Strip return statements from CPU kernels
Created by: vchuravy
fixes #461 (closed)
@eschnett
I remember know why I didn't add this in the first place.
@kernel function f(cond)
if cond
return
end
# do something useful
end
How should we compile this on the GPU?
The reason for the bug in #461 (closed) is that on the CPU the kernel
@kernel function set_matrix!(A)
i,j = @index(Global, NTuple)
A[i, j] = 1
return nothing
end
Get's compiled as:
for lane in ...
i,j = @index(Global, NTuple)
A[i, j] = 1
return nothing
end
And so we early terminate the execution.
It's easy enough for trailing return statements, but for conditional returns it becomes as hard as conditional synchronize, if not harder.
cc: @pxl-th