Skip to content

Strip return statements from CPU kernels

Churavy, Valentin requested to merge vc/return into main

Created by: vchuravy

fixes #461 (closed) @eschnett I remember know why I didn't add this in the first place.

@kernel function f(cond)
   if cond
       return
   end
   # do something useful
end

How should we compile this on the GPU?

The reason for the bug in #461 (closed) is that on the CPU the kernel

@kernel function set_matrix!(A)
   i,j = @index(Global, NTuple)
   A[i, j] = 1
   return nothing
end

Get's compiled as:

   for lane in ...
        i,j = @index(Global, NTuple)
        A[i, j] = 1
        return nothing
   end

And so we early terminate the execution.

It's easy enough for trailing return statements, but for conditional returns it becomes as hard as conditional synchronize, if not harder.

cc: @pxl-th

Merge request reports

Loading