Skip to content

tune CUDA kernels automatically

Churavy, Valentin requested to merge github/fork/simeonschaub/sds/autotune into master

Created by: simeonschaub

This is still quite rough around the edges, but I am putting this up for feedback. This automatically splits up the threads over leading dimensions of the ndrange for better performance if the first dimension is small.

Merge request reports

Loading