#pragma no_unroll ?

Question asked by Ziple on Apr 26, 2015
Hello everyone,


I have a certain kernel, doing some computations on blocks of matrices (I have a block size of 128 matrices).

To process a block, I iterate over all the matrices in a block.

It seems that this loop is too agressively unrolled, and it disminishes the number of in-flight wavefronts from 4 to 2.


As I need a certain number of wavefronts to hide the memory latency, I would like to specify to the OpenCL compiler to NOT unroll this loop, is this possible?


I am currently passing this constant with a kernel argument to avoid the unrolling, but that's not a great solution IMO...


