Hello everyone,
I have a certain kernel, doing some computations on blocks of matrices (I have a block size of 128 matrices).
To process a block, I iterate over all the matrices in a block.
It seems that this loop is too agressively unrolled, and it disminishes the number of in-flight wavefronts from 4 to 2.
As I need a certain number of wavefronts to hide the memory latency, I would like to specify to the OpenCL compiler to NOT unroll this loop, is this possible?
I am currently passing this constant with a kernel argument to avoid the unrolling, but that's not a great solution IMO...
Thank you for your help,
Regards,
Ziple.
Solved! Go to Solution.
Hi Ziple,
You may try __attribute__((opencl_unroll_hint(1))) for this purpose. As per the OpenCL C Spec:
Section: Specifying Attribute For Unrolling Loops
The __attribute__((opencl_unroll_hint)) and __attribute__((opencl_unroll_hint(n))) attribute qualifiers can be used to specify that a loop (for, while and do loops) can be unrolled. This attribute qualifier can be used to specify full unrolling or partial unrolling by a specified amount. This is a compiler hint and the compiler may ignore this directive.
n is the loop unrolling factor and must be a positive integral compile time constant expression. An unroll factor of 1 disables unrolling. If n is not specified, the compiler determines the unrolling factor for the loop.
NOTE: The __attribute__((opencl_unroll_hint(n))) attribute qualifier must appear immediately before the loop to be affected.
Example:
__attribute__((opencl_unroll_hint(1)))
for (int i=0; i<32; i++)
{
…
}
For details, please refer the section "Specifying Attribute For Unrolling Loops" in OpenCL C Spec.
Regards,
Hi Ziple,
You may try __attribute__((opencl_unroll_hint(1))) for this purpose. As per the OpenCL C Spec:
Section: Specifying Attribute For Unrolling Loops
The __attribute__((opencl_unroll_hint)) and __attribute__((opencl_unroll_hint(n))) attribute qualifiers can be used to specify that a loop (for, while and do loops) can be unrolled. This attribute qualifier can be used to specify full unrolling or partial unrolling by a specified amount. This is a compiler hint and the compiler may ignore this directive.
n is the loop unrolling factor and must be a positive integral compile time constant expression. An unroll factor of 1 disables unrolling. If n is not specified, the compiler determines the unrolling factor for the loop.
NOTE: The __attribute__((opencl_unroll_hint(n))) attribute qualifier must appear immediately before the loop to be affected.
Example:
__attribute__((opencl_unroll_hint(1)))
for (int i=0; i<32; i++)
{
…
}
For details, please refer the section "Specifying Attribute For Unrolling Loops" in OpenCL C Spec.
Regards,
Thank you for your help, I think that it will do the trick!