I have an array that is always small, but whose size isn't known until after preprocessor evaluation. How can I fully unroll all loops that access this array without resorting to code generators (which I'm considering adding to clUtil as a pre-preprocessor)? Also, if the compiler is able to compute indices at compile time, will this array reside in registers?
E.g.
#define arraySize 4
__kernel void foo()
{
unsigned int array[arraySize]; //Want to be in registers
#pragma unroll arraySize
for(unsigned int i = 0; i < arraySize; i++)
{
//do something with array;
}
}
This code is equivilent to
__kernel void foo()
{
unsigned int array1;
unsigned int array2;
unsigned int array3;
unsigned int array4;
//do something with array1, 2, 3, 4
}
It doesn't let you pass a preprocessor macro as an unroll factor.
Originally posted by: rick.weber It doesn't let you pass a preprocessor macro as an unroll factor.
I am able to add like following in my kernel.
#define LOOP_UNROLL_FACTOR 4
#pragma unroll LOOP_UNROLL_FACTOR