I have a simple test kernel which I compile in kernelanalyzer2. It says on Tahiti it will use 110 VGPR. I tried to put #pragma unroll 1 and it does not have any effect at all. Is there a known way to avoid compiler from using so many registers? ( keep in mind this is a dummy test kernel, but this seems to effect an actual kernel and reduce occupancy)
__kernel void test (__global double *distrValues, __global double *distrValuesOut) {
__private const int id = get_global_id(0);
__private double den2;
__private int i;
for (i=0;i<53;i++) {
den2 += distrValues;
}
distrValuesOut[id]=den2;
}
Thanks!