cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

yurtesen
Miniboss

VGPR usage question (code attached)

I have a simple test kernel which I compile in kernelanalyzer2. It says on Tahiti it will use 110 VGPR. I tried to put #pragma unroll 1 and it does not have any effect at all. Is there a known way to avoid compiler from using so many registers? ( keep in mind this is a dummy test kernel, but this seems to effect an actual kernel and reduce occupancy)

__kernel void test (__global double *distrValues, __global double *distrValuesOut) {

    __private const int id = get_global_id(0);

    __private double den2;

     

  __private int i;

   for (i=0;i<53;i++) {

       den2 += distrValues;

   }

    distrValuesOut[id]=den2;

}

Thanks!

0 Likes
1 Solution

Hi yurtsen,

If I compile the code in first thread for Tahiti it uses 112 VGPRs. If I change loop limit to 54, it simply uses 20 VGPRs. Dont you think something is strange here? If I change the code as follows, it uses only 8 VGPRs !!! (although this last one does not seem to have the same effect in my actual program for some reason)

you are right. I can also see these observations. I will write a small host code and check actual performance on running this kernel. Also I will forward it to a appropriate team.

View solution in original post

0 Likes
20 Replies