7 Replies Latest reply on Nov 30, 2010 6:23 PM by jeff_golds

    Drastically changing GPU behaviour on minor change in kernel


      Hi all,

      Here is a kernel which I am running on an ATI Mobility Radeon HD 4500:

      __kernel void ker(__global int *A, __global int *B, int width)
          int tid = get_global_id(0);
         int a;
         for(int i=tid;i<width && i<tid+10000; i++){
              a = A+tid+B;
          B[tid] = 9;

      The total number of global work items (i.e width) = 128x128x128.

      Now this kernel takes about 0.93 wall time as specified by the time command, BUT as soon as I change the second-last line of code to
                                                                                           B[tid] = a;
      my desktop GUI hangs up for a certain time, and when it resumes the wall time displayed is 14.28 seconds.

      What exactly is happening in this code???

      Thank You.