1 Reply Latest reply on Aug 12, 2010 8:24 AM by Marco13

    a finding of loop in kernel

    Fuxianjun

      this two kernels' execution times are almost the same,both of them take between 8 to 11 ms. why ?

      __kernel void testone(__global float * a, __global float * b, __global float * c) { int i = get_global_id(0); for(int j=0;j<10000000;j++) { c[i]+=a[i]*b[i]+j; } } __kernel void testtwo(__global float * a, __global float * b, __global float * c) { int i = get_global_id(0); for(int j=0;j<10;j++) { c[i]+=a[i]*b[i]+j; } }

        • a finding of loop in kernel
          Marco13

          Hello

          Usually, compilers will do loop unrolling. And depending on how "clever" the compiler is, it might (at least theoretically) translate these kernels to something like

          c[ i ]=(a[ i ]*b[ i ])*n+(n*(n-1)/2);

          where 'n' may be 10 or 10000000.

          I assume that the results will be significantly different when you pass the upper value of the loop as an additional parameter....

          Another question is whether measing such short time intervals can be precise enough at all - when the time measured varies by ~50%, there may be something odd...

          bye