cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

Fuxianjun
Journeyman III

a finding of loop in kernel

this two kernels' execution times are almost the same,both of them take between 8 to 11 ms. why ?

__kernel void testone(__global float * a, __global float * b, __global float * c) { int i = get_global_id(0); for(int j=0;j<10000000;j++) { c+=a*b+j; } } __kernel void testtwo(__global float * a, __global float * b, __global float * c) { int i = get_global_id(0); for(int j=0;j<10;j++) { c+=a*b+j; } }

0 Likes
1 Reply
Marco13
Journeyman III

Hello

Usually, compilers will do loop unrolling. And depending on how "clever" the compiler is, it might (at least theoretically) translate these kernels to something like

c[ i ]=(a[ i ]*b[ i ])*n+(n*(n-1)/2);

where 'n' may be 10 or 10000000.

I assume that the results will be significantly different when you pass the upper value of the loop as an additional parameter....

Another question is whether measing such short time intervals can be precise enough at all - when the time measured varies by ~50%, there may be something odd...

bye


0 Likes