Fuxianjun

A proble about kernel parameters' length

Discussion created by Fuxianjun on Aug 14, 2010
Latest reply on Aug 14, 2010 by genaganna

I find in many examples that the parameters in kernel, are all with a length which is multiple of 64.(take AMD's MatrixMultiplication for example, the matrox's size is 64 * 64, and also in some vector-addtion case, the vector length is multiple of 64). 

I posted a topic in http://forums.amd.com/devforum/messageview.cfm?catid=390&threadid=137812&enterthread=y and LeeHowes tell  me that "it hard to believe that the second kernel really has an optimum count of 2".

my questions are :

1.the parameters' length are multiple of 64 because the wavefront size is 64 is some AMD hardware chip , is it correct ?

2.dose global_work_size paramether in  clEnqueueNDRangeKernel() means how many workitems are needed or how many wavefronts are needed ?

3.for question 2, i guess the answer is workitem, so ,if global_work_size is less than 64, are there also 64 workitems in the wavefront in working ? for the same reason , there are always a number of workitems which is multiple of 64 in working ,is it correct ?

4. so, if i want to calculate algebra problem of matrix or vector or array with the random size , it is better to append some zeroes to it and make its length is multiple of 64, is it correct ?

Outcomes