I'm a newbie in OpenCL.
Can someone here give a brief explanation, when to use 1D, 2D, or 3D NDRange? What is the advantages or disadvantages of using 1D, 2D, or 3D NDRange?
Another one is about the task parallelism. In OpenCL 1.0 Spec p.26: "It is logically equivalent to executing akernel on a compute unit with a work-group containing a single work-item". What does it actually mean? Is there any examples in the SDK use task parallel model? I have assignment that I think it is more suitable to use the task parallel model than the data parallel model.
it says that clEnqueueTask is equal to clEnqueueNDRange with global, local size and dimension all set to 1.
but current GPU can't execute two kernels at once. and to fully utilize GPU you must have many hundreds of workitems not just one.
dimension you use depend on problem you are solving. for example you do reduction of array so you use 1D. or you program matrix operation then you use 2D.
I think nou explained it correctly but let me do it in my terms which might make it clearer for you.
First the NDRange. I think there are no practical restrictions to use any particular problem. Differnet options for choosing the ndrange are given to make the things more easy to understand. For eg. there is nothing wrong in implementing a 2D matrix addition as 1Dndrange, it only makes things more logically clear.
Regarding the task parallelism it is definitely not possible to run 2 different kernels concurrently so task parallelism on GPUs is not something to be encouraged for now. Although GPUs are extremely helpful in data parallel operations.