How best to map NDRange to the problem at hand?

Discussion created by jski on May 31, 2011
Latest reply on May 31, 2011 by himanshu.gautam

Been reading about NDRange and am wondering about how best to map it to the problem at hand.  NDRange is a 1, 2, or 3 dimensional space where each element corresponds to a kernel instance.  NDRange appears to best map to the architectural layout of the GPU.

If I have two 10K by 10K matrices and wish to multiply them, undoubtedly I would choose a 2D NDRange.  As large as possible?  But since these matrices are beyond the capacity for the GPU how should I best map the A and B matrices to the 2D NDRange available?