Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

Journeyman III

How best to map NDRange to the problem at hand?

Been reading about NDRange and am wondering about how best to map it to the problem at hand.  NDRange is a 1, 2, or 3 dimensional space where each element corresponds to a kernel instance.  NDRange appears to best map to the architectural layout of the GPU.

If I have two 10K by 10K matrices and wish to multiply them, undoubtedly I would choose a 2D NDRange.  As large as possible?  But since these matrices are beyond the capacity for the GPU how should I best map the A and B matrices to the 2D NDRange available?


1 Reply

How best to map NDRange to the problem at hand?

that is a out of core matrix mutiplication problem you are talking about.

The answer is you will have to divide the matrices into blocks( say divide A matrix in rows and B in columns). Then send these blocks one by one multiply.