cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

jski
Journeyman III

Mapping a large matrix to NDRange?

If I have 2 large matrices (say, 10K x 10K) I wish to muliply, too large to completely reside on any GPU device, how should I map it to the NDRange? How do I divide up the problem?

How best should I define NDRange? Obviously work_dim=2, but what about global_work_size and local_work_size?

Obviously, only part of each matrix could be loaded.  This is more of a question about global_work_size and local_work_size.

---jski

0 Likes
2 Replies
nareshsankapelly
Journeyman III

Jski,

You have to divide the problem into sub problems and run corresponding kernels for the sub problems. 

Suppose that you have to calculate C = A * B

Please have a look at Matrix Multiplication sample in AMD APP SDK samples.

0 Likes

Suppose that you have to calculate C(m X k) = A(m X n)*B(n X k). 

Divide C matrix into four matrices of size m/2 X k/2, A matrix into two parts of size m/2 X n and B matrix into two parts of size n X k /2. I think now you got the clear picture of how it works. You have to calculate four matrix multiplications to get the resultant C.

0 Likes