cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

shingoxlf
Journeyman III

arbitrary size matrix multiplication

Does anyone know a fast arbitrary size matrix multiplication algorithm/code on GPU?

The matrix multiplication from SDK seems only work when input matrix has a size of multiple of 16. For example, if input matrix is 127X127, it returns wrong results.

0 Likes
2 Replies
dmeiser
Elite

You could try the amd opencl blas library

http://developer.amd.com/tools/heterogeneous-computing/amd-accelerated-parallel-processing-math-libr...

You might want to check out the gemm family of functions in that library (sgemm, dgemm, etc.)

himanshu_gautam
Grandmaster

Yeah right. Check out clAmdBlas library's gemm routine.

As a suggestion, I would not recommend you to do a 127X127 matrix multiplication on GPUs. It may be better if you increase the size of matrices by adding some padding, to actually make a multiple of 2. The work distribution can be quite unoptimal for odd sized matrices.

0 Likes