I'm using ClAmdBlas version 1.10.274 over Intel OpenCL SDK 3.2 on Windows. For now I'm trying to get a simple program running on my Core i7 920 CPU but once I get it debugged my intention is to run it on a GPU.
During the call to clAmdBlasSgemm, I occasionally get warnings (e.g. "2 warnings generated") and more seriously, errors that cause the kernel to be printed and the method to fail with CL_BUILD_PROGRAM_FAILURE. The error is always:
53:95:63: error: can't convert between vector values of different size ('uint8' and 'unsigned long')
Significantly, the errors only occur for specific configurations of arguments to Sgemm. After experimenting with many different calls, it only appears to happen when 1) transA is true and transB is false and 2) for specific configurations of the dimensions of the matrices, with no clear pattern. For example, when A is 24x23 and B is 24x25, I get the correct result multiplying A (transposed) by B. But when A is 25x24 and B is 25x26, Sgemm fails. In all cases, I'm using column order and the leading dimension equals the number of rows.
I've attached the dumped kernel in the A=25x24 and B=25x26 case.
Any help greatly appreciated!