About acmlgpu-1-1-1

Discussion created by fruitfly1026 on Aug 17, 2010
Latest reply on Aug 19, 2010 by fruitfly1026
A question about libCALBLAS sample source in acmlgpu1-1-1


   I saw the GEMM_Shaders.h in acmlgpu1-1, which used to build the libCALBLAS library in "./src/libCALBLAS" subdirectory.

   The "szDGEMM_Mult" kernel have 8 inputs( 4 for A and 4 for B) and 8 outputs(8 'o'registers for C), but why the declaration part only declares 4 'o'registers, and why the compiling and running it have no problems? Besides, when I change to declare 8 outputs, the conpiling aand running proccess also right. But, when I change the kernel to "il_cs_2_0", the compiling cannot complete successfully.

  I'm confused now. Thank you for reply.