laobrasuca

problem with clAmdBlasSgemm

Discussion created by laobrasuca on May 25, 2011
Latest reply on May 26, 2011 by laobrasuca
incorrect results

hi all,

I'm in need of a Sgemm implementation on GPU and I've tried the AMD proposition clAmdBlasSgemm (win7 pro 64bits, clAmdBlas-1.2.78, Radeon HD5770, driver 11.5, sdk 2.4). While comparing results i've notice that things are not really correct. In the sample "example_sgemm.c" you pack with the library there's this C = 10*A*B + 20*C , with:

A[] = {

11, 12, 13, 14, 15,
21, 22, 23, 24, 25,
31, 32, 33, 34, 35,
41, 42, 43, 44, 45}

B[] = {11, 12, 13,
    21, 22, 23,
    31, 32, 33,
    41, 42, 43,
    51, 52, 53}

c[] = {

11, 12, 13,
21, 22, 23,
31, 32, 33,
41, 42, 43}

with result

14420, 15790, 17060,
25120, 27490, 29760,
35820, 39190, 42460,
46520, 50890, 55160

 

while the correct result is:

21370, 22040, 22710, 
37070, 38240, 39410, 
52770, 54440, 56110, 
68470, 70640, 72810

 

I've also tested some other results, here's a simple one:

A[] = {

1, 2, 3,
1, 2, 3,
1, 2, 3}

B[] = {

1, 1, 1,
2, 2, 2,
3, 3, 3}

with trasA = transB = clAmdBlasNoTrans, M = N = K = lda = ldb = ldc = 3 and alpha = 1, beta = 0, with result

6, 12, 18,
6, 12, 18,
6, 12, 18

 

while it should be 14 everywhere. Is it that I'm not using it correctly or?

 

Outcomes