cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

miktis
Journeyman III

clAmdFft generates invalid kernels

I use clAmdFft version 1.8.239 (Win7 64).
Using this command
clAmdFft.Client -x 4096 -y 1 -z 1 --inLayout 2 --outLayout 2 -d
the library generates the file clAmdFft.kernel.Stockham1.cl.

Now lets look at the end of the function FwdPass0

if(rw)
{
bufOutRe[outOffset + ( ((2*me + 0)/1)*8 + (2*me + 0)%1 + 0 )*1] = B2C0R0.s0; //line 4234
// more lines like these
bufOutIm[outOffset + ( ((2*me + 1)/1)*8 + (2*me + 1)%1 + 7 )*1] = B2C0I7.s1; //line 4265
}

rw is always 1.
bufOutRe points to __local float lds[0]
bufOutIm points to __local float lds[4096]
outOffset is always 0.
me is the local id that goes from 0 to 255.
So if me=255 this makes ((2*255+1)/1)*8 + ((2*255+1)%1 + 7 = 4606.
That means that the Real part overwrites the Imaginary above __local float lds[4096] and
also that the code accesses among others __local float lds[8702] which does NOT exist.
This pattern for the index is found also in function InvPass0.
The funny thing is that the test reports "PASS".
I don't know if I'm missing something.

0 Likes
1 Solution

Hi,

The expression '((2*255+1)/1)*8 + (2*255+1)%1 + 7' equates to 4095. Your calculation of 4606 is incorrect.

The ability to dump the kernels is just given for special case debug purposes. It is not really meant for the users to directly consume the kernel. If you are trying to compute FFT transform with our libraries and need support, then please use the library API functions to compute transforms. The documentation manual explains how the library can be used.

View solution in original post

0 Likes
2 Replies

Hi,

The expression '((2*255+1)/1)*8 + (2*255+1)%1 + 7' equates to 4095. Your calculation of 4606 is incorrect.

The ability to dump the kernels is just given for special case debug purposes. It is not really meant for the users to directly consume the kernel. If you are trying to compute FFT transform with our libraries and need support, then please use the library API functions to compute transforms. The documentation manual explains how the library can be used.

0 Likes

Yes, you are correct.

Thank you.

0 Likes