I use Linux X86-64, CAL 1.2.1beta-1 and latest driver on one 4870.
/amdcal/samples/runtime/exportspeed -r 1 -w 256 -h 256 -e passed.
However exportspeed -r 1 -w 16 -h 16 -e failed.
Could someone please verify this?
I suspect there is something wrong with the global buffer as I met some other problem too.
Below the simple kernel just copies one element from one input buffer(the global buffer ) to another output buffer. However I failed to get correct results when using options "-w 4 -h 16 -e". Say third element (float4) of the input is 1.0, 0.0, 1.0, 1.0. The result I got could be 0.0, 0.0, 0.0, 0.0 or other strange numbers such as 0.089. I also sometimes got the correct result. But the result keeps changing everytime I execute the program.
The input buffer is 2D float4 and output is 2D float4 (w/4 and h). Execution domain is (0, 0, w/4, h). However "-w 16 -h 16 -e" always gets me the correct result
const CALchar ILKernel =
"dcl_literal l0, 0x00000002, 0x40490FDB, 0x40490FDB, 0x40490FDB\n"
"mov r0, g[l0.x]\n"
"mov o0, r0\n"
Another example is memimport_matmul
if you try -r 1 -w 16 -h 16 -e, the verification is good.
However if you try -r 1 -w 16 -h 8 -e, the verification fails.