cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

lipi
Journeyman III

Possible IL bug

vObjIndex0 in scatter_IL

 

The following IL code from the scatter_IL example seems to write only half of the global buffer locations:

const CALchar* ILKernel =
"il_ps_2_0\n"
"dcl_input vObjIndex0\n" // vObjIndex starts at 0 and increments by 1.
"mov g[vObjIndex0.x], vObjIndex0.x\n"
"end\n";

When using global buffer preinitialized with 0xffffffff the output will be:

00000000 00000000 00000000 00000000
00000001 00000001 00000001 00000001
ffffffff ffffffff ffffffff ffffffff
ffffffff ffffffff ffffffff ffffffff
00000004 00000004 00000004 00000004
00000005 00000005 00000005 00000005
ffffffff ffffffff ffffffff ffffffff
ffffffff ffffffff ffffffff ffffffff

 

0 Likes
4 Replies

Due to how the hardware works with Pixel Shader & vObjIndex, you will see this behavior if you do not have a multiple of 2 height in your buffer.
0 Likes

 

Thanks Micah, I had a height of 1 indeed.

I changed the kernel to use compute shader and thread ID instead of object index, now it works correctly.

0 Likes

>you will see this behavior if you do not have a multiple of 2 height in your buffer.

 

As I can understand it isn't buffer size must be multiple of 2 but domain invocation size, am I right?

I.e., if I'm running Grid as:

 CALdomain domain = {0, 0, 4096, 4096};
 if (calCtxRunProgram(&e, ctx, func, &domain) != CAL_RESULT_OK) { ... }

will vObjIndex0 be valid for all values from 0 to 4096^2 - 1 no matter how buffers was declared?

 

It's really easier to go compute shader mode and use vaTid0 instead of vObjIndex0... unfortunately there are other problems with cs, so I forced to return to ps.

 

0 Likes

As we have absolutely terrific support from AMD/ATI at this forum I'll answer by myself:

 

No, vObjIndex0 won't be valid. With domain size bigger than 512 elements vObjIndex0 behaviour becomes totally unpredictable, any element can be skipped not just odd/even ones. If you aren't checking whole buffer you can easily miss this error. And so no point to use vObjIndex0 at all.

 

And no, it's not corrupted memory on GPU as switching to compute shader mode and using vaTid0 gives totally correct results.

0 Likes