Archives Discussions

lerlinghagen · ‎02-15-2015

Hi everyone,

I'm having issues when using __local variables in the attached kernel. When run in a loop with constant input data, the output differs at semi-random locations. On a Tahiti GPU, data differs starting in a random iteration in a random location. On a Tonga GPU, data differs in the second iteration at a fixed location. In both cases, the data inconsistency starts at memory addresses written by local_id(1) >= 64. For my use case, I'd expect the contents of 'sums' and 'textures' to be the same in each iteration.

Here is the relevant input data:

IMAGE_WIDTH is defined to be 680

IMAGE_HEIGHT is defined to be 512

NUM_DISP is defined to be 112

WINDOW_SIZE is defined to be 5

Work size is (IMAGE_HEIGHT, NUM_DISP ), work group size is (1, NUM_DISP).

left, right, sums, textures, and prefilterCap are identical for each kernel run.

Both GPUs use the latest non-beta Catalyst drivers.

The inconsistency disappears for NUM_DISP <= 64, and when the kernel is running on a CPU device. Did I miss a barrier call somewhere? As far as I can see, all work items should hit every barrier, and only use the local variable's contents after the barrier.

dipak · ‎04-22-2015

Hi lerlinghagen,

My apologies for this late reply.

Has your problem been resolved? If not, please provide the complete project (with host-side code) such that we can run it at our end. Also, please mention the setup details such as OS, driver, SDK, GPU etc.

Regards,

lerlinghagen · ‎06-25-2015

The code in question has been refactored so the issue is no longer relevant. If it comes up again, I will provide a project for testing.

dipak · ‎06-25-2015

Sure. Till then, better to mark this thread as "assumed answered". As needed, you can revive it any time.

Regards,

Archives Discussions

__local variable inconsistency issue