Is a mem_fence supposed to be needed here?

Discussion created by arsenm on Aug 7, 2011
Latest reply on Aug 7, 2011 by maximmoroz

I'm trying to figure out if my understanding of mem_fence() is wrong based on a problem I'm having.

My understanding of mem_fence() is that it's supposed to be used to control the order that things become visible to other items in a workgroup in memory, and that memory was supposed to be consistent within a single work item. I've run into a situation where I'm not sure if I'm interpreting what consistency within a single work item means. My problem is within a single work item, a later read from a previously written address gets the overwritten value. Other workitems aren't touching the address. If I stick a global mem_fence() between the write and later read, I get the correct written value.

I'm using SDK 2.5, Catalyst 11.7, Linux x86_64 on a Radeon 6970

Here is pseudocode for what I'm seeing:

__global volatile int* buffer; i = calculate some index into buffer buffer[i] = x // suppose buffer[i] originally contains 'a' // mem_fence(GLOBAL) this works correctly if I place this here j = some other index calculation // j happens to be equal to i y = buffer[j] // y here is equal to a, and not the recently written x