Archives Discussions

arsenm · ‎08-07-2011

I'm trying to figure out if my understanding of mem_fence() is wrong based on a problem I'm having.

My understanding of mem_fence() is that it's supposed to be used to control the order that things become visible to other items in a workgroup in memory, and that memory was supposed to be consistent within a single work item. I've run into a situation where I'm not sure if I'm interpreting what consistency within a single work item means. My problem is within a single work item, a later read from a previously written address gets the overwritten value. Other workitems aren't touching the address. If I stick a global mem_fence() between the write and later read, I get the correct written value.

I'm using SDK 2.5, Catalyst 11.7, Linux x86_64 on a Radeon 6970

Here is pseudocode for what I'm seeing:

__global volatile int* buffer; i = calculate some index into buffer buffer = x // suppose buffer originally contains 'a' // mem_fence(GLOBAL) this works correctly if I place this here j = some other index calculation // j happens to be equal to i y = buffer // y here is equal to a, and not the recently written x

maximmoroz · ‎08-07-2011

mem_fence places limits on reordering reads and writes by the compiler. It is perfectly valid to use mem_fence even if you don't need to synchronize between work-items of the same work-group.

In your particular case if you don't specify memory fence the compiler might generate code which read buffer before writing to buffer. As this reordering might speed up the kernel by bunching together in single clase global memory read instructions.

Archives Discussions

Is a mem_fence supposed to be needed here?