cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

spectral
Adept II

barrier or write_mem_fence

Jump to solution

Hi,

I have some doubts about the use of 'write_mem_fence' and 'barrier'.

Here is my code :

__local WorkItemData workItemData;
const size_t lid     = get_local_id(0);
     if (lid < 1)
     {     
          workItemData.sky                         = sky;
          workItemData.lightsCount               = lightsCount;
          workItemData.emitters                    = emitters;
     }

     //---- Sync : wait that 'workItemData' is fully initialized
     barrier(CLK_LOCAL_MEM_FENCE);
     //write_mem_fence(CLK_LOCAL_MEM_FENCE);

The goal is to insure that the variable 'workItemData is well initialized'. Can I use a write_mem_fence ? Will it be faster and more efficient ?

Is there some other way to improve performance in this case, by example if I put more operation between the initialization and use, sometimes I don't need to synchronize (But it is dangerous) ! Maybe there are some other ways.

Thanks

Tags (2)
0 Likes
1 Solution

Accepted Solutions
LeeHowes
Staff
Staff

Re: barrier or write_mem_fence

Jump to solution

As you point out, it's a synchronization issue, therefore it needs a synchronization operation. In OpenCL the only synchronization operations are barriers. The fence would only guarantee that the compiler and hardware won't reorder writes across the fence, it won't ensure that work items wait for each other.

Given that the hardware runs 64 work items simultaneously, if you only have 64 work items in your group you could in theory drop the barrier. However, if you only have 64 work items in your group the compiler will do that for you anyway so you might as well put it in in practice. If you have more than 64 work items then you need the barrier to ensure that the other wavefront waits for the one that did the writes.

View solution in original post

5 Replies
binying
Challenger

Re: barrier or write_mem_fence

Jump to solution

What kind of data are they, sky, emitters and lightsCount?This post would be helpful, though.

http://devgurus.amd.com/message/1242922#1242922.

0 Likes
binying
Challenger

Re: barrier or write_mem_fence

Jump to solution

Will it be faster and more efficient ?--I would compare their assemblers through Kernel Analyzer.

0 Likes
spectral
Adept II

Re: barrier or write_mem_fence

Jump to solution

Theses are pointers to buffers. So my goal is to insure they are available in all the work-items.

Not sure that analyzing the binaries will help ! It is a question about synchronization 😛 not instructions.

0 Likes
LeeHowes
Staff
Staff

Re: barrier or write_mem_fence

Jump to solution

As you point out, it's a synchronization issue, therefore it needs a synchronization operation. In OpenCL the only synchronization operations are barriers. The fence would only guarantee that the compiler and hardware won't reorder writes across the fence, it won't ensure that work items wait for each other.

Given that the hardware runs 64 work items simultaneously, if you only have 64 work items in your group you could in theory drop the barrier. However, if you only have 64 work items in your group the compiler will do that for you anyway so you might as well put it in in practice. If you have more than 64 work items then you need the barrier to ensure that the other wavefront waits for the one that did the writes.

spectral
Adept II

Re: barrier or write_mem_fence

Jump to solution

Thanks a lot Lee, for your complete explanation.

0 Likes