Archives Discussions

sonkanit · ‎04-09-2010

Unfortunately my laptop is using 4570 and it seems that it doesn't fully support openCl, including the atomic int32 operation.

Is there any other way to make sure that the global increment is done properly? I am thinking of something like critical section in thread APIs.

I tried to use barrier with global flag but the graphic card driver crashed. May be it's because of my code since barrier is not called in every workitem.

Thank you in advance.

if(specificCondition) { //atom_inc(&globalValue); globalValue++; barrier(CLK_GLOBAL_MEM_FENCE); }

davibu · ‎04-10-2010

Ma by something like:

barrier(CLK_GLOBAL_MEM_FENCE);

if(specificCondition)
{
//atom_inc(&globalValue);
globalValue++;
}

barrier(CLK_GLOBAL_MEM_FENCE);

sonkanit · ‎04-10-2010

too bad the condition is somewhat too complex. may be I have to rewrite it again.

lvella · ‎04-10-2010

Originally posted by: davibu Ma by something like:

barrier(CLK_GLOBAL_MEM_FENCE);

if(specificCondition) { //atom_inc(&globalValue); globalValue++; }

barrier(CLK_GLOBAL_MEM_FENCE);

I can not understand why this code guarantees the atomicity of the inc operation. Are you sure about that?

I am also in need of this kind of synchronization, and it seems that 4870 lacks atomic operation, too.

Is there something to workaround atom_or()?

Fr4nz · ‎04-10-2010

Originally posted by: lvella

Is there something to workaround atom_or()?

One possibile workaround is to coordinate thread writes through the use of two or more variables (in global memory) shared across all the threads and make ordered writes according to the values contained in these.

There are various mutex techniques available, as you can see here:

http://en.wikipedia.org/wiki/Mutual_exclusion

Anyway, I think that soft. mutex is going to be quite slower if compared to atomic ops, because you're going to use "branchy" code...but, anyway, this is better than nothing...

nou · ‎04-10-2010

will it work? with atom_cmpxchg() is thereticaly possible make global synchronization mutex. but i doubt it will work. because IMHO one SIMD core execute one work group. so for example on 5870 it runs 20 workgroup at the same time. but i think this workgroups must end before it run another 20 workgroups. so i think global synchronization it not possible.

Archives Discussions

Workaround for the device that doesn't support atomic int32 operation