Archives Discussions

redditisgreat · ‎02-15-2011

I implemented a barrier with atomic operations

My initial testing seems to indicate that it works.

Is there a way to do the same without forcing the compiler "complete path" memory mode?

global uint sema = 0; if( get_local_id(0)==0 ) atomic_inc( sema ); while( sema % num_groups ) if( get_local_id(0)==0 ) atomic_add( sema, 0 );

MicahVillmow · ‎02-15-2011

atomics force the compiler down the complete path.

redditisgreat · ‎02-15-2011

Originally posted by: MicahVillmow atomics force the compiler down the complete path.

Which leads to the following questions:

1. Can I get a terminating loop with another read construct?

2. Can the compiler be changed in a way, so that it uses "complete path" only for variable affected by atomic operations?

3. Will the OpenCl standard be extented to include global synchronization primitives, so I won't hav to use dirty hacks anymore?

Is this an effective way to achieve global synchroniation on the GPU?