makeshift global barrier

Discussion created by sblackwell on Sep 29, 2010
Latest reply on Sep 30, 2010 by sblackwell
trouble reading from global memory after atom_inc


I know there's no global barrier in OpenCL, but I'm trying to create a workaround using the following code:


void barrier(__global uint* scratch) {

  uint nThreads = get_global_size(0);


  /* this loop never terminates */

  while(scratch[0] < nThreads) {





The idea is that each thread loops until all threads have incremented that memory.

I know the memory is being incremented because scratch[0] = nThreads when I read scratch back to host memory, but loops never terminate. When I have the threads write out the value at scratch[0] elsewhere, they all just print the result of their atom_inc.

I know I can normally read just fine from global memory, but what am I missing here?