cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

d_a_a_
Adept II

Strange dead lock

I'm trying to implement the scan primitive, but I came across a weird behavior. Whenever the dimension of the global array 'x' is greater than 64 (the wavefront size), let's say 128, the following kernel hangs up:

   kernel void scan_dead_lock( global float * x )

  {

     int id = get_local_id(0);

     for( int s = 1; s < get_local_size(0); s <<= 1 )

     {

        if( id >= s ) {

           float tmp = x[id] + x[id - s];

           barrier( CLK_GLOBAL_MEM_FENCE );

           x[id] = tmp;

        }

        else barrier( CLK_GLOBAL_MEM_FENCE );

        barrier( CLK_GLOBAL_MEM_FENCE );

     }

   }

But this one, functionally equivalent, does not:

  kernel void scan( global float * x )

   {

     float tmp;

     int id = get_local_id(0);

     for( int s = 1; s < get_local_size(0); s <<= 1 )

     {

        if( id >= s ) tmp = x[id] + x[id - s];

        barrier( CLK_GLOBAL_MEM_FENCE );

        if( id >= s ) x[id] = tmp;

        barrier( CLK_GLOBAL_MEM_FENCE );

     }

   }

I'm using 'global size' = 'local size', so there is only a work-group in flight. Is the OpenCL compiler doing something wrong, maybe removing the "else barrier( CLK_GLOBAL_MEM_FENCE );" part from the first kernel?

The dead lock occurs both with Catalyst (fglrx) 12.4 and 12.6 using an AMD HD5750 GPU on Debian GNU/Linux.

Thank you.

0 Likes
1 Solution

yes. this can lead to deadlock

if(get_local_id(0)>32)barrier();

else barrier();

View solution in original post

0 Likes
4 Replies
nathan1986
Adept II

Hi,

    GLOBAL barrier  means "each threads stops here and wait until all threads come", if you put the barrier in the selection, some threads will never get the same place so  it waits forever.

barrier must be hitted with all or none thread in work-group. otherwise it lead to undefined behaviour.

Thank you for your replies.

I knew that a barrier must be hit by all threads in the work-group, but I totally forgot that each barrier is uniquely identified and so all threads must hit the *same* barrier. Right?

Thanks.

0 Likes

yes. this can lead to deadlock

if(get_local_id(0)>32)barrier();

else barrier();

0 Likes