7 Replies Latest reply on Oct 29, 2010 7:09 PM by SiegeLord

    Problems with local atomics

    SiegeLord

      The following kernel when run on a CPU device causes the error_buffer to return non-zero values:

      #pragma OPENCL EXTENSION cl_khr_local_int32_base_atomics : enable
      __kernel
      void test(__global int* error_buffer) { __local int fire_table_idx; int local_id = get_local_id(0); if(local_id == 0) fire_table_idx = 0; barrier(CLK_LOCAL_MEM_FENCE); if(fire_table_idx != 0) error_buffer[0] = fire_table_idx; else error_buffer[0] = 0; atom_inc(&fire_table_idx); }

      If I remove the call to atom_inc at the bottom, the kernel functions fine (i.e. fire_table_idx is set to 0).

      Any ideas as to why this might be happening? Am I not using __local variables correctly?
        • Problems with local atomics
          nou

          this is maybe related to non array local variable bug. as workaround change it to

          fire_table_idx to array like
          __local int fire_table_idx[1];
          fire_table_idx[0] = 0; //etc
          • Problems with local atomics
            himanshu.gautam

            SiegeLord,

            the extension name is #pragma OPENCL EXTENSION cl_khr_int64_base_atomics : enable.

            I am not able to understand what do you want to do with this kernel.You are changing the value of same variable error_buffer[0] inside every work group.

              • Problems with local atomics
                SiegeLord

                 

                Originally posted by: nou this is maybe related to non array local variable bug. as workaround change it to

                fire_table_idx to array like __local int fire_table_idx[1]; fire_table_idx[0] = 0; //etc

                That did not help, unfortunately.

                 

                Originally posted by: himanshu.gautam

                the extension name is #pragma OPENCL EXTENSION cl_khr_int64_base_atomics : enable.

                No. That does not enable 32 bit local atomics, which is what I use. With your extension, there are link time errors when compiling the kernel.

                 

                Originally posted by: himanshu.gautam

                I am not able to understand what do you want to do with this kernel.You are changing the value of same variable error_buffer[0] inside every work group.

                This kernel is just to highlight the bug. The bug does not go away if you assign to different entries of the error_buffer.

                I have managed to come up with something that I think works, here it is:

                __kernel void test(__global int* error_buffer)
                {
                    int local_id = get_local_id(0);
                    int global_id = get_global_id(0);
                   
                    __local int fire_table_idx;
                   
                    if(local_id == 0)
                        fire_table_idx = 0;
                    barrier(CLK_LOCAL_MEM_FENCE);
                   
                    error_buffer[global_id] = fire_table_idx;

                    atom_inc(&fire_table_idx);
                   
                    fire_table_idx = 0; // <- set it explicitly to 0 BEFORE any more calls to barrier
                    barrier(CLK_LOCAL_MEM_FENCE); // <- has to be after the previous line
                }

                Setting it explicitly to 0 worked. In my real function there is also a barrier after initialization... I found that the barrier has to go after you reset the fire_table_idx to 0, or this fix won't work.

                Still... this fix seems a tad arbitrary...

                  • Problems with local atomics
                    SiegeLord

                    Actually... no. While this fix works for this simple kernel, it doesn't work for the real kernel I use. Anyone have any other suggestions?

                    Incidentally... while this specific bug appears only on the CPU device, the GPU also gets errors from this local atomic/barrier combination, just not in this simple function.