cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

SiegeLord
Adept II

Problems with local atomics

The following kernel when run on a CPU device causes the error_buffer to return non-zero values:

#pragma OPENCL EXTENSION cl_khr_local_int32_base_atomics : enable
__kernel
void test(__global int* error_buffer) { __local int fire_table_idx; int local_id = get_local_id(0); if(local_id == 0) fire_table_idx = 0; barrier(CLK_LOCAL_MEM_FENCE); if(fire_table_idx != 0) error_buffer[0] = fire_table_idx; else error_buffer[0] = 0; atom_inc(&fire_table_idx); }

If I remove the call to atom_inc at the bottom, the kernel functions fine (i.e. fire_table_idx is set to 0).

Any ideas as to why this might be happening? Am I not using __local variables correctly?
0 Likes
7 Replies
nou
Exemplar

this is maybe related to non array local variable bug. as workaround change it to

fire_table_idx to array like
__local int fire_table_idx[1];
fire_table_idx[0] = 0; //etc
0 Likes
himanshu_gautam
Grandmaster

SiegeLord,

the extension name is #pragma OPENCL EXTENSION cl_khr_int64_base_atomics : enable.

I am not able to understand what do you want to do with this kernel.You are changing the value of same variable error_buffer[0] inside every work group.

0 Likes

Originally posted by: nou this is maybe related to non array local variable bug. as workaround change it to

fire_table_idx to array like __local int fire_table_idx[1]; fire_table_idx[0] = 0; //etc

That did not help, unfortunately.

Originally posted by: himanshu.gautam

the extension name is #pragma OPENCL EXTENSION cl_khr_int64_base_atomics : enable.

No. That does not enable 32 bit local atomics, which is what I use. With your extension, there are link time errors when compiling the kernel.

Originally posted by: himanshu.gautam

I am not able to understand what do you want to do with this kernel.You are changing the value of same variable error_buffer[0] inside every work group.

This kernel is just to highlight the bug. The bug does not go away if you assign to different entries of the error_buffer.

I have managed to come up with something that I think works, here it is:

__kernel void test(__global int* error_buffer)
{
    int local_id = get_local_id(0);
    int global_id = get_global_id(0);
   
    __local int fire_table_idx;
   
    if(local_id == 0)
        fire_table_idx = 0;
    barrier(CLK_LOCAL_MEM_FENCE);
   
    error_buffer[global_id] = fire_table_idx;

    atom_inc(&fire_table_idx);
   
    fire_table_idx = 0; // <- set it explicitly to 0 BEFORE any more calls to barrier
    barrier(CLK_LOCAL_MEM_FENCE); // <- has to be after the previous line
}

Setting it explicitly to 0 worked. In my real function there is also a barrier after initialization... I found that the barrier has to go after you reset the fire_table_idx to 0, or this fix won't work.

Still... this fix seems a tad arbitrary...

0 Likes

Actually... no. While this fix works for this simple kernel, it doesn't work for the real kernel I use. Anyone have any other suggestions?

Incidentally... while this specific bug appears only on the CPU device, the GPU also gets errors from this local atomic/barrier combination, just not in this simple function.

0 Likes

Please send a test case to streamdeveloper@amd.com

0 Likes

siegelord,

I checked out the kernel you provided.

I do not find any stange behaviour about it when i run it on CPU.

Can you explain your problem?

0 Likes

I think I solved it (thanks to jcornwall on #opencl). What I needed was another barrier in that function to prevent one thread incrementing the local variable before some other thread reaches the check for it being 0. Here's the fixed kernel:

#pragma OPENCL EXTENSION cl_khr_local_int32_base_atomics : enable
__kernel
void test(__global int* error_buffer)
{
__local int fire_table_idx;
int local_id = get_local_id(0);
if(local_id == 0)
fire_table_idx = 0;
barrier(CLK_LOCAL_MEM_FENCE);

if(fire_table_idx != 0)
error_buffer[0] = fire_table_idx;
else
error_buffer[0] = 0;

barrier(CLK_LOCAL_MEM_FENCE);
atom_inc(&fire_table_idx);
}

I think that makes my code run without bugs for now... but I'll reply back if I get a bug related to this again.
0 Likes