cancel
Showing results for 
Search instead for 
Did you mean: 

OpenCL

avinashkrc
Adept I

Getting stuck in a loop, does local variable not visible to other workitems in a work group?

Jump to solution

This is my kernel code:

__kernel void test(__global int *input_vector,__global atomic_int *mem_flag)
{
    local int d[32];
    if(get_local_id(0)==0) {
            d[0] = 100;
        }
    barrier(CLK_GLOBAL_MEM_FENCE| CLK_LOCAL_MEM_FENCE);
    while(1) {
        if(get_local_id(0) == 0) {
            d[0] = 0;
            break;
        }
        else {
            if(d[0] == 0)
            break;
        }
    }

}

Inside while loop > inside else part there is if condition that is expecting d[0] equals to 0 at some point of time. But it isn't happening. I am under the assumption that local variables can be seen by all the workitems in that workgroup. I am using APU A12-9800, ubuntu 14.04, fglrx.

0 Likes
Reply
1 Solution

Accepted Solutions
avinashkrc
Adept I

Re: Getting stuck in a loop, does local variable not visible to other workitems in a work group?

Jump to solution

However, this worked for me

__kernel void test(__global int *input_vector,__global atomic_int *mem_flag)
{
    local int d[32];                                                  
    if(get_local_id(0)==0) {                                          
            d[0] = 100;
        }   
    barrier(CLK_GLOBAL_MEM_FENCE| CLK_LOCAL_MEM_FENCE);               
    while(1) {
            mem_fence(CLK_GLOBAL_MEM_FENCE| CLK_LOCAL_MEM_FENCE);     
           if(d[0] == 0)
            break;
        if(get_local_id(0) == 0) {                                    
            d[0] = 0;   
        }   
    }     
}

View solution in original post

0 Likes
Reply
8 Replies
dipak
Staff
Staff

Re: Getting stuck in a loop, does local variable not visible to other workitems in a work group?

Jump to solution

I think atomics and synchronization operations should be used in the above case. These operations play a special role in making assignments in one work-item visible to another.

Thanks.

0 Likes
Reply
avinashkrc
Adept I

Re: Getting stuck in a loop, does local variable not visible to other workitems in a work group?

Jump to solution

Hi dipak,

I changed the code to this:

local atomic_int d[32];
    if(get_local_id(0)==0) {
            atomic_store(&d[0],100);
        }
    barrier(CLK_GLOBAL_MEM_FENCE| CLK_LOCAL_MEM_FENCE);
    while(1) {
        if(get_local_id(0) == 0) {
            atomic_store(&d[0], 0);
            break;
        }
        else {
           if(atomic_load(&d[0]) == 0)
            break;
        }
    }

still same problem persists. Any other thoughts?

0 Likes
Reply
dipak
Staff
Staff

Re: Getting stuck in a loop, does local variable not visible to other workitems in a work group?

Jump to solution

Please attach the complete code (host+kernel). I'll check and share my observation.

Thanks.

0 Likes
Reply
avinashkrc
Adept I

Re: Getting stuck in a loop, does local variable not visible to other workitems in a work group?

Jump to solution

Hi dipak, code is attached now, however opencl kernel is updated one with the barrier code without atomic. If you want to check atomic one just replace the kernel code with above mentioned one.

0 Likes
Reply
avinashkrc
Adept I

Re: Getting stuck in a loop, does local variable not visible to other workitems in a work group?

Jump to solution

However, this worked for me

__kernel void test(__global int *input_vector,__global atomic_int *mem_flag)
{
    local int d[32];                                                  
    if(get_local_id(0)==0) {                                          
            d[0] = 100;
        }   
    barrier(CLK_GLOBAL_MEM_FENCE| CLK_LOCAL_MEM_FENCE);               
    while(1) {
            mem_fence(CLK_GLOBAL_MEM_FENCE| CLK_LOCAL_MEM_FENCE);     
           if(d[0] == 0)
            break;
        if(get_local_id(0) == 0) {                                    
            d[0] = 0;   
        }   
    }     
}

View solution in original post

0 Likes
Reply
dipak
Staff
Staff

Re: Getting stuck in a loop, does local variable not visible to other workitems in a work group?

Jump to solution

After doing some experiments with the previous kernels, here are my findings:

  • outcome of the program depends on which path inside the while-loop is chosen first by the compiler. when "if" branch is executed first, then d[0] is already set to 0 before testing it in the  "else" branch. But if the "else" branch  (i.e. get_local_id(0) > 0) is executed first, then it generates an infinite loop waiting for d[0] to be 0 and causes the hang. 
  •  a barrier at the end of while loop can force  both the branches to merge after each iteration, so infinite "else" loop should not happen.
  • As I observed, a fence instruction also worked here because  "if" branch was chosen first then.
0 Likes
Reply
avinashkrc
Adept I

Re: Getting stuck in a loop, does local variable not visible to other workitems in a work group?

Jump to solution

Hi dipak,

barrier either at the start or end of while  loop if we have coded in below mentioned way where we terminates work-item at different line number i.e. one thread breaks inside if statement where as other break inside else statement and I believe this is expected as some work-items hit barrier and other don't

while(1) {
    barrier(CLK_GLOBAL_MEM_FENCE| CLK_LOCAL_MEM_FENCE);
        if(get_local_id(0) == 0) {
            d[0] = 0;
            break;
        } else {
           if(d[0] == 0)
            break;
        }
    }

However, if all work-items break at same line number inside same conditional statement then only it works as mentioned in above working code (even if we replace mem_fence with barrier).

0 Likes
Reply
dipak
Staff
Staff

Re: Getting stuck in a loop, does local variable not visible to other workitems in a work group?

Jump to solution

Yes, you are right about the barrier usage. If barrier is used, then all work-items in a work-group must hit the barrier, otherwise the result is undefined and not portable. 

Couple of points to note though. In general, work-items do not have any execution order guarantee. Barriers provide some level of ordering guarantee for work-items within a work-group. It combines thread synchronization as well as memory fence. Whereas, memory fence itself does not provide any ordering guarantee.

Depending on hardware, an implementation may execute many work-items in parallel or one work item one at a time (say, on a single core CPU). So, if outcome of the kernel depends on execution order of the work-items, then it may loose portability and a special attention is required when a different hardware/implementation is selected.