cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

via0517
Journeyman III

help to the synchronized problem in loop

Jump to solution

I test a code and can't understand the result.

I am dealing with a uint array x[], and have two threads, here is my code:

__kernel void ssum(__global uint *x )

{

    int id= get_global_id(0);

    for (int i=0;i<5;i++)

    {

        x[i+id]=x[i+id]*10+id;

   

        float x=123;

        for (int j=0;j<10000*id;j++) x=tan(x);

    }   

}

the result is 0 10 10 10 10 1

but thread 1 must run slower than thread 0, it have 10000 tan() operators, so why every element x always thread 1 operate first then thread 0?

I am surprised is every step in loop is synchronized ?  thread 0 will be waited when thread 1 is running tan operators?

Tags (1)
0 Likes
1 Solution

Accepted Solutions
LeeHowes
Staff
Staff

Re: help to the synchronized problem in loop

Jump to solution

You don't have two threads. You have two work-items. A work-item is not a thread, it is mapped by the compiler and runtime to some underlying thread in an implementation-defined manner. The reality is that we map 64 work-items to a single thread in single instruction multiple data fashion - that means that all 64 work-items execute a single instruction at the same time (on different data, as we see in your example).

When people use the term "thread" for a work-item that is a matter of convenience for the programming model, not a description of how it maps to in the hardware when you consider that (as you imply given your surprise in the post) a thread is generally considered to be an independent entity with its own program counter. A single work-item does not have its own PC during execution.

View solution in original post

0 Likes
4 Replies
realhet
Miniboss

Re: help to the synchronized problem in loop

Jump to solution

float x=123;

for (int j=0;j<10000*id;j++) x=tan(x);

This is eliminated completely by the compiler as this code does nothing. It only alters a temp variable which isn't used later.

Also that 2 thread will run in paralell when thread 0 calculates x[i+0] thread 1 will calculate x[i+1] and so on.

"I am surprised is every step in loop is synchronized"

It is not 'multitasking', it's Single Instruction Multiple Data. ALU Operations are done simultaneously, and memory IO is somewhat serialized (that can generate stalls and pause the ALU instruction processing).

LeeHowes
Staff
Staff

Re: help to the synchronized problem in loop

Jump to solution

You don't have two threads. You have two work-items. A work-item is not a thread, it is mapped by the compiler and runtime to some underlying thread in an implementation-defined manner. The reality is that we map 64 work-items to a single thread in single instruction multiple data fashion - that means that all 64 work-items execute a single instruction at the same time (on different data, as we see in your example).

When people use the term "thread" for a work-item that is a matter of convenience for the programming model, not a description of how it maps to in the hardware when you consider that (as you imply given your surprise in the post) a thread is generally considered to be an independent entity with its own program counter. A single work-item does not have its own PC during execution.

View solution in original post

0 Likes
via0517
Journeyman III

Re: help to the synchronized problem in loop

Jump to solution

Thank you, you are right. I test two work-groups then and find they are not synchronized. I know barrier can synchronize work-items in the same work-group, but is there any way which can synchronize different work-groups?

0 Likes
himanshu_gautam
Grandmaster

Re: help to the synchronized problem in loop

Jump to solution

There is no way to synchronize different workgroups.

Usually, People split a kernel into multiple kernel launches (consecutively), if the problem requires inter-workroup coordination.