cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

via0517
Journeyman III

help to the synchronized problem in loop

I test a code and can't understand the result.

I am dealing with a uint array x[], and have two threads, here is my code:

__kernel void ssum(__global uint *x )

{

    int id= get_global_id(0);

    for (int i=0;i<5;i++)

    {

        x[i+id]=x[i+id]*10+id;

   

        float x=123;

        for (int j=0;j<10000*id;j++) x=tan(x);

    }   

}

the result is 0 10 10 10 10 1

but thread 1 must run slower than thread 0, it have 10000 tan() operators, so why every element x always thread 1 operate first then thread 0?

I am surprised is every step in loop is synchronized ?  thread 0 will be waited when thread 1 is running tan operators?

0 Likes
1 Solution
LeeHowes
Staff

You don't have two threads. You have two work-items. A work-item is not a thread, it is mapped by the compiler and runtime to some underlying thread in an implementation-defined manner. The reality is that we map 64 work-items to a single thread in single instruction multiple data fashion - that means that all 64 work-items execute a single instruction at the same time (on different data, as we see in your example).

When people use the term "thread" for a work-item that is a matter of convenience for the programming model, not a description of how it maps to in the hardware when you consider that (as you imply given your surprise in the post) a thread is generally considered to be an independent entity with its own program counter. A single work-item does not have its own PC during execution.

View solution in original post

0 Likes
4 Replies
realhet
Miniboss

float x=123;

for (int j=0;j<10000*id;j++) x=tan(x);

This is eliminated completely by the compiler as this code does nothing. It only alters a temp variable which isn't used later.

Also that 2 thread will run in paralell when thread 0 calculates x[i+0] thread 1 will calculate x[i+1] and so on.

"I am surprised is every step in loop is synchronized"

It is not 'multitasking', it's Single Instruction Multiple Data. ALU Operations are done simultaneously, and memory IO is somewhat serialized (that can generate stalls and pause the ALU instruction processing).

LeeHowes
Staff

You don't have two threads. You have two work-items. A work-item is not a thread, it is mapped by the compiler and runtime to some underlying thread in an implementation-defined manner. The reality is that we map 64 work-items to a single thread in single instruction multiple data fashion - that means that all 64 work-items execute a single instruction at the same time (on different data, as we see in your example).

When people use the term "thread" for a work-item that is a matter of convenience for the programming model, not a description of how it maps to in the hardware when you consider that (as you imply given your surprise in the post) a thread is generally considered to be an independent entity with its own program counter. A single work-item does not have its own PC during execution.

0 Likes

Thank you, you are right. I test two work-groups then and find they are not synchronized. I know barrier can synchronize work-items in the same work-group, but is there any way which can synchronize different work-groups?

0 Likes

There is no way to synchronize different workgroups.

Usually, People split a kernel into multiple kernel launches (consecutively), if the problem requires inter-workroup coordination.