cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

kbrafford
Adept II

Conceptual issue with local_id()

I have a nicely working OpenCL solution that processes millions of lines of data in what are logically bundles of 512 lines per chunk.  It works fine

Right now I do this in the code:

                    // this is where my thread's samples start
                    int my_sample_index = (gid>>1) & 0xFF00;
                   
                    // calculate our coeffs
                    int my_index = (gid & 0x007F) * 4;

and it seems to work fine.

But it seems like maybe I am supposed to be making use of local_id(), and am having kind of a hard time getting my head around how you use it.

Is there a good explanation around that makes it clear when you need to make use of that feature of opencl?

0 Likes
5 Replies

local id is a core part of OpenCL. Your global execution space is broken up into work-groups. The local id is the ID of the thread within the work-group. Please read section 1.3 of the OpenCL Programming guide.
0 Likes

If I don't make use of local memory, does that mean that local_id() will be useless to me?

0 Likes

Originally posted by: kbrafford If I don't make use of local memory, does that mean that local_id() will be useless to me?

 

In your specific example, your my_sample_index and my_index do not correspond neither to get_group_id() nor get_local_id. It would corresponds if my_sample_index = gid & 0xFF00; and my_index = gid & 0x00FF; AND ONLY IF your local_work_size is 256 with global_work_size less than 0xFFFF. In this case I would say to you to use local_id, it would be a more generic solution for whatever local_work_size you specify. But, in your particular example, using get_local_id would require more head around for you to transform it in such an index. It's not because local_id exists that you have to use it. It's up to you to decide how your threads will access the data, it's totally implementation dependent!

0 Likes

I am starting to catch on, I think.  The reason my calculation looks weird is that my algorithm vectorizes trivially, and I am actually able to do 4 conceptual pieces of data in each thread.

Looks like I need to find a good book that starts with the basics, even though I already have some working code!

0 Likes

That all depends on your algorithm.
0 Likes