Archives Discussions

kbrafford · ‎02-01-2011

I have a nicely working OpenCL solution that processes millions of lines of data in what are logically bundles of 512 lines per chunk. It works fine

Right now I do this in the code:

                    // this is where my thread's samples start
                    int my_sample_index = (gid>>1) & 0xFF00;

                    // calculate our coeffs
                    int my_index = (gid & 0x007F) * 4;

and it seems to work fine.

But it seems like maybe I am supposed to be making use of local_id(), and am having kind of a hard time getting my head around how you use it.

Is there a good explanation around that makes it clear when you need to make use of that feature of opencl?

MicahVillmow · ‎02-01-2011

local id is a core part of OpenCL. Your global execution space is broken up into work-groups. The local id is the ID of the thread within the work-group. Please read section 1.3 of the OpenCL Programming guide.

kbrafford · ‎02-01-2011

If I don't make use of local memory, does that mean that local_id() will be useless to me?

laobrasuca · ‎02-03-2011

Originally posted by: kbrafford If I don't make use of local memory, does that mean that local_id() will be useless to me?

In your specific example, your my_sample_index and my_index do not correspond neither to get_group_id() nor get_local_id. It would corresponds if my_sample_index = gid & 0xFF00; and my_index = gid & 0x00FF; AND ONLY IF your local_work_size is 256 with global_work_size less than 0xFFFF. In this case I would say to you to use local_id, it would be a more generic solution for whatever local_work_size you specify. But, in your particular example, using get_local_id would require more head around for you to transform it in such an index. It's not because local_id exists that you have to use it. It's up to you to decide how your threads will access the data, it's totally implementation dependent!

kbrafford · ‎02-04-2011

I am starting to catch on, I think. The reason my calculation looks weird is that my algorithm vectorizes trivially, and I am actually able to do 4 conceptual pieces of data in each thread.

Looks like I need to find a good book that starts with the basics, even though I already have some working code!

MicahVillmow · ‎02-01-2011

That all depends on your algorithm.

Archives Discussions

Conceptual issue with local_id()