I have a nicely working OpenCL solution that processes millions of lines of data in what are logically bundles of 512 lines per chunk. It works fine
Right now I do this in the code:
// this is where my thread's samples start
int my_sample_index = (gid>>1) & 0xFF00;
// calculate our coeffs
int my_index = (gid & 0x007F) * 4;
and it seems to work fine.
But it seems like maybe I am supposed to be making use of local_id(), and am having kind of a hard time getting my head around how you use it.
Is there a good explanation around that makes it clear when you need to make use of that feature of opencl?