To use the compute shader, I know I should use Attribute to specify thread group size and then do data sharing ..... But I still get confused when I get into the actual code, for example, I wanna use compute shader to do simple array addition, and each thread process multiple elements. I wrote the following code, but it seems I got index out of range problem. Anyone can fix it ?
Attribute[GroupSize(64, 1, 1)]
blockAdd(float a, float b, out float c)
int tid = instance().x;
//every thread process len elements, len = 1, 2, 4,.....
int len= 2;
int start = tid * len;
for (i = 0; i < len; i++)
c[start + i] = a[start + i] + b[start + i];