Originally posted by: gaurav.garg Did you enable pre-processing using -pp flag?
When you define Attribute of a kernel with GROUP_SIZE, threads are divided into groups and these threads can share data among each other using shared memory. InstanceInGroup() shows the thread ID within a group, instance() shows the global id. |
Thank you. Now Attribute works well.
Here I got another question about 2D array( width,height). The input and output arrays are width x height. I want to define width threads and each thread handles one column data of the arrays.
Just see the following code.
uint4 offset(0,0,0,0);
uint4 index(width,1,1,1);
kernelGPU.domainOffset(offset);
kernelGPU.domainSize(index);
kernel void kernelGPU(float input1 [][],float input2 [][], out float output [][])
{
int2 index = instance().xy;
int a=index.x;
int b=index.y;
for (i=0;i {
output=input1+input2;
b=b+1;
}
}
the result is incorrect. could anyone tell me how to arrange the code?
thank you very much.