I have a question about the domain of execution in a scatter output, for example, if I have this code:
var2[z] = var1[x][y+offset][z];
var1[x][y+other_offset][z] = var2[z];
Can you run the domain over just x and y dimensions without running over the z dimension, since the entire z dimension is going to be used in every thread?
So, basically, I just want the domain to be over the first two For loops, i and j, not k. I will unroll k inside the kernel.