} | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
So what happends is that I need each workgroup to take an equal number of elements from a queue, that has nr_queue elements. 1st step it has 1 element so only group 0 will get 1 element. Thus "nr_roots" becomes 1 and I put it in BC[5]. Then I do a barrier to make sure that all the workitems on the workgroup know that nr_roots is 1, but when I out nr_roots in BC[6] the result is 0. Any1 has any idea why this might be? Thanks! | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Maybe you can enable this,
#pragma OPENCL EXTENSION cl_amd_printf : enable
so that you can print nr_roots at those lines that confuse you. Well, this
may not be the smartest way, but it works.
-----------------
#pragma OPENCL EXTENSION cl_amd_printf : enable
__kernel void async_copy (__global int* in, __global int* out, __local int * sdata)
{
int2 threadIdx;
int2 blockIdx;
int2 blockDim;
threadIdx.x = (int)get_local_id(0);
threadIdx.y = (int)get_local_id(1);
blockIdx.x = (int)get_group_id(0);
blockIdx.y = (int)get_group_id(1);
blockDim.x = (int)get_local_size(0);
blockDim.y = (int)get_local_size(1);
int xDim = (int)get_global_size(0);
int yDim = (int)get_global_size(1);
int2 idx;
idx.x = blockDim.x*blockIdx.x + threadIdx.x;
idx.y = blockDim.y*blockIdx.y + threadIdx.y;
int xPos = (int)get_global_id(0);
int yPos = (int)get_global_id(1);
if((xPos == 0) && (yPos==0))
printf("%d.%d %d.%d\n", xPos, yPos, idx.x, idx.y);
int gidx = yPos*xDim + xPos;
out[gidx] = gidx;
...
}
problem has to do with the fact that the variable needs to be declared with the "volatile" attribute, such that threads read it from the memory and not registers. ( at least thats what I think ) adding volatile seems to solve it
Might have something to do with the fact that BC is a global, but you're using a local barrier.
Nor can barriers be conditional.
Not to mention that every thread is writing to the same addresses.