Hi,
I have started working on OpenCL very recently.
I was porting some CUDA kernels to OpenCL kernels.
In one of the kernel, I found a statement like this:
int res = atomicInc(classified, (unsigned int)objects.cols);
I want to port this line to OpenCL. OpenCL has an atomic_inc
function but that does not take two parameters, that only takes one parameter.
In CUDA, atomicInc first compares and then increment.
I want to have the same functionality as CUDA in OpenCL.
Could anyone show me the right way? How can I implement CUDA atomicInc functionality
in OpenCL's atomicInc?
Any idea would be appreciated.
Thanks.
maybe you can combine atom_inc() and atom_cmpxchg();
Hi,
Thanks for your reply.
atom_cmpxchg compares if the values are equal or not.
But in CUDA, atomicInc compares whether the old value is less than the 2nd parameter.
Is there something by which I can get the above functionality?
Thanks in advance.
Right now in OpenCL there is no function which does same as CUDA atomicInc. but you can do it just by adding a check condition before callin OpenCL atomic_inc() function like (Old = cmp) then call atomic_inc else return 0.
No, that will not work.
As this is parallel, the checking would be parallel and multiple thread will compare at the same time and get the old value.
and will increment it by one. atomic operations make this sequential.
I would try something like:
while( true ){
old = *ptr;
if( old >= val )
if( old == atomic_cmpxchg(ptr, old, 0) )
break;
else
if( old == atomic_cmpxchg(ptr, old, old+1) )
break;
}
Obviously not so efficient but I think it would be of the same functionality as CUDA's atomicInc.
Thanks for your reply. But dont you think multiple thread might check the the same old value for comparison?
I,e, first thread may check old(=3)<*ptr(=5), by the time it increment, another thread may check the same comparison (old=3<*ptr=5)
In that case, that wont be correct.
What do you think?
That's why you have to use atomic_cmpxchg. If another thread has intervened and changed the old value the atomic_cmpxchg will fail and the thread will try again by rereading the *ptr.
The ptr has also to be declared as volatile somehow so the compute unit is forced to rereading its value in every iteration of the loop.
ekondis wrote:
The ptr has also to be declared as volatile somehow so the compute unit is forced to rereading its value in every iteration of the loop.
volatile keyword is supported in opencl:
The type qualifiers const, restrict and volatile as defined by the C99
specification are supported. These qualifiers cannot be used with image2d_t,
image3d_t , image2d_array_t, image1d_t, image1d_buffer_t and
image1d_array_t types. Types other than pointer types shall not use the
restrict qualifier.