Hello i'm trying to implement an atomic counter.
I've got information from this post http://www.khronos.org/message_boards/viewtopic.php?t=4520 and the http://www.khronos.org/registry/cl/extensions/ext/cl_ext_atomic_counters_32.txt specification.
The specification says :
The Host initializes the atomic counter by passing a buffer object as an
argument to the kernel. The counter initial value is taken from the first
4 bytes of the buffer object. During the run of the kernel, the global
buffer is not updated by the atomic operations done on the counter. At the
end of the kernel’s run, the updated value of the counter is written to
the buffer object.
but it does not specify the type the buffer needs to have.
So , here's my kernel argements on the host side :
cl_int *p;
p = (int *)malloc(sizeof(cl_int));
p[0] = 0;
cl::Buffer index( context , CL_MEM_READ_WRITE | CL_MEM_COPY_HOST_PTR , sizeof(cl_int), p);
//executing kernel
kernel_func3(resultBuffer2, resultBuffer3, count, cl::__local(sizeof(cl_int)*REP*2*64), cl::__local(sizeof(cl_int)*REP*2), resultBuffer2_index, p);
cmdqueue.finish();
This compiles but segfaults inside the kernel_func3
i've tried to use the type counter32_t but it just return
counter32_t’ was not declared in this scope
How do i have to initialize to get amd atomic counter to work ?
Thanks kevin,
Solved! Go to Solution.
yes, i have my mistake.
I'm sorry it's a stupid one.
cl_int *p;
p = (cl_int *)malloc(sizeof( cl_int));
p[0] = 0;
cl::Buffer index( context , CL_MEM_READ_WRITE | CL_MEM_COPY_HOST_PTR , sizeof(cl_int), p);
//executing kernel
kernel_func3(resultBuffer2, resultBuffer3, count, cl::__local(sizeof(cl_int)*REP*2*64), cl::__local(sizeof(cl_int)*REP*2), resultBuffer2_index,p);
should have been
cl_int *p;
p = (cl_int *)malloc(sizeof( cl_int));
p[0] = 0;
cl::Buffer index( context , CL_MEM_READ_WRITE | CL_MEM_COPY_HOST_PTR , sizeof(cl_int), p);
//executing kernel
kernel_func3(resultBuffer2, resultBuffer3, count, cl::__local(sizeof(cl_int)*REP*2*64), cl::__local(sizeof(cl_int)*REP*2), resultBuffer2_index,index);
and here my kernel call
__kernel void compare( __global int* input, __global int* output, const unsigned int count, __local int * shared , __local int * suspect, __global int* output_index, counter32_t point )
Are you checking that atomic counters is supported on the device you are attempting to execute on and that it is initialized correctly in the kernel?
I have checked for support, and it is suported
extensions | cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_gl_sharing cl_ext_atomic_counters_32 cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_popcnt |
I'm not sure what you mean with initialization in kernel.
But even with my atomics operations commented out , my kernel crashes.
thanks
Then you most likely have a buffer overrun in your kernel.
yes, i have my mistake.
I'm sorry it's a stupid one.
cl_int *p;
p = (cl_int *)malloc(sizeof( cl_int));
p[0] = 0;
cl::Buffer index( context , CL_MEM_READ_WRITE | CL_MEM_COPY_HOST_PTR , sizeof(cl_int), p);
//executing kernel
kernel_func3(resultBuffer2, resultBuffer3, count, cl::__local(sizeof(cl_int)*REP*2*64), cl::__local(sizeof(cl_int)*REP*2), resultBuffer2_index,p);
should have been
cl_int *p;
p = (cl_int *)malloc(sizeof( cl_int));
p[0] = 0;
cl::Buffer index( context , CL_MEM_READ_WRITE | CL_MEM_COPY_HOST_PTR , sizeof(cl_int), p);
//executing kernel
kernel_func3(resultBuffer2, resultBuffer3, count, cl::__local(sizeof(cl_int)*REP*2*64), cl::__local(sizeof(cl_int)*REP*2), resultBuffer2_index,index);