I am seeing inconsistence in the reading the histogram buffers like rhist ,ghist, bhist buffers from kernel.For the same input data i am seeing variations in the values in these buffers.
The code in the kernel is shown below.
do the unary operators behave correctly in OpenCL Kernel.
rhist[ output[index + 0] ]++;
ghist[ output[index + 1] ]++;
bhist[ output[index + 2] ]++;
For first run .
when i=1 rhist: 3935 ghist: 3060 bhist: 2884
i=2 rhist: 7533 ghist: 8436 bhist: 6656
For seconf run. i am seeing these incosistency
when i=1 rhist: 3935 ghist: 3062 bhist: 2885
i=2 rhist: 7532 ghist: 8438 bhist: 6656
**********************************
Please find the Application Code.
1.Creating BUffer:
rhistBuffer = clCreateBuffer(context,CL_MEM_READ_WRITE | CL_MEM_USE_HOST_PTR,sizeof(cl_int) * 256 ,rhist,&status);
ghistBuffer = clCreateBuffer(context,CL_MEM_READ_WRITE | CL_MEM_USE_HOST_PTR,sizeof(cl_int) * 256 ,ghist,&status);
bhistBuffer = clCreateBuffer(context,CL_MEM_READ_WRITE | CL_MEM_USE_HOST_PTR,sizeof(cl_int) * 256 ,bhist,&status);
2.Setting the Kernel:
/* the rhist array to the kernel */
status = clSetKernelArg(kernel,6, sizeof(cl_mem),(void *)&rhistBuffer);
/* the ghist array to the kernel */
status = clSetKernelArg(kernel,7, sizeof(cl_mem),(void *)&ghistBuffer);
/* the bhist array to the kernel */
status = clSetKernelArg(kernel,8, sizeof(cl_mem),(void *)&bhistBuffer);
3.Reading from GPU.
status = clEnqueueReadBuffer(commandQueue,rhistBuffer,CL_TRUE, 0,256 * sizeof(cl_int),rhist, 0, NULL, &events[1]);
status = clWaitForEvents(1, &events[1]);
clReleaseEvent(events[1]);
status = clEnqueueReadBuffer(commandQueue,ghistBuffer,CL_TRUE, 0,256 * sizeof(cl_int),ghist, 0, NULL, &events[1]);
status = clWaitForEvents(1, &events[1]);
clReleaseEvent(events[1]);
status = clEnqueueReadBuffer(commandQueue,bhistBuffer,CL_TRUE, 0,256 * sizeof(cl_int),bhist, 0, NULL, &events[1]);
status = clWaitForEvents(1, &events[1]);
clReleaseEvent(events[1]);
Please let me where i am going wrong...I verified ..but i am not getting the correct values from the 3 buffers.
Micah Villmow,
Thanks ..but my GPU doesn't support atomic pragma cl_khr_global_int32_base_atomics : found from ./CLInfo
I am getting the followig error in GPU build
,error: bad argument type to opencl atom op: expected pointer to int/uint with addrSpace global/local atom_inc(rhist[output[aa + 0]]);
In GPU case how should i solve this race condition since this extension is not available.
In CPU Outputs are consistence with atom_inc functions. I modified to
atom_inc(rhist+ *(output+(index + 0)));
atom_inc(ghist+ *(output+(index + 1)));
atom_inc(bhist+ *(output+(index+ 2)));
but ...I am still clueless how to perform in GPU.
Hi All
Please let me know some pointers on the solution...also i am not getting correct values from this operation in GPU.How to avoid race condition in GPU without atom_inc functions.
rhist[ output[index + 0] ]++;
ghist[ output[index + 1] ]++;
bhist[ output[index + 2] ]++;
Hi Micah Villmow,
Thanks for clarification ..so i will pull back this code into host side but at the cost of performance hit.
Regards
Pavan
Hi Micah Villmow,
I am new to parallel programming and so couldn't understand the approach.Can you please list me the steps in terms of OpenCL functions .
Thanks
Pavan