I tried using clEnqueueMapBuffer but i couldn't succeed in the getting the output.infact i am getting NULL output.
could you point to any examples using the above scenario(multiple kernels ,reusing the output buffer of kernel as input buffer of kernel and ofcourse some modifications in between kerne and kernel execution ).
Hope you understood what i see.
Hi Nou and All ,
Please help me in getting an example of above scenario.
I tried with clEnqueueMapBuffer and also with creating a new buffers for each kernel w/o clEnqueueMapBuffer...seems i am not getting the exact nerve of the buffer management.
Thanks in advance
Are you doing something like this-
1. clCreateBuffer with CL_MEM_READ_WRITE
2. Call kernel
4. Modify data
6. call kernel
This should work without any problem.
Scenario: I want outputbuffer of kernel modified in host application then send the modified buffer as input to kernel.
In kernel modify the input buffer and store the results in same buffer.
I am doing something like this.
1. outputBuffer = clCreateBuffer(context,CL_MEM_READ_WRITE | CL_MEM_USE_HOST_PTR,sizeof(cl_uint) * width * 3 , output, &status)
2.output =(cl_uint *)clEnqueueMapBuffer(commandQueue,outputBuffer , CL_TRUE,CL_MAP_READ|CL_MAP_WRITE,0, sizeof(cl_uint) * width * 3 ,
0,NULL, &events,&status); /
//map the output host memory with device outputbuffer so that i can use device "outputbuffer" as input to kernel.
3.clSetKernelArg(kernel, 4,sizeof(cl_mem),(void *)&outputBuffer);
5.clEnqueueReadBuffer(commandQueue,outputBuffer, CL_TRUE, 0, width * 3 * sizeof(cl_uint),output, 0,NULL,&events);
6.Modiy the "output" in host application.
7.clSetKernelArg(kernel,0, sizeof(cl_mem),(void *)&outputBuffer); //here i am setting the same outputbuffer as input to kernel....wihtout creating new bufferobject...assuming host buffer"output" and devicebuffer "outputBuffer" are in sync.
9.Then clEnqueueReadBuffer(commandQueue,outputBuffer,CL_TRUE,0,width * 3 * sizeof(cl_uint),output,0,NULL, &events);
...Please rectify for any wrong flow.
In clEnqueueReadBuffer, you are trying to copy data from the pointer output to the same pointer location.
I would suggest you to use map/unmap instaed of clEnqueueReadBuffer. As, your buffer is on host itself, it should be faster to use map/unmap.
Also, you have not unmapped the pointer after second step, you must unmap before launching kernel.