cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

cyndwith
Journeyman III

How can we use multiple kernal on a single device sequentially

I have been trying to implement some image processing operation...i need to have to kernel operation on the image...one after the other...

i.e. two filters the output of one filter is feed to the others....so, can some one help me how to do it...i know events kindof things can be used...but i just learnt the theory of it...never really got exposed to how to code it...

0 Likes
6 Replies
rwelsch
Adept I

If Your commandQueue is not set as an "" you can just enqueue them in the desired order in Your commandQueue.

From: http://www.khronos.org/registry/cl/sdk/1.2/docs/man/xhtml/clCreateCommandQueue.html

If the CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE property of a command-queue is not set, the commands enqueued to a command-queue execute in order. For example, if an application calls clEnqueueNDRangeKernel to execute kernel A followed by a clEnqueueNDRangeKernel to execute kernel B, the application can assume that kernel A finishes first and then kernel B is executed. If the memory objects output by kernel A are inputs to kernel B then kernel B will see the correct data in memory objects produced by execution of kernel A. If the CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE property of a commandqueue is set, then there is no guarantee that kernel A will finish before kernel B starts execution.

The easiest way is to put clFinish(queue) between the kernel executions. For the recommended ways you can browse through the AMD APP Samples

You should not use clFinish() for this!

By default all queues are queues: i.e. in order/FIFO.

All you do is invoke each kernel in turn with the right arguments, and opencl will ensure they're run with the right data in the order specified.

You can use the 'wait' flag on the data transfer calls when you move data around if you need to ensure host<>device synhronisation.  You can use clfinish at this point too, but it is definitely not recommended to use it as a general synchronisation mechanism - for starters it will kill any potential performance stone cold dead.

0 Likes

I agree with notzed. I only mentioned clFinish as the easiest way, not the recommended way.

0 Likes

Thanks a lot for all your replies it has been helpful

I want to execute two kernels such that the output of one kernel goes to the other, in such a case to i have to set the kernel arguments and enqueue the second kernel after the first( i guess NO)....then is it ok to just set the kernel arguments of both the kernels, create the individual kernels and then just enqueue them with a wait for command ( Willl this work?? )...

I tried to go thru AMD APP samples with similar type of implementation ( multiple kernals)...i cudnt fine any good example...can any one suggest me good sample program illustrating the above model in AMD APP samples...

Thanks again...

0 Likes

Yes, the approach you mentioned should work anyways. For proper usage of events, I recommeded AMD APP Samples. I am not sure if multiple kernel samples are there, but almost every sample require synchronization, to make sure kernel execution has finished before reading back the output.

0 Likes