Archives Discussions

rotor · ‎06-22-2011

Multiple Kernels

Hi collective brains,

I have a couple of questions about running multiple kernels concurrently on GPUs. I have been surfing through the internet on this issues but there's no clear answer to me. Also I could not find any code example for this. So any help will be really appreciated.

1) how to enable multiple kernel running concurrently? Someones said enable out of order execution on the same queue, some said using multiple queues for multiple kernels. Which is the best way? Any example on this?

2) I have a kernel (says kernel1) process different sets of datas independently, and for each set of data, it has to utilize different amount of shared memory. There a simple way to solve this is set lunch the kernel only once to process the all set of data and utilize the shared memory to the upper bound, however this is inefficient in my case because upperboud and lowerbound are quite largely different and therefore the occupancy is not optimized.

So what I am planning to do is I lunch multiple copy of kernel1 (i.e. kernel11, kernel12, kerner13...) with different local memory utilizations to process the data set1, set2, set3.

i.e:

kernel11 = clCreateKernel( program_t,"kernel1",&clstatus);

kernel12 = clCreateKernel( program_t,"kernel1",&clstatus);

If I create the kernel on that's way, will OpenCL treat them as different kernels or OpenCl will know them as a same kernel (since their code are the same?)

3) regarding to my problem in 2, can you suggest me a better solution?

Many Thanks

nou · ‎06-22-2011

1. both way should work. but currently AMD implementation don't support any concurent kernel execution on one GPU.

2. they will be treted as different kernels

3. you can set local memory size at each execution differntly. dont see reason why you need multiple kernels. only when you use multiple threads then you need multiple kernels.

himanshu_gautam · ‎06-23-2011

If you are having more that one GPUs to run your kernels, then it would be good to have two queues and using different kernels. Otherwise just use one kernel and change that size of local memory required via kernel arguments.

rotor · ‎06-23-2011

Thanks nou and himanshu a lot. I got the points here.

Archives Discussions

Running Multiple Kernels concurrently on GPUs