Hi,
I have 5 kernels to launch in a specific order, so for now I use clEnqueueNDRange.... in C++
I would like to create another kernel that is able to launch theses kernels, but I don't want to put the 5 kernels code into the same "build".
4 kernels have been saved as binary format, to avoid to recompile each time !
The last kernel is dynamically generated at run time.
So, I think that it is not possible, but I ask ?
thx
you can call another kernel from kernel. it behave like another function. but you can't "dynamicaly link" binary kernels.
It is the problem !
I would like to compile the last kernel and tell that its is related to other kernels !
They are all in the same 'program', so technically it should be possible !
I don't want to get back to C++ and use a c++ kernel to dispatch the kernels 😞
Originally posted by: viewon01 It is the problem !
I would like to compile the last kernel and tell that its is related to other kernels !
They are all in the same 'program', so technically it should be possible !
I don't want to get back to C++ and use a c++ kernel to dispatch the kernels 😞
It is not possible to launch a kernel from kernel but kernel can call any kernel.
Note : clEnqueuNativeKernel can call any c function which can launch any kernel from clEnqueueNDRangeKernal.
The problems is that I have several kernels. Imagine 10 kernels.
But one of the kernels is generated dynamically and can change during execution of the software.
So, I must recompile it each time, the other kernels are saved as binary.
The problem is that I don't want to put everything in the same 'program' because I don't want to compile everything each time, just for a few functions that have changed !!!
With the GPU compilation take a lot of time !!!!!
Originally posted by: viewon01 The problems is that I have several kernels. Imagine 10 kernels.
But one of the kernels is generated dynamically and can change during execution of the software.
So, I must recompile it each time, the other kernels are saved as binary.
The problem is that I don't want to put everything in the same 'program' because I don't want to compile everything each time, just for a few functions that have changed !!!
With the GPU compilation take a lot of time !!!!!
I see only two solution as follows
1. Compile kernel code in another thread while setting up the data for that kernel buffer(Asyncronous implemenation)
2. Avoid dynamic kernel generation by writing a single kernel which handles these dynamics.