Do you know of any samples where the device fission extension is used?
For better cache usage in a raycaster I am trying to have every core of the cpu work on a column of the picture instead of working on random work-groups.
There are a couple of things the documentation left me unsure about.
- Can I have both 8 command queues for the cores of the CPU as well as a command queue for the parent device (the entire CPU)?
So I could use the individual cores for the raycasting, but then use the entire CPU for post-processing
- Can I build the program with only one build call using all subdevices and the parent device as parameter?
- If I use only one program can I use only one instance of the kernels as well?
And they could all work using the same kernel with different arguments? at the same time?
So when I want to start the kernel on all cores I would first set the common kernel arguments and then for every core just set the kernel-specific arguments, enqueue the kernel and then move on to the next core (change a kernel argument, enqueue...) ?
- can you somehow use the device fission extension with the OpenCL 1.1 C++ bindings and the current StreamSDK?
If I define USE_CL_DEVICE_FISSION it won't compile because the cl_ext.h is still in revision 10424 instead of 11702
Thanks for reading.