Originally posted by: viewon01 Hi,
We have an application that must be able to work on image (texture) and it should run on CPU and GPU.
So, I would like to know if there is a way to efficiently transfer theses data to the kernel ?
1 - If I'm on the CPU, I transfer an array of bytes as any other information. This way it work too for the GPU.
2 - If I'm on the GPU, I can use OpenGL interop ! But it will not work on the CPU !!!
So, is there a way to create one unique method to transfer and to work on theses data without looking if we are on the CPU or GPU?
If I'm on the GPU, will I got better performance if I use textures ? Or it is better to have my image in global memory ?
Few things you need to remember.
Image are not support on CPU. You should keep your data in global memory.
GLInterop is supported both on CPU and GPU, Usually you will get good performance on GPU in case GL interop.
You should keep your data in global memory if you want same code to work for both devices.
Texture are optimizated for specific access patterns. Performance is purely based on your algorithm.
In addition to what genaganna said, you don't really need two exactly equal kernels for CPU and GPU, you can compile custom kernels for each, which is to some extent the beauty of it.
You could create a kernel using __global char4 * Image for CPU and image2d_t with byte sampler for GPU (assuming you're using RGBA standard).
The kernels wouldn't look too dissimilar if you did that.