1 Reply Latest reply on Jul 30, 2014 1:58 AM by dipak

    Downside of using image vs global memory?

    boxerab

      I  have a kernel that processes RGB images. Currently, I take each channel one by one, and run the same kernel on that channel

      The kernel input is a global memory buffer: data is moved in chunks from the global buffer into local memory for processing, then stored into another global buffer as output.

      I was thinking of refactoring this to store all three channels in an RGBA buffer, and operate on all three channels at the same time, using vector operations. I understand that images have better spatial caching.

      Is there any disadvantage to this refactor? I realize that I will have to reduce the number of pixels per chunk, because I will now have three times the amount of data.

      Thanks!

        • Re: Downside of using image vs global memory?
          dipak

          Hi,

          I think you are moving to right direction. OpenCL image objects are better option for operating on non-linear structures like 2D and 3D image, specially for spatial operations like convolution, filtering etc. Most of the GPUs have special hardware (e.g texture memory/cache] to handle the images and they provide extremely high-performance access to and filtering of texture images. But there are few limitations worth to know like:

          1) There is a limit to maximum size of 2D and 3D image object (can be queried using clGetDeviceInfo)

          2) Same image object cannot be read and written in same kernel (this restriction will not be there in OpenCL 2.0)

          3) Image data can not be accessed directly using pointer, only accessible through built-in image read/write functions.

          4) As most of the OpenCL implementations prefer to store(internally) the image object as non-linear fashion, the copy and mapping of image object to linear memory such as buffer or linear host memory may have some performance impact.

          5) There is a limit of max. image objects that can be used as kernel arguments (for read and write) [this limit values are large enough for general applications].

           

          Regards,