5 Replies Latest reply on Jan 9, 2013 10:08 AM by stoney

    row_pitch error with CL_MEM_USE_HOST_PTR

    stoney

      I'm using an AMD Radeon HD 7970 (just installed and updated) on Windows 7-64, AMD Catalyst 12.11.

       

      In clCreateImage2D, when using CL_MEM_USE_HOST_PTR, all but one of the implementations I've tried allows passing a row_pitch that is greater than the number of pixels in a row, so long as row_pitch is a multiple of the pixel size.

       

      With the AMD Radeon HD 7970, the excess row_pitch is not ignored, so that the resultant image is skewed. IOW, it's treating the data past the end of each host row as part of the image.

       

      On Mac, I've tested an AMD Radeon HD 5870 and an nVidia GeForce GT 650M, which behave properly. On Windows, I've tested an nVidia GeForce GTX 680, which worked fine, and the AMD 7970, which didn't. Interestingly, the AMD CPU OpenCL implementation works properly on Windows.

       

      It does not fail when using CL_MEM_COPY_HOST_PTR, or when not supplying the host pointer and calling enqueueWriteImage to copy the data.

       

      This is clearly a bug, but I'm not sure how to report it, or if I just did that.

       

      I can supply a snippet of source code if necessary.

       

      Thanks.

        • Re: row_pitch error with CL_MEM_USE_HOST_PTR
          binying

          I can supply a snippet of source code if necessary.

          --It would be nice if you upload it.

            • Re: row_pitch error with CL_MEM_USE_HOST_PTR
              stoney

              I've attached a zip file of the source code to a console app that tests all the OpenCL devices for this bug to this post.

              Here is an extract from the main test file showing how I'm creating the Image2D when it fails:

              ------

              // the width and height are arbitrary

              size_t const kImageWidth        = 256;

              size_t const kImageHeight       = 128;

              size_t const kImagePaddingBytes = 32;

               

                        // create the input image

                        size_t const imageRowbytes = kImageWidth * sizeof(cl_uchar4) + kImagePaddingBytes;

                        size_t const imageByteSize = imageRowbytes * kImageHeight;

               

                        boost::scoped_array<cl_uchar4> inputImage(new cl_uchar4[imageByteSize]);

               

                        // fill input with random values

                        boost::random::mt19937 gen;

               

                        cl_uchar* imgBytes = reinterpret_cast<cl_uchar*>(inputImage.get());

               

                        for (int i = 0; i < imageByteSize; ++i)

                                  imgBytes[i] = gen() & 0xff;

               

                        cl::ImageFormat imageFormat(CL_RGBA, CL_UNSIGNED_INT8);

               

                        // write image from host buffer to GPU

                        cl::Image2D image2D = cl::Image2D(context,

                                                                CL_MEM_USE_HOST_PTR | CL_MEM_READ_ONLY,

                                                                imageFormat,

                                                                kImageWidth,

                                                                kImageHeight,

                                                                imageRowbytes,

                                                                inputImage.get());

              ------

              I then create a Buffer and run a kernel that copies the image into the buffer. The test

              concludes by reading the buffer to host memory and comparing it with the inputImage used

              as the host buffer for the image.

               

              The comparison fails on Tahiti if I use CL_MEM_USE_HOST_PTR as above.

               

              If I use CL_MEM_COPY_HOST_PTR instead of CL_MEM_USE_HOST_PTR, then it works. If I don't

              specify either, then use enqueueWriteImage to write the host buffer to the GPU, it also

              works properly.

               

              Here is the output of the test app on my Windows 7 box:

              ----

              Starting OpenCL Rowbytes Test

              Testing device: 'AMD Accelerated Parallel Processing - Tahiti'

                      Mode: Host copy - Passed

                      Mode: Explicit write - Passed

                      Mode: Host map - Failed

              Testing device: 'AMD Accelerated Parallel Processing -        Intel(R) Core(TM) i7-3930K CPU @ 3.20GHz'

                      Mode: Host copy - Passed

                      Mode: Explicit write - Passed

                      Mode: Host map - Passed

              Testing device: 'NVIDIA CUDA - GeForce GTX 680'

                      Mode: Host copy - Passed

                      Mode: Explicit write - Passed

                      Mode: Host map - Passed

              OpenCL Rowbytes Test Complete

              ----

              I hope that's clear and manageable.

                • Re: row_pitch error with CL_MEM_USE_HOST_PTR
                  developer

                  Thanks for your detailed post. Will dig more next week. Thanks!

                  • Re: row_pitch error with CL_MEM_USE_HOST_PTR
                    heman

                    hi stoney,

                    Thanks for the detailed test case. I am able to reproduce the problem here.

                    Also the memory corruption only happens when some padding is present. If the padding is 0, all the testcases pass properly.

                     

                    I will send it to the right people.

                     

                    Thanks

                    workitem7

                      • Re: row_pitch error with CL_MEM_USE_HOST_PTR
                        stoney

                        Yes, the point of this bug report is that row padding should be ignored. It's required to be a multiple of the pixel size, but the padding is not supposed to be transferred to the cl image.

                         

                        Row padding shows up in several common situations, such as when processing frames in Adobe After Effects, when processing only a "Region of Interest" in a frame, or when using the padding to select one field of an interlaced frame. It's important to ignore the row padding to avoid needing to copy a frame with row padding to remove the padding before using it to fill a cl image.

                         

                        Thanks for passing this on to the Right People!

                        - Stoney