cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

stoney
Journeyman III

row_pitch error with CL_MEM_USE_HOST_PTR

I'm using an AMD Radeon HD 7970 (just installed and updated) on Windows 7-64, AMD Catalyst 12.11.

In clCreateImage2D, when using CL_MEM_USE_HOST_PTR, all but one of the implementations I've tried allows passing a row_pitch that is greater than the number of pixels in a row, so long as row_pitch is a multiple of the pixel size.

With the AMD Radeon HD 7970, the excess row_pitch is not ignored, so that the resultant image is skewed. IOW, it's treating the data past the end of each host row as part of the image.

On Mac, I've tested an AMD Radeon HD 5870 and an nVidia GeForce GT 650M, which behave properly. On Windows, I've tested an nVidia GeForce GTX 680, which worked fine, and the AMD 7970, which didn't. Interestingly, the AMD CPU OpenCL implementation works properly on Windows.

It does not fail when using CL_MEM_COPY_HOST_PTR, or when not supplying the host pointer and calling enqueueWriteImage to copy the data.

This is clearly a bug, but I'm not sure how to report it, or if I just did that.

I can supply a snippet of source code if necessary.

Thanks.

0 Likes
5 Replies
binying
Challenger

I can supply a snippet of source code if necessary.

--It would be nice if you upload it.

0 Likes

I've attached a zip file of the source code to a console app that tests all the OpenCL devices for this bug to this post.

Here is an extract from the main test file showing how I'm creating the Image2D when it fails:

------

// the width and height are arbitrary

size_t const kImageWidth        = 256;

size_t const kImageHeight       = 128;

size_t const kImagePaddingBytes = 32;

          // create the input image

          size_t const imageRowbytes = kImageWidth * sizeof(cl_uchar4) + kImagePaddingBytes;

          size_t const imageByteSize = imageRowbytes * kImageHeight;

          boost::scoped_array<cl_uchar4> inputImage(new cl_uchar4[imageByteSize]);

          // fill input with random values

          boost::random::mt19937 gen;

          cl_uchar* imgBytes = reinterpret_cast<cl_uchar*>(inputImage.get());

          for (int i = 0; i < imageByteSize; ++i)

                    imgBytes = gen() & 0xff;

          cl::ImageFormat imageFormat(CL_RGBA, CL_UNSIGNED_INT8);

          // write image from host buffer to GPU

          cl::Image2D image2D = cl::Image2D(context,

                                                  CL_MEM_USE_HOST_PTR | CL_MEM_READ_ONLY,

                                                  imageFormat,

                                                  kImageWidth,

                                                  kImageHeight,

                                                  imageRowbytes,

                                                  inputImage.get());

------

I then create a Buffer and run a kernel that copies the image into the buffer. The test

concludes by reading the buffer to host memory and comparing it with the inputImage used

as the host buffer for the image.

The comparison fails on Tahiti if I use CL_MEM_USE_HOST_PTR as above.

If I use CL_MEM_COPY_HOST_PTR instead of CL_MEM_USE_HOST_PTR, then it works. If I don't

specify either, then use enqueueWriteImage to write the host buffer to the GPU, it also

works properly.

Here is the output of the test app on my Windows 7 box:

----

Starting OpenCL Rowbytes Test

Testing device: 'AMD Accelerated Parallel Processing - Tahiti'

        Mode: Host copy - Passed

        Mode: Explicit write - Passed

        Mode: Host map - Failed

Testing device: 'AMD Accelerated Parallel Processing -        Intel(R) Core(TM) i7-3930K CPU @ 3.20GHz'

        Mode: Host copy - Passed

        Mode: Explicit write - Passed

        Mode: Host map - Passed

Testing device: 'NVIDIA CUDA - GeForce GTX 680'

        Mode: Host copy - Passed

        Mode: Explicit write - Passed

        Mode: Host map - Passed

OpenCL Rowbytes Test Complete

----

I hope that's clear and manageable.

0 Likes

Thanks for your detailed post. Will dig more next week. Thanks!

0 Likes

hi stoney,

Thanks for the detailed test case. I am able to reproduce the problem here.

Also the memory corruption only happens when some padding is present. If the padding is 0, all the testcases pass properly.

I will send it to the right people.

Thanks

workitem7

0 Likes

Yes, the point of this bug report is that row padding should be ignored. It's required to be a multiple of the pixel size, but the padding is not supposed to be transferred to the cl image.

Row padding shows up in several common situations, such as when processing frames in Adobe After Effects, when processing only a "Region of Interest" in a frame, or when using the padding to select one field of an interlaced frame. It's important to ignore the row padding to avoid needing to copy a frame with row padding to remove the padding before using it to fill a cl image.

Thanks for passing this on to the Right People!

- Stoney

0 Likes