cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

Wheedle
Journeyman III

Newbie question about workspace, buffers and char3

Hi everybody,

I’m stuck with a small problem which is most probably due to my lack of knowledge of OpenCL. What I’m trying to do is to capture a colored image send it to my GPU transform it to grayscale and get it back. To do so I’m using buffers because it’s only an exercise. Because it’s most probably a logical mistake I’m writing down also my thinking. Here it is:

In order to read the picture I first use OpenCV:

IplImage* frame = cvLoadImage("D:\\Special Folders\\Pictures\\SamplePictures\\JellyfishS.jpg");       

W = frame->width;

H = frame->height;

To access the image itself the IplImage Structure provides me with a char pointer “imageData”. I can also access the width and height of the image in pixels.

I create also an “empty image” with only one channel that will receive the result of the kernel.

IplImage * op_data = cvCreateImage(cvSize(frame->width,frame->height),frame->depth,1); //last argument is the number of channel

After creating the platform, context, queue, program and so on I create the buffers like that:

size_t mem_size = (size_t)frame->imageSize; //return the size of the image in bytes

size_t outputSize = (size_t)op_data->imageSize;

cl_mem data = clCreateBuffer(context, CL_MEM_READ_ONLY | CL_MEM_COPY_HOST_PTR,mem_size, (void*)frame->imageData, &ciErrNum);

cl_mem output = clCreateBuffer(context,CL_MEM_WRITE_ONLY,outputSize,NULL,&ciErrNum);

My kernel is created like that:

cl_kernel mykernel = clCreateKernel (myprog , "luminosityGrayscale",&status);

I set the arguments:

status = clSetKernelArg(mykernel, 0, sizeof(cl_mem), &data);

status = clSetKernelArg(mykernel,1,sizeof(cl_mem), &output);

And finally I execute like that:

size_t globalws[2] = {W, H};

cl_event evt;

status = clEnqueueNDRangeKernel(queue, mykernel,2, 0, globalws, NULL,0, NULL, &evt);

I read back the data like that:

status = clEnqueueReadBuffer(queue, output, CL_TRUE,0, outputSize, 

(void*) op_data->imageData , NULL, NULL, NULL);

Now because the image I manipulate is a 24 bits RBG image (so 8 bits per channel) and because OpenCV store images in sequence like this RBGRBGRBGRBG,I decided for my kernel to use a char3 for the imput argument thinking that like that I could manipulate easily each pixel and for the output argument a char since the resulting image should only have one channel (so only 8 bits per pixel). I also, as shown in the code above, created a workspace with the size of my image in pixels (which means one workItem per pixel – if I well understand the concept). So my kernel looks like that:

__kernel void luminosityGrayscale (__global uchar3 * data, __global uchar * output){

       const int i = get_global_id(0);

       const int j = get_global_id(1);

       const int sizei = get_global_size(0);

       int index = i+j*sizei;

       uint R = (int)data[index].s0;

       uint G = (int)data[index].s1;

       uint B =(int)data[index].s2;

       uint average = R*0.07+G*0.71+B*0.21;    

       output[index] = (uchar)average;  

}

The result of this is wrong. I have a grayscale image alright, but it is kind of a duplicated (three of four times), superposed, shifted image.

My guess is that I screw up the index part in the kernel…But I can’t pinpoint the flaw. It’s also possible that I made a mistake somewhere else. So my question is: “Where my logic is flawed?”


0 Likes
1 Solution
antzrhere
Adept III

The reason it is wrong is that all 3 component data types (char3, int3, float3 etc.) are arranged as 4-component vectors IN MEMORY (i.e. your char3 is actually aligned to a 4 byte boundary). 3 component vectors are a special case that was added to OpenCL later on during it's cycle. This allows the forth component to be ignored in arithmetic.

That's why your image is 'shifted' as (because of the different sizes) each read is incremently misaligned by an additional byte.

The only work around is to input your data as an RGBA image file. You can then choose to read it as a char4, or as char3 and the alpha channel will be ignored. While it is true you could also read each component as a scalar value from a char[] buffer but this would be very inefficient and bad practice.

View solution in original post

0 Likes
21 Replies