AnsweredAssumed Answered

Newbie question about workspace, buffers and char3

Question asked by Wheedle on Mar 3, 2012
Latest reply on Mar 20, 2012 by mpineyro

Hi everybody,

I’m stuck with a small problem which is most probably due to my lack of knowledge of OpenCL. What I’m trying to do is to capture a colored image send it to my GPU transform it to grayscale and get it back. To do so I’m using buffers because it’s only an exercise. Because it’s most probably a logical mistake I’m writing down also my thinking. Here it is:

In order to read the picture I first use OpenCV:


IplImage* frame = cvLoadImage("D:\\Special Folders\\Pictures\\SamplePictures\\JellyfishS.jpg");       

W = frame->width;

H = frame->height;


To access the image itself the IplImage Structure provides me with a char pointer “imageData”. I can also access the width and height of the image in pixels.

I create also an “empty image” with only one channel that will receive the result of the kernel.


IplImage * op_data = cvCreateImage(cvSize(frame->width,frame->height),frame->depth,1); //last argument is the number of channel


After creating the platform, context, queue, program and so on I create the buffers like that:

size_t mem_size = (size_t)frame->imageSize; //return the size of the image in bytes

size_t outputSize = (size_t)op_data->imageSize;

cl_mem data = clCreateBuffer(context, CL_MEM_READ_ONLY | CL_MEM_COPY_HOST_PTR,mem_size, (void*)frame->imageData, &ciErrNum);

cl_mem output = clCreateBuffer(context,CL_MEM_WRITE_ONLY,outputSize,NULL,&ciErrNum);


My kernel is created like that:

cl_kernel mykernel = clCreateKernel (myprog , "luminosityGrayscale",&status);


I set the arguments:

status = clSetKernelArg(mykernel, 0, sizeof(cl_mem), &data);

status = clSetKernelArg(mykernel,1,sizeof(cl_mem), &output);


And finally I execute like that:

size_t globalws[2] = {W, H};

cl_event evt;

status = clEnqueueNDRangeKernel(queue, mykernel,2, 0, globalws, NULL,0, NULL, &evt);


I read back the data like that:

status = clEnqueueReadBuffer(queue, output, CL_TRUE,0, outputSize, 

(void*) op_data->imageData , NULL, NULL, NULL);


Now because the image I manipulate is a 24 bits RBG image (so 8 bits per channel) and because OpenCV store images in sequence like this RBGRBGRBGRBG,I decided for my kernel to use a char3 for the imput argument thinking that like that I could manipulate easily each pixel and for the output argument a char since the resulting image should only have one channel (so only 8 bits per pixel). I also, as shown in the code above, created a workspace with the size of my image in pixels (which means one workItem per pixel – if I well understand the concept). So my kernel looks like that:


__kernel void luminosityGrayscale (__global uchar3 * data, __global uchar * output){

       const int i = get_global_id(0);

       const int j = get_global_id(1);

       const int sizei = get_global_size(0);

       int index = i+j*sizei;

       uint R = (int)data[index].s0;

       uint G = (int)data[index].s1;

       uint B =(int)data[index].s2;

       uint average = R*0.07+G*0.71+B*0.21;    

       output[index] = (uchar)average;  



The result of this is wrong. I have a grayscale image alright, but it is kind of a duplicated (three of four times), superposed, shifted image.

My guess is that I screw up the index part in the kernel…But I can’t pinpoint the flaw. It’s also possible that I made a mistake somewhere else. So my question is: “Where my logic is flawed?”