7 Replies Latest reply on Sep 29, 2009 6:33 AM by omkaranathan

    ERROR: kernel must return void; pointer argument must point to addrSpace __global, __local, or __constant

    alexaverbuch

      Hi,

      I have recently added a few more parameters to my Kernel and now I get the following error:

      /tmp/OCLSr0DUi.cl(17): error: kernel must return void; pointer argument must
                point to addrSpace __global, __local, or __constant
        __kernel void edgeDetectKernel(    __global      uint4 * input,

      After adding the new parameters I also added code in the Host to allocate memory, create buffers, set Kernel arguments, etc. I don't know where to look next...

      I have had this error once before, but that was when I was (foolishly) trying to pass a pointer2pointer as a paramter.

      Does anyone know what could be causing this error?

      Thanks in advance,

      Alex

      __kernel void edgeDetectKernel( __global uint4 * input, __private uint * intermediate, __global uint * output, __global uint * clSobelOpX, __global uint * clSobelOpY, __private uint kernelCount, const uint2 sobelDim, const uint2 inputOutputDim, const uint2 intermediateDim ) { // CODE }

        • ERROR: kernel must return void; pointer argument must point to addrSpace __global, __local, or __constant
          alexaverbuch

          Here is some of the code for creating my cl_mem buffers

          inputBuffer = clCreateBuffer( context, CL_MEM_READ_ONLY | CL_MEM_USE_HOST_PTR, sizeof(cl_uint4) * width * height, input, &status); intermediateBuffer = clCreateBuffer( context, CL_MEM_READ_WRITE | CL_MEM_USE_HOST_PTR, sizeof(cl_uint) * intermediateDim.u32[0] * intermediateDim.u32[1], intermediate, &status); outputBuffer = clCreateBuffer( context, CL_MEM_WRITE_ONLY | CL_MEM_USE_HOST_PTR, sizeof(cl_uint) * width * height, output, &status);

            • ERROR: kernel must return void; pointer argument must point to addrSpace __global, __local, or __constant
              alexaverbuch

              hhmm OK, so __private can not be a pointer type...

              Is there any way to allocate a buffer for a Kernel, and make it private?

              It's not possible to allocate memory INSIDE the Kernel, so what other alternative is there for providing a Kernel with it's own private work space (an array)?

                • ERROR: kernel must return void; pointer argument must point to addrSpace __global, __local, or __constant
                  jcpalmer

                  Just statically declare it in the body of the kernel like:

                  __kernel void edgeDetectKernel( ...){

                  uint   intermediate[37];

                  }

                  Be careful though, as this is not an unlimited resource.  It also is multiplied by the Work Group Size.  I know this is a school project, so this is probably fine.  If this were commercial, I think you might want to look hard at your design to see if you can avoid it, especially if it is large.

                    • ERROR: kernel must return void; pointer argument must point to addrSpace __global, __local, or __constant
                      alexaverbuch

                      Hi jcpalmer,

                      I can't declare it like that as this application performed edge detection on images using Sobel operators.

                      First I convert the image into grey-scale (black and white), and then I perform the edge detection.

                      I want each Kernel to perform the matrix convolutions on a "block" of the image, and the "block" size is dictated by the amount of "global threads"/KERNEL_COUNT. I decide KERNEL_COUNT at compile-time, but the "block" size obviously depends on the image size too, so it is calculated at run-time.

                      When performing the matrix convolutions with the Sobel operators there are data depencies between neighbouring "blocks" of the grey-scale image. To minimize synchronization points and maximize parallelism I wanted to elocate an intermediate buffer (that has a little "extra" area) to each Kernel (privately) and have them all do a bit of reduntant work.

                      E.g. Sacrifice some memory, to gain parallelism.

                      My back-up plan is simply to NOT use an "intermediate" buffer and perform all operations on the __global output buffer.

                      But this means I need a barrier between the grey-scale calculation and the edge-detection calculation. Still parallel... but not pretty in my opinion.

                      I think the extra memory usage is not so large... as we are talking about (several copies of) a 2D image, nothing more.

                      Thanks all the same.

                      Any further suggestions about way that OpenCL was designed to handle these cases would be greatly appreciated.

                      Alex