4 Replies Latest reply on Sep 22, 2010 7:59 AM by n0thing

    array of buffer objects

    laobrasuca
      how to pass it as argument to a kernel

      Hi all,

      i have a collection of cl_mem buffers which i want to make part of a single kernel execution. what i've done so far was making an array of cl_mem:

      cl_mem ArrayOfBuffers = new cl_mem[NumberOfBuffers];

      (and create the mem from arrays of floats)

      then, i set it as kernel argument like:

      clSetKernelArg(kernel, 0, sizeof(cl_mem*), (void *)&ArrayOfBuffers)

      However i've a problem when getting it into the kernel, where I have:

      __kernel void kernel_job(__global float** Array)

      but the compiler crashes with error: kernel arguments can't be declared with types bool/half/pointer-to-pointer. How can i "receive" it into the kernel?

      I really want to avoid running one kernel per buffer. The problem is that i have hundreds of buffers, which would mean hundreds of kernel calls, thus lots of overhead.

        • array of buffer objects
          n0thing

          You should do someting like this -

          cl_mem *ArrayOfBuffers = new cl_mem[NumberOfBuffers];

          And set the arguments like -

          for(int i = 0; i < NumberOfBuffers; i++)

               clSetKernelArg(kernel, i, sizeof(cl_mem), (void *)&ArrayOfBuffers)

          And inside your kernel -

          __kernel void kernel_job(__global float* Array0, __global float* Array1, ...)

          This is the only way because pointer to pointer is unsupported in OpenCL.

          • array of buffer objects
            n0thing

            EDIT to the above post -

            clSetKernelArg(kernel, i, sizeof(cl_mem), (void *)&(ArrayOfBuffers + i))

              • array of buffer objects
                laobrasuca

                 

                Originally posted by: n0thing EDIT to the above post -

                 

                clSetKernelArg(kernel, i, sizeof(cl_mem), (void *)&(ArrayOfBuffers + i))

                 

                thx for the fast reply!

                but the problem is that i can reach the maximum number of arguments. Btw, what the CL_DEVICE_MAX_PARAMETER_SIZE means? it says "Max size in bytes of the arguments that can be passed to a kernel" but i don't really understand how to compute the max number of arguments from this.

                  • array of buffer objects
                    n0thing

                    ITs the limit on total size in bytes of all of your arguments to a kernel. If all of your arguments are float* then each is of 4 bytes.

                    Currently 1024 bytes is the maximum limit on AMD GPUs so that gives you a maximum of 1024/4 = 128 arguments. So in your case you can use 128 buffers in 1 kernel.

                    I think its better to pack 4 buffers into 1 buffer by using float4 buffers, that will give you better performance in case of reads/writes.

                    Also using more arguments will also increase the time of your kernel invocation but thats a relative issue compared to how much time our kernel takes.