8 Replies Latest reply on Mar 11, 2013 12:11 AM by himanshu.gautam

    opencl porting

    sajis997

      Hi forum,

       

      I am trying to port an existing cuda application to OpenCL and i am not getting the corresponding OpenCL function of the following CUDA function:

       

       

      cudaError_t cudaMallocPitch(void ** devPtr,


      size_t * pitch,


      size_t width,


      size_t height

      )


       

       

      Allocates at least width (in bytes) * height bytes of linear memory on the device and returns in *devPtr a pointer to the allocated memory. The function may pad the allocation to ensure that corresponding pointers in any given row will continue to meet the alignment requirements for coalescing as the address is updated from row to row. The pitch returned in *pitch by cudaMallocPitch() is the width in bytes of the allocation. The intended usage of pitch is as a separate parameter of the allocation, used to compute addresses within the 2D array. Given the row and column of an array element of type T, the address is computed as:

          T* pElement = (T*)((char*)BaseAddress + Row * pitch) + Column; 

      For allocations of 2D arrays, it is recommended that programmers consider performing pitch allocations using cudaMallocPitch(). Due to pitch alignment restrictions in the hardware, this is especially true if the application will be performing 2D memory copies between different regions of device memory (whether linear memory or CUDA arrays).

       

      Parameters:

      devPtr - Pointer to allocated pitched device memory

      pitch - Pitch for allocation

      width - Requested pitched allocation width (in bytes)

      height - Requested pitched allocation height

       

      It will be very nice to get some hint to derive the OpenCL version .

       

       

      Thanks

      Sajjad

        • Re: opencl porting
          realhet

          Hi,

           

          You can use clCreateBuffer() if you're going to allocate 1D memory.

          For 2D memory there's a more complicated one: clCreateImage2D() (or even 3D). With this you can specify element format too (for example xyzw/bytes).

           

          Don't forget to use the CL_MEM_READ_WRITEF flag!

            • Re: opencl porting
              himanshu.gautam

              clCreateImage() is the latest API. It can handle 1D, 2D and 3D images.

               

              cl...2D() is deprecated from OpenCL 1.2.

              But NVIDIA seems no plan of moving to OpenCl 1.2

               

              So, you may actually be safe using clCreateImage2D() API - if you are looking at NVIDIA platform.

              • Re: opencl porting
                sajis997

                Hi

                 

                I am not sure if i had explained the issue properly.

                 

                I think i need something that basically represents a 2D array in the linear manner.  So technically, i do need to use the clCreateBuffer(....). All i am confused is how to calculate the pitch value as mentioned in the cudaMallocPitch(...)

                 

                Any more idea folks?

                 

                 

                Regards

                Sajjad

                  • Re: opencl porting
                    nou

                    AFAIK there is no such API.

                    • Re: opencl porting
                      himanshu.gautam

                      Hi,

                      I also agree with nou. Also it looks like a nice feature to give the developers.

                      But IMHO you can add some padding manually while calculating the buffer size. Generally power of two sizes are preferable, so that wavefronts do not diverge while reading/writing. But power of two may not be practical always, so i suggest you to try out the next multiple of 16 or 64 as the actual size. It would be nice if you can share your results.

                      • Re: opencl porting
                        realhet

                        For this purpose: "represents a 2D array in the linear manner"

                        I'd use simple linear memory allocated with clCreateBuffer().

                        And inside the kernel it can be accessed as simple as: buffer[mad24(y,width,x)]

                        You should use a pitch of your own choice if there would be too many bank conflicts based on your program's memory access pattern.

                        If you use Image it will utilize the hardware texture samplers (with a possibility for mip mapping, aniso filtering, converting to 0.0..1.0 range), but for this simple 2D linear thing I think it is too much.

                          • Re: opencl porting
                            sajis997

                            Hi forum,

                             

                            Thanks for all the hints over this issue.

                             

                             

                            I am creating a linear array on the device and another texture buffer on device for faster texture fetch. The linear array is created as follows:

                             

                            [code]

                            typedef cl_float2 cData;

                            cData* hvfield = NULL;

                             

                            ......................................

                            ......................................

                             

                               //allocate the opencl object for the source data

                                hvfield = (cData*)malloc(sizeof(cData) * DS);

                             

                             

                                memset(hvfield,0,sizeof(cData) * DS);

                             

                                dvfield = clCreateBuffer(cxGPUContext,CL_MEM_READ_WRITE | CL_MEM_COPY_HOST_PTR,sizeof(cData) * DS,hvfield, &ciErrNum);

                                oclCheckErrorEX(ciErrNum, CL_SUCCESS, pCleanup);

                             

                            [/code]

                             

                            Now while creating the texture object i am confused about the cl_image_format

                             

                            [code]

                               cl_image_format image_format;

                               image_format.image_channel_order = CL_RGBA;

                               image_format.image_channel_data_type = CL_UNSIGNED_INT8;

                             

                            [/code]

                            I am not sure about the second assignment of the immediately above snippet. The image texture is basically pointing to the linear buffer i have created first and the data type is cData there.

                             

                            What should be the data type here ? Since i typdef it like typedef cl_float2 cData, what should be assignment for

                             

                               image_format.image_channel_data_type = ????

                             

                            Regards

                            Sajjad

                              • Re: opencl porting
                                himanshu.gautam

                                   cl_image_format image_format;

                                   image_format.image_channel_order = CL_RGBA;

                                   image_format.image_channel_data_type = CL_UNSIGNED_INT8;

                                 

                                I am not sure about the second assignment of the immediately above snippet. The image texture is basically pointing to the linear buffer i have created first and the data type is cData there.

                                 

                                What should be the data type here ? Since i typdef it like typedef cl_float2 cData, what should be assignment for

                                 

                                   image_format.image_channel_data_type = ????

                                 

                                Regards

                                Sajjad

                                As i understand you want to use a bufer of cl_float2 type as a cl_image object. Here is what i find in spec for that.

                                Image Channel Data Type can be CL_FLOAT type and image channel order can be CL_RG (From  5.6 & 5.7 of OpenCL spec 1.2). You could also use channel order as CL_R, but with double the indexes in the cl_image object.

                                Hope this helps.

                                1 of 1 people found this helpful