cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

sajis997
Adept I

opencl porting

Hi forum,

I am trying to port an existing cuda application to OpenCL and i am not getting the corresponding OpenCL function of the following CUDA function:

cudaError_t cudaMallocPitch(void ** devPtr,


size_t * pitch,


size_t width,


size_t height

)


Allocates at least width (in bytes) * height bytes of linear memory on the device and returns in *devPtr a pointer to the allocated memory. The function may pad the allocation to ensure that corresponding pointers in any given row will continue to meet the alignment requirements for coalescing as the address is updated from row to row. The pitch returned in *pitch by cudaMallocPitch() is the width in bytes of the allocation. The intended usage of pitch is as a separate parameter of the allocation, used to compute addresses within the 2D array. Given the row and column of an array element of type T, the address is computed as:

    T* pElement = (T*)((char*)BaseAddress + Row * pitch) + Column; 

For allocations of 2D arrays, it is recommended that programmers consider performing pitch allocations using cudaMallocPitch(). Due to pitch alignment restrictions in the hardware, this is especially true if the application will be performing 2D memory copies between different regions of device memory (whether linear memory or CUDA arrays).

Parameters:

devPtr - Pointer to allocated pitched device memory

pitch - Pitch for allocation

width - Requested pitched allocation width (in bytes)

height - Requested pitched allocation height

It will be very nice to get some hint to derive the OpenCL version .

Thanks

Sajjad

0 Likes
8 Replies
realhet
Miniboss

Hi,

You can use clCreateBuffer() if you're going to allocate 1D memory.

For 2D memory there's a more complicated one: clCreateImage2D() (or even 3D). With this you can specify element format too (for example xyzw/bytes).

Don't forget to use the CL_MEM_READ_WRITEF flag!

0 Likes

clCreateImage() is the latest API. It can handle 1D, 2D and 3D images.

cl...2D() is deprecated from OpenCL 1.2.

But NVIDIA seems no plan of moving to OpenCl 1.2

So, you may actually be safe using clCreateImage2D() API - if you are looking at NVIDIA platform.

0 Likes

Hi

I am not sure if i had explained the issue properly.

I think i need something that basically represents a 2D array in the linear manner.  So technically, i do need to use the clCreateBuffer(....). All i am confused is how to calculate the pitch value as mentioned in the cudaMallocPitch(...)

Any more idea folks?

Regards

Sajjad

0 Likes

AFAIK there is no such API.

0 Likes

Hi,

I also agree with nou. Also it looks like a nice feature to give the developers.

But IMHO you can add some padding manually while calculating the buffer size. Generally power of two sizes are preferable, so that wavefronts do not diverge while reading/writing. But power of two may not be practical always, so i suggest you to try out the next multiple of 16 or 64 as the actual size. It would be nice if you can share your results.

0 Likes

For this purpose: "represents a 2D array in the linear manner"

I'd use simple linear memory allocated with clCreateBuffer().

And inside the kernel it can be accessed as simple as: buffer[mad24(y,width,x)]

You should use a pitch of your own choice if there would be too many bank conflicts based on your program's memory access pattern.

If you use Image it will utilize the hardware texture samplers (with a possibility for mip mapping, aniso filtering, converting to 0.0..1.0 range), but for this simple 2D linear thing I think it is too much.

0 Likes

Hi forum,

Thanks for all the hints over this issue.

I am creating a linear array on the device and another texture buffer on device for faster texture fetch. The linear array is created as follows:

typedef cl_float2 cData;

cData* hvfield = NULL;

......................................

......................................

   //allocate the opencl object for the source data

    hvfield = (cData*)malloc(sizeof(cData) * DS);

    memset(hvfield,0,sizeof(cData) * DS);

    dvfield = clCreateBuffer(cxGPUContext,CL_MEM_READ_WRITE | CL_MEM_COPY_HOST_PTR,sizeof(cData) * DS,hvfield, &ciErrNum);

    oclCheckErrorEX(ciErrNum, CL_SUCCESS, pCleanup);

Now while creating the texture object i am confused about the cl_image_format

   cl_image_format image_format;

   image_format.image_channel_order = CL_RGBA;

   image_format.image_channel_data_type = CL_UNSIGNED_INT8;

I am not sure about the second assignment of the immediately above snippet. The image texture is basically pointing to the linear buffer i have created first and the data type is cData there.

What should be the data type here ? Since i typdef it like typedef cl_float2 cData, what should be assignment for

   image_format.image_channel_data_type = ????

Regards

Sajjad

0 Likes

   cl_image_format image_format;

   image_format.image_channel_order = CL_RGBA;

   image_format.image_channel_data_type = CL_UNSIGNED_INT8;

I am not sure about the second assignment of the immediately above snippet. The image texture is basically pointing to the linear buffer i have created first and the data type is cData there.

What should be the data type here ? Since i typdef it like typedef cl_float2 cData, what should be assignment for

   image_format.image_channel_data_type = ????

Regards

Sajjad

As i understand you want to use a bufer of cl_float2 type as a cl_image object. Here is what i find in spec for that.

Image Channel Data Type can be CL_FLOAT type and image channel order can be CL_RG (From  5.6 & 5.7 of OpenCL spec 1.2). You could also use channel order as CL_R, but with double the indexes in the cl_image object.

Hope this helps.