cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

boxerab
Challenger

What is the fastest way to transfer data between host and device on Cape Verde arch ?

I have an HD 7700 GPU, on windows 7.

Here is how I am currently managing my memory:

1) I allocate a host buffer using new[] operator, aligned to page size (4096 bytes)

2) I allocate a device buffer via clCreateBuffer, using the CL_MEM_READ_ONLY flag

3) I transfer from host to device using clEnqueueWriteBuffer(...), with blocking set to CL_FALSE.

I have found this method to be pretty fast. Can I do better?

Thanks.

0 Likes
2 Replies
dipak
Big Boss

Usually to transfer data from host memory to device memory (for example, using clEnqueueReadBuffer or clEnqueueWriteBuffer), some kind of memory pinning is required. As pinning takes time, in situation where data to be transferred frequently or multiple times, one can avoid pinning cost by using a (already) pinned host memory. Pinned host memory can be created using flag CL_MEM_ALLOC_HOST_PTR or CL_MEM_USE_HOST_PTR during buffer creation. Use that pinned memory to transfer the data to device memory each time. For more details, I suggest you to go through the section 1.3 and 1.4 of Chapter-1 [OpenCL Performance and Optimization] in AMD OpenCL Programming Optimization Guide.

Regards,

0 Likes

Thanks, Dipak.  I switched to CL_MEM_USE_HOST_PTR, and saw a slight improvement on perf.

0 Likes