AnsweredAssumed Answered

How to reduce map/unmap overhead on APUs?

Question asked by Linuxhippy on Aug 5, 2015
Latest reply on Aug 9, 2015 by Linuxhippy



I would like to make use of zero-copy in an APU environment for legacy code.

I intend to use the following code for data transfer:


// Create Buffers, somewhere else in the application

inBuf = clCreateBuffer(context, CL_MEM_READ_ONLY, bufSize, NULL, &err); //input

outBuf = clCreateBuffer(context, CL_MEM_WRITE_ONLY | CL_MEM_ALLOC_HOST_PTR, bufSize, NULL, &err); //output


// get direct pointer to buffer

inPtr = (unsigned char *) clEnqueueMapBuffer(commands, inBuf, CL_TRUE, CL_MAP_WRITE, 0,  bufSize, 0, NULL, NULL, &err);

// do something with the data pointed to by inPtr

clEnqueueUnmapMemObject(commands, inBuf, inPtr, 0, NULL, NULL); //unMap inPtr




// access result

outPtr = (unsigned char *) clEnqueueMapBuffer(commands, outBuf, CL_TRUE, CL_MAP_READ, 0,  bufSize, 0, NULL, NULL, &err);

clEnqueueUnmapMemObject(commands, outBuf, outPtr, 0, NULL, NULL); //unMap inPtr



Is this the correct way to perform data transfer?


Also for me low invocation / map overhead is more important than peak-throughput on the GPU: The OpenCL kernels will be executed as part of a legacy application, where there is no way to do double-buffered data transfers, so all the calls to map/unmap should be fast. Do the parameters chosen for buffer creation in the code above make sense to this scenario?


I've created a trace using CodeXL, and map/unmap with code very similar to the above snippit (only with 3 in/out buffers) has quite high overhead compared to the actual kernel invocation:


Bildschirmfoto vom 2015-08-05 17_28_43.png


As you can see, while the kernel executes in ~1.5ms (the first buffer-map is slow, because it has to wait for kernel execution).

However mapping the input buffers is horrible slow (CL_MAP_WRITE), taking 0.18-0.25ms each.

Isn't there anything I can do to reduce this overhead?


The APU I used is an AMD_A10-7800 (Spectre) running Centos-7 with the latest Catalyst drivers.


Thank you in advance, Clemens