I would like to know if it's possible to get profiling information about data transfer times when using buffers allocated with the CL_MEM_USE_HOST_PTR flag. Currently, I can't seem to find anything in the profiler output (sprofile 2.4 from AMD APP 2.6) about the data up or download times.
My code is quite simple: it creates the buffers (with the USE_HOST_PTR flag), unmaps them (I don't think this is even necessary), and then maps the destination buffer to read the data from the GPU after kernel execution.
In the API trace from sprofile I see that the map command does specify a memory size, although the runtimes are exceptionally short; for the unmapings, I don't even get to see the memory transfer size. Is this information simply unavailable, or is there some other way to extract it?