AnsweredAssumed Answered

Bug in clEnqueueMapBuffer in SDK 2.9-1

Question asked by nibal on Sep 17, 2015
Latest reply on Sep 28, 2015 by dipak

As discussed in another thread "Optimization Guide Memory Allocation", according to the Optimization guide, when the display driver fglrx supports VM, and data is transferred from the application to the GPU kernel device, this should be a 0-copy when using the appropriate flags in CreateBuffer and use MapBuffer for the transfer. I imagine this to work in other SDKs, since it is written in the guide.

In my case:


clinfo | grep Driver

Driver version: 1445.5 (VM)


I'm using CL_MEM_ALLOC_HOST_PTR in my CreateBuffer and use MapBuffer for the transfer of data. CodeXL reports for the same exactly amount of data:


A) Read/Write Buffers

WriteBuffer: 173 ms for 6241 calls each@.02773 ms

ReadBuffer: 122 ms for 390 calls each@.312 ms


B) Map/Unmap Buffers

MapBuffer: 193 ms for 6630 calls each@.02907 ms

UnmapBuffer: 120 ms for 6630 calls each@0.01811 ms


Notice that actually the sum of Read/WriteBuffer calls is slightly less than the sum of the Map/UnmapBuffer calls, a far cry from the 0-copy it should be.

Plz fix