I have been reading about the ability to map buffers in OpenCL in order to achieve zero-copy performance. However, it seems that this functionality is only being discussed with regard to newer APUs and not discrete GPUs where data has to be sent over the PCIe bus.
My question is:
For discreet GPUs like the 5870, is there any implementation(and hence speed) difference between writing a buffer to the device via a copy vs mapping a buffer and then writing.
I am ultimately trying to determine whether switching to buffer mapping on a descrete GPU is going to give me any speed increase and, if so, an explanation of why it is faster if the data is going over the PCIe bus in either scenario.
In addition, does AMD have a 'best practices' document for minimizing this bottleneck that someone can give me a link to.
Here is a reference to an APU article that discusses buffer mapping.