I just got a Linux box with AMD E-350 Accelerated Processor. Since this APU doesn't have the PCIe bottleneck, I'm trying to understand the best way to perform data transfer between the CPU and the GPU using OpenCL.
I have experimented a little bit with the PCIeBandwidth test example in the APP SDK 2.4. I'm getting ~2.5 GB/s which looks good (especially when I compare with another system that has a Radeon on the PCIe bus). Is there a better way to perform such a test? Any sample code somewhere?
Also, I understand that there is no zero-copy support in Linux yet. Will that impact this device-host bandwidth on the APU? If so, is there an estimate by how much?