Hi,
The fastest way on AMD hardware is to use a pinned memory buffer to avoid the additional copy into the application's address space. You should also ensure that you are reading in a format that matches that of the buffer you are reading from, and that the offset in the buffer is naturally aligned for the chosen internal format. This will ensure that the driver does not have to do any additional alignment or format conversion work.
Thanks,
Graham