I am currently evaluating Kaveri+OpenCL for image processing tasks, where we have the following requirements:
- For image data highest possible memory bandwidth + zero-copy (but no CPU/GPU coherency or SVM support required)
- For data structures SVM, we would like to be able to pass data-structures with Pointers to the GPU - however for this highest throughput is not required.
I tried to find documentation, and at least for earlies APUs there were different memory busses (onion and garlic) with very different performance characteristics.
Is there documentation available for Kaveri covering the same topics?
E.g. I would be interested how different memory-mapping flags influence read/write throughput on both (GPU+CPU) sides and which options will yield zero-copy.
Thank you in advance, Clemens