Hi,
i got an AMD A10-7850K and wanted to test and play with
GitHub - HSAFoundation/CLOC: CL Offline Compiler : Compile OpenCL kernels to HSAIL
But i also thought about getting a dedicated GPU to improve
the potential performance. Like an RX480 or RX460. Is this
possible? I mean, does the IOMMUv2 enable the dedicated
GPU to use the system memory as real shared memory, like
the integrated GPU in the APU?
PS: I was not able to find a real document about this IOMMUv2,
maybe i just didn't find it. Also information about how the cache
coherence is realized in the APU between the CPU and GPU
cores is somewhat limited. I only found third party websites with
information on that. Maybe somebody can point me to a good
source.
Thanks!
BR
Simon
OK,
i got an dedicated GPU but so far i had no success using it with HSA.
I guess its not supposed to work because the cache coherence is only garantied
for the APU.
BR
Simon
The role of the IOMMU IP block is to act as a MMU for the PCI devices. In other words it lies between the PCIe device (in your case GPU) and the RC (Root Complex [1] and does the translations as per what tables it has. For example the GPU instructs the DMA engines to write to memory region 0x1000, the request gets sent via the PCIe lanes and the IOMMU block would translate 0x1000 to another address like 0x2E000 000. The GPU thinks it is writing at 0x1000 through the DMA - the GPU only writes btw using the DMA to RAM (CPU/system memory) and using its memory controller to VRAM (its memory).
Coherency would be a big problem given the memory systems are different - and even if that would be solved, it would provide no real performance benefits that I can think of (given the PCIe transfers need to take place anyway).