We have recently worked on this code to showcase, asynchronous DMA + Kernel Execution on AMD GPUs. Please go through it, give feedback. We hope it helps a lot developers and students to achieve better performance using AMD's hardware.
Courtesy: Andryeyev German
Message was edited by: Himanshu Gautam