I have a requirement to optimize a working OpenCL program for AMD GPUs.
I would like to rewrite one core kernel in ISA assembly, but I'm stuck on the tool chain.
I have analyzed the OpenCL code in CodeXL, which gives me IL and ISA versions, but I don't know how to modify these and then run them.
I have done "Export Binary" ( device-specific binaries would be fine ) but I don't know how to run these in my own program.
Any suggestions would be very welcome at this point.