Can you tell me what is the Intermediate Language for AMD's APU machines (in particular, APU A10-5800)? Is it AMD IL? And how can I retrieve the AMD IL from an executable? For the discrete Radeon AMD GPU, I used the build options "-save-temps" to generate the *.il files but when I do the same for the AMD APU machine, the generated .il file looks quite different for the same OpenCL application. I am not sure how to understand it. Can you point me to a specification for it?
Secondly, do the new Kaveri systems (A10-7850K and A10-7700K) support the HSA IL as their intermediate language? Is that already supported by the Beta Catalyst driver that is out?
The OpenCL stack provided by AMD uses AMD IL as its interface to the shader compiler backend, which it uses when targetting a GPU. The differences that you see are due to different GPU implementations. Because the chips differ, the compiler frontend makes different choices about the exact IL it uses. In particular, A10-5800 is a Trinity, which uses an HD 5000 series GPU (also known as Evergreen series). You don't state which discrete GPU you are comparing that IL with, but if it is Southern Islands (mostly HD 7000 series) or something newer the IL is going to be different. But even among chips of the same generation, there can be small differences in IL. Kaveri has one of the newer SI (GCN) GPU's.
HSAIL is a different intermediate language. It (and the rest of the programming environment) is described (in a draft specification) here: http://hsafoundation.com/standards/
If you want to easily see AMD IL and generated GPU instructions (which we call ISA) for an OpenCL kernel on any supported GPU, I suggest that you get CodeXL and use the Kernel Analysis mode. This is much easier than trying to deal with the saved binaries.
If you really want to mess with the saved binaries, they are packaged in a pretty vanilla container. The ELF magic cookie in the first few bytes might lead you to try readelf on them. That said, depending on how the binary was built, the sections present can vary. So the -save-temps option and CodeXL are better ideas.
Thank you for your response. I have a few follow-up questions.
1. I realize HSAIL is a different intermediate language, but my
understanding was that the Kaveri systems will support HSAIL, and that AMD
IL will no longer be supported. Is that not true? And if both will be
supported, do you know when HSAIL support will be (or is) available and how
to access the HSAIL from the binary on Kaveri systems?
2. I am trying to use the -save-temps option to modify the saved binaries.
With respect to that, if I want to replace the AMD IL with my own IL and
create an OpenCL program using OpenCL's clCreateProgramWithBinary with the
new IL, do I need to make sure that the binary only has the .amdil section
and not the remaining sections? Would that work?
On Wed, Feb 26, 2014 at 6:50 AM, Roland Ouellette
I'm not sure of the status of the new HSA language, runtime and such.
If we have released something, it really can only be considered preliminary as the standard has not yet been approved.
The first GPU that really can do all the things promised by the HSA language is Kaveri.
AMD IL will continue to be used in parallel with HSAIL.
The CAL interface we had used to let you program with AMD IL.
But I don't think we have published anything about AMD IL since then.
The interfaces for programming the GPU for computation are OpenCL, C++ AMP, DX compute, OpenGL Compute, and the new HSAIL interface.
What is really unsatisfactory about using OpenCL?
Maybe we can fix your real problem.
That said, if you omit the .text segment from a saved binary, I think that the OpenCL runtime will rebuild the compiled ISA from the IL. I have not looked at an object in a while, but you would also want to remove any sections with IL in binary form. The runtime might prefer those over the text form.
It seems really odd to work directly with AMD IL, other than looking at it and noticing opportunities for optimization.
Likewise, CodeXL will let you examine the machine ISA, but it's not like you really should want to do much about that, beyond noting poor optimization choices and incorrect code.
What are you trying to achieve?