cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

arsenm
Adept III

Error when replacing IL in binary

I'm trying to replace the .amdil section in the binary, but I get an error when I try to load it ("Internal error: Input OpenCL binary is not for the target!").

I generated a binary (on Cayman) without the .llvmir or executable sections, and then used objcopy to replace the IL section (kind of like shown here). I don't see where it would get this from. The device type should be the same. In the OpenCL C generated kernel, there is a block of comments that look like ";device:cayman". Does it look for these comments or something? I tried sticking in the same block of comments from a reference kernel, but I get the same error.

 

 

 

0 Likes
15 Replies

A few questions:
1) Are you attempting to generate a binary for the same device?
2) Are you updating the .rodata section with the correct metadata information(this is not documented but is the same as the comments in the code)?
3) How important is this feature to your project?
0 Likes

1. Yes, it's the same device

2. I was wondering if that was there, however I was trying to eliminate that variable by starting with a binary from the same device.

3. It's pretty important. My CAL/IL version of the project is ~30% faster than my OpenCL version. I need to replace it with the CAL deprecation. This is for a largeish distributed computing science project, and the current CAL version is responsible for about 80% of our results.

I think I could achieve similar speed with only a few small chunks of IL though (I would much prefer inline IL or perhaps some kind of intrinsic extension functions). I believe most of the problem is that I can't get to the dldexp and dfrac IL instructions for a couple of replacements I use for math functions. The most important function in this project is (double precision) exp(). My OpenCL kernel uses the standard exp. The IL exp replacement I'm using uses dfrac and dldexp. I've tried to achive the same thing in OpenCL using the frac() and ldexp() functions, but it doesn't seem to use dfrac, and uses 2 dlexp. The whole thing ends up 50% slower than the somewhat more accurate standard function.

I can post the pieces of IL that I really want to reach if you're interested.

0 Likes

arsenm,
While this isn't a path that we support, so I can't guarantee that it will work, one thing you can try.
Write an OpenCL program and compile the binary to IL. Modify the IL to contain your sequence of instructions, but not touching the boot strap code. Reinject the modified IL.
0 Likes

I seem to get the same problem when I try replacing the same IL that was in the original binary, so something must be wrong with my objcopy-ing to replace the section.

 

 

objcopy -I elf32-i386 -O elf32-i386 --alt-machine-code=0x3f8 -R ".amdil" -R ".source" -R ".llvmir" --add-section ".amdil"=replacement_kernel.il original.bin new.bin

0 Likes

I noticed the type of the ELF object and a few other small details changed between the modified one and the unmodified with readelf. I wrote a quick little utility to replace the section manually with libelf, and that seems to be working.

0 Likes

The bitcoin OpenCL grinder also needs a function not exposed in OpenCL. They solved it by substituting a different function that had the same input/output configuration and made a binary search and replace on the compiled kernel, substituting the placeholder with the needed instruction.

It seems to me that only offering OpenCL as target language for the AMD GPUs is a waste seeing as the hardware has powerful instructions going to waste...

 

Edit: URL to kernel: http://bitcointalk.org/?topic=6458.0

0 Likes

Originally posted by: MicahVillmow arsenm, While this isn't a path that we support, so I can't guarantee that it will work, one thing you can try. Write an OpenCL program and compile the binary to IL. Modify the IL to contain your sequence of instructions, but not touching the boot strap code. Reinject the modified IL.



Is it possible to reinject the modified IL using only the OpenCL API, or do I have to use CAL API to do that?

0 Likes

galmok,
The BFI_INT instruction is being exposed soon.
0 Likes
arsenm
Adept III

I made some progress on this. The problem was I had a typo'd instruction which didn't actually exist. I had a dl_arena_uav_id instead of a dcl_arena_uav_id. The error was the misleading "Input OpenCL binary is not for the target!" error.

0 Likes

I have been following the hints in this thread on replacing the IL code in the ELF binary. I need a factor of 2-3 speedup over the OpenCL code. Otherwise, I will need to switch to FPGAs.

I got it to work using libelf and replacing the IL code in the .amdil section with the IL code generated from C. The problem is just adding comments to the IL code causes it to fail on occasion.

; Start of main().
works, but
; Start of main(). Duck! Duck! Duck!
Fails with error
calclCompile failedInternal error: Input OpenCL binary is not for the target!

It does not matter what the comment is, only the number of characters used.  Replacing the string with spaces has the exact same effect. It is very repeatable.  Any thoughts?

OpenCL 1.1 AMD-APP-SDK-v2.5 (684.213)
Catalyst_11.8 Driver
Linux 2.6.38-11-generic #48-Ubuntu

0 Likes

My experience is that the error "calclCompile failedInternal error: Input OpenCL binary is not for the target!" is a pretty generic and usually misleading error that seems to be caused by most problems.

It sometimes is legitimate if you're using IL instructions not supported on whatever you're building for, but sometimes it isn't. For example you get the same error if using ones that don't exist at all.

I had a few other weird cases that resulted in this (but I don't think I ran into anythere where changing comments caused it) but I don't remember what they were right now.

 

0 Likes

I found the problem that prevented modifying the IL source.  If the size of the ELF .amdil section is changed, then the symbol table needs to reflect the new size.  Actually, the size of the symbols in the .symtab section need to mirror the size of the .amdil and .rodata sections.

I had the the following three symbols in .symtab.  The first needed to reflect the size of the .amdil section and the other two were related to the size of the .rodata section.

Symbol 1: __OpenCL_ker_name_amdil
Symbol 2: __OpenCL_ker_name_metadata
Symbol 3: __OpenCL_ker_name_fmetadata

0 Likes

riataman,
If you can provide an example of what fails, we can investigate it. However, this is an unsupported path at this time, so I can't give any guarantee's.
0 Likes

Glad you were able to find the solution, hopefully it will be useful to others that are experimenting with this.
0 Likes

eugenek,
As this is not a supported path, there is no API to do what you want.
0 Likes