It's possible to inject any binary code into an opencl .elf image and then execute it using the latest OpenCl api. (There's and easier way to do it using CAL, but it's kinda broken on 7970, only the 11.12_79xx 64bit Catalystworks well for that)
The ISA disassembler you mentioned is inside the drivers, but AFAIK it exists only for debugging purposes and there is no ISA asm -> ISA binary compiler in the drivers.
The best you can do is to write your own limited instruction set assembler (or code generator) for your specific tasks. Since July you can check the ISA documentation for the 7970 here -> developer/SDKs/AMD APP SDK/documentation/ AMD_Southern_Islands_Instruction_Set_Architecture_2012_Aug.pdf
Also there is a good overview on instruction encoding in GCN_2620_final.pdf
(site is under maintenance, I wasn't able to give exact links)
Thanks for your answer. I was just wandering, because I am designing 128bit floating point arithmetic on this hardware and I'd like to make it fast and "useable"
"because I am designing 128bit floating point arithmetic on this hardware"
Someone else was also interested in int operations check this thread -> http://devgurus.amd.com/thread/159816 (Integer operations in GCN) here are the instruction you'll need.
In bignum arithmetic we have to work with carries. On GCN they can be handled elegantly with S registers, and with the v_addc_i32 instruction. At the moment this could be the reason choosing to write code in GCN asm.
For example here's a bigint multiplier cell:
v_mul_u32_u24 Work, MCols[CIdx], MRows[RIdx] \
v_add_i32 MAcc[AccIdx], vcc, MAcc[AccIdx], Work \
v_mul_hi_u32_u24 Work, MCols[CIdx], MRows[RIdx] \
v_addc_u32 Carry, vcc, CIn, Work, vcc
It calculates a 48bit product from 2 24bit operands (MCols[CIdx]*MRow[RIdx]) and accumulates the result in 2 32bit registers (named MAcc[AccIdx] and Carry). 'Work' is a temporary register. And vcc is a temporary 'special 1bit' register which moves the carry from the first addition into the second addition.
Now do that in OpenCL within 4 clock cycles!
(btw that would be soo cool If there was a v_mad_u32_u24 which can generate carry -> this way bigint multiplication could be 33% faster)
Thanks it's really helpful but how can I assemble this code?
It is clear for me that writing the arithmetic part in GNC asm would be really really fast but
I have no idea how to assemble your or any code written with Tahiti ISA.
There's no official way to do it.
But you can do like compile a skeleton OpenCL kernel (compile it to binary only) and then patch your own binary code into it.
You'll have to handle .elf files. (Especially an elf file inside and elf image).
Read the GCN ISA manual, so you learn how to encode those instructions you're going to use. Also learn s_branch instructions because you'll have to adjust jump offsets.
Also there are GCN specific technical info in the latest OpenCL manual.
It's not that easy, but AFAIK that's the only way. Since July there is official ISA documentation out. When the GCN card came out, it was harder, because all we had was a pdf brochure that luckily contained all the different instruction encodings and some tech info, and a disassembler.
Thanks for all of the infos. So I have to write my code in byte code if I'd like to write a low-assemly-like program.
Could you please show me a "project" like just, when e.g.: you write a simple code which doubles a value with all of the dirty work, like patching binary code into opencl kernel, using elf images ....
1 of 1 people found this helpful
Hey, give me some free days, I've already planned to make a 'tutorial' on how to execute binary under OpenCl, because I didn't touched OpenCL 4-5 months ago, and now seeing my stuff, I don't remember what I did there
(I'm still using the three_years_deprecated CAL api, it's simpler but it's quiet dead now. I'll have to find out again, what are different registers are used for under OpenCL, and finally write it down haha)
1 of 1 people found this helpful
Hello, It took me more that few days , but maybe you could try this: http://realhet.wordpress.com/
(Hope the it will work at you, though it's absolutely beta)
Thanks for the answers
I will try your assembler for sure if I get my Win7 installed on this machine.
I do some research stuff in number theory, so I'll try your assembler versus OpenCL and
I'm really excited about the result