Hi,
s_load_buffer changed from 32 to 64 bits, so it supports a bigger offset, that's reasonable.
But why rearrange the flag bits in the MUBUF encoding?
Also in VOP3a, only the clamp bit is repositioned.
And most interestingly VOP2 opcodes are changed as well. o.O (I haven't seen the other opcodes yet, but I can guess they changed as well.)
Is this incompatibility has a higher purpose? Less transistor count in the chip for example?
For the assembler it is not that good, as now it has to support both the old and the new ISA. Including Evergreen there is now 3 different ISAs to produce in total.
(Also the disassembler still not works on ISA-only elf files, but I guess it is for the protection of kernel files to make them harder to reverse-engineer.)
On the hardware side, it's a great new gpu. I always find some new stuff, like now I've found a real time clock (s_memrealtime), and a special 16bit shl thingy.