Fully support you. I am propose RISC CPU which have for example 16 pipelines, but execution units, caches and registers can be shared across pipelines! Also I suggest direct instruction instead of Instructuion->MOPs hierarchy ! Instruction can manipulate up to cache line operands and number of operands can vary.
Good news: AMD have changed the specifications for the future SSE5 instructions to make them more compatible with Intel's AVX scheme, see: http://support.amd.com/us/Processor_TechDocs/43479.pdf
To avoid confusion, they have also changed the name SSE5 to XOP, FMA4 and CVT16.
Thank you so much, AMD. Please tell us if Bulldozer will support both XOP, FMA4 and CVT16?
Bad news: Intel have recently changed the specs for their FMA instructions, so that the compatibility is lost again.
In the initial preliminary specifications, AMD had 3 different operands on FMA instructions, and Intel had 4 operands.
Now, both companies have revised their specifications: AMD now has 4 operands on FMA instructions and Intel has 3 operands isgust;.
Apparently, Intel are to blame for not informing AMD in time about this change. They certainly knew that AMD planned to make compatible instructions because there have been patent sharing negotiations about this issue.
Maintaining compatibility seems to be a game of running after a moving target, as long as both companies keep secrets for each other rather than cooperate.
Can somebody from AMD please comment on how you will react to Intel's latest change in their FMA spec. Will future AMD processors use 3 or 4 operands on FMA instructions, or support both forms?
As far as I can see, semantic compatibility is moot because these instruction set extensions (from AMD and Intel) are not syntactically compatible anyway.
BTW I don't think you can patent an instruction format (i.e., encoding), but the implementation.
Some of the new AMD instructions are different from anything Intel have, for example the half precision floating point calculations. Some instructions are identical in every way, for example PTEST. Some instructions serve the same need but are slightly different, for example AMDs VPCMOV and Intels BLEND instructions.
AMD changed the coding of their FMA instructions to make them fully compatible with Intels instructions. Unfortunately, Intel changed the specs for their FMA instructions so that compatibility is lost once again.
This situation is intolerable to the software community. We must find a way to standardize the x86 instruction set and force AMD and Intel to cooperate on instruction formats rather than playing tricks on each other for the sake of short term PR gains.
Originally posted by: avk Agner: It seems that AMD has heard you . But, alas, Intel did a dirty trick .
Yes, AMD have certainly done what I have argued so heavily for here and elsewhere - thank you very much for that - but I have no idea whether they would have done the same had I not voiced my opinion. I don't know what the motives behind Intel's change was.
All these problems could be avoided if we had a public forum for discussion of new instructions, supported by both Intel and AMD.
The only real reason of Intel's decision to change their FMA specification is to make it incompatible with AMD, IMHO. Somehow Intel have managed to know that AMD will adopt Intel's FMA, so first one chose to change the specification. Of course, Intel will never say that, it will say that "3 operands if much more simple to implement than 4" or something like that.