AMD and Intel are making mutually incompatible instructions and are using different instruction codes for almost identical instructions. This is certainly not what the IT community wants, but it is a consequence of free competition. The two companies are competing to invent new instructions and keeping their plans secret for the sake of competition. The consequence is mutually incompatible instructions. We have seen the two companies assigning different codes to the same instruction, but the worst nightmare is yet to come: assigning different instructions to the same code.
The current situation is very unfortunate for the software industry. Very few software developers are willing to bear the costs of developing, testing and maintaining separate versions of their software for AMD and Intel.
This problem is a consequence of the market situation where each company has to keep its plans secret for reasons of competition. A voluntary peace agreement is unlikely, so the only cure is a legal or political intervention. The initiative for a legal intervention may come from AMD, because the current situation is more advantageous to Intel than to AMD. The best that can come out of such a process is a public standardization committee where new instructions are discussed and approved. A less ambitions outcome would be an agreement about which part of the opcode space each company can use for its innovations.
However, such a legal process could take years, and AMD cannot remain passive in the meantime. I will therefore discuss what AMD could do in the present situation if no peace agreement with Intel can be obtained.
The history in a nutshell:
- AMD invented 3DNow, Intel invented SSE, Intel won. AMD had to copy SSE.
- AMD invented x64, Intel invented IA64, AMD won. Intel copied x64.
- AMD invented SSE5, Intel invented AVX, Intel won. AMD will have to copy AVX.
The situation of SSE5 versus AVX is particularly troublesome. We have two different schemes for coding instructions with more than two operands. These two schemes are mutually incompatible and it would be quite costly in terms of instruction decoding hardware to support both. The AVX scheme is technically superior, as I have argued elsewhere (http://aceshardware.freeforums.org/intel-avx-kills-amd-sse5-t538.html) so I have no doubt that AVX will win this competition.
AMD will have to revise their SSE5 specification to fit the AVX coding scheme. Call it SSE5R or whatever. Some of the SSE5 instructions can simply be replaced by the almost equivalent instructions in the Intel AVX and FMA instruction sets, but many of the SSE5 instructions have no equivalent Intel instructions - yet.
Here comes the next problem. How can AMD find a vacant bit combination in the AVX scheme without running the risk that Intel has something else in the pipeline using the same code for something else? I have asked in Intel's AVX forum whether there is space reserved for other vendors, but got no answer (http://softwarecommunity.intel.com/isn/Community/en-US/forums/thread/30257153.aspx).
I have therefore made a list of what AMD could do if Intel refuses to assign part of the AVX code space to AMD:
(1). Use some of the unused bits in the VEX prefix to indicate new AMD instructions. This would be a very dangerous solution. One important feature of the VEX coding scheme is that it is possible to determine the instruction length based on only the VEX prefix and the mod/reg/rm byte. No matter which bit combination AMD chooses there is a possibility that Intel has already assigned the same bit combination to some other instructions with a different length. This would make an incompatibility that it is impossible to solve.
(2). Put a VEX prefix on codes that are already in use by AMD. The 3DNow instructions don't need a VEX prefix because VEX is not allowed on MMX instructions. This frees the following codes for other use:
0E, 0F, 24, 25, 7A, 7B preceded by VEX with mm = 01.
(3). Define a new VEX prefix. The current VEX prefixes begin with C4 and C5. These are the same codes as the old LES and LDS instructions, which are not allowed in 64-bit mode. In 32-bit mode, the distinction between VEX prefix and LES/LDS is based on the two leftmost bits of the subsequent byte, which are 11 if it is a VEX prefix. This bit combination would indicate an illegal register operand on LES/LDS. There is one more byte value that can be used in the same way, namely the hexadecimal value 62. This is the BOUND instruction, which is not allowed in 64-bit mode and cannot have a register operand. The 62 byte value can be used as a VEX prefix for AMD instructions. However, this is the only remaining byte value that has this property. Using this in an unwise and shortsighted way may prevent future extensions. Using 62 as a three-bytes VEX prefix analogously to C4 would not add much to the opcode space. I would prefer to make it a four-bytes VEX prefix. The first byte is 62, the next two bytes should have exactly the same meaning as for the C4 VEX prefix, including the instruction length information. A single bit of the fourth byte should indicate an AMD instruction. You could make a public announcement saying that the part of the opcode space defined by this bit = 1 is AMD territory. Everybody else stay out, unless copying an AMD instruction. The last seven bits are available for future extensions.
(4). If you fear that Intel may have other plans with the 62 byte then there are two other byte values that can be used for VEX prefixes, although this is a little more tricky. These are D4 and D5. These codes are currently assigned to the obsolete instructions AAM and AAD, which are not allowed in 64-bit mode. The distinction between VEX prefix and AAM/AAD in 32-bit mode would still be based on the two leftmost bits of the subsequent byte being 11. The second byte of the AAM and AAD instructions is almost always = 0A (= 10 decimal). This is the radix or number base for packed BCD calculations. Other values are possible, but partly undocumented and almost never used. The AMD manual and a few old Intel manuals tell that other values are possible, while most manuals specify only the value 0A. Other values than 0A are not supported by assemblers and compilers. The only values that make sense when used for radix conversions are in the interval 0x02 - 0x10. The value would have to be bigger than or equal to 0xC0 to interfere with the use as a VEX prefix. It is theoretically possible that some programmer has amused himself with using AAM or AAD for other purposes than they are intended for and with a byte value > 0xC0. This would probably be some old and obscure DOS program.
The probability that such a VEX prefix would break existing software is so low that I would consider it permissible, from a purely technical point of view. However, there is another consideration that cannot be ignored, and that has to do with PR. It is possible that a competitor or a nit-picking IT journalist would claim that the processor might be incompatible with existing software, even if there is no proof that such software exists at all. For this reason, it should be possible to switch off the VEX use in 32-bit mode. For example by a bit in the EFLAGS register.
(5). Same as (4), but available only in 64-bit mode. Assume that high-end users will use 64-bit mode anyway at that time.