cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

agner
Adept I

Typos in Programmer's Manual Volume 6 ?

Possible errors in spec. for new XOP instructions

I was happy to see the new Programmer’s manual volume 6 which has made a lot of modifications for the sake of compatibility with the VEX coding scheme. Such a revision was certainly a wise thing to do (see thread http://forums.amd.com/devforum/messageview.cfm?catid=203&threadid=98392&enterthread=y )

As I started to put the new codes into my disassembler ( http://www.agner.org/optimize/#objconv ) I found some inconsistensies that might be typos:

Page 114ff: The instructions VFRCZPS/PD/SS/SD have opcodes 81,80,83,82 respectively. I believe this should be 80,81,82,83 respectively.

Page 124: The code for VFRCZSS is specified in the YMM version only (L=1). I believe the XMM version (L=0) will be available first.

Page 29, table 1-5: The source operands in 3-operand instructions are swapped when XOP.W = 0 and not swapped if XOP.W = 1 (relative to the order the operands have on instructions that don't allow swapping). It is opposite for 4-operand instructions (table 1-2). This is inconsistent, but possibly correct.

 

0 Likes
9 Replies
rex8664
Staff

Hi Agner,

For the FRCZ instructions, you are correct on both counts -- thanks for pointing that out.  One other encoding typo that you may notice is that VCVTPS2PH should have an mmmmm value of 08, not 09.  I'm sure a few more typos will be discovered, and at some point we'll release an update to the document.

As for the operand swapping, I'm not sure what you mean about the inconsistency -- can you provide more detail?  At any rate, the tables are indeed correct.

Dave Christie

 

0 Likes

Originally posted by: rex8664As for the operand swapping, I'm not sure what you mean about the inconsistency -- can you provide more detail?


For example, VCOMB does not allow operand swapping. It has the W bit = 0. The order of the source operands is specified as:

src1 = vvvv, src2 = rm (possibly a memory operand)

VPROTB allows operand swapping.

VPROTB with W = 1: src1 = vvvv, src2 = rm (possibly a memory operand)

VPROTB with W = 0: src1 = rm, src2 = vvvv (not a memory operand)

The "natural order" of the operands is src1 = vvvv, src2 = rm. The swapped order is src1 = rm, src2 = vvvv. This is also in accordance with Intel's VEX scheme. So on VPROTB, the operands are swapped if W=0, but on VCOMB the operands are not swapped when W=0.

If it is later decided to allow operand swapping on VCOMB, then it will be: operands are swapped if W=1. So VCOMB and VPROTB have opposite meanings of the W bit.

The four-operand instructions have swapped the operands if W=1, and not swapped if W=0, for both AMD and Intel. So the VPROTx instructions are opposite of all other instructions on the W bit.

---

Speaking of inconsistensies, there is also an inconsistency with the pp bits. Instructions introduced by AMD have pp=00, while instructions introduced by Intel have pp=01 for obscure reasons. This disagreement may cause problems if the pp bits are used in the future for register extensions or whatever. It would be natural to use the pp bits for extensions in the vector size beyond 256 bits and reserve the unused mm bits for changes that affect opcode length.

0 Likes

Ah yes, I see what you mean.  I suppose that could be viewed as an inconsistency, although I don't see it as being problematic.  We don't expect to add the swapping ability to VCOM in the future.

Regarding the pp bits: With Intel's VEX encoding, these reflect the presence of a 66, F2 or F3 prefix on the legacy 2-or 3-byte opcode encoding of the instruction.  For every VEX-encoded instruction (as defined so far at least), there is a parallel legacy encoding that may use one of these prefixes (and Intel's original FMA encodings used the 66 prefix, hence have a pp value of 01).  XOP instructions have no equivalent legacy encodings, hence no use of these prefixes to mirror in the pp field.  And so we simply decided to keep the pp field reserved.

 

0 Likes

Originally posted by: rex8664XOP instructions have no equivalent legacy encodings, hence no use of these prefixes to mirror in the pp field.  And so we simply decided to keep the pp field reserved.


I understand. I guess I shouldn't ask you why Intel changed their prefix policy and added a 66 prefix to ALL new instructions from SSSE3 and on. This weird decision is reflected in the pp bits of Intel's instructions, including instructions that don't have a legacy encoding. I was just expecting you to follow the same pattern as Intel in order to make it easier to use the pp bits for some purpose in the future.

---

I have noticed that those instructions that SSE5 shared with SSE4.1 are missing in the new manual. Does this mean that all SSE4.1 instructions will be covered in the next AMD processor and hence don't need to be categorized as XOP instructions?

0 Likes

Originally posted by: agner I have noticed that those instructions that SSE5 shared with SSE4.1 are missing in the new manual. Does this mean that all SSE4.1 instructions will be covered in the next AMD processor and hence don't need to be categorized as XOP instructions?

Yes -- see my response to your question on my blog entry, if you haven't already.

BTW, the XOP document has been updated to version 3.03, fixing a few errors beyond what you've pointed out already.  Please let us know of any other issues you find or questions you have.

0 Likes

Rev 3.03 caught some additional ones I had noticed, but not:

- FPCOMNEQ* "equal" (imm8=4) should be VPCOMEQ*

- VPHADDDQ opcode should be C8 instead of CB (I think), as VPHADDUDQ is D8.

- VPHADDWQ opcode should be C7 instead of D7 (I think), otherwise it's the same as VPHADDUWQ.

- Most of the VPCOM instruction mnemonics show only 3 operands; this is inconsistent with the text, which calls for 4 operands.

- VPROTB fixed-count mnemonic 2nd operand should be xmm2/mem128 rather than just xmm2.

0 Likes

Originally posted by: peterjohnson Rev 3.03 caught some additional ones I had noticed, but not:

 

- VPHADDDQ opcode should be C8 instead of CB (I think), as VPHADDUDQ is D8.

 

Upon further comparison with the old SSE5 spec, it appears this should be vice-versa.  VPHADDDQ = CB is correct, but VPHADDUDQ should be DB instead of D8.

0 Likes

Originally posted by: peterjohnson
Originally posted by: peterjohnson Rev 3.03 caught some additional ones I had noticed, but not:

- VPHADDDQ opcode should be C8 instead of CB (I think), as VPHADDUDQ is D8.

Upon further comparison with the old SSE5 spec, it appears this should be vice-versa.  VPHADDDQ = CB is correct, but VPHADDUDQ should be DB instead of D8.

Thanks Peter -- taking this into account, you are correct on all counts.

0 Likes
extremeseos0007
Journeyman III

Hello to everyone at this forum.....I am really new at this and wish a good time to spend discussing

0 Likes