Archives Discussions

yeyang · ‎08-30-2007

On page 1 of SSE5 documentation the "Integer multiply accumulate" instructions are described as (IMAC, IMADC). Shouldn't it really be (PMAC, PMADC)?

[edit]Same with the texts on the SSE5 webpage.[/edit]

devcentral · ‎09-06-2007

Thanks for bringing this to our attention. We will fix the web page shortly, and will request an update for the next revision of the document. Glad to see that community review is working!

yeyang · ‎09-22-2007

Hi, some more questions regarding SSE5 -

1. In the description of PCMOV it says the instruction can use results from PCMPx and CMPx. Can the results from PCOMx and COMx (added in SSE5) be used as well?

2. It seems to me that the SSE5 PCOMx and COMx instructions actually "supersede" the various SSE2 PCMPx and CMPx instructions, is it truly so?

3. The graphical representations of FMADDx and FNMADDx always shows dest = src1, but depending on Opcode3 dest could be src3 when OC1=1, right?

4. For FMADDx and FNMADDx instructions, setting OC[1,0] bits as 10b or 11b should have exactly the same result?

5. The OC[1,0] bits of any PMACx instruction is always 10b; are the other values (notably 00b and 01b) purposely excluded?

Thanks in advance.

devcentral · ‎09-25-2007

Here are some answers from one of our engineers:

1. In the description of PCMOV it says the instruction can use results from PCMPx and CMPx. Can the results from PCOMx and COMx (added in SSE5) be used as well?

Yes, the results from PCOMx and COMx can be used by PCMOV as well.

2. It seems to me that the SSE5 PCOMx and COMx instructions actually "supersede" the various SSE2 PCMPx and CMPx instructions, is it truly so?

Yes, the SSE5 PCOMx and COMx are more supersets of PCMPx and CMPx, and more powerful, since the destination can be a different register than the source registers. So, the source registers are not overwritten.

3. The graphical representations of FMADDx and FNMADDx always shows dest = src1, but depending on Opcode3 dest could be src3 when OC1=1, right?

Yes, based on OC[1:0], the dest can be any of the three sources. The graphical representation shows only src1 for brevity.

4. For FMADDx and FNMADDx instructions, setting OC[1,0] bits as 10b or 11b should have exactly the same result?

For FMADDx and FNMADDx instructions, OC[1,0]= 10b implies that that src2 can be a memory operand, while OC[1:0] = 11b implies that src3 can be a memory operand. Both of these versions are provided for maximum flexibility.

5. The OC[1,0] bits of any PMACx instruction is always 10b; are the other values (notably 00b and 01b) purposely excluded?

For PMACx, OC[1;0] is always 10b. This is intentional due to two reasons. Firstly, unlike FMAC, this flexibility is largely not required for PMACx for real apps. Secondly, for many versions of PMACx, the width of the packed bit-fields (example: byte vs. word) of the sources and dest are not the same, and it does not make sense to overwrite the sources other than the addend in such cases.

Thanks.

Archives Discussions

SSE5 Instruction Set doc