3 Replies Latest reply on Sep 25, 2007 8:09 PM by devcentral

    SSE5 Instruction Set doc

    yeyang
      On page 1 of SSE5 documentation the "Integer multiply accumulate" instructions are described as (IMAC, IMADC). Shouldn't it really be (PMAC, PMADC)?

      [edit]Same with the texts on the SSE5 webpage.[/edit]
        • SSE5 Instruction Set doc
          Thanks for bringing this to our attention. We will fix the web page shortly, and will request an update for the next revision of the document. Glad to see that community review is working!
          • SSE5 Instruction Set doc
            yeyang
            Hi, some more questions regarding SSE5 -

            1. In the description of PCMOV it says the instruction can use results from PCMPx and CMPx. Can the results from PCOMx and COMx (added in SSE5) be used as well?

            2. It seems to me that the SSE5 PCOMx and COMx instructions actually "supersede" the various SSE2 PCMPx and CMPx instructions, is it truly so?

            3. The graphical representations of FMADDx and FNMADDx always shows dest = src1, but depending on Opcode3 dest could be src3 when OC1=1, right?

            4. For FMADDx and FNMADDx instructions, setting OC[1,0] bits as 10b or 11b should have exactly the same result?

            5. The OC[1,0] bits of any PMACx instruction is always 10b; are the other values (notably 00b and 01b) purposely excluded?

            Thanks in advance.
              • SSE5 Instruction Set doc
                Here are some answers from one of our engineers:

                1. In the description of PCMOV it says the instruction can use results from PCMPx and CMPx. Can the results from PCOMx and COMx (added in SSE5) be used as well?

                Yes, the results from PCOMx and COMx can be used by PCMOV as well.

                2. It seems to me that the SSE5 PCOMx and COMx instructions actually "supersede" the various SSE2 PCMPx and CMPx instructions, is it truly so?

                Yes, the SSE5 PCOMx and COMx are more supersets of PCMPx and CMPx, and more powerful, since the destination can be a different register than the source registers. So, the source registers are not overwritten.

                3. The graphical representations of FMADDx and FNMADDx always shows dest = src1, but depending on Opcode3 dest could be src3 when OC1=1, right?

                Yes, based on OC[1:0], the dest can be any of the three sources. The graphical representation shows only src1 for brevity.

                4. For FMADDx and FNMADDx instructions, setting OC[1,0] bits as 10b or 11b should have exactly the same result?

                For FMADDx and FNMADDx instructions, OC[1,0]= 10b implies that that src2 can be a memory operand, while OC[1:0] = 11b implies that src3 can be a memory operand. Both of these versions are provided for maximum flexibility.

                5. The OC[1,0] bits of any PMACx instruction is always 10b; are the other values (notably 00b and 01b) purposely excluded?

                For PMACx, OC[1;0] is always 10b. This is intentional due to two reasons. Firstly, unlike FMAC, this flexibility is largely not required for PMACx for real apps. Secondly, for many versions of PMACx, the width of the packed bit-fields (example: byte vs. word) of the sources and dest are not the same, and it does not make sense to overwrite the sources other than the addend in such cases.

                Thanks.