DarkShikari

Merging behavior with setCC instructions?

Discussion created by DarkShikari on May 23, 2009
Latest reply on Jun 11, 2009 by edward_yang

The Phenom arch manual talks about how:

xor eax, eax

mov al, foo

is worse than

movzx al, foo

because of merging penalties.

But what about setCC?  What kind of merging penalties exist for the setCC instructions, which can only output to 8-bit registers, and don't zero the high bits?  Should I do:

xor eax, eax

setne al

or should I do

setne al

movzx eax, al

The former is faster on all the Intel chips I've tested, since the xor can be executed well in advance, but I don't know how Phenom merging penalties affect this.

Outcomes