cancel
Showing results for 
Search instead for 
Did you mean: 

General Discussions

jayadb3
Journeyman III

AMD Fast Short Rep Mov performance issue

AMD's processors advertise support for Fast Short Rep Mov but have very poor rep mov performance on unaligned data, causing performance issues as memcpy implementations which rely on this flag can pick a suboptimal implementation, including very important ones like glibc's memcpy. I wrote up a blog post which goes into more detail Zen 3's Amazing Slow Short Rep Mov and you can also see some discussion of this on the glibc issue tracker and a hacker news discussion of this bug making rust programs doing some simple things slower than python. I hope AMD are able to fix this issue in microcode as in some workloads it can have a significant performance impact.

0 Likes
0 Replies