LGS

Another prefetch question

Discussion created by LGS on Apr 10, 2008
Latest reply on Apr 16, 2008 by LGS
I am trying to improve the performance of my app. Looking with CodeAnalyst, I see that the biggest time used in my app comes in a hash table lookup. This hash table is relatively large (1.7 gig on a 2 gig system), so it is not surprising that much time would be spent here.

What is surprising is what happened when I added a prefetchnta call in an attempt to ensure the correct hash node would be available when needed. While the "cmp reg/mem" instruction for checking the hash node now takes much less time, most of that time is now showing up in the prefetchnta instruction.

Isn't prefetchnta essentially an asynchronous operation? I expected that the prefetch would continure to run while I performed other operations, not block until the memory is retrieved. But CodeAnalyst shows a huge hunk of time being spent there.

I'm on an AMD Turion64 x2.

Outcomes