cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

zorroblsa
Journeyman III

How Can I Optimize my Program in SSE on AMD CPU!!

I write an assembler function in SSE  to caculate Vector mutiply Matrix ...That works well on an Intel CPU , cost only 30% time compare to the FLU assembler by VC8....But  as to my AMD CPU(AthlonX2 3600+).....It cost  about  double  time than FLU...   I tried 3DNOW,which worked even worse...     Does AMD SIMD just work slow?

Can  some one help me? Any suggestion is welcomed.

0 Likes
4 Replies
Brane214
Journeyman III

I am no expert, but you probably have to take into account the fact that AMD K-8's SSE unit is much slower than Intel C2D's, since it can process only 64-bit per clock cycle.

Also, memory access pattern can be very influential factor.

I was toying with some asm routines in linux kernel and have managed to accelerate them on K-8/K-10 just by removing a couple of prefetches that were supposed to lift performance on Intel...

0 Likes
eduardoschardong
Journeyman III

It would be easier to help if you post the source code
0 Likes
rramshan
Staff

Could you describe the problem in detail and post source code if possible? We can take a look at it.

 

This response is provided for informational purposes only, is provided “AS IS” and does not obligate AMD to provide any of the services, technology, or programs described.

 

0 Likes

I think It'll be enough a part of the source code

0 Likes