4 Replies Latest reply on Feb 3, 2009 2:32 PM by tanja1

    How Can I Optimize my Program in SSE on AMD CPU!!

      I write an assembler function in SSE  to caculate Vector mutiply Matrix ...That works well on an Intel CPU , cost only 30% time compare to the FLU assembler by VC8....But  as to my AMD CPU(AthlonX2 3600+).....It cost  about  double  time than FLU...   I tried 3DNOW,which worked even worse...     Does AMD SIMD just work slow?

      Can  some one help me? Any suggestion is welcomed.