2 Replies Latest reply on Nov 17, 2011 1:28 AM by francium

    About Optimization of Opteron


      I'm benchmarking a Opteron cluster using High Performance Lapack .

      When compiling with PGI Optimization option '-fastsse -tp=shanghai-64' , The final performace is a little bit LOWER than just use ' -fastsse'

      When compiling with Open64 with '-O3 -march=barcilona' , the result is the same, a little bit lower than just use '-O3'

      I suppose it should be faster when specifing the target processor. but result is weird.

        • About Optimization of Opteron

          Both the Open64 and PGI compiler default to using the architecture of the machine being used for the compilation when no -march (Open64) or -tp (PGI) option is given.  This is different than gcc which defaults to a generic code generation when no architecture switch is given.

          What does /proc/cpuinfo show as your processor?

          For PGI, it could be the case that you are running on a system that defaults to -tp,istanbul instead and hence better performance there.

          For Open64, what does "-v" show for a compilation of a simple file with your two choices?  That would show where they are different.  Also, note that the option is spelled -march=barcelona (not barcilona) but I assume that is just a typo in your posting above.