7 Replies Latest reply on Oct 14, 2017 4:13 AM by epigramx

    Dreadful OpenGL performance


      I guess this can be considered a kind-of follow up of Abysmal OpenGL performance (RX480)


      Basically.. I tried the my testcase on the following systems:

      • Core 2 Duo E8400 + Radeon 7750: ~19 FPS
      • Phenom II X4 965 + Radeon RX 480: ~25 FPS
      • Core 2 Duo 6320 + GeForce GT 430: ~58 FPS


      I believe I don't need any further explanation.

      My educated guess is that gl commands aren't dispatched to a separate thread.


      Then I would have liked to give some more info, but I had problems with both CodeXL and  PerfStudio .

      Instructions to use the thing shouldn't be any different from those contained in this last link.

        • Re: Dreadful OpenGL performance

          One of our GL driver engineers is looking into this report.

          1 of 1 people found this helpful
          • Re: Dreadful OpenGL performance

            Here are a few initial observations.


            I ran the application with Crimson 16.9.2 on a windows 10 x64 machine with an RX480 + i7-6700K and was getting around ~54 FPS.
            What version of the driver and what OS are you running ?




            I was also able to capture performance with CodeXL and PerfStudio if ran GSDumpGUI.exe with the following command line arguments.

            E:\Work\AMD\pcsx2\bin\plugins\GSdx32-SSE2.dll  E:\Work\AMD\Community\perf-case.7z\gsdx_20160924182111.gs GSReplay -1


            The obvious hotspots here included.


               -> GSDeviceOGL::SetupCB

               -> GSRendererOGL::SetupIA

               -> GSRendererOGL::SendDraw


            I will try a few more configurations and let you know what else I find.



            1 of 1 people found this helpful
            • Re: Dreadful OpenGL performance

              Just to elaborate slightly on this (as I am one of the developers for PCSX2), this performance drop is consistent across the entire AMD range, regardless of computer specs.


              OpenGL performance is usually roughly half that seen on DX11 using the same card/setup.


              On Nvidia cards the performance of OpenGL vs Dx11 is about the same, sometimes it is 1-2% slower in OpenGL, but generally is the same speed.


              So there is certainly an issue with the driver, one of our guys who works with making hardware for a living, also works on GSDX, said the OpenGL driver seems very single threaded, where Nvidia have a multithreaded driver for OpenGL, this wasn't obvious until he enabled the multithreaded support on GSDX when initialising OpenGL, that is when the gap between the card manufacturers appeared.

              • Re: Dreadful OpenGL performance

                Is there any update on this at all?

                1 of 1 people found this helpful
                • Re: Dreadful OpenGL performance

                  Hi Mirh,


                  One of our developers found an optimization to the OpenGL Program Pipeline implementation. It should get rolled into a release soon.




                  Aaron Hagan

                  This was in Octoboer. Still nothing.


                  Almost a year later, I managed to find out even another testcase.



                  Aforementioned AMD systems can only get ~3, ~20 and ~60 fps in each of the tests respectively (basically no matter the GPU)

                  The nvidia smartphone-sized PC can reach 6 (7 with multi-thread switch), 45 and 105 fps instead.

                  1 of 1 people found this helpful
                  • Re: Dreadful OpenGL performance

                    Up given these days engineers seem keen.

                    • Re: Dreadful OpenGL performance

                      I believe I know the main source of the dramatic loss of performance compared to the competitor: Scroll to "Threaded Validation and Submission": OpenGL like Vulkan


                      The Mesa driver on Linux attempts to do the same (spawn a thread dedicated to draw calls) and it also has about 30% higher performance under certain conditions.


                      PS. It might be more valuable for renderers that are capping a CPU core.