19 Replies Latest reply on Nov 26, 2018 10:08 PM by xhuang

    Dreadful OpenGL performance


      I guess this can be considered a kind-of follow up of Abysmal OpenGL performance (RX480)


      Basically.. I tried the my testcase on the following systems:

      • Core 2 Duo E8400 + Radeon 7750: ~19 FPS
      • Phenom II X4 965 + Radeon RX 480: ~25 FPS
      • Core 2 Duo 6320 + GeForce GT 430: ~58 FPS


      I believe I don't need any further explanation.

      My educated guess is that gl commands aren't dispatched to a separate thread.


      Then I would have liked to give some more info, but I had problems with both CodeXL and  PerfStudio .

      Instructions to use the thing shouldn't be any different from those contained in this last link.

        • Re: Dreadful OpenGL performance

          One of our GL driver engineers is looking into this report.

          1 of 1 people found this helpful
          • Re: Dreadful OpenGL performance

            Here are a few initial observations.


            I ran the application with Crimson 16.9.2 on a windows 10 x64 machine with an RX480 + i7-6700K and was getting around ~54 FPS.
            What version of the driver and what OS are you running ?




            I was also able to capture performance with CodeXL and PerfStudio if ran GSDumpGUI.exe with the following command line arguments.

            E:\Work\AMD\pcsx2\bin\plugins\GSdx32-SSE2.dll  E:\Work\AMD\Community\perf-case.7z\gsdx_20160924182111.gs GSReplay -1


            The obvious hotspots here included.


               -> GSDeviceOGL::SetupCB

               -> GSRendererOGL::SetupIA

               -> GSRendererOGL::SendDraw


            I will try a few more configurations and let you know what else I find.



            1 of 1 people found this helpful
            • Re: Dreadful OpenGL performance

              Just to elaborate slightly on this (as I am one of the developers for PCSX2), this performance drop is consistent across the entire AMD range, regardless of computer specs.


              OpenGL performance is usually roughly half that seen on DX11 using the same card/setup.


              On Nvidia cards the performance of OpenGL vs Dx11 is about the same, sometimes it is 1-2% slower in OpenGL, but generally is the same speed.


              So there is certainly an issue with the driver, one of our guys who works with making hardware for a living, also works on GSDX, said the OpenGL driver seems very single threaded, where Nvidia have a multithreaded driver for OpenGL, this wasn't obvious until he enabled the multithreaded support on GSDX when initialising OpenGL, that is when the gap between the card manufacturers appeared.

              • Re: Dreadful OpenGL performance

                Is there any update on this at all?

                1 of 1 people found this helpful
                • Re: Dreadful OpenGL performance

                  Hi Mirh,


                  One of our developers found an optimization to the OpenGL Program Pipeline implementation. It should get rolled into a release soon.




                  Aaron Hagan

                  This was in Octoboer. Still nothing.


                  Almost a year later, I managed to find out even another testcase.



                  Aforementioned AMD systems can only get ~3, ~20 and ~60 fps in each of the tests respectively (basically no matter the GPU)

                  The nvidia smartphone-sized PC can reach 6 (7 with multi-thread switch), 45 and 105 fps instead.

                  1 of 1 people found this helpful
                  • Re: Dreadful OpenGL performance

                    Up given these days engineers seem keen.

                    • Re: Dreadful OpenGL performance

                      I believe I might know a source of the dramatic loss of performance compared to the competitor: Scroll to "Threaded Validation and Submission": OpenGL like Vulkan The Mesa driver on Linux attempts to do the same (spawn a thread dedicated to draw calls) and it also has about 30% higher performance under certain conditions.


                      Another reason is that even without that feature, NVIDIA is faster compared to AMD at OpenGL rendering.


                      The issue might be more apparent on renderers that are capping their CPU thread.


                      EDIT: I no longer believe that's the main contributor, see below.

                      • Re: Dreadful OpenGL performance

                        The multithreading feature of other drivers appears to NOT be the main contributor of their better performance. Even if I turn that feature off on the Mesa driver, the performance of that open source driver remains about 30 to 40% better on renderers that are CPU hungry.


                        I know something similar is true on the NVIDIA driver on Windows if their threading optimization feature is turned off and confirmed there is no much CPU activity beyond the main renderer. Maybe AMD software has a simple design flaw that keeps it back.

                          • Re: Dreadful OpenGL performance

                            Open driver might not be all those bells and whistles either, loosing even against the hated fglrx (in CPU-bound cases, but still that's quite much to say considering elsewhere it's way faster).

                            EDIT: that's due to a like 25% performance regression in the last months. Unsure about comparisons made with a fixed version.


                            Ping aaronhagan & dwitczak

                              • Re: Dreadful OpenGL performance

                                Even if some native games are CPU bound, I wouldn't call their renderers necessary CPU hungry since it might be game logic being CPU bound. Try an emulator renderer like Citra's or Cemu's which ensures the renderer is on a CPU bound thread and you'll see a significant handicap on the AMD OpenGL driver of Windows compared to Mesa on Linux.


                                PS. Most native PC games are efficient enough at the system side of rendering to not be low FPS before that condition is met, so people don't even notice. But in those specific cases that the FPS remains low because of that condition, the OpenGL driver for Windows reveals that it's significantly inefficient.

                            • Re: Dreadful OpenGL performance

                              Hello mirh, using the gl_vs_vk on AMD R9 Fury + latest driver,  i can get the result ~5fps, ~30fps and ~90fps respectively. May I have the latest test result from your side, as well as the GPU/OS/driver info?

                                • Re: Dreadful OpenGL performance

                                  I just tested the GL_vs_VK tool on an Nvidia NVS 315 (pretty much a display adapter rather than a graphics card) using Windows 10 and The 391.03 Quadro driver.


                                  These are the results of my test:

                                  Test1: 11FPS

                                  Test2: 61FPS

                                  Test3: 32-53FPS (Fluctuates quite a bit)


                                  Considering I beat the R9 Fury, a card which is vastly more powerful than this thing I'm using in 2 of the 3 tests by over 2x, that is an abysmal showing from the R9 card.

                                  1 of 1 people found this helpful
                                  • Re: Dreadful OpenGL performance

                                    Same results of last time (for as much as the E8400 became a Q9505)

                                    The 7750 is on windows 7x64 with latest 18.8.1


                                    My very broadly educate guess is that you are massively getting cpu-limited.

                                      • Re: Dreadful OpenGL performance

                                        Hello mirh refractionpcsx2, gl_vs_vk has Vulkan support, are you seeing performance gap for VK?

                                          • Re: Dreadful OpenGL performance

                                            Hi xhuang, sorry this NVS 315 doesn't support vulkan as far as I can tell so I am unable to test that on this machine.


                                            Just for some additional test data, I had a friend test his AMD card to see what results he gets, they are as follows

                                            GPU: Asus AMD R7 360

                                            OS: Windows 7 64bit

                                            Driver version: 18.8.1



                                            Test 1: 18 fps

                                            Test 2: 58 fps

                                            Test 3: 148-192 fps




                                            Test 1: 6 fps

                                            Test 2: 41 fps

                                            Test 3: 150 fps


                                            As for myself, I can try it on my GTX 980Ti tonight to see what kind of performance numbers that gives.



                                            Ok tested my 980Ti using the 397.93 drivers (CPU is an i5 4690k @ 4.3Ghz), here's the results, I would expect an R9 Fury to be at least 75% of these results.



                                            Test 1: 25 FPS

                                            Test 2: 91 FPS

                                            Test 3: 298 FPS




                                            Test 1: 35 FPS

                                            Test 2: 153 FPS

                                            Test 3: 1300-1800 FPS (and a lot of squeeling xD )

                                      • Re: Dreadful OpenGL performance

                                        I do not know it's related but I have noticed a performance problems with some older games in D3D9 related to vertex processing.
                                        Even in quite new game - Final Fantasy XIII - transferring a 358400 bytes vertex buffer kills performance on my old R7 360 and game is doing this in all the frames.

                                        When I forced to change pool in IDirect3DDevice9::CreateVertexBuffer from D3DPOOL_MANAGED to D3DPOOL_SYSTEMMEM so the buffer stays in RAM - 60FPS vs 15FPS (tested in save menu - that vertex buffer contains vertices of menu elements like hand cursor etc).

                                        It is possible that some common code in AMD driver responsible of vertex processing have a performance flaws.


                                        I have a code for my tweaks in a wrapper on github here: GitHub - Nucleoprotein/OneTweakNG: OneTweak for all games with game performance fixes.


                                        Something similar happens also in RE4 (not HD) and King Bounty The Legend - but for that games I change behavior flags in IDirect3D9::CreateDevice to D3DCREATE_MIXED_VERTEXPROCESSING - thats fixes them (>60FPS vs 30-40FPS in KB, 30FPS vs 15FPS in RE4)

                                        • Re: Dreadful OpenGL performance

                                          Hello again,

                                          I noticed a much improvement in radv:

                                          radv: Align large buffers to the fragment size. - Patchwork

                                          ie. patch to allocate VRAM as power of two, that was added also to drm/amdgpu for linux 4.20, maybe it benefit Windows too, dunno what you currently using in Windows drivers, but this is good point of start looking.