cancel
Showing results for 
Search instead for 
Did you mean: 

OpenGL & Vulkan

Vulkan: Poor performance due to barrier REGION_BIT being ignored causing full flush

This is kind of related to an old discussion with PCSX2 having poor OGL performance which was rudely archived with no resolution as seen here https://community.amd.com/t5/opengl-vulkan/dreadful-opengl-performance/td-p/212252/

However recently one of our guys has been working on a Vulkan renderer and has been investigating poor performance on AMD when using "pipeline barrier from COLOR_ATTACHMENT -> FRAGMENT SHADER, but the memory flags are COLOR_ATTACHMENT -> INPUT_ATTACHMENTT, which is framebuffer-local (VK_DEPENDENCY_BY_REGION_BIT)" as he is doing what is described here: https://www.khronos.org/registry/vulkan/specs/1.2-extensions/html/vkspec.html#renderpass-feedbackloo...

What he noticed was despite what settings he was passing, it was doing full barrier and invalidating everything, which probably causes a write back of everything to VRAM, and this is VERY slow, Nvidia doesn't suffer from this same problem.

 

HJ35HD7.png

 

Now I don't know that OpenGL is doing the same thing, but performance is even worse there when using barriers, so I wouldn't be surprised if it was the same.

 

The code for the barrier is as follows

 

static void ColorBufferBarrier(GSTexture* rt)

{

const VkImageMemoryBarrier barrier = {VK_STRUCTURE_TYPE_IMAGE_MEMORY_BARRIER, nullptr, VK_ACCESS_COLOR_ATTACHMENT_READ_BIT | VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT, VK_ACCESS_INPUT_ATTACHMENT_READ_BIT, VK_IMAGE_LAYOUT_GENERAL, VK_IMAGE_LAYOUT_GENERAL, VK_QUEUE_FAMILY_IGNORED, VK_QUEUE_FAMILY_IGNORED, static_cast<GSTextureVK*>(rt)->GetTexture().GetImage(), {VK_IMAGE_ASPECT_COLOR_BIT, 0u, 1u, 0u, 1u}};

vkCmdPipelineBarrier(g_vulkan_context->GetCurrentCommandBuffer(), VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT, VK_PIPELINE_STAGE_FRAGMENT_SHADER_BIT, VK_DEPENDENCY_BY_REGION_BIT, 0, nullptr, 0, nullptr, 1, &barrier);

}

8 Replies

No response, AMD? an acknowledgement would be nice, I figured 4 months might be enough for at least a "thanks for the report".

 

Thanks.

Hi @refractionpcsx2 ,

Sorry for this delayed response. I have informed the Vulkan team about this issue. I will let you know once I get any feedback from them.

Thanks.

0 Likes

Thank you, I appreciate it 🙂

0 Likes

Hi,

I've opened up a ticket to track this issue to get someone to look at this. Thanks for reporting.

Owen

0 Likes

Hi refractionpcsx2,

Would it be possible to provide an executable with source code to reproduce this issue so it's easier on our end?

Thanks,

Owen

0 Likes

Testcase:

https://drive.google.com/file/d/1IRvZaVo55ljhh4PPoUPhEfs762psrleV/view?usp=share_link

Source code:

https://github.com/PCSX2/pcsx2

 

To run the testcase simply drag the DRAGME.gs.xz file on top of the executable pcsx2-qtx64.exe

0 Likes

Hey,

Sorry for the delay, investigation is ongoing, it looks like VK_DEPENDENCY_BY_REGION_BIT won't do anything on dGPUs. It seems to be mostly used on mobile GPUs for tile-based renderers. We're wondering if PCSX2 is still using this method? And what's your environment that showed this issue? A vulkaninfo would be useful.

Thanks,

Owen

0 Likes

Hi there! 

So sorry for the delay in getting back to you, I must had missed the email, we just happened to be talking about this on our discord today.

 

We do indeed still use this bit, and as far as we're aware both intel and Nvidia benefit from this bit being set, not just tile based iGPU's, it was one of our main reasons that Vulkan was faster Vs OpenGL

 

This is a comparison on an Nvidia 2070 Super with Need for Speed Carbon (which does a lot of barriers)

OGL:
https://media.discordapp.net/attachments/612095738712817665/1200441378476408942/image.png
VK:

https://media.discordapp.net/attachments/612095738712817665/1200441390258212884/image.png

 

If you would like I can provide this sample in a package, it's actually what we call a GS dump, which is a recording of the instructions sent to the PS2's GPU, that way you can run it on your end with a preconfigured PCSX2 (no BIOS or game required, so no copyrighted materials), there might be a few draws in there though :D. We also have a "debug device" option in the emulator for annotating the draws in tools such as Renderdoc 😄

0 Likes