cancel
Showing results for 
Search instead for 
Did you mean: 

OpenGL & Vulkan

Caching(?) issue in trivial Vulkan test

Test summary

I have a test which does the following steps, all in a single command buffer submitted to one queue:

Compute dispatch 1Writes '1' to an SSBO.
Render pass 1Fullscreen triangle. Reads the SSBO, if the value is '1' output green, otherwise red.
Compute dispatch 2Writes '2' to the same SSBO.
Render pass 2Fullscreen triangle. Reads the SSBO, if the value is '2' output green, otherwise red.


The compute dispatches are all a single work group containing a single thread. Inbetween each dispatch/pass, there is an ALL->ALL pipeline barrier with a global memory barrier with access masks: VK_ACCESS_MEMORY_READ_BIT | VK_ACCESS_MEMORY_WRITE_BIT. This should cause full serialization and cache invalidation.

Both compute dispatches use the same compute pipeline. Both render passes use the same graphics pipeline. The value that is written to the SSBO (and checked against in the fragment shader) is controlled by push constant.

Output (vs expected output)

The expected output is that the rendered image is green and that when inspecting the buffer contents in RenderDoc, the contents of the SSBO should be:

Compute dispatch 11
Render pass 11
Compute dispatch 22
Render pass 22


The output I get is that the rendered image is red and the SSBO contents are:

Compute dispatch 11
Render pass 11
Compute dispatch 22
Render pass 21


This behavior is fully consistent (not intermittent). At least on my machine (TM).

Environment

I am running on a Radeon RX 5700 XT (731FC1) on Windows 10.
Driver version: 21.50.02.01-220309a-377495C-AMD-Software-Adrenalin-Edition
The test works as expected on Intel UHD Graphics.

Workarounds

If I remove Render pass 1, the issue disappears. (This could of course be timing related.)
If I use two identical compute pipelines, instead of reusing the same one, the issue disappears. (I don't see how this can be timing related.)

This appears to me to be a caching issue. But with the ALL->ALL barriers introduced, I don't see how the app-side synchronization could be any more conservative.

Validation layers

The validations layers do not report any issues (Vulkan SDK 1.3.204.1). The validation layers are running correctly. I have confirmed this by deliberately introducing an error (resource not being freed) and seeing that it is reported.

Reproducer

Here is the test: https://drive.google.com/file/d/1TpZ9I7QJIcUhd4W9h-DxV0DCT72h1kM9/view?usp=sharing
Test with two compute pipelines workaround: https://drive.google.com/file/d/1ZF0iWqNz6TxxD8oxRL_ctcH3gBoNGr3J/view?usp=sharing
It requires Visual C++ 2019 runtime redistributables.

There are two ways to run the test:

 

Tests.exe --vulkan --test BugFillBuffer

 

This will run the test once using offscreen rendering and check that the result is an all green image. Otherwise it will output Fail to the log and save the output as BugFillBuffer.png. (I initially thought this was related to vkCmdFillBuffer, but found it was not necessary to reproduce the issue, hence the test name.)

 

Tests.exe --vulkan --test BugFillBufferLoop

 

This will run the same test, but output the results to the swapchain in an infinite loop. This makes it easy to capture in RenderDoc or Radeon GPU Profiler.

Running without --vulkan will use OpenGL as the rendering API. The issue does not appear in OpenGL.

0 Likes
1 Solution

On 22.10.1 the test no longer reproduces. The original issue (light binning) also appears to be working correctly without the workaround.

View solution in original post

0 Likes
8 Replies
dipak
Big Boss

Hi @Chainsawkitten,

Thank you for reporting the issue. I'm moving the post to the AMD Vulkan forum.

Also, you have been whitelisted for the AMD Devgurus community.

Thanks.

 

0 Likes

Hey Chainsawkitten,

To make it a easier for us to reproduce, can you provide a vulkaninfo output and the source code/build instructions for this issue?

Thanks,

Owen

0 Likes

vulkaninfo: https://pastebin.com/kGiFSs5U

My test doesn't call Vulkan functions directly, but uses an abstraction layer I've written. So there's quite a bit of code involved in this "trivial" test.

Code

git clone --branch AMDBug --recurse-submodules https://github.com/Chainsawkitten/HymnToBeauty.git

The test code is in tests/Video/Bugs.cpp .
The relevant shaders are tests/Video/shaders/WriteBuffer.comp, tests/Video/shaders/FullscreenTriangle.vert and tests/Video/shaders/VerifyFillBuffer.frag .
The shaders are mostly in GLSL but contain some macros used by my shader pre-processor for the sake of resource bindings.

Prerequisites

git, cmake, Vulkan SDK

Building

mkdir HymnToBeauty-build
cd HymnToBeauty-build
cmake -G "Visual Studio 16 2019" -D VulkanRenderer=ON ../HymnToBeauty/

Open the Visual Studio solution and build the Tests project. You can see the generator strings to use for other Visual Studio versions here.
If you build in Debug, validation layers will be enabled. They are disabled in Release.

Workaround

Checkout the AMDBugWorkaround branch.

0 Likes

Thanks for providing a reproducible source, I've created a ticket to track this issue

0 Likes

Hi,

Can you try with latest amd driver, we weren't able to reproduce this issue on our side.

Thanks,

Owen

0 Likes

The issue still reproduces for me on 21.50.21.11-220428a-379219C-AMD-Software-Adrenalin-Edition.

0 Likes

Can you try latest: 22.20.50200

On windows should be: 22.10.1

Thanks,

Owen

0 Likes

On 22.10.1 the test no longer reproduces. The original issue (light binning) also appears to be working correctly without the workaround.

0 Likes