Test summary
I have a test which does the following steps, all in a single command buffer submitted to one queue:
Compute dispatch 1 | Writes '1' to an SSBO. |
Render pass 1 | Fullscreen triangle. Reads the SSBO, if the value is '1' output green, otherwise red. |
Compute dispatch 2 | Writes '2' to the same SSBO. |
Render pass 2 | Fullscreen triangle. Reads the SSBO, if the value is '2' output green, otherwise red. |
The compute dispatches are all a single work group containing a single thread. Inbetween each dispatch/pass, there is an ALL->ALL pipeline barrier with a global memory barrier with access masks: VK_ACCESS_MEMORY_READ_BIT | VK_ACCESS_MEMORY_WRITE_BIT. This should cause full serialization and cache invalidation.
Both compute dispatches use the same compute pipeline. Both render passes use the same graphics pipeline. The value that is written to the SSBO (and checked against in the fragment shader) is controlled by push constant.
Output (vs expected output)
The expected output is that the rendered image is green and that when inspecting the buffer contents in RenderDoc, the contents of the SSBO should be:
Compute dispatch 1 | 1 |
Render pass 1 | 1 |
Compute dispatch 2 | 2 |
Render pass 2 | 2 |
The output I get is that the rendered image is red and the SSBO contents are:
Compute dispatch 1 | 1 |
Render pass 1 | 1 |
Compute dispatch 2 | 2 |
Render pass 2 | 1 |
This behavior is fully consistent (not intermittent). At least on my machine (TM).
Environment
I am running on a Radeon RX 5700 XT (731FC1) on Windows 10.
Driver version: 21.50.02.01-220309a-377495C-AMD-Software-Adrenalin-Edition
The test works as expected on Intel UHD Graphics.
Workarounds
If I remove Render pass 1, the issue disappears. (This could of course be timing related.)
If I use two identical compute pipelines, instead of reusing the same one, the issue disappears. (I don't see how this can be timing related.)
This appears to me to be a caching issue. But with the ALL->ALL barriers introduced, I don't see how the app-side synchronization could be any more conservative.
Validation layers
The validations layers do not report any issues (Vulkan SDK 1.3.204.1). The validation layers are running correctly. I have confirmed this by deliberately introducing an error (resource not being freed) and seeing that it is reported.
Reproducer
Here is the test: https://drive.google.com/file/d/1TpZ9I7QJIcUhd4W9h-DxV0DCT72h1kM9/view?usp=sharing
Test with two compute pipelines workaround: https://drive.google.com/file/d/1ZF0iWqNz6TxxD8oxRL_ctcH3gBoNGr3J/view?usp=sharing
It requires Visual C++ 2019 runtime redistributables.
There are two ways to run the test:
Tests.exe --vulkan --test BugFillBuffer
This will run the test once using offscreen rendering and check that the result is an all green image. Otherwise it will output Fail to the log and save the output as BugFillBuffer.png. (I initially thought this was related to vkCmdFillBuffer, but found it was not necessary to reproduce the issue, hence the test name.)
Tests.exe --vulkan --test BugFillBufferLoop
This will run the same test, but output the results to the swapchain in an infinite loop. This makes it easy to capture in RenderDoc or Radeon GPU Profiler.
Running without --vulkan will use OpenGL as the rendering API. The issue does not appear in OpenGL.
Solved! Go to Solution.
On 22.10.1 the test no longer reproduces. The original issue (light binning) also appears to be working correctly without the workaround.
Hi @Chainsawkitten,
Thank you for reporting the issue. I'm moving the post to the AMD Vulkan forum.
Also, you have been whitelisted for the AMD Devgurus community.
Thanks.
Hey Chainsawkitten,
To make it a easier for us to reproduce, can you provide a vulkaninfo output and the source code/build instructions for this issue?
Thanks,
Owen
vulkaninfo: https://pastebin.com/kGiFSs5U
My test doesn't call Vulkan functions directly, but uses an abstraction layer I've written. So there's quite a bit of code involved in this "trivial" test.
Code
git clone --branch AMDBug --recurse-submodules https://github.com/Chainsawkitten/HymnToBeauty.git
The test code is in tests/Video/Bugs.cpp .
The relevant shaders are tests/Video/shaders/WriteBuffer.comp, tests/Video/shaders/FullscreenTriangle.vert and tests/Video/shaders/VerifyFillBuffer.frag .
The shaders are mostly in GLSL but contain some macros used by my shader pre-processor for the sake of resource bindings.
Prerequisites
git, cmake, Vulkan SDK
Building
mkdir HymnToBeauty-build
cd HymnToBeauty-build
cmake -G "Visual Studio 16 2019" -D VulkanRenderer=ON ../HymnToBeauty/
Open the Visual Studio solution and build the Tests project. You can see the generator strings to use for other Visual Studio versions here.
If you build in Debug, validation layers will be enabled. They are disabled in Release.
Workaround
Checkout the AMDBugWorkaround branch.
Thanks for providing a reproducible source, I've created a ticket to track this issue
Hi,
Can you try with latest amd driver, we weren't able to reproduce this issue on our side.
Thanks,
Owen
The issue still reproduces for me on 21.50.21.11-220428a-379219C-AMD-Software-Adrenalin-Edition.
Can you try latest: 22.20.50200
On windows should be: 22.10.1
Thanks,
Owen
On 22.10.1 the test no longer reproduces. The original issue (light binning) also appears to be working correctly without the workaround.