cancel
Showing results for 
Search instead for 
Did you mean: 

OpenGL & Vulkan

dutta
Adept II

GPU driver hang

I have a Vulkan issue that only seems to occur on my GPU. I have thus tested it on an R9 Nano, an Nvidia 1060, and it works fine. Validation layers only produce errors related to texture layouts which worked previously. This is my issue:

Running in an optimized build, the GPU hangs and fails to recover after a couple of seconds of running. Each frame is identical to the other, nothing in the scene changes. Once the GPU recovered but both screens turned purple. The time it takes before the hang is different, but I never manage to run it for over a minute. Trying to synchronize each frame by inserting and immediately waiting for a fence on all queues does impact the performance, but does not remove the hang. In very few cases, the GPU did manage to recover and showed that vkAcquireNextImageKHR returned VK_ERROR_DEVICE_LOST, but I could not find any prior command returning this error. I also noticed that when a fence was waited for but wasn't submitted, vkWaitForFences returned VK_TIMEOUT, which does not seem to correspond to the specification, but this issue is probably unrelated. I am sure I am doing something wrong, but without any recovery from the GPU, and without the validation layers telling me anything interesting, it is impossible to understand where to look for.

Running in a debug build however, produces no such issues. All submissions are done in a single thread, and I can't see how I can be racing against the GPU since I am in fact waiting for the GPU to finish after each submit and present. My hardware is a RX 480, and the driver version is 19.7.2, running in Windows 10 build 1903. The issue is hard to extract to a minimal repro, but the code can be found at https://github.com/gscept/nebula/tree/master/code/render/coregraphics/vk, the files interesting should be vkgraphicsdevice.cc and vksubcontexthandler.cc. 

0 Likes
5 Replies
dorisyan
Staff

Hi dutta‌, Thanks for your report, we will take a look into it.

0 Likes

Thanks! I want to add that I made a mistake in my first post. I think my hang did not come from vkAcquireNextImage but from vkQueueSubmit, which makes more sense. I can provide a RenderDoc capture if that will help.

0 Likes

Hi @dutta, it would be better if you can provide a renderDoc capture, BTW can you write a minimal code that can reproduce this problem?

Yes, dorisyan‌ is absolutely right, dutta‌, could you provide a minimal test project(with executable and Visual Studio solution) ? I can test it on AMD/nVidia.

0 Likes

Sorry for the late response. I am going to have a look at it, but it requires a little bit of work considering it's not a small project. 

0 Likes