cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

nezarn
Adept II

Driver Crash\BSOD with Vulkan

Hey,

I'm getting a driver crash and sometimes a BSOD when i play Hatsune Miku Project Diva F in RPCS3 emulator.

Here you can read some discussion: Vulkan: Project Diva F broken graphics (AMD specific) · Issue #2129 · RPCS3/rpcs3 · GitHub  (some time ago graphics were broken in a lot of games with Vulkan on AMD GCN3 a GCN4 cards (GCN2 and older cards that supports Vulkan always worked fine) and even with a workaround its not perfect, im just mentioning this because maybe in some way its related, but in that issue you can see some discussion about this BSOD\driver crash too)

This driver crash\BSOD is 100% reproducable in this game (Hatsune Miku: Project Diva F), in the song called "Black★Rock Shooter".

The file where the BSOD is coming from is ATIKMDAG.SYS

Windows crashdump: Dropbox - 100416-17093-01.dmp

PS.: I'm using the latest driver with an RX480. (and posting this issue, since rpcs3 devs told me getting a BSOD is a driver bug\issue as you can see in that github issue link)

0 Likes
22 Replies
Zephiris
Adept I

fwiw, I'm getting a similar issue using Vulkan with the Dolphin emulator on Super Smash Brothers Brawl, 100% of the time it loads stage 2 in "classic mode", RX 480, 16.10.1, and using stock settings (no OC).

Not sure how to reproduce this more easily, but even if emulators or games are sending invalid commands somehow, obviously it shouldn't crash the entire driver.

0 Likes

IDK what did they change from GCN2 to 3 and 4, because its interesting that only GCN3 and GCN4 cards are affected...

It would be nice if someone from AMD would reply.

edit: also i get 100% GPU usage always in rpcs3 in Vulkan, while for others GPU usage is fine. (this 100% GPU usage can be easily reproduced by running one of the samples that are included in rpcs3)

0 Likes
dwitczak
Staff

I could look into this, granted the following:

1) You can provide clear instructions on how to reproduce the issue + app source code is available.

2) There's no need to download any third-party content.

3) No issues are reported by the latest version of the validation layers, taken straight from the project's repository.

Edit: Reason behind all of these is that it doesn't take much for a misbehaving app to TDR a driver in Vulkan, with vast majority of the cases turning out to be app-side issues.

0 Likes

Well you need to have a PS3 and the game (so basically you need 3rd party content), the Vulkan source code is here: rpcs3/rpcs3/Emu/RSX/VK at master · RPCS3/rpcs3 · GitHub  feel free to take a look, and if you see anything wrong\missing, post it here Vulkan: multiple issues on newer AMD (GCN3+) cards · Issue #2201 · RPCS3/rpcs3 · GitHub

But if you have a PS3 and the game, here's how to make the game run and reproduce the BSOD\driver crash:

1. Download latest RPCS3 from here: AppVeyor

2. You need to get some firmware modules from your PS3 (or if you can find a pup file extractor, you can get them from the official PS3 firmware updates)

3. Use these settings: http://i.imgur.com/f4KO3ii.png (+ in graphics select Vulkan, and a resolution, 1280x720 or 1920x1080, frame limit auto, in audio Xaudio2, at input if you have an Xinput gamepad, then select Xinput)

4. Here is a savegame so the required song is unlocked Dropbox - NPUB31241_SYSTEM00.ZIP (unzip it into rpcs3\dev_hdd0\home\00000001\savedata\)

5. Run the game (if you have the disc version, select Boot->Boot Game and browse to the game's folder (you need to rip your disc), if you have the PSN version you need to back it up into a pkg and use Boot->Install Game and you need to place the .rap file (which is basically the license that you have the game) to rpcs3\dev_hdd0\home\00000001\exdata\ and then the game will be in the list, just simply doubleclick on it

6. In game select Load Game, then Play->Rhythm Game and select the song called "Black★Rock Shooter", then you can either play with start, or watch video, both will crash\BSOD at the same place after a while (and always)

edit: sometime later when i have time, i can post some renderdoc\perfstudio stuff, so at least you could help getting rid of the graphical issues that are present on GCN3+ cards

0 Likes

Sorry, no go, we won't be able to help if we are required to use third-party content to reproduce the issue.

On the other hand, a RenderDoc-based repro is fine, that's something I can definitely try helping you out with.

0 Likes

Here is a renderdoc capture from a game's main menu, on the bottom you can clearly see the corruption that happens (used 0.31 renderdoc)

Dropbox - graphicalissue.rdc

Here is a video how it looks like: 2016 10 15 23 05 37 - YouTube  (if link doesn't contain the timestamp, go to 0:25, there you can see it)

And btw i mentioned that the emulator always uses 100% gpu on newer AMD cards, you can easily reproduce that, just download latest rpcs3 version from my comment above, then after setting up the emulator to use Vulkan, you can boot an included sample from rpcs3\dev_hdd0\game\TEST12345\USRDIR\ (so you don't even need to have a PS3, since samples doesn't need any files from firmware)

For example when running gs_gcm_hello_world.elf : RX480: http://i.imgur.com/Kg0hxul.png R9 290: http://i.imgur.com/wxEM8bU.png

0 Likes

It appears that the glitches are introduced in colour passes #7-#11, where the app appears to be doing a blur.

The biggest problem that's jarring from the trace you provided is that your application is executing a lot of renderpasses without defining external dependencies. This can lead to corruptions as the ones we're seeing because of the fact the GPU is free to run the commands in an overlapping manner which may lead to RAW hazards.

For performance reasons, also please consider coalescing the huge number of renderpasses your application is using right now, so that the draw calls are embedded in subpasses.

Hi, I added the following dependency to try and finish previous writes before they are read as textures

  dependencies[1].srcSubpass = VK_SUBPASS_EXTERNAL;

  dependencies[1].srcAccessMask = VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT | VK_ACCESS_DEPTH_STENCIL_ATTACHMENT_WRITE_BIT;

  dependencies[1].srcStageMask = VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT;

  dependencies[1].dstSubpass = 0;

  dependencies[1].dstAccessMask = VK_ACCESS_SHADER_READ_BIT;

  dependencies[1].dstStageMask = VK_PIPELINE_STAGE_FRAGMENT_SHADER_BIT;

  dependencies[1].dependencyFlags = VK_DEPENDENCY_BY_REGION_BIT;

It didn't help; my fault for not understanding the spec on the fields there I'm sure. There's no way for the renderer to know what the next draw command will be like or if the render target will even change and the current one used as input since it is an emulator, so I've been using one subpass and setting src to external.

What's the correct way to set up this dependency?

0 Likes

The BY_REGION flag doesn't make much sense for a single-subpass renderpass, please remove it. Also, if your application needs to sample DS image views, you should consider ORing srcStageMask with VK_PIPELINE_STAGE_LATE_FRAGMENT_TESTS_BIT.

In general, please use validation layers to confirm your Vulkan back-end does everything OK.

0 Likes

Oh, and reg. 100% GPU utilization: assuming you do not use any kind of a CPU-side-based frame limiting solution, that's absolutely fine. After all, wasn't the idea behind Vulkan to squeeze as much juice from the GPU as it's only possible?

0 Likes

Then why does the 100% GPU usage happen only on GCN3 and GCN4 cards? GCN2 cards and Nvidia cards doesn't have 100% usage ever.

Just tried DOOM, and it happens there too. Gonna ask my friends to check their GPU usage...

edit: it also happens in every vulkan sample too.

0 Likes

Obviously, I can't speak for other vendors. As for GCN2, I don't have an answer off the cuff.

0 Likes

Asked my friends to run the Cube sample from Vulkan SDK, so far these are the GPU usages:

http://i.imgur.com/r919zoa.png  R9 290 22%

http://i.imgur.com/hAU98jm.jpg  Nvidia GTX 950 56%

Meanwhile mine:

http://i.imgur.com/cMvxBOT.png

So something is clearly wrong driver wise....

edit: R9 280 (while running Cube sample and an Android emulator) http://gpuz.techpowerup.com/16/10/18/3yp.png

edit2: Fury http://i.imgur.com/NnxByMD.png  (same 100% usage as on RX480, so only affecting GCN3 and GCN4 cards confirmed)

0 Likes

We're looking into this. Thanks for reporting.

Looks like that graphical artifact I mentioned in the first post, can be reproduced in another emulator, and on OpenGL....

Cemu - Graphical Corruption on AMD - YouTube

It would be nice to find something that is free and could reproduce this issue in both Vulkan and OpenGL.... (since big emulator projects tends to follow the ogl\vulkan specification, so theres a chance theres something very broken driver wise in newer AMD cards....)

edit: you can reproduce this issue with PCSX2 too in Bios screen too (in OpenGL, and its not as bad as in RPCS3 /w Vulkan)

http://i.imgur.com/x0rl8C3.png

TL;DR: Either something is very broken Vulkan\OpenGL wise on newer cards, or every emulator in the world has the same issue (which is quite unlikely)

edit2: looks like soon (tomorrow) i can upload a methood to reproduce this graphical issue opengl wise.

0 Likes

As stated earlier, I can't help much unless you can provide a repro without third-party dependencies. Please correct me if I'm wrong, but I believe a bios binary also counts as one.

0 Likes

I believe this should technically do it.

0 Likes

I was able to reproduce the issue you reported locally, so I'm going to forward this for internal investigation.

A Vulkan repro would still be appreciated. Thanks.

It doesn't take much in Vulkan to miss a barrier which could lead to artifacts like the one visible on that clip. I will be happy to have a closer look at this issue, provided freely available assets can be used to reproduce it.

There are memory barriers in OpenGL too (although much simpler to use than in APIs like Vulkan!), which - if skipped - could cause similar effects.

Again, I really can't say much beyond these general guidelines, if I don't have a way to reproduce the issue locally.

0 Likes

So any update regarding 100% GPU usage? (in vulkan with GCN3 and 4 cards)

0 Likes

I don't have any update at this point, I'm afraid.

0 Likes

Since theres still no news (and rpcs3 dev only fixes nvidia issues), found a way to reproduce the driver crash.

0. Emulator source code: GitHub - RPCS3/rpcs3: PS3 emulator/debugger

1. Download the emulator from here https://rpcs3.net/download

2. Download PS3 firmware update from here PS3 System Software Update – Latest Version 4.81

3. Download Project Diva F Demo from here Dropbox - UP0177-NPUB90958_00-PJDF393PRPRTRIAL.pkg

4. After extracting the emulator, go to File -> Install Firmware

5. Install the Demo with File -> Install .pkg

6. In Configuration -> GPU set the renderer to Vulkan (and for good audio, set Xaudio2 in Audio tab), in Configuration -> Pads set up controller, or keyboard controls (its needed, because issue happens in gameplay)

7. Run the game, and in the song list, choose Black★Rock Shooter, difficulty doesn't matter (if it doesn't wanna load, change PPU decoder to Interpreter (fast) in configuration -> cpu)

8. After a while, game will crash the driver. (based on this video Project Diva F | Black Rock Shooter | EXTREME PERFECT - YouTube  it happens when camera would show that blinking lamp, around 0:44 in the video)

Also this driver crash happens in linux too, even with opensource drivers. Also note that it doesn't happen with every AMD card, from what I've seen, only happens on newer cards. (on some GCN3 cards, and on my RX480).

I hope you guys can debug the issue, whenever its really a driver issue (as the stubborn rpcs3 dev says, which I don't believe, because then how come every other Vulkan app\game works just fine) or an emulator issue.

edit: It looks like its the same issue as Vulkan: Fix GPU hangs on AMD Polaris by stenzek · Pull Request #4924 · dolphin-emu/dolphin · GitHub this, if I modify source code to disable primitive restart, it doesn't crash driver

if (rsx::method_registers.restart_index_enabled())

properties.ia.primitiveRestartEnable = VK_FALSE;

//else

//properties.ia.primitiveRestartEnable = VK_FALSE;

But ofc since this isn't a real fix, it breaks graphics in everything.

edit: primitive restart is broken, workaround was made in the emulator (which basically disables primitive restart on RX4xx, RX5xx and Vega)

0 Likes