cancel
Showing results for 
Search instead for 
Did you mean: 

OpenGL & Vulkan

JulianXhokaxhiu
Journeyman III

Vulkan HDR Crash on the latest AMD drivers ( 22.8.1 )

Hi,
 
I've found a weird AMD driver bug crash which I have no idea how to fix, as it seems my Vulkan code is how it should be ( based on https://gpuopen.com/learn/using-amd-freesync-premium-pro-hdr-code-samples/), and this wasn't happening until release 22.7.1 AFAIK.

 

The piece of code that triggers the crash is this one:

 

As soon as that code gets executed, this exception is returned:
---
Exception thrown at 0x00007FF87F3D3E00 (amdvlk64.dll) in example-02-metaballsDebug.exe: 0xC0000005: Access violation reading location 0xFFFFFFFFFFFFFFFF.
---

 

This is the call stack if it helps:

 

amdvlk64.dll!00007ff87f3d3e00()
vulkan-1.dll!00007ff8852bdf96()
example-02-metaballsDebug.exe!bgfx::vk::SwapChainVK::createSwapChain() Line 6809
  at julianxhokaxhiu\bgfx\src\renderer_vk.cpp(6809)
example-02-metaballsDebug.exe!bgfx::vk::SwapChainVK::create(VkCommandBuffer_T * _commandBuffer, void * _nwh, const bgfx::Resolution & _resolution, bgfx::TextureFormat::Enum _depthFormat) Line 6501
  at julianxhokaxhiu\bgfx\src\renderer_vk.cpp(6501)
example-02-metaballsDebug.exe!bgfx::vk::FrameBufferVK::create(unsigned short _denseIdx, void * _nwh, unsigned int _width, unsigned int _height, bgfx::TextureFormat::Enum _format, bgfx::TextureFormat::Enum _depthFormat) Line 7468
  at julianxhokaxhiu\bgfx\src\renderer_vk.cpp(7468)
example-02-metaballsDebug.exe!bgfx::vk::RendererContextVK::init(const bgfx::Init & _init) Line 1861
  at julianxhokaxhiu\bgfx\src\renderer_vk.cpp(1861)
example-02-metaballsDebug.exe!bgfx::vk::rendererCreate(const bgfx::Init & _init) Line 4437
  at julianxhokaxhiu\bgfx\src\renderer_vk.cpp(4437)
example-02-metaballsDebug.exe!bgfx::rendererCreate(const bgfx::Init & _init) Line 2757
  at julianxhokaxhiu\bgfx\src\bgfx.cpp(2757)
example-02-metaballsDebug.exe!bgfx::Context::rendererExecCommands(bgfx::CommandBuffer & _cmdbuf) Line 2808
  at julianxhokaxhiu\bgfx\src\bgfx.cpp(2808)
example-02-metaballsDebug.exe!bgfx::Context::renderFrame(int _msecs) Line 2445
  at julianxhokaxhiu\bgfx\src\bgfx.cpp(2445)
example-02-metaballsDebug.exe!bgfx::renderFrame(int _msecs) Line 1480
  at julianxhokaxhiu\bgfx\src\bgfx.cpp(1480)
example-02-metaballsDebug.exe!entry::Context::run(int _argc, const char * const * _argv) Line 534
  at julianxhokaxhiu\bgfx\examples\common\entry\entry_windows.cpp(534)
example-02-metaballsDebug.exe!main(int _argc, const char * const * _argv) Line 1187
  at julianxhokaxhiu\bgfx\examples\common\entry\entry_windows.cpp(1187)
[External Code]

 

 
Any help on this regard is appreciated.

 

Thank you in advance and best regards,
Julian
Solution Architect/Engineer
https://julianxhokaxhiu.com/
0 Likes
1 Solution

Hey Julian,

 

From Ramel Mammo:

 

It appears you have applied the change to the wrong member. It looks like you set the sType and pNext for VkWin32SurfaceCreateInfoKHR. But it needs to set the sType and pNext for the m_surface member, like so:

VkResult SwapChainVK::createSurface() {

m_surface.sType = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_SURFACE_INFO_2_KHR;
m_surface.pNext = nullptr;
...

}

The issue is that VkPhysicalDeviceSurfaceInfo2KHR m_surface is declared in the header file but sType and pNext are never set. The reason this works on release is because it zeros out the structure, but debug build keeps garbage values.

 

Thanks,

Owen

View solution in original post

0 Likes
8 Replies
dipak
Big Boss

Thank you for providing the above information. I've reported the issue to the Vulkan team.

Thanks.

0 Likes

Hey JulianXhokaxhiu,

Looks like you're missing the VkDisplayNativeHdrSurfaceCapabilitiesAMD struct during creation of VkSurfaceCapabilities2KHR.

The sample code uses:

// VkHdrMetadataEXT
s_HdrMetadataEXT.sType = VK_STRUCTURE_TYPE_HDR_METADATA_EXT;
s_HdrMetadataEXT.pNext = nullptr;

// VkDisplayNativeHdrSurfaceCapabilitiesAMD
s_DisplayNativeHdrSurfaceCapabilitiesAMD.sType = VK_STRUCTURE_TYPE_DISPLAY_NATIVE_HDR_SURFACE_CAPABILITIES_AMD;
s_DisplayNativeHdrSurfaceCapabilitiesAMD.pNext = &s_HdrMetadataEXT;

// VkSurfaceCapabilities2KHR
s_SurfaceCapabilities2KHR.sType = VK_STRUCTURE_TYPE_SURFACE_CAPABILITIES_2_KHR;
s_SurfaceCapabilities2KHR.pNext = &s_DisplayNativeHdrSurfaceCapabilitiesAMD;

VkResult res = g_vkGetPhysicalDeviceSurfaceCapabilities2KHR(s_physicalDevice,
&s_PhysicalDeviceSurfaceInfo2KHR,
&s_SurfaceCapabilities2KHR);

Your code initializes like this:

VkHdrMetadataEXT s_HdrMetadataEXT;
s_HdrMetadataEXT.sType = VK_STRUCTURE_TYPE_HDR_METADATA_EXT;
s_HdrMetadataEXT.pNext = nullptr;

VkSurfaceCapabilities2KHR surfaceCapabilities;
surfaceCapabilities.sType = VK_STRUCTURE_TYPE_SURFACE_CAPABILITIES_2_KHR;
surfaceCapabilities.pNext = &s_HdrMetadataEXT;
result = vkGetPhysicalDeviceSurfaceCapabilities2KHR(physicalDevice, &m_surface, &surfaceCapabilities);

 

0 Likes
JulianXhokaxhiu
Journeyman III

Hello Owen,

Thanks for the reply. As a matter of fact I did try also my code with or without that block and it crashes the same. Nevertheless for the sake of completeness, I added it to my code as well and I'm getting the exact same crash.

Also for your information I was able to execute my previous code iteration past any AMD 22.7.1 driver release, which makes me think something broke on your own end starting that release.

Here is the updated stack trace after adding VkDisplayNativeHdrSurfaceCapabilitiesAMD:

 

 

 	amdvlk64.dll!00007ffb58903e00()	Unknown
 	vulkan-1.dll!00007ffbc575df96()	Unknown
>	example-02-metaballsDebug.exe!bgfx::vk::SwapChainVK::createSwapChain() Line 6813	C++
 	example-02-metaballsDebug.exe!bgfx::vk::SwapChainVK::create(VkCommandBuffer_T * _commandBuffer, void * _nwh, const bgfx::Resolution & _resolution, bgfx::TextureFormat::Enum _depthFormat) Line 6501	C++
 	example-02-metaballsDebug.exe!bgfx::vk::FrameBufferVK::create(unsigned short _denseIdx, void * _nwh, unsigned int _width, unsigned int _height, bgfx::TextureFormat::Enum _format, bgfx::TextureFormat::Enum _depthFormat) Line 7472	C++
 	example-02-metaballsDebug.exe!bgfx::vk::RendererContextVK::init(const bgfx::Init & _init) Line 1861	C++
 	example-02-metaballsDebug.exe!bgfx::vk::rendererCreate(const bgfx::Init & _init) Line 4437	C++
 	example-02-metaballsDebug.exe!bgfx::rendererCreate(const bgfx::Init & _init) Line 2757	C++
 	example-02-metaballsDebug.exe!bgfx::Context::rendererExecCommands(bgfx::CommandBuffer & _cmdbuf) Line 2808	C++
 	example-02-metaballsDebug.exe!bgfx::Context::renderFrame(int _msecs) Line 2445	C++
 	example-02-metaballsDebug.exe!bgfx::renderFrame(int _msecs) Line 1480	C++
 	example-02-metaballsDebug.exe!entry::Context::run(int _argc, const char * const * _argv) Line 534	C++
 	example-02-metaballsDebug.exe!main(int _argc, const char * const * _argv) Line 1187	C++
 	[External Code]	

 

 

Can you please double check on your side where does that pointer end in the driver code?

I am currently on AMD release 22.8.2.

Thank you in advance,
Julian

Solution Architect/Engineer
https://julianxhokaxhiu.com/
0 Likes

Thanks for the update JulianXhokaxhiu, I've created internal ticket SWDEV-360445 and we're looking into it.

-Owen

0 Likes

Hi Julian

Your HDR branch changed the type of m_surface from VkSurfaceKHR to VkPhysicalDeviceSurfaceInfo2KHR. Although you updated the rest of the code to accommodate this change, you appear to have missed setting the sType and pNext members of the new member which causes segfaults in the validation layers and in our driver. Setting m_surface's sType to VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_SURFACE_INFO_2_KHR and pNext to null in createSurface method resolves this issue.

Thanks,

Owen

(Issue was resolved by Ramel Mammo)

 

0 Likes

Hello Owen,

Thank you very much first of all for the kind reply. I did try to apply the changes suggested by you ( https://github.com/julianxhokaxhiu/bgfx/blob/feature/vulkan-hdr/src/renderer_vk.cpp#L6722-L6723 ) but when trying this change on top of the latest AMD driver 22.10.3 I get this error:

 

Exception thrown at 0x50E65AF2 (amdvlk32.dll) in example-09-hdrDebug.exe: 0xC0000005: Access violation reading location 0xCDCDCDCD.

 

Any idea what is going on here? This code works fine under Nvidia w/Vulkan.

Thank you in advance and best regards,
Julian

//EDIT: Ok it seems I'm able to trigger this issue only under debug builds, if I do a release one it works just fine. Although something weird is happening here, can you please double check there is no left-over code on your own side that triggers this behavior? I don't see why it works fine under Release, but not Debug. Thanks!

Solution Architect/Engineer
https://julianxhokaxhiu.com/
0 Likes

Hey Julian,

 

From Ramel Mammo:

 

It appears you have applied the change to the wrong member. It looks like you set the sType and pNext for VkWin32SurfaceCreateInfoKHR. But it needs to set the sType and pNext for the m_surface member, like so:

VkResult SwapChainVK::createSurface() {

m_surface.sType = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_SURFACE_INFO_2_KHR;
m_surface.pNext = nullptr;
...

}

The issue is that VkPhysicalDeviceSurfaceInfo2KHR m_surface is declared in the header file but sType and pNext are never set. The reason this works on release is because it zeros out the structure, but debug build keeps garbage values.

 

Thanks,

Owen

0 Likes

Thanks a lot for the reply. Indeed I noticed that just setting

m_surface.pNext = nullptr;

Is enough to make it work, but setting them both is better to ensure the Vulkan code is validated correctly 🙂

Thanks a lot, it has been quite a rollercoaster but I'm glad we found a way out. Appreciated!

Solution Architect/Engineer
https://julianxhokaxhiu.com/
0 Likes