cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

dutta
Adept II

Descriptor set layouts and shader matching

I have an issue which is inconsistent between the AMD implementation and the Nvidia one when related to creating descriptor set layouts. I have some shader code which implements two descriptor sets, and they have the layout(set=0) and layout(set=5) qualifiers respectively. The issue is when I create descriptor set layouts from these shaders, on AMD, I need to create a layout for sets 0 through 5, where the 1,2,3,4 are just empty with 0 binds. If I don't, the vkCreateXPipelines cause an access violation at some later stage. On Nvidia, I have to create only 0 and 5, otherwise the same function gets the same error. Just gathering from the documentation, the only 'required' behaviour for the set location qualifier is which descriptor set is bound where when calling vkCmdBindDescriptorSets, where in my case I would bind 0, NULL, NULL, NULL, NULL, 5. However, it seems like the pipeline linkage assumes the descriptor layouts tries to match the shader code by looking at the set qualifier in the shader code (index as-is on AMD, but indexed as incrementally on Nvidia) and use that as an offset into the pBindings used in the vkCreateDescriptorSetLayout and assume they match, but that's just my speculation.

However, this is an edge case where we have 'gaps' in the descriptor sets, and I can't find any specification on how this is supposed to work, but it would be nice to know what I can assume to be the right way.

0 Likes
1 Solution

If the range of descriptor sets your application requires to use is noncontinuous, the missing descriptor sets still count. For each such descriptor set, you need to include its stub definition when filling the VkPipelineLayoutCreateInfo structure, where by "stub" I mean a valid instance of VkDescriptorSetLayout which does not contain any descriptors.

We believe our driver's behavior is correct. Our understanding stems from the fact the Vulkan specification does not recognize the concept of descriptor set numbering discontinuity, and in specific it does not deem it invalid. Please consider raising this issue with the Khronos consortium, perhaps by filing an issue under Issues · KhronosGroup/Vulkan-Docs · GitHub.

I'd like to again indicate the fact that this use case indicates an issue in the application.

View solution in original post

0 Likes
8 Replies
dwitczak
Staff

Thank you for your message. According to Vulkan 1.0.6 Specification (Vulkan 1.0.6 + WSI Extensions - A Specification​​):

vkCmdBindDescriptorSets causes the sets numbered [firstSet.. firstSet+descriptorSetCount-1] to use the bindings stored in pDescriptorSets[0..descriptorSetCount-1] for subsequent rendering commands (either compute or graphics, according to the pipelineBindPoint). Any bindings that were previously applied via these sets are no longer valid.

From the language above, I'd say that even though your app only needs to update a total of 2 descriptor set bindings, it would still need to pass a total of 6 DS bindings in the <pDescriptorSets> array, items at indices 1..4 (inclusive) set to NULL. Sounds like the behavior you're seeing with our driver is valid then.

And I assume the vkCmdBindDescriptorSets must match the layout created from vkCreateDescriptorSetLayout in that the pDescriptor sets passed must have identical indices. The issue is not with vkCmdBindDescriptor sets (as far as I know yet) but with creating the array of descriptor set layouts which only implements a descriptor set layout per group, instead of implementing a descriptor set layout for the entire range, but where sets 1..4 inclusive would have their pBindings set to VK_NULL_HANDLE and bindingCount as 0.


The issue I had was that when creating the range 1..4 of empty descriptor set layouts I would get an access violation on the Nvidia driver, but on the AMD driver I had to do it. An easy resolve to this issue is to just change the set index to 1 and have the range closed, but I can't find anywhere that having a range of descriptor sets is in any way required, which is why I'm confused as to what is the correct behavior. The access violation only occurs when trying to create a pipeline object using the pipeline layout created based on the descriptor set range.

0 Likes

Let me check if we're on the same page:

1. You're seeing a behavioral difference between ours and competitor's driver.

2. This happens for a scenario where a certain shader stage defines two descriptor set bindings: one at index 0 and the other one at index 5.

3. When creating a pipeline layout, your observation is that, on our driver, you need to create a pipeline where all descriptor set layouts in range from 0 to 5 inclusive point to relevant VkDescriptorSetLayout instances, in order for the compute/GFX pipeline creation to succeed without crashing. On the competitor's driver, following this pattern causes a crash because it seems like their driver has an issue with descriptor set layouts whose descriptorCount field is set to 0.

4. Finally, when binding the descriptor sets, the "continuity" (in the sense of language I cited in my first response) of the range of descriptor sets you need to bind appears to be defined by the pipeline layout.

If my understanding above is correct, the third bullet could indicate an issue in our driver. I'm going to try to reproduce it locally shortly and will get back to you.

0 Likes

Precisely! Although I can't speak for when binding the descriptor sets later, I just assumed the correct way to handle non-used descriptor sets when binding them was to simply send a null handle. I can't really speak for how vkCmdBindDescriptorSets behaves just yet.

0 Likes

I think I know where the problem's coming from. In the "Valid Usage" section for vkCmdBindDescriptorSets(), there's the following language:

* pDescriptorSets must be a pointer to an array of descriptorSetCount valid VkDescriptorSet handles

However, you earlier said that:

"vkCmdBindDescriptorSets, where in my case I would bind 0, NULL, NULL, NULL, NULL, 5. "

It is illegal to pass NULL VkDescriptorSet handles in the via the <pDescriptorSets> argument of the vkCmdBindDescriptorSets() function. What you should do instead is: initialize a "dummy" descriptor set, encapsulating zero descriptors, and use that descriptor set handle instead of the NULL values.

Edit: I forgot to add. I checked today and pipelines, which have "gaps" in descriptor set layout bindings, work as expected with our driver, as long as descriptor set & pipeline layouts are correctly configured (that is: pipeline layouts are configured correctly, and no NULL handles are passed at binding time).

0 Likes

Yes, I know they work with the AMD driver, but I was wondering what was to be considered the correct behavior. Because having 'gaps' is the only acceptable way for some of my shaders with the AMD driver, and having no 'gaps' was the only option on the Nvidia driver. Although I can't find any documentation which would suggest what is supposed to be the correct behavior. Does the pSetLayouts (used for vkCreatePipelineLayout) have to match the layout(set=X) qualifier in the shader where pSetLayouts matches the descriptor set described in the shader? In the Nvidia driver it seems that pSetLayouts[0] matches set 0, and pSetLayouts[1] matches set 5, because the sets are not indexed directly into the pSetLayouts but instead work like a list (as it seems). However on the AMD driver it seems like pSetLayouts actually must match a descriptor set defined to use set=X in the shader, meaning that pSetLayouts[3] must match with any descriptor using the layout(set=3), and if none such exist it will cause an access violation. It seems like the amount of descriptor sets in the AMD driver conforms to the maximum number of any of the set qualifiers, whereas the Nvidia driver gives a new descriptor set for each unique set qualifier present. The AMD driver works just fine with the 'gaps' thing, that's not the issue. The issue is what to expect from the Vulkan spec, because as long as I know what is to be expected to work I can stick with that and just wait for it to be patched on the other driver.

I have the code on Github, if you want to have a look it's in the function called SetupDescriptorLayout(AnyFX::ShaderEffect* effect). The AMD_DESC_SETS is set to 1 if we want it to work on the AMD driver.


nebula-trifid/vkshaderprogram.cc at vulkan · gscept/nebula-trifid · GitHub

I will take into consideration that binding NULL to vkCmdBindDescriptorSets is not allowed. Should be easy to just create a dummy descriptor set with 0 bindings and use that whenever I need to fill the range with 'NULL' descriptor sets.


Edit: I think I messed up a bit, I was talking about vkCreateDescriptorSetLayouts and pBindings when I actually meant vkCreatePipelineLayout and pSetLayouts. I apologize for the obvious mistake. The access violation happens when creating pipeline objects, and the violation stops if the pipeline layout conforms to either one of the solutions dependent on the driver.

0 Likes

If the range of descriptor sets your application requires to use is noncontinuous, the missing descriptor sets still count. For each such descriptor set, you need to include its stub definition when filling the VkPipelineLayoutCreateInfo structure, where by "stub" I mean a valid instance of VkDescriptorSetLayout which does not contain any descriptors.

We believe our driver's behavior is correct. Our understanding stems from the fact the Vulkan specification does not recognize the concept of descriptor set numbering discontinuity, and in specific it does not deem it invalid. Please consider raising this issue with the Khronos consortium, perhaps by filing an issue under Issues · KhronosGroup/Vulkan-Docs · GitHub.

I'd like to again indicate the fact that this use case indicates an issue in the application.

0 Likes

Yes, I agree that this use case is an issue, and it's basically an edge case which is not handled by the Khronos reference compiler nor the LunarG validation layer. The problem is not really with the drivers - the compiler should pick this up and report this as an error. For the record I had no intention to pitch the driver as being faulty compared to the Nvidia one, but merely showcase there is no 'right' way to handle it.

I will try to raise this with Khronos. Hopefully the reference compiler can implement a validation for discontinuous sets.

Thank you for the exchange!

0 Likes