0 Replies Latest reply on Aug 28, 2015 5:53 PM by quiret

    Direct3D 12 problem with Descriptor Table

    quiret

      Hello GPU developers!

       

      I came here to ask about a problem that I encounter with Direct3D 12. Since AMD has created this beautiful world of low-level graphics APIs for the PC, I think this is the right place to ask. I am just hoping that you guys know enough about Microsoft's mapping to your Mantle API.

       

      Right. So I do not entirely grasp the flexibility guarrantees of a Descriptor Table. From the docs I read that it is supposed to describe the access patterns that will be used by HLSL shaders on a Descriptor Heap. So I try to create a this table with two ranges, one element CBV and one element UAV, and the application does not do what I would expect it to do. It is a shame, because I usually have no problems with assembler, low level programming and memory management, so I think there is a catch somewhere....

       

      Here is the code that I use to create the Root Signature.

      // Create a root signature.
      {
          CD3DX12_DESCRIPTOR_RANGE ranges[2];
          CD3DX12_ROOT_PARAMETER rootParameters[1];
      
          ranges[1].Init(D3D12_DESCRIPTOR_RANGE_TYPE_CBV, 1, 0, 0, 0);
          ranges[0].Init(D3D12_DESCRIPTOR_RANGE_TYPE_UAV, 1, 1, 0, 1);
          rootParameters[0].InitAsDescriptorTable(_countof(ranges), ranges, D3D12_SHADER_VISIBILITY_PIXEL);
      
          // Allow input layout and deny uneccessary access to certain pipeline stages.
          D3D12_ROOT_SIGNATURE_FLAGS rootSignatureFlags =
              D3D12_ROOT_SIGNATURE_FLAG_ALLOW_INPUT_ASSEMBLER_INPUT_LAYOUT |
              D3D12_ROOT_SIGNATURE_FLAG_DENY_HULL_SHADER_ROOT_ACCESS |
              D3D12_ROOT_SIGNATURE_FLAG_DENY_DOMAIN_SHADER_ROOT_ACCESS |
              D3D12_ROOT_SIGNATURE_FLAG_DENY_GEOMETRY_SHADER_ROOT_ACCESS;// |
              //D3D12_ROOT_SIGNATURE_FLAG_DENY_PIXEL_SHADER_ROOT_ACCESS;
      
          CD3DX12_ROOT_SIGNATURE_DESC rootSignatureDesc;
          rootSignatureDesc.Init(_countof(rootParameters), rootParameters, 0, nullptr, rootSignatureFlags);
      
          ComPtr<ID3DBlob> signature;
          ComPtr<ID3DBlob> error;
          ThrowIfFailed(D3D12SerializeRootSignature(&rootSignatureDesc, D3D_ROOT_SIGNATURE_VERSION_1, &signature, &error));
          ThrowIfFailed(m_device->CreateRootSignature(0, signature->GetBufferPointer(), signature->GetBufferSize(), IID_PPV_ARGS(&m_rootSignature)));
      }
      

       

      I think the above code is to create a mapping as displayed in this diagram.

      descriptor_table_mapping_problem.png

      In the diagram you can see the HLSL declarations I use to access the GPU memory. I have purposefully used commit buffers, so that I can inspect the values that the GPU has written on the CPU. And for some reasons the GPU redirects every access to "mybuf" into the CBV that is mapped to register b0!

       

      // Create descriptor heaps.
      {
          // Describe and create a render target view (RTV) descriptor heap.
          D3D12_DESCRIPTOR_HEAP_DESC rtvHeapDesc = {};
          rtvHeapDesc.NumDescriptors = FrameCount;
          rtvHeapDesc.Type = D3D12_DESCRIPTOR_HEAP_TYPE_RTV;
          rtvHeapDesc.Flags = D3D12_DESCRIPTOR_HEAP_FLAG_NONE;
          ThrowIfFailed(m_device->CreateDescriptorHeap(&rtvHeapDesc, IID_PPV_ARGS(&m_rtvHeap)));
      
          m_rtvDescriptorSize = m_device->GetDescriptorHandleIncrementSize(D3D12_DESCRIPTOR_HEAP_TYPE_RTV);
      
          // Describe and create a constant buffer view (CBV) descriptor heap.
          // Flags indicate that this descriptor heap can be bound to the pipeline 
          // and that descriptors contained in it can be referenced by a root table.
          D3D12_DESCRIPTOR_HEAP_DESC cbvHeapDesc = {};
          cbvHeapDesc.NumDescriptors = 2;
          cbvHeapDesc.Flags = D3D12_DESCRIPTOR_HEAP_FLAG_SHADER_VISIBLE;
          cbvHeapDesc.Type = D3D12_DESCRIPTOR_HEAP_TYPE_CBV_SRV_UAV;
          ThrowIfFailed(m_device->CreateDescriptorHeap(&cbvHeapDesc, IID_PPV_ARGS(&m_cbvHeap)));
      
          m_cbvUavDescriptorSize = m_device->GetDescriptorHandleIncrementSize(D3D12_DESCRIPTOR_HEAP_TYPE_CBV_SRV_UAV);
      }
      

       

      What is important is the constant buffer view descriptor heap (m_cbvHeap). I basically say that I want a heap that can hold two descriptors, so I assume it is an array of two descriptors that can be offset. Let's move on.

       

      Here is the code that I use to set up the heap... and later bind it to the pipeline.

      // Create the GPU memory.
      {
          ThrowIfFailed(m_device->CreateCommittedResource(
              &CD3DX12_HEAP_PROPERTIES(D3D12_HEAP_TYPE_UPLOAD),
              D3D12_HEAP_FLAG_NONE,
              &CD3DX12_RESOURCE_DESC::Buffer(1024 * 64),
              D3D12_RESOURCE_STATE_GENERIC_READ,
              nullptr,
              IID_PPV_ARGS(&m_constantBuffer)));
      
          // Describe and create a constant buffer view.
          D3D12_CONSTANT_BUFFER_VIEW_DESC cbvDesc = {};
          cbvDesc.BufferLocation = m_constantBuffer->GetGPUVirtualAddress();
          cbvDesc.SizeInBytes = (sizeof(ConstantBuffer) + 255) & ~255;    // CB size is required to be 256-byte aligned.
      
          CD3DX12_CPU_DESCRIPTOR_HANDLE handle01( m_cbvHeap->GetCPUDescriptorHandleForHeapStart(), 0, this->m_cbvUavDescriptorSize );
      
          cbvDescriptor = handle01;
      
          m_device->CreateConstantBufferView(&cbvDesc, handle01);
      
          // Initialize and map the constant buffers. We don't unmap this until the
          // app closes. Keeping things mapped for the lifetime of the resource is okay.
          ZeroMemory(&m_constantBufferData, sizeof(m_constantBufferData));
          ThrowIfFailed(m_constantBuffer->Map(0, nullptr, reinterpret_cast<void**>(&m_pCbvDataBegin)));
          memcpy(m_pCbvDataBegin, &m_constantBufferData, sizeof(m_constantBufferData));
      
          size_t texBufSize = ( sizeof(DepthColorSortSample) * this->m_width * this->m_height + 255 ) & ~255;
      
          ThrowIfFailed(m_device->CreateCommittedResource(
              &CD3DX12_HEAP_PROPERTIES(D3D12_HEAP_TYPE_UPLOAD),
              D3D12_HEAP_FLAG_NONE,
              &CD3DX12_RESOURCE_DESC::Buffer(texBufSize),
              D3D12_RESOURCE_STATE_GENERIC_READ,
              nullptr,
              IID_PPV_ARGS(&m_uavEndBuffer)));
      
          D3D12_UNORDERED_ACCESS_VIEW_DESC uavDesc = {};
          uavDesc.ViewDimension = D3D12_UAV_DIMENSION_BUFFER;
          uavDesc.Format = DXGI_FORMAT_UNKNOWN;
          uavDesc.Buffer.FirstElement = 0;
          uavDesc.Buffer.CounterOffsetInBytes = 0;
          uavDesc.Buffer.Flags = D3D12_BUFFER_UAV_FLAG_NONE;
          uavDesc.Buffer.NumElements = ( m_width * m_height );
          uavDesc.Buffer.StructureByteStride = sizeof(DepthColorSortSample);
      
          CD3DX12_CPU_DESCRIPTOR_HANDLE handle02( m_cbvHeap->GetCPUDescriptorHandleForHeapStart(), 1, this->m_cbvUavDescriptorSize );
      
          uavDescriptor = handle02;
              
          m_device->CreateUnorderedAccessView( m_uavEndBuffer.Get(), NULL, &uavDesc, handle02 );
      
          // Map for eternity.
          ThrowIfFailed(m_uavEndBuffer->Map(0, nullptr, (void**)&m_uavEndDataBegin));
      
          // Zero out.
          memset( m_uavEndDataBegin, 0, texBufSize );
      }
      

       

      This code makes use of the assumption that m_cbvHeap is an array of two descriptors, so that I put the CBV at slot 0 and the UAV at slot 1 (a.k.a. offset 0 into the heap and offset 1 into the heap). Now if we go back to the diagram, it says that offset 0 should contain the pointer to the CBV and offset 1 should contain the pointer to the UAV, but it the output makes no sense!

       

      And then there is the point where I map the Descriptor Heap and the resources to the pipeline. Here is the code.

      // Create and record the bundle.
      {
          ThrowIfFailed(m_device->CreateCommandList(0, D3D12_COMMAND_LIST_TYPE_BUNDLE, m_bundleAllocator.Get(), m_pipelineState.Get(), IID_PPV_ARGS(&m_bundle)));
          m_bundle->SetDescriptorHeaps(1, m_cbvHeap.GetAddressOf());
          m_bundle->SetGraphicsRootSignature(m_rootSignature.Get());
          m_bundle->IASetPrimitiveTopology(D3D_PRIMITIVE_TOPOLOGY_TRIANGLELIST);
          m_bundle->IASetVertexBuffers(0, 1, &m_vertexBufferView);
          m_bundle->SetGraphicsRootDescriptorTable(0, m_cbvHeap->GetGPUDescriptorHandleForHeapStart());
          m_bundle->DrawInstanced(3, 1, 0, 0);
          ThrowIfFailed(m_bundle->Close());
      }
      

       

      I am confused because if I modify my application so that I put the UAV descriptor directly into the root signature using ID3D12GraphicsCommandList::SetGraphicsRootUnorderedAccessView, it works! Why would the Descriptor table approach, for which Microsoft says is probably the best for broadest compatibility, fail?

       

      Here is the full HLSL code, in case you guys need it.

      cbuffer ConstantBuffer : register(b0)
      {
          float2 screensize;
      };
      
      struct SortedColorItem
      {
          float4 color;
          float depth;
      };
      
      struct BufferInput
      {
          SortedColorItem colorItems[8];
          float isUsed;
      };
      
      RWStructuredBuffer <BufferInput> mybuf : register(u1);
      
      struct PSInput
      {
          float4 position : SV_POSITION;
          float4 color : COLOR;
      };
      
      PSInput VSMain(float4 position : POSITION, float4 color : COLOR)
      {
          PSInput result;
      
          result.position = position;
          result.color = color;
      
          return result;
      }
      
      float4 PSMain(PSInput input) : SV_TARGET
      {
          float texBufferCoord = screensize[0] * input.position.y + input.position.x;
      
          BufferInput newInput = mybuf[texBufferCoord];
      
          float gColor;
      
          if ( newInput.isUsed == 0.0f )
          {
              newInput.isUsed = 1.0;
              newInput.colorItems[0].color.x++;
      
              gColor = 1.0f;
          }
          else
          {
              newInput.isUsed = 0.0f;
      
              gColor = 0.0f;
          }
      
          mybuf[texBufferCoord] = newInput;
      
          float red = input.position.x / screensize[0];
      
          return float4(red, gColor, ( newInput.colorItems[0].color.x / 50.0f ) % 1.0f, 1.0);
      }
      

       

      I am grateful for any response that helps me solve this problem.

       

      If you need anything more from me, a sample, screenshots, etc, tell me.