cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

Transfer queue copy operations

I'm attempting to copy staged data from a VkBuffer or VkImage bound in host-visible memory into a VkImage in device-local memory.

vkCmdCopyBuffer appears to work with any queue family.

However, I've only been able to get vkCmdCopyBufferToImage, and vkCmdCopyImage to work in a queue family with graphics enabled.

The spec reads:

The VkCommandPool that commandBuffer was allocated from must support transfer, graphics, or compute operations.

and...

VK_PIPELINE_STAGE_TRANSFER_BIT: Execution of copy commands. This includes the operations resulting from all transfer commands. The set of transfer commands comprises vkCmdCopyBuffer, vkCmdCopyImage, vkCmdBlitImage, vkCmdCopyBufferToImage, vkCmdCopyImageToBuffer, vkCmdUpdateBuffer, vkCmdFillBuffer, vkCmdClearColorImage, vkCmdClearDepthStencilImage, vkCmdResolveImage, and vkCmdCopyQueryPoolResults.

Which seems to me that transfer queue families should support it.

The obvious workaround is to use vkCmdCopyBuffer (host to GPU copy) using the transfer queue family and vkCmdCopyBufferToImage (local GPU memory copy) using the graphics queue family to avoid stalling the graphics queue since local memory copies are faster than PCI-e transfers.

Is this expected behavior?

  • Hardware: AMD GPU (R9 Nano)
  • Requested API: 1.0.13
  • Radeon driver: 16.4.2
  • Windows 10
0 Likes
4 Replies
dwitczak
Staff

Thank you for your report. This does not seem to be a legitimate driver behavior. Would you mind providing an example application which reproduces this issue? Modifying SDK's cube application to do the job would be just fine.

0 Likes

Attached is a simple application which displays a 24-bit 512x512 BMP file using 1 of 4 methods using the variables enable_transfer_queue and use_buffer_copy:

enable_transfer_queue makes use of the first available queue family for transfer operations only.

use_buffer_copy adds a intermediate buffer to move data from host memory to device memory.

Possible transfer configurations:

* Buffer (host-visible) ===> Image (device-local)

* Buffer (host-visible) ===> Buffer (device-local) ===> Image (device-local)

Because the minimum image transfer granularity on my transfer queue family is (8, 8, 8), I transfer a 3D image with a depth of 8 and simply blit the 1st layer to the swap chain.

The problem is that enable_transfer_queue=true and use_buffer_copy=false does not work.

0 Likes

OK, so this one is an issue we recently fixed. I'm not sure if the necessary changes have already gone out, but what I certainly can tell is that the driver you are using is outdated 🙂 Please consider checking if you can reproduce the problem with the latest driver version (APU)

0 Likes

I get the same behavior.

I've been using [Radeon Software Crimson Edition 16.7.3 Driver Version 16.30.2311] for the past week to no avail.

I upgraded to [Radeon Software Crimson Edition 16.8.3 Driver Version 16.30.2511] today to re-test the issue.

Driver properties:

pastedImage_0.png

Vulkan Device Properties:

pastedImage_2.png

amdvlk64.dll properties:

* MD5: be12504d74301bcc97f5ebeec17ce6b9

pastedImage_4.png

0 Likes