cancel
Showing results for 
Search instead for 
Did you mean: 

OpenCL

eric880212
Adept I

How to use opencl multiple command queues

I want to launch multiple kernels. I know that if I need to have they execute concurrently, I need to use multiple command queues. But I have few questions.

I ran an command "/opt/amd-gpu/bin/clinfo" (it's downloaded with driver), then it printed out lots of information and I found "Max on device queues : 1".

So I guess I have no capibility to use multiple command queues and command queue is a hardware resource on GPU right? 

In this situation, I need to change my GPU ? Or is it have something to do with driver or CPU ?

 

0 Likes
1 Solution
dipak
Big Boss

>> Max on device queues : 1

This limit is for device-side queue. The maximum number of device queues that can be created per context. This value can be queried using clGetDeviceInfo  with param CL_DEVICE_QUEUE_ON_ DEVICE_MAX_SIZE.

For concurrent command execution using multiple host-side queues, it mainly depends on hardware capabilities like no. of asynchronous compute engines (ACEs) and hardware queues available on the device. If you are using a recent AMD GPU, then I think it should have multiple ACEs and hardware queues.

As described in the section "Command Queue" in AMD_OpenCL_Programming_Optimization_Guide.pdf :

"A hardware queue can be thought of as a GPU entry point. The GPU can process kernels from several compute queues concurrently. All hardware queues ultimately share the same compute cores. The use of multiple hardware queues is beneficial when launching small kernels that do not fully saturate the GPU. "

"An OpenCL queue is assigned to a hardware queue on creation time. The hardware compute queues are selected according to the creation order within an OpenCL context. If the hardware supports K concurrent hardware queues, the Nth created OpenCL queue within a specific OpenCL context will be assigned to the (N mod K) hardware queue. The number of compute queues can be limited by specifying the GPU_NUM_COMPUTE_RINGS environment variable."

Thanks.

View solution in original post

1 Reply
dipak
Big Boss

>> Max on device queues : 1

This limit is for device-side queue. The maximum number of device queues that can be created per context. This value can be queried using clGetDeviceInfo  with param CL_DEVICE_QUEUE_ON_ DEVICE_MAX_SIZE.

For concurrent command execution using multiple host-side queues, it mainly depends on hardware capabilities like no. of asynchronous compute engines (ACEs) and hardware queues available on the device. If you are using a recent AMD GPU, then I think it should have multiple ACEs and hardware queues.

As described in the section "Command Queue" in AMD_OpenCL_Programming_Optimization_Guide.pdf :

"A hardware queue can be thought of as a GPU entry point. The GPU can process kernels from several compute queues concurrently. All hardware queues ultimately share the same compute cores. The use of multiple hardware queues is beneficial when launching small kernels that do not fully saturate the GPU. "

"An OpenCL queue is assigned to a hardware queue on creation time. The hardware compute queues are selected according to the creation order within an OpenCL context. If the hardware supports K concurrent hardware queues, the Nth created OpenCL queue within a specific OpenCL context will be assigned to the (N mod K) hardware queue. The number of compute queues can be limited by specifying the GPU_NUM_COMPUTE_RINGS environment variable."

Thanks.