cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

bonsaiscott
Journeyman III

Current Status of Asynch Compute & Device Side Child Kernel Enqueue Support?

What is the current status of OpenCL support on Hawaii (Tahiti or Pitcairn if possible) to perform:

1) Host-initiated simultaneous GPU device kernel execution

2) GPU device initiated GPU child kernel spawn

I.e. when will AMD support the equivalent of CUDA's Hyper-Q and Dynamic Parallelisn for their GPUs under OpenCL?

Scott

0 Likes
4 Replies
sudarshan
Staff

Hi,

1. Concurrent kernel execution is already supported on AMD devices. This can be done by creating multiple command queues for the devices and queuing independent kernels across queues.

2. Dynamic parallelism is expected to be supported with OpenCL 2.0.

0 Likes

Thank you for the reply sudarshan.

So today on Hawaii I can create 64 queues (8 ACEs with 8 queues each) and have 64 different kernels running simultaneously?

Under Linux?  With the standard Radeon driver or is a FirePro required?  Is there any sample code which proves this?

The only reference I can find to OpenCL 2.0 support is for the OpenCL 1.2 beta driver, but the list doesn't include Dynamic Parallelism.  Is there any ETA on this under Linux?

0 Likes

Hi,

I have never tried creating so many command queues for a single device, so I am not sure about it. But if you would be doing it and have some insights to share, it would be of great help.

For concurrent kernel execution  there is a sample code available in AMD APP SDK 2.9 in  AMD_APP_SDK\2.9\samples\opencl\cpp_cl\ConcurrentKernel.

Drivers for OpenCL 2.0 are not yet released.

0 Likes

There is no document on Hawaii, but guide for 79xx says that it has two "Asynchronous Compute Engine/ Command Processor" that can process command queues concurrently.

So two kernels at a time, but you could ask OpenCL runtime for more, it will just dispatch them in some serial order. According to CodeXL 7850 also can execute two at a time (with Radeon driver on Win7 x64). Maybe two command processor for all GCN1.0.

Never checked if on Linux concurrent execution works, but code with multiple queues run correctly.

P.S i think i've heard somewhere that 7790 (which is GCN 1.1 maybe) can execute 3 at a time, but may be i'm wrong. May be it's also the case for Hawaii since it's something like GCN 1.1 too.

0 Likes