I've turned on out of order queues on my project running on HD 7700.
This significantly improves performance, I highly recommend it!
Only caveat is that you have to manage kernel sequence yourself, using events.
My question: which AMD cards support out of order queues? Also, is this feature purely on
the driver side, or is there silicon required to support it?
Thanks!
Solved! Go to Solution.
Thanks, Dipak. Yeah, this turned out to be wishful thinking. On closer inspection, the out of order flag is ignored; makes not difference.
Out of order queues device-side queues is part of the OpenCL standard since 1.0. Every GPU should support it AFAIK. For host-side it depends on the implementation. I know of one device that does not! Altera clearly states in its programming guide that its runtime does not support such queues.
As per OpenCL spec, supporting out-of-order queue is not a mandatory feature. So, I guess, if you pass the out-of-order flag during command queue creation, the implementation may ignore this flag if out-of-order is not supported by the platform.
AFAIK, out-of-order host-side queues are not currently supported on AMD platform, but out-of-order device-side queues are supported. However, certain devices have hardware support which can simultaneously handle multiple commands from multiple queues. Hence, they can provide a significant performance boost if many independent tasks are enqueued concurrently to the device via multiple opencl queues.
Anyway, as you mentioned, you can see a performance improvement if you turns on the out-of-order mode. Its really interesting. Could you please be more explicit about the commands you enqueued and how? Meanwhile I'll check with some experienced folks for more details.
Regards,
I was mistaken on device-side vs host-side. I think the OP talks about the the former.
Thanks, Dipak. Yeah, this turned out to be wishful thinking. On closer inspection, the out of order flag is ignored; makes not difference.
Are there plans to support host-side out of order queues? Seems to me this would be easier
than doing device side out of order queues, something we already have implemented.
Thanks,
Aaron
None that I know of.
zypo - I've branched the newer questions into a new thread: Device queues.
--Prasad