Hi,
I'm little confused of property of "in-order" command queue.
From 4.7.4 of APP SDK programming guide (page 4-36), there's the statement of "
Command-queues that are configured to execute in-order are guaranteed to
complete execution of each command before the next command begins. This
synchronization guarantee can often be leveraged to avoid explicit
clWaitForEvents() calls between command submissions. Using
clWaitForEvents() requires intervention by the host CPU and additional
synchronization cost between the host and the GPU; by leveraging the in-order
queue property, back-to-back kernel executions can be efficiently handled
directly on the GPU hardware.
"
However, the same page also said "AMD Southern Islands GPUs can execute multiple kernels simultaneously when there are no dependencies."
My questions are:
Hoping can get clear clarification of the in-order queue behavior.
... simultaneously when there are no dependencies."
So if you enqueue kernel which use buffer A,B and second one which use C,D then they can execute simultaneously. Important is result at the end.
1. I believe multiple kernel execution happens for kernels coming in from different command queue - not for the same CQ.
2. I am not sure about it.. Even if they do, I am sure the driver has the right dependecy check -- so that the code's reliance on in-order property does not break....
If that was the case, this forum will be filled with bug reports.. 🙂
Anyway, I will ask the driver folks to look at this thread and comment for you.
They are the right guys....to handle your question.
-
Bruhaspati
- If command queue is in-ordered, then why multiple kernels can run simultaneously? Shouldn't they be exectued sequentially?
- Seems DMA commands can be async with other commands in the same queue, developers should use blocking API or events to check the status, then how to define in-order here?
Basically, nou answered your questions already. The concurrent execution occurs for the independent operations (the same is true for the kernels and DMA operations). Runtime guarantees "in-order" execution in terms of the application visibility to the execution, but runtime can run some operations concurrently on HW.
"Out-of-order" queue requires events tracking from the applications and in reality it's a useless mode, because if application uses multiple queues, then it's already can be called "out-of -order" execution model, because the app would have to track events and synchronize for the proper order.
Thank you guys.
It really helpful.
Would you please explain “Runtime guarantees "in-order" execution in terms of the application visibility to the execution”, I don’t fully understand.
Haibo
it means that some kernels can be executed concurent but from application view it seems like they are executed after each other.
@Haibo,
It means that as programmer your assumptions about in-order command queue will not be broken.
You can assume the in-orderness of the command queue and write code.
Thats what it means..