I understand that within one context the order of kernel completion is based on a Fifo scheme, ie the same order as request filed in the command queue.
Considering 2 contexts on the same device:
(a) Assuming for context 1, a batch of kernel computations has been launched through filing up queue 1 with call for kernels k1, k2 and k3 and then issuing a flush.
Assuming this task T1 (= k1+k2+k3) execution will take say 1s to complete
(b) Concurrently I have another task (T2) comprising kernel k4 that typically would take say 1ms of execution, and I would like this task to be completed quickly (ie no wait for T1 completion).
(1) is there a way while T1 is getting executed to preempt the GPU, by sending a call for T2 in context 2?
(2) in this case is it possible to assume that T2 could be completed prior to T1?
I tend to think some GPU preemption mechanism should exist, but don't know if this is the way to do it!