Thanks Udeepta,
Well the reason I'm interested in GPU premption is for CPU-GPU co processing optimization. Consider this loop :
(1) I have a main GPU task T1 that works on a data sample N.
(2) In the meantime, I have the CPU working on the previous result of GPU computation ie sample N-1.
Doing this way I can take full advantage of GPU and CPU working concurrently with only one synchro point - ie at the beginning of each loop.
Now the point is that while CPU is doing its own work, it may take advantage of short tasks T2 being run on the GPU - the processing involved for T2 being more suitabloe for GPU than for CPU.
The issue is that T1 is a batch and cannot really be split into pieces, should this be the case, I'd lose most of the benefits of concurrent GPU-CPU execution (since synchro schemes would have to be inserted).
So this is why having a form of preemption of GPU (on a different context for example) would be highly wishable.