out of order queue is currently not supported on AMD implementation.
I don't understand why you are interested in clEnqueueTask when clEnqueueNDRangeKernel gives you more programmability. I have never used clEnqueueTask so cannot say anything for sure.
But as per spec:
"clEnqueueTask
is equivalent to calling clEnqueueNDRangeKernel with work_dim
= 1, global_work_offset
= NULL, global_work_size
[0] set to 1, and local_work_size
[0] set to 1."
So it should not be possible to run different tasks on different compute units of GPU. Also device fission is only there for CPUs so there you should be able to run many kernels.