I recently tried out the D3D11 Command Lists feature on 4 threads and observed a decrease in performance. After using GPU Perf Studio I was shocked to see how all threads (immediate + deferred) wait for FinishCommandList calls . That's basically the reason why I'm seeing slower performance, with 1+ deferred contexts the driver actually stops rendering on the immediate context. You can also observe from the image attached that FinishCommandList cannot execute in parallel so each call waits on the other one(s). I do hope this will somehow change in D3D12 as I was having the impression that Command Lists will be the norm there.