peak GCN performance possible with 1 wave

Discussion created by nerdralph on Feb 20, 2017
Latest reply on Feb 24, 2017 by nerdralph

Everything I've read so far about wave occupancy suggests (or even explicitly states) that a minimum of 4 waves in flight is required for full VALU occupancy on GCN.  After scrutinizing documentation and code, I've come to the conclusion that full VALU utilization can be obtained with just one wave.  This is only possible for kernels executing only vector instructions, so for practical purposes the minimum is 2 waves.

Nerd Ralph: Inside AMD GCN code execution