8 Replies Latest reply on Aug 26, 2015 12:22 AM by youwei

    SGPR usage trippled on GCN-1.2 (v8) GPUs




      I've analyzed an OpenCL-kernel using CodeXL and I am quite happy with the register-usage - on GCN 1.0/1.1 devices per SIMD the maximum of 10 wavefronts can be queued, so hopefully memory latencies can be hidden efficiently.

      However on GCN-1.2 devices (Tonga), SGPRs usage exploded - while on Capverde the same kernel consumes 32 SPGRs, on Tonga 94 SGPRs are required which limits the kernel to 5 parallel waves per SIMD (screenshots attached).


      Any idea why the same Code running on Tonga requires almost 3 times the SGPRs?

      Have there been architectural changes to Tonga or are there pitfalls when it comes to SGPR usage?


      Thank you in advance, Clemens