For a groupsize of 64, and 1 wavefront per SIMD (640 threads), I can write and read reliably from 8 sr registers per thread. Beyond that, I get corruption. Does this sound correct, or is it possibly my probelm? What determines this maximum, and can I increase it by going into the ISA, for example ?
Thanks for any hints from the hardware gurus ...
cheers, e