iya

Using all 256 registers?

Discussion created by iya on Jul 18, 2010
Latest reply on Jul 19, 2010 by MicahVillmow

Hello,

I've written a generator for .il code, and would like to use as many registers as possible. My hardware is a 4850.

As I understand limiting the groupsize to the wavefront size of 64 should be the only requirement, but neither in OpenCL nor in IL was I ever successful of getting the compiler to allocate more than 122 GPRs. A groupsize of 256 can get upto 63.

Am I forgetting something or is it a current compiler limitation?

NumWavefrontPerSIMD = 2 seems to be the problem. Is there a way to limit this to 1?

; ----------------- CS Data ------------------------ ; Input Semantic Mappings ; No input mappings GprPoolSize = 0 CodeLen = 11808;Bytes PGM_END_CF = 0; words(64 bit) PGM_END_ALU = 0; words(64 bit) PGM_END_FETCH = 0; words(64 bit) MaxScratchRegsNeeded = 3 ;AluPacking = 0.0 ;AluClauses = 0 ;PowerThrottleRate = 0.0 ; texResourceUsage[0] = 0x00000000 ; texResourceUsage[1] = 0x00000000 ; texResourceUsage[2] = 0x00000000 ; texResourceUsage[3] = 0x00000000 ; fetch4ResourceUsage[0] = 0x00000000 ; fetch4ResourceUsage[1] = 0x00000000 ; fetch4ResourceUsage[2] = 0x00000000 ; fetch4ResourceUsage[3] = 0x00000000 ; texSamplerUsage = 0x00000000 ; constBufUsage = 0x00000000 ResourcesAffectAlphaOutput[0] = 0x00000000 ResourcesAffectAlphaOutput[1] = 0x00000000 ResourcesAffectAlphaOutput[2] = 0x00000000 ResourcesAffectAlphaOutput[3] = 0x00000000 ;SQ_PGM_RESOURCES = 0x3000027A SQ_PGM_RESOURCES:NUM_GPRS = 122 SQ_PGM_RESOURCES:STACK_SIZE = 2 SQ_PGM_RESOURCES:FETCH_CACHE_LINES = 0 SQ_PGM_RESOURCES:PRIME_CACHE_ENABLE = 1 ; CS Setup Mode = Fast (i.e setup R0.x) ; NumThreadPerGroup = 64 ; NumWavefrontPerSIMD = 2 ; IsMaxNumWavePerSIMD = true ; SetBufferForNumGroup = false

Outcomes