AnsweredAssumed Answered

USER_SGPR value restricted to 14bits?

Question asked by realhet on Jul 12, 2015
Latest reply on Jul 13, 2015 by realhet

Hi,

 

I've made a small kernel in GCN ASM and decided to launch it with 167million workitems. WorkGroupSize is 256. I calculate the 0 base global_id value as simple as GID = USER_SGPR*256+V0. And just noticed that the value of the USER_SGPR reg (S12 at the moment) can not exceed 16384, and it is AND-ed, so it basically repeats every GID values from 0 to 4million 40 times. I counted the distinct gid values with buffer_atomic_add.

 

My question is this: Is there a 14 bit limit on the USER_SGPR value? And if so, what's the proper way to get the higher bits?

 

I don't think I ran this problem in the past, so maybe it's a new/altered functionality. Or maybe it is related to the largest texture size in 1 dimension.

I'm trying to use the simplest header as possible:

__kernel __attribute__((reqd_work_group_size(256,1,1))) void main(__global uint* a) { a[0]=a[0]%10; }

It gives only 2 user elements: IMM_UAV 12, s[4:7];  IMM_CONST_BUFFER 1, s[8:11];    With USER_SGPR = 12

 

I initialize the kernel like this:

  s_buffer_load_dword  s0, s[8:11], 0x00    //buffer base offset

  s_or_b32      s5, s5, $40000                     //set resource record size

  s_mul_i32     s1, s12, 256                       //!!!!!!!!!!!!!!!!!!!!!!!  s12 is ANDed with 16383, but don't know why

  s_waitcnt     lgkmcnt(0)

  s_add_u32     s4, s4, s0 \ s_addc_u32 s5 ,s5, 0  //s[4:7] = buffer, add base offset to resource

  v_add_i32     v1, vcc, s1, v0                   //!!!!!!!!!!!!!!!!!!!!!!! So, the calculated 0based GID can't exceed 4194304 (=WorkGroupSize*16384)

  alias UAV=s[4:7], LID=v0, GID=v1

 

I'm using: HD7770 Cape Verde, Cat 14.12

 

Thank you!

Outcomes