cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

foomanchoo
Adept I

workgroups, scalar registers, constant buffers

good morning.

i currently use 64 constant uint in every thread of every workgroup in my kernel like so:

__constant uint * coeff = coeff_kernel_argument;

uint coeff1 = coeff[0];

...

uint coeff63 = coeff[63];

i am delighted to see that the 64 values end up in scalar registers (i am talking about Tahiti), as confirmed with the ISA dump.

now i would like to change it to:

uint * coeff = coeff_kernel_argument + get_global_id(0) / get_local_size(0) * 64;

unfortunately this results in the coeff?? variables to be stored in vgprs, even though their values are the same for each thread of a workgroup.

i am also beginning to rewrite my kernel in AMDIL for other reasons and as such a workaround that only works on the IL level would be suffice.

thanks a lot.

0 Likes
4 Replies
coordz
Adept II

uint * coeff = coeff_kernel_argument + get_global_id(0) / get_local_size(0) * 64;

unfortunately this results in the coeff?? variables to be stored in vgprs, even though their values are the same for each thread of a workgroup.

You appear to be using the global thread id in your calculation which varies per thread which leads to the values being put in VGPRs.

0 Likes

the expression "get_global_id(0) / get_local_size(0)" is constant among all threads of a workgroup, which fits perfectly with the scalar registers being a shared resource per workgroup.

0 Likes

get_group_id(0) is constant for all threads in work group. get_global_id(0) returns unique number for all threads in all work groups.

"get_global_id(0) / get_local_size(0)" can't be evaluated on compilation stage, because its value depends on launch parameters. So, it's not a constant for compiler.

One workaround would be to:

__constant uint * coeff = coeff_kernel_argument;

uint coeff0 = coeff[0];

if (get_local_id{0) == 0)

{

  shared_mem[0] = coeff0;

}

uint coeff1 = coeff[1];

if (get_local_id{0) == 0)

{

  shared_mem[1] = coeff1;

}

...

uint coeff63 = coeff[63];

if (get_local_id{0) == 0)

{

  shared_mem[63] = coeff63;

}

barrier(CLK_LOCAL_MEM_FENCE);

__local uint coeffForWorkGroup;

if (get_local_id(0) == 0)

{

    coeffForWorkGroup = shared_mem[get_group_id(0)];

}

barrier(CLK_LOCAL_MEM_FENCE);

uint coeffForWorkGroupPrivateCopy = coeffForWorkGroup;

0 Likes