cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

viscocoa
Adept I

Question regarding registers

Jump to solution

What is the addressable unit of registers? A byte or a 4-byte integer/float, or a 16-byte vector?

If I declare a byte type private variable, will it occupy a 1-byte register, or 4-, or 16-byte register?

How are registers used by the ALU? In the following statements,

float4 x;

x.a = 1.f;                              // writing to the first element of x

float y = x.b + 2.f;                // reading the second element of x

can the last two instructions be excuted simultaneously?

Thank you in advance!

Tags (1)
0 Likes
1 Solution

Accepted Solutions
MicahVillmow
Staff
Staff

Re: Question regarding registers

Jump to solution

1) A character occupies a 4 bytes of 16 byte registers.

2) They may be packed into 4 components of up to 4 different registers.

3) No, what is in a vector in the source does not directly correspond to what is in a vector register.

View solution in original post

0 Likes
10 Replies
MicahVillmow
Staff
Staff

Re: Question regarding registers

Jump to solution

Registers can be thought of as 16 bytes of memory that are addressable as 4 4-byte components on the VLIW architecture.

On GCN architecture, a scalar register 4 bytes and a vector register is 64x4 bytes.

viscocoa
Adept I

Re: Question regarding registers

Jump to solution

Thank you Micah.

What is GCN?

On Radeon 5870, if I declare a float variable and the compiler decides to allocate a register for it, will the variable occupy a whole 16-byte vector? Will the compiler consider to compress multiple float varibles into a 4-element vector?

I found it is hard to predict when the compiler will allocate a register. I think there are lots of optimizations inside the compiler.

0 Likes
MicahVillmow
Staff
Staff

Re: Question regarding registers

Jump to solution

GCN stands for graphics core next and is our architecture for HD7XXX chips.

The compiler will pack registers when it can, but if too many registers are required or the index pattern into an array is dynamic and the array is above a certain threshhold, then the registers get backed into memory.

0 Likes
viscocoa
Adept I

Re: Question regarding registers

Jump to solution

Hi Micah,

I only know that 5870 has 256kb vector registers. Does it also have separate scalar registers?

Do you mean that any array declared as a private variable, and indexed dynamically will be pushed into the global memory? An example is as below:

__kernel void mykernel{

int array[10];

int total;

for (int i = 0; i < 10; i ++)

    total += array; // array is indexed dynamically, and will be put into global memory, and cause the kernel super slow???

}

0 Likes
MicahVillmow
Staff
Staff

Re: Question regarding registers

Jump to solution

That is a statically indexed array as the indices can be determined at compile time.

The only way this gets pushed into global uncached memory is if the array is to large to fit into regsiters.

Only the HD7XXX series has scalar registers.

0 Likes
viscocoa
Adept I

Re: Question regarding registers

Jump to solution

Hi Micah, I don't understand the phrase "addressable as 4 4-byte components".

Are the registers  indexed in a linear space, like WORD0, WORD1, WORD2, WORD3, WORD4, ...

Or are they indexed in a two-layer space, like VECTOR0.x, VECTOR0.y, VECTOR0.z, VECTOR0.w, VECTOR1.x, ...

Registers are precious resource on GPUs. I want to (1) save some registers (2) speed up the calculations. Could you kindly answer yes or no to the following questions?

(1) a char variable occupies a 16bit register

(2) the following variables may be packed into a float4 by the compiler:

float x;

float3 y;

(3) accessing to different components of a vector register will be serialized, and slow down the calculation. Eg.

float4 v;

v.x = 1.f;                              // writing to the first element of x

float y = v.y + 2.f;                // reading the second element of x

Because v.x and v.y are in the same vector register, there is an access conflict?

Thank you very much in advance!

0 Likes
MicahVillmow
Staff
Staff

Re: Question regarding registers

Jump to solution

1) A character occupies a 4 bytes of 16 byte registers.

2) They may be packed into 4 components of up to 4 different registers.

3) No, what is in a vector in the source does not directly correspond to what is in a vector register.

View solution in original post

0 Likes
viscocoa
Adept I

Re: Question regarding registers

Jump to solution

Thank you very much Micah!

May I ask how is registers addressed? In 1D space like WORD0, WORD1, WORD2, WORD3, WORD4, ..., or 2D like VECTOR0.x, VECTOR0.y, VECTOR0.z, VECTOR0.w, VECTOR1.x, ...?

Sorry for so many questions. I don't understand GPU asmbly

0 Likes
MicahVillmow
Staff
Staff

Re: Question regarding registers

Jump to solution
0 Likes