Archives Discussions

yurtesen · ‎05-26-2014

In http://developer.amd.com/wordpress/media/2012/12/AMD_Southern_Islands_Instruction_Set_Architecture.p...

at section 9.3.1

LDS Direct reads occur in vector ALU (VALU) instructions and allow the LDS to

supply a single DWORD value which is broadcast to all threads in the wavefront

and is used as the SRC0 input to the ALU operations. A VALU instruction

indicates that input is to be supplied by LDS by using the LDS_DIRECT for the

SRC0 field.

I am interested to know how many clock cycles penalty does it have compared to using a data which is already in a register?

Does ALUs have some hidden registers to receive the data in SRC0? or where does the broadcasted data gets stored?

realhet · ‎05-27-2014

Hi,

src_lds_direct takes exactly the same amount of time as a vector or a scalar register. (measured with s_memtime)

It is like when you broadcast a scalar register to the whole WF but basically you can have up to 16KB constants, not only 103*4 bytes, while the ALU can work at maximum utilization.

SRC0 can select from 512 different things: 256 vregs, 128sregs and 128 special things (I guess those are cam from the scalar alu also). lds_direct is on of these specials. There are many int, float constants, debug/trap registers, and state flags and even a thing that marks immediate data right after the instruction dword.

View solution in original post

realhet · ‎05-27-2014

Hi,

src_lds_direct takes exactly the same amount of time as a vector or a scalar register. (measured with s_memtime)

It is like when you broadcast a scalar register to the whole WF but basically you can have up to 16KB constants, not only 103*4 bytes, while the ALU can work at maximum utilization.

SRC0 can select from 512 different things: 256 vregs, 128sregs and 128 special things (I guess those are cam from the scalar alu also). lds_direct is on of these specials. There are many int, float constants, debug/trap registers, and state flags and even a thing that marks immediate data right after the instruction dword.

Archives Discussions

LDS Direct Read performance