cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

givenchy
Journeyman III

Latency about accessing the registers(private memory)

Hi,

I could see the latency about global memory in manual.

But I don't see the private memory latency in manual, NVIDIA shows the latency "zero cycle or read after right(24cycle) "  in its manual .

And how about AMD OpenCL ?

Regards.,

Thanks

0 Likes
1 Solution
realhet
Miniboss

Hi,

There's no such latency on AMD. In every cycle it can read 3 regs and write 1 reg.

The only penalty I know is when a vector instruction that writes into a scalar reg is followed by a scalar instruction. That could be 1 cycle penalty but the compiler will avoid this anyways.

View solution in original post

1 Reply
realhet
Miniboss

Hi,

There's no such latency on AMD. In every cycle it can read 3 regs and write 1 reg.

The only penalty I know is when a vector instruction that writes into a scalar reg is followed by a scalar instruction. That could be 1 cycle penalty but the compiler will avoid this anyways.