So I have a laptop with 6850u cpu and 4x8 GB 6400 MHZ LPDDR5 RAM. Each of the 4 channel has bus width of 32 bit. The theoretical memory bandwidth for this setup should be ~ 102 GB/s. I ran different memory bandwidth tests (ramspeed, stream, aida64) and the upper limit for the bandwidth seems to be ~ 51 GB/s limit (Even for GPU although this is based on LLM inference speed observations).
Here is a review of the same laptop where AIDA64 memory copy speed is presented and all the speeds are below 51 GB/s.
What is limiting the memory bandwidth?
Solved! Go to Solution.
Thanks for posting the screenshot. That doesn't seem to be a realtime display of the memory operating frequency, which is what CPU-Z shows.
But apparently with LPDDR5 51.2GB/s is the max bandwidth. Here are links to Micron and Samsung that both state the same.
https://www.micron.com/products/memory/dram-components/lpddr5
https://semiconductor.samsung.com/us/dram/lpddr/lpddr5/
Have you verified the memory clocks up to 3200MHz (6400MT/s) when running the benchmark?
The CPU-Z Memory screen from that link you posted shows the memory downclocked to ~800MHz.
It seems to be running at 6400 MHZ. During Stream memory test (Monitored by dmidecode and lshw):
lshw output:
H/W path Device Class Description
================================================================
/0/1 memory 512KiB L1 cache
/0/2 memory 4MiB L2 cache
/0/3 memory 16MiB L3 cache
/0/6 memory 32GiB System Memory
/0/6/0 memory 8GiB Synchronous Unbuffered (Unregistered) 6400 MHz (0.2 ns)
/0/6/1 memory 8GiB Synchronous Unbuffered (Unregistered) 6400 MHz (0.2 ns)
/0/6/2 memory 8GiB Synchronous Unbuffered (Unregistered) 6400 MHz (0.2 ns)
/0/6/3 memory 8GiB Synchronous Unbuffered (Unregistered) 6400 MHz (0.2 ns)
/0/14 memory 128KiB BIOS
dmidecode output:
Handle 0x0009, DMI type 17, 92 bytes
Memory Device
Array Handle: 0x0006
Error Information Handle: 0x0008
Total Width: 32 bits
Data Width: 32 bits
Size: 8 GB
Form Factor: Other
Set: None
Locator: DIMM 0
Bank Locator: P0 CHANNEL A
Type: LPDDR5
Type Detail: Synchronous Unbuffered (Unregistered)
Speed: 6400 MT/s
And the test result:
-------------------------------------------------------------
Function Best Rate MB/s Avg time Min time Max time
Copy: 41048.7 0.042795 0.038978 0.048379
Scale: 24536.5 0.068154 0.065209 0.075868
Add: 26594.3 0.093679 0.090245 0.102343
Triad: 26701.4 0.094547 0.089883 0.104958
Open CPU-Z and monitor the Memory tab while the benchmark is running to verify the speed is actually getting to 3200MHz.
CPU-Z is not available on linux. I ran it on windows vm but it can not access memory info.
CPU-X is a similar tool for linux. Here is the screenshot:
Voltages are all set at 0.5 V, so it probably does not downclocks.
Thanks for posting the screenshot. That doesn't seem to be a realtime display of the memory operating frequency, which is what CPU-Z shows.
But apparently with LPDDR5 51.2GB/s is the max bandwidth. Here are links to Micron and Samsung that both state the same.
https://www.micron.com/products/memory/dram-components/lpddr5
https://semiconductor.samsung.com/us/dram/lpddr/lpddr5/
In the micron link, it clearly states that the bandwidth of 51.2 GB/s is for quad channel system. So I thought maybe that's the limit.
But it seems like there are 16-bit bus width LPDDR5 rams as well [Link]:
"LPDDR5 DRAMs offer additional power-savings using the dynamic voltage scaling (DVS) feature, in which the memory controller can reduce both the DRAM frequency and voltage during channel idle times. LPDDR DRAM channels are typically 16- or 32-bits wide, in contrast to the typical standard DDR DRAM channels which are 64-bit wide"
So it may be the case that in the micron link, the calculations for the maximum bandwidth are based on 4 16-bit channels.
For ryzen 6000 seris APU, memory controller can handle 4 32-bit channels:
And from the output of "dmidecode" command, my laptop has 4 32-bit memory modules running at 6400 MT/s; Channels A, B, C and D (I might install windows and run cpuz to confirm that it rans at 6400 MT/s). So the expected bandwidth is:
128 bit * 6400 MT/s * (1 byte / 8 bit) = 102.4 GB/s
I also contacted the AMD technical support:
"Please note that Ryzen™ 7 PRO 6850U CPU doesn't have 4 memory channels, it has 2 memory channels.
AMD Ryzen™ 7 PRO 6850U Drivers
So, the result you are seeing is expected."
But it is not the technical answer I was looking for as it is possible for this apu to handle 4 32-bit channels.
For anyone that sees this post that has similar apu and ram setup and running windows, I'd appreciate if you could post your ram speeds and configuration here.
Thanks for sharing the link, and it explains the difference as to why LPDDR5 is getting half the bandwidth and why the standard calculation does not return the expected results.
https://www.synopsys.com/designware-ip/technical-bulletin/key-features-about-lpddr5.html
the memory controller in the SoC typically runs at half the CK frequency at the DDR PHY Interface in the DFI 1:2 ratio mode. For example, for an LPDDR4/4X speed of 4267 Mbps, the CK and DQS run at 2133 MHz, and the C/A has a data-rate of 2133 Mbps and controller clock runs at 1066 MHz.
Such a clocking scheme is not scalable at LPDDR5 speeds. Thus, LPDDR5 adopts a new clocking scheme, where CK runs at one fourth the data-strobe frequency at speeds higher than 3200 Mbps, and at half the data-strobe frequency at speeds under 3200 Mbps. Hence, even at 6400 Mbps, this clocking scheme requires CK to operate only at 800 MHz.
So the standard formula ( bit rate * mem rate / 8 ) assumes that the memory frequency clock is one half of the data rate, but for LPDDR5 it is only one fourth. Therefore when the formula indicates a result of 102.4GB/s it is really 51.2GB/s.
I found your question asked on a Reddit thread and the same link was also referenced there explaining why LPDDR5 has half the expected bandwidth.
https://www.reddit.com/r/thinkpad/comments/xztt8c/every_laptop_with_6th_gen_ryzen_lpddr5_memory/
rayle, AMD specifications here says the 6850U has two memory channels. This will account for the half speed measured. John.
"DDR5 splits the memory module into two independent 32-bit addressable subchannels to increase efficiency and lower the latencies of data accesses for the memory controller. The data width of the DDR5 module is still 64-bit. However, breaking it down into two 32-bit addressable channels increases overall performance." Kingston
It is 2 64-bit channels which is 4 32-bit channels. For a total bus width of 128-bit. So memory bandwidth should be the same I think.
rayle, please run Typhoon and post a screenshot. John.
I could not find any alternative for thaiphoon burner that runs on linux. There were some suggestions but I could not make them work.
rayle, thanks, John.