News Three-year-old CPU beats Intel's fastest current chip in RAM benchmark — 7 GHz Core i9-11900K tops 8.3 GHz Core i9-14900KS in PYPrime 32B

bit_user

Polypheme
Ambassador
Sadly, latency is a tradeoff that DDR5 had to make, in order to continue scaling density and bandwidth. If you compare the lowest latency DDR5 (in terms of ns, not CL), it's nearly as good as fast DDR4 memory. At least, that's what I saw in examples I looked at.

Remember kids: you need to divide CL by the memory speed, in order to compare them.

The article said:
Splave was running DDR4-3913 dual-rank, double-sided memory for interleaving gains
That's somewhat redundant. The key detail is "dual-rank". It just so happens that most dual-rank DIMMs are also double-sided and vice versa. However, I think it should be possible to have chip-stacked DRAM that's dual-rank without being double-sided.
 

jkflipflop98

Distinguished
Back in Ye Olde days when the company would just hand us free processors on request, I used to be into this scene. I'd go down to the SEM labs and fill a dewer with a bunch of LN2 and head home for nerdy fun on the weekends.

Gets expensive when you have to buy your own CPUs, though.
 

TheHerald

Upstanding
Feb 15, 2024
409
99
260
Sadly, latency is a tradeoff that DDR5 had to make, in order to continue scaling density and bandwidth. If you compare the lowest latency DDR5 (in terms of ns, not CL), it's nearly as good as fast DDR4 memory. At least, that's what I saw in examples I looked at.

Remember kids: you need to divide CL by the memory speed, in order to compare them.


That's somewhat redundant. The key detail is "dual-rank". It just so happens that most dual-rank DIMMs are also double-sided and vice versa. However, I think it should be possible to have chip-stacked DRAM that's dual-rank without being double-sided.
It's also the size of the cache, as it grows latency grows with it. I never tried Rocket lake but going from CFL to CFL+ to CML (8700 --> 9900 --> 10900) latency progressively increased, a 8700k with tuned 4000mhz ddr4 can drop to 32-33ns, with a 10900k even at 4400mhzc16 ram youd hit a wall at around 36ns. You can get lower than that but with unsavory amounts of voltage.
 

bit_user

Polypheme
Ambassador
It's also the size of the cache, as it grows latency grows with it. I never tried Rocket lake but going from CFL to CFL+ to CML (8700 --> 9900 --> 10900) latency progressively increased,
Huh? All of those CPUs used the formula 32/256/2048 kB for L1d/L2/L3 cache per core. The only thing that changed in each case was the number of cores. That meant more stops on the ring bus, but you also got more L3 cache, as a side effect.

Another thing about them is they all used basically the same process node. If you're using a denser process node, then you can sometimes increase cache size without increasing the latency (in clock cycles). The other thing you can do to reduce the effect of cache on memory latency is simply clock higher while maintaining cache latency as the same number of clock cycles, thereby reducing how many ns cache tag RAM lookups have on memory latency.
 
Last edited: