News AMD and JEDEC Develop DDR5 MRDIMMs With Speeds Up To 17,600 MT/s

InvalidError

Titan
Moderator
This looks like it could be another HBM vs Hybrid Memory Cube situation. It will be interesting to see which memory type becomes the future standard.
I don't really see this as a competition: on-package and 3D-stacked memory gives you TBs/s bandwidth at the lowest possible latency and power while external memory lets you trade some bandwidth, latency and power for however much memory you want.

IMO, MRDIMM will be a transient thing: once the cost of on-package memory comes down to the point of becoming the norm and CPUs get enough of it for their intended workloads' typical working set (16-64GB for desktop, 64-256GB for servers), external memory bandwidth and latency won't matter much. Then the external memory controllers will be scrapped in favor of 12-16 extra PCIe lanes per scrapped channel and the option of using CXL-mem for memory expansion instead.

If JEDEC is going to complexify the DIMM protocol by adding what basically sounds like a multi-channel mux chip on the DIMM, may as well go one step further by offloading basic DRAM housekeeping functions to the mux/buffer chip and turn the CPU-DIMM interface into a simple transactional protocol ("I want to read/write N bytes starting from address X with transaction tag Y and service priority Z"), effectively moving the memory controller off-die.
 

InvalidError

Titan
Moderator
I do wonder to the latency hit compared to normal...also how much more stress does the doubling put on the memory controller?
M(C)R-DIMMs don't put any extra load on memory controllers besides the strain from increased bus speed since all control and data signals are buffered on each DIMM and look like a single-point load to the memory controller no matter how many chips are on the buffer's back-end.
 
  • Like
Reactions: bit_user

user7007

Commendable
Mar 9, 2022
45
33
1,560
So long as on the consumer side we don't have 2 competing standards it sounds good to me. I know intel is showing off similar tech for servers but presumably (hopefully) the JEDEC version is all consumers ever have to worry about. On the server side I'm less worried, intel needs every competitive advantage it can get at the moment.
 
  • Like
Reactions: atomicWAR

bit_user

Titan
Ambassador
I hope this version somehow avoids the tradeoff of doubling the minimum burst length, but I'm not too optimistic.

I do wonder to the latency hit compared to normal...
Well, I think I read MCR DRAM adds a cycle. I'm not sure if that's relative to registered or unbuffered DRAM, though.

Assuming MR DRAM is implemented the same way, the effect should be similar.
 
Last edited:
  • Like
Reactions: drajitsh

bit_user

Titan
Ambassador
I don't really see this as a competition: on-package and 3D-stacked memory gives you TBs/s bandwidth at the lowest possible latency and power while external memory lets you trade some bandwidth, latency and power for however much memory you want.
I think you misunderstood that post. @jeremyj_83 was likening the MCR DRAM vs. MRDRAM contest to that of HMC vs HBM.

If JEDEC is going to complexify the DIMM protocol by adding what basically sounds like a multi-channel mux chip on the DIMM, may as well go one step further by offloading basic DRAM housekeeping functions to the mux/buffer chip and turn the CPU-DIMM interface into a simple transactional protocol ("I want to read/write N bytes starting from address X with transaction tag Y and service priority Z"), effectively moving the memory controller off-die.
We don't yet know that they're not, I think?
 

bit_user

Titan
Ambassador
Rambus? They didnt have to do anything but file lawsuits and settle for royalties from all major dram manufacturers... sounds like a pretty sweat deal!
Rambus was real, at first. It's only when they got snubbed by the other DRAM makers that they decided to sue for DDR infringing on some of their patents. I think the success of that effort is what basically turned them into an IP factory / patent troll.

BTW, Rambus goes back a pretty long ways. I think one of the first big design wins they got was in the Nintendo 64.
 
  • Like
Reactions: drajitsh

InvalidError

Titan
Moderator
We don't yet know that they're not, I think?
What few details there were about MRDIMM align pretty much exactly with what we already know about MCRDIMM, so my guess is JEDEC is standardizing Intel&friends' stuff for everyone else who wants to hop in under a JEDEC brand/standard.

MCRDIMM is just the normal DDR5 interface except there are data buffer chips on the DIMM multiplexing the DQs in a round-robin fashion across chip ranks. The 8800MT/s DIMMs are dual-rank and the projected 17600MT/s ones would most likely be quad-rank.
 

InvalidError

Titan
Moderator
Considering "G.Skill showed off real-world memory tests of its 8400MT/s " the first generation of this is kind of dead in the water already?
If MR-DIMM is the same as MCR-DIMM, then it is intended for 100+ cores server CPUs and those only do DDR5-4800 at one single DIMM slot per channel for stability reasons.

M(C)R-DIMM is basically a next-generation FB-DIMM that operates the DDR5 bus at N-times speed to multiplex commands and data to N-rank DIMMs, which provides N-times DIMM density, bandwidth and access concurrency scaling.
 

bit_user

Titan
Ambassador
MCRDIMM is just the normal DDR5 interface except there are data buffer chips on the DIMM multiplexing the DQs in a round-robin fashion across chip ranks. The 8800MT/s DIMMs are dual-rank and the projected 17600MT/s ones would most likely be quad-rank.
That's an oversimplification. They change the minimum burst size to 128 bytes, which isn't the optimal granularity for the cache hierarchy. This is possibly an even bigger tradeoff than any additional latency they add.
 
Last edited:

bit_user

Titan
Ambassador
Considering "G.Skill showed off real-world memory tests of its 8400MT/s " the first generation of this is kind of dead in the water already?
It's a good point, and something I've also wondered about. My suspicion is that 8400 is only possible with the highest-binned DDR5 chips and at a significant efficiency penalty - not insurmountable for a pair of heatsink-clad DIMMs in a gaming machine, but a potential deal-breaker for servers with 24 DIMMs or more. A key point about MCR DRAM is that the actual DRAM chips need only operate at half of the interface speed.

It's probably not just about cost & efficiency, though. I assume MR/MCR DRAM will continue frequency-scaling well beyond the ceiling of regular DDR5.
 
  • Like
Reactions: thisisaname

InvalidError

Titan
Moderator
That's an oversimplification. They change the minimum burst size to 128 bytes, which isn't the optimal granularity for the cache hierarchy. This is possibly an even bigger tradeoff than any additional latency they add.
While the images I have seen about MCR may imply that data from ranks gets merged into 128bytes bursts, the accompanying text says ranks are individually controlled, in which case the two interleaved data bursts can be for two completely unrelated 64bytes chunks.
 
  • Like
Reactions: helper800

hasten

Distinguished
Oct 25, 2007
204
43
18,720
Rambus was real, at first. It's only when they got snubbed by the other DRAM makers that they decided to sue for DDR infringing on some of their patents. I think the success of that effort is what basically turned them into an IP factory / patent troll.

BTW, Rambus goes back a pretty long ways. I think one of the first big design wins they got was in the Nintendo 64.
I jumped on the Rambus bandwagon with a P4 2.66ghz. The only downfall was no upgrade path with the whole demise of RDRAM. It was fine technology but complexity drove up costs and it couldnt compete in retail. I guess I didnt like the dummy modules cause it was one more point of failure.

Im not being facetious when stating Rambus the winner... there were a handful of silly frivolous lawsuits, but the ddr patents were legit and deserved compensation.
 

Kamen Rider Blade

Distinguished
Dec 2, 2013
1,432
967
20,060
I think RAMBUS patent on "MicroThreading" withing the RAM Package is about to expire in ~2025

I think they got it around 2005, but since it's RAMBUS, nobody wants to deal with their licensing / proprietary BS.

But since it's about to expire, I can see JEDEC secretly readying their own version of "MicroThreading" within the RAM package to increase throughput / bandwidth.
 

bit_user

Titan
Ambassador
this is some stuff for epyc, i doubt theyll release registered dimms for consumers at cheap
Right. MCR means having dual-ranked, at minimum. More chips -> higher cost. That's all the more reason for consumer platforms to pass this one by.

We've been hearing about DDR6, for a while. That's what I expect to propel consumer platforms forward, at least if in-package memory doesn't take off first.

For servers, MCR is a general technique that I expect should be equally applicable to DDR6 as DDR5.

for those above, stick will be 128bit, its QDR, that is twice more bandwith as on DDR (64bit)
DDR2 is already quad-data rate, in a sense. I believe DDR3 doubled it, again. I doubt DIMM widths will get any wider, since we're seeing channel bifurcation to 32-bit, in DDR5 and even down to 16-bit, in LPDDR5 and I think maybe DDR6.
 
DDR2 is already quad-data rate, in a sense. I believe DDR3 doubled it, again. I doubt DIMM widths will get any wider, since we're seeing channel bifurcation to 32-bit, in DDR5 and even down to 16-bit, in LPDDR5 and I think maybe DDR6.
quad data rate means 400mhz = 1600mt/s (QDR is mostly seen on GPUs)
ddr5 split 64bits into two 32bit streams (aka qdr or dual channel at half bandwith), no bandwith benefit, just latency related..technically effective bandwith drops due to extra control bits, but its still DDR
ddr4 4400 = 2200mhz * 2 (ddr) = 4400 mt/s * (64/8) (bus width in bytes) = 35.2GB/s
ddr5 4400 = 2200mhz *4 (it has two DDR streams) = 8800mt/s * (32/8) (half bus width in bytes) = 35.2GB/s

here they go with two 64 bit streams one for each rank which would double bandwith, latency would probably be slowpoke due to buffer, but bandwith would double
ddr5 mcr 4400 = 2200mhz * 4 (two ddr streams) = 8800mt/s * (64/8) (full bus width) = 70.4GB/s

that is bandwith for signle ram stick (single channel), multiply that by number of channels and youll get some nice bandwith boost