News AMD Responds to Claims of EPYC Genoa Memory Bug, Says Up On Track

Bios update for support ram on a server It's a big joke. Rushed desing
Timing was important to show dominance over Intel Sapphire Rapids, even single dimm showing strength.

But you are right, is not a mature platform, Sapphire rapids is opposite as it has been a long time coming, from restepping (Silicon fix) and other delays it is a full fledge bug free platform, Genoa is bleeding edge best performance with questionable stability. Interesting counterpoints of competition, stability+ dated design vs peak performance and teething issues.
 
  • Like
Reactions: rluker5
2DPC support should die with DDR5.

You hamstring bandwidth too much by forcing one memory controller to support that much electrical load that you have to compromise on the bandwidth.

Could you please explain how the mere presence of the 2DPC option hurts your use case? If you have the option you don't have to use it. But if you need it, well, then you better have it.

Some workloads benefit more from extra memory than from faster memory. And some workloads cannot even run when you don't have enough memory. And with 2DPC you can use cheaper modules and save quite some dosh.

I am looking at the prices of a prticular retailer that sells 64GB modules for $455 but 128GB modules cost $1650. Imagine, then, that we need 3TB of memory across two 12-channel Epyc sockets. We can do this with 48x64GB for the cost of $21840 but with 24x128GB it jumps to $39600. $18k is a lot of money. It may easilly be more than half the price of the rest of the system. If it's your money - go ahead and spend it however you like. But please don't tell me how I should spend mine.
 
  • Like
Reactions: KraakBal
Could you please explain how the mere presence of the 2DPC option hurts your use case? If you have the option you don't have to use it. But if you need it, well, then you better have it.

Some workloads benefit more from extra memory than from faster memory. And some workloads cannot even run when you don't have enough memory. And with 2DPC you can use cheaper modules and save quite some dosh.

I am looking at the prices of a prticular retailer that sells 64GB modules for $455 but 128GB modules cost $1650. Imagine, then, that we need 3TB of memory across two 12-channel Epyc sockets. We can do this with 48x64GB for the cost of $21840 but with 24x128GB it jumps to $39600. $18k is a lot of money. It may easilly be more than half the price of the rest of the system. If it's your money - go ahead and spend it however you like. But please don't tell me how I should spend mine.

2DPC is "2 dims per channel" with 12 channels where is the 48x64GB coming from?

Sorry I'm not a computer Specialist but I don't see the logic.

Unless you mean 24x64 against 12x128 (1DPC), is this a typo?
 
2DPC is "2 dims per channel" with 12 channels where is the 48x64GB coming from?

Sorry I'm not a computer Specialist but I don't see the logic.

Unless you mean 24x64 against 12x128 (1DPC), is this a typo?

He wrote "two 12-channel Epyc sockets", so 24 channels.

Imagine, then, that we need 3TB of memory across two 12-channel Epyc sockets. We can do this with 48x64GB for the cost of $21840 but with 24x128GB it jumps to $39600.


1DPC means 24 DIMMs. 2DPCmeans 48 DIMMS. 48x64 = ... etc

// Stefan
 
This is the mandate of cross licensing agreement with Intel. You can see the effect of the mandate in the AMD's APU with integrated graphics that has some limits on the PCI express versions and lanes.
 
They don't call it Semi-Accurate for nothing...😉
Charlie's spiel isnt't giving love generously.

But even if he had not erred, I'd have to agree with Paul's assessment, that with 12 channels and denser DIMMs, even an outright lack of dual DIMM support wouldn't disqualify the product for the majority of its users.

I'd actually love to have some info on the number of ranks that the various Zen CPUs support: quad rank LR-DIMMs with 64GB per stick seem to become available and that could be interesting even for "desktop" boards.
 
He wrote "two 12-channel Epyc sockets", so 24 channels.

1DPC means 24 DIMMs. 2DPCmeans 48 DIMMS. 48x64 = ... etc

Thanks for answering on my behalf!

Please note that I specifically chose the 2 socket example because if you are buying a single socket system but you think you'll need lots of memory, it might be better to obtain a dual socket one. And the price difference of the DIMMs can almost make up for the extra cost of the dual socket system. Plus it will have higher aggregate throughput.

Of course, if you already have the single socket and you need more memory, then you better have a 2DPC capable system.
 
Thanks for answering on my behalf!

Please note that I specifically chose the 2 socket example because if you are buying a single socket system but you think you'll need lots of memory, it might be better to obtain a dual socket one. And the price difference of the DIMMs can almost make up for the extra cost of the dual socket system. Plus it will have higher aggregate throughput.

Of course, if you already have the single socket and you need more memory, then you better have a 2DPC capable system.

Makes perfect sense. There are so many workloads, and some of them will favor single socket systems, just for latency's sake, and the single socket, 2DPC config makes a lot of sense for them, if they need loads of memory, just as there are many workloads that can split up so that a socket will most of the time just use memory that is close.

Different workloads, different setups. Just having the possibility of 2DPC is great, regardless of, in your example, pricing. Some just don't care about price and just want performance.

// Stefan
 
Could you please explain how the mere presence of the 2DPC option hurts your use case? If you have the option you don't have to use it. But if you need it, well, then you better have it.
It mostly has to do with turning down the Transfer Speeds per DIMM once you add in the 2nd DIMM onto the same channel.
Because you're adding extra electrical load onto the Memory Controller, your speed is compromised.

Doesn't really matter if it's AMD or Intel, their memory controllers Maximum Transfer speed will get noticeably NERFed if you add in 2DPC.

That's how it has always been, that's how it's going to be.

I'd rather see companies moving to 1DPC and just give you 2x the Memory Controllers to compensate so you can get speed & capacity if that's where you want things to go.
 
It mostly has to do with turning down the Transfer Speeds per DIMM once you add in the 2nd DIMM onto the same channel.
Because you're adding extra electrical load onto the Memory Controller, your speed is compromised.

However, the question wasn't about adding a second dimm to the channel, it was about giving the possibility to add a second dimm to the channel, i.e. having a socket for it. And yes, there is a super minor difference because of the added trace lengths, etc, it's not really until you actually add the dimm that there is a problem.

Doesn't really matter if it's AMD or Intel, their memory controllers Maximum Transfer speed will get noticeably NERFed if you add in 2DPC.
That's how it has always been, that's how it's going to be.

Show a benchmark, remember we're not talking about adding a second dimm here, we're talking about having the socket for it. We all know why a second dimm will nerf the speed.

I'd rather see companies moving to 1DPC and just give you 2x the Memory Controllers to compensate so you can get speed & capacity if that's where you want things to go.

The cost is prohibitive, but that is in a way what AMD has done going to 12 channels. I agree that that is a way to solve it, but it also means a LOT of more pins on the CPU, and traces that need to be laid out, as the 2DPC config is relatively easy in comparison. Some workloads would be better off for it, however, not all need the speed and will say DDR5-4000 or 4400 is good enough, they just need the amount of memory.
 
a dual-socket server would need 48 total slots. As such, we believe that most 2DPC configs will likely either be for single-socket servers or use a reduced number of channels in dual-socket servers. In fact, the Tyan server that lists 2DPC support only has a single socket.
I wonder if we're nearing the end of the 2-socket era. At least, 2-sockets in the mainstream server. As the number of cores per server continues to grow, it makes less and less sense to add a costly and bottleneck-prone multi-CPU cache coherency bus.

market insiders have even predicted that support for 2DPC could end with the DDR6 standard.
True, but not for the reasons stated in the article. External DRAM can't compete with in-package memory for either speed or efficiency. Therefore, in-package DRAM will provide a performance memory tier, while external CXL.mem devices will provide the capacity - even scaling up to an entire rack drawer full of DRAM.
 
Last edited:
2DPC support should die with DDR5.

You hamstring bandwidth too much by forcing one memory controller to support that much electrical load that you have to compromise on the bandwidth.
That's what registered/buffered memory is for.

Could you please explain how the mere presence of the 2DPC option hurts your use case? If you have the option you don't have to use it. But if you need it, well, then you better have it.
As the article mentioned, Alder Lake had an interesting property that the mere presence of an second, empty DIMM slot per channel lowered the maximum speed of the first slot!

We can do this with 48x64GB
As the article mentioned, you'll likely have trouble finding boards with 48 DIMM slots!
 
Last edited:
I'd actually love to have some info on the number of ranks that the various Zen CPUs support: quad rank LR-DIMMs with 64GB per stick seem to become available and that could be interesting even for "desktop" boards.
Neither AMD nor Intel desktop CPUs support LR or Registered memory. Conversely, neither TR Pro nor Xeon W 2400/3400 support Unbuffered.
 
48 slots in a side by side design is not very feasible, but it wouldn't be so bad in a front and back layout. Many server chassis are moving to 950mm length to do more per rack U, and that plenty of room for front/back design.

I've also worked on quad CPU designs where the memory was installed on daughter boards that ran vertically off the motherboard. That design held 64 memory slots, but needed 3U. That wasn't DDR5 and I don't know how far you can push bus speeds in that orientation. At the same time I can see that being a popular arrangement for CXL memory.
 
  • Like
Reactions: bit_user
48 slots in a side by side design is not very feasible, but it wouldn't be so bad in a front and back layout.
I think it's fun to imagine putting each CPU and its memory on opposite sides of the same PCB. I know it's impractical, but perhaps not impossible. A simpler solution would probably be two separate boards (or 2 instances of the same board) that hook together via a pair of PCIe-like connectors that are mirror images (one male, one female).

I've also worked on quad CPU designs where the memory was installed on daughter boards that ran vertically off the motherboard.
As you say, CXL will bring this back.
 
I think it's fun to imagine putting each CPU and its memory on opposite sides of the same PCB. I know it's impractical, but perhaps not impossible. A simpler solution would probably be two separate boards (or 2 instances of the same board) that hook together via a pair of PCIe-like connectors that are mirror images (one male, one female).
Not sure what you mean by opposite sides, but by front/back I wasn't intending to mean 1 CPU in front by the drives and 1 in the back by the I/O ports, but an orientation where the CPUs are almost kissing - just one toward the front and one toward the back. In my experience this has become more common than side by side. CPU to CPU traces are shorter and don't have to run under the memory slots which simplifies the board. In blade servers front/back is pretty much the only orientation available - where CPU 2 eats the heat blowing back from CPU1. Not ideal, but as I say its the only way. In full 1U servers where there is more real estate they tend to offset the CPUs just a bit to minimize that effect.