News Samsung's Memory-Semantic CXL SSD Brings a 20X Performance Uplift

dehjomz

Distinguished
Dec 31, 2007
81
55
18,610
I know this is a datacenter application, but for the consumer space, if/when CXL-enabled CPUs come to market, can the OS be installed to a CXL drive, and what, if anything, will a CXL-based drive do for OS load times, application load times, and game-level load times? Will CXL drives enable persistent memory, for near instant OS loads from cold boot? Is the proliferation of CXL why Intel cancelled Optane? This samsung drive seems like Optane on steroids.

I have so many questions about what CXL will enable for the consumer market. Like as the PC consumer market begins its transition to DDR5 RAM, should I save my DDR4 RAM modules... will they be able to one day be used in a CXL.mem device for additional RAM/cache/storage?
 

abufrejoval

Reputable
Jun 19, 2020
615
452
5,260
Well the promise of NV-DIMMs was that you wouldn't ever boot again unless you had to recover from a catastrophic failure. When memory is non-volatile you can just switch off and on and resume very much like you do on an Android.

And then remember that servers aren't supposed to boot or be idle: cloud vendors invest much more in making sure that their ARM servers are always loaded than in copying these wonky turbo or power saving features x86 has from its desktop and notebook legacy.

I don't think anyone on CXL comittees is thinking about consumer use cases, I doubt we'll see "CXL ready" stickers on any Intel notebooks or gamer kit.

CXL is designed and hopefuly paid for by hyperscaler customers and/or deep government sponsored HPC pockets.

Perhaps some consumer gaming or desktop kit may eventually speak CXL, but only once it's cheaper to reuse IP blocks from the cloud/HPC kit instead of maintaining PCIe-only libraries.

Generally the industry does not want you to reuse any RAM kits (or any other components, for that matter): It's bad for business!

But from former colleagues of mine, who produced HPC hardware, I also heard that DRAM really does age in HPC environments (probably in clouds, too) and would eventually become unreliable and fail, at which points all savings from recycling are quickly destroyed. Actually CPUs (and other logic chips) do, too, they just used to have higher margins built in.

That's changing as the competition is heating up ...quite literally.

And then the biggest challenge I see with CXL is that none of today's operating systems are near able to understand what it is or how to manage all the variety of devices thus connected. Remember that both Linux and VMS/WNT are DDOS, dictator disk operating systems with a bit of networking grafted on them and no notion of peers and on how to collaborate with them.

Unix can't really tell a GPU from a printer or audio port. And believe me you won't like what happens on your screen if Linux were to try to schedule 10k GPU cores on an RTX 4000. And now just imagine it trying to manage the allocation of memory blocks between CPU and GPU cores spread around various motherboards and busses across the data center...

I recommend you not hold your consumer breath for CXL.
 

InvalidError

Titan
Moderator
I have so many questions about what CXL will enable for the consumer market.
It won't do anything for the consumer market. CXL is little more than PCIe with a cache-coherency (and encryption in 2.0) layer on top. It makes no sense on single-socket consumer systems where the only system memory pool in the system is controlled by the only CPU in the system.

CXL is primarily intended for stitching large-scale systems (multiple processors, accelerators, large-scale storage, etc. across multiple boards) together where something needs to keep tabs on what is being cached where to mitigate potential memory IO conflicts between all of the stuff that is on the bus competing for memory access across all of the different system-wide memory pools.

Basically, CXL is "Big-Tin" stuff. Unnecessary overhead and complexity for consumer stuff.
 

USAFRet

Titan
Moderator
I know this is a datacenter application, but for the consumer space, if/when CXL-enabled CPUs come to market, can the OS be installed to a CXL drive, and what, if anything, will a CXL-based drive do for OS load times, application load times, and game-level load times? Will CXL drives enable persistent memory, for near instant OS loads from cold boot? Is the proliferation of CXL why Intel cancelled Optane? This samsung drive seems like Optane on steroids.

I have so many questions about what CXL will enable for the consumer market. Like as the PC consumer market begins its transition to DDR5 RAM, should I save my DDR4 RAM modules... will they be able to one day be used in a CXL.mem device for additional RAM/cache/storage?
We're already close to that point.

Consider boot time

HDD:
15 sec of BIOS,
15 sec of drive and Windows

30 seconds total.

SSD that is 5 times faster than the HDD
15 sec of BIOS
3 sec of drive and Windows

18 secs total.

Faster, but not magical.


Take the spinning drive out of the equation with solid state, and we are way deep into diminishing returns.
Raw drive speed is only a small part of the equation.
 
  • Like
Reactions: Makaveli

escksu

Reputable
BANNED
Aug 8, 2019
877
353
5,260
Despite the fancy names, this product is basically just having a DRAM cache/buffer pairing with NAND. Concept is not new and has been around for ages. HDDs have a small buffer that works the same way, just that its much smaller in size (~32-64MB). This is device level implrementation.

For PC side, a portion of our main memory is being used as cache for data from SSD/HDD. Serves the same purpose.

Btw, all these do not speed up boot time. When you power off the machine, the data from the cache is flushed to the storage. So, when you power up and boot to windows, the cache has to be filled from storge (bottleneck is storage here).
 

escksu

Reputable
BANNED
Aug 8, 2019
877
353
5,260
I know this is a datacenter application, but for the consumer space, if/when CXL-enabled CPUs come to market, can the OS be installed to a CXL drive, and what, if anything, will a CXL-based drive do for OS load times, application load times, and game-level load times? Will CXL drives enable persistent memory, for near instant OS loads from cold boot? Is the proliferation of CXL why Intel cancelled Optane? This samsung drive seems like Optane on steroids.

I have so many questions about what CXL will enable for the consumer market. Like as the PC consumer market begins its transition to DDR5 RAM, should I save my DDR4 RAM modules... will they be able to one day be used in a CXL.mem device for additional RAM/cache/storage?

No, it does nothing for consumer market. Consumer PCs already have this speed all along. All PCs (be it windows/linux/MacOS)) are known to utilise part of the man memory for caching data from storage (be it HDD or SSD). So, using CXL SSD will do nothing to improve performance.

Such SSDs are only mean for intensive servers by freeing up CPU and main memory for other tasks. In such environment, the CPU simply transfer data from main memory to CXL SSD DRAM cache (much faster than transferring to NAND). After that, the drive manages its own trnasfers to NAND.
 

bit_user

Titan
Ambassador
It makes no sense on single-socket consumer systems where the only system memory pool in the system is controlled by the only CPU in the system.
Leaving aside the "consumer" part, the rest isn't necessarily true. For workstations and servers, where you will have in-package DRAM* for high-bandwidth access to "hot" data, CXL.mem gives you a way to scale memory capacity that doesn't involve having to bring out large numbers of DDR5 channels directly from the CPU. Linux already supports memory tiering, where pages can be migrated between faster and slower memory tiers. This is like what Intel tried to do with Optane DIMMs, but scales even better.

CXL is primarily intended for stitching large-scale systems (multiple processors, accelerators, large-scale storage, etc. across multiple boards) together
Just because it's cache-coherent doesn't mean that's the only context where it makes sense.

There's plausible speculation that the next Mac Pro will use CXL.mem to complement in-package DRAM, in just such a fashion.

* The first mainstream server CPU with in-package DRAM will be the HBM variants of Sapphire Rapids... if Intel can ever get them out the door!
: O
 
Last edited:

InvalidError

Titan
Moderator
Leaving aside the "consumer" part, the rest isn't necessarily true. For workstations and servers, where you will have in-package DRAM* for high-bandwidth access to "hot" data, CXL.mem gives you a way to scale memory capacity that doesn't involve having to bring out large numbers of DDR5 channels directly from the CPU.
PCIe/CXL memory expansion would make sense to avoid hitting the swapfile but at 80-100ns of latency and I have already predicted as much while musing the probability of exactly that happening when AMD and Intel decide to put HBM on mainstream CPUs. You'll likely still want lower-latency stuff as supplemental memory when your workload is bigger than the HBM can handle though.

If CPUs drop parallel interfaces for directly attached external memory, I suspect they will come up with an ultra-lightweight "PCIe alt-mode" to enable much lower latency memory-to-host connections.
 

bit_user

Titan
Ambassador
PCIe/CXL memory expansion would make sense to avoid hitting the swapfile but at 80-100ns of latency ... You'll likely still want lower-latency stuff as supplemental memory when your workload is bigger than the HBM can handle though.
Last I checked, 80-100 ns latency is what you get when accessing unbuffered, directly-connected DRAM DIMMs.



According to this, the CXL 2.0 interconnect adds only 20-40 ns of latency for CXL.mem devices:



BTW, nice to chat with you, again. I'm not sure how active I'll be, but it's reassuring to see you still around here.

P.S. did you ever upgrade? I seem to recall you saying something about holding out for 2x single-thread performance or something. Did Zen 3 or Alder Lake cross the threshold? I'm starting to contemplate a move from Sandybridge to Alder Lake, myself. Supermicro's WS680 boards are finally in stock. Now, I just need to find some unbuffered DDR5 ECC DIMMs!!
 

abufrejoval

Reputable
Jun 19, 2020
615
452
5,260
Same but I also do a few more restarts due to GPU driver updates, AGESA Bios updates so my number is more in the 10-12 range for the year.
AGESA has "stabilized" for AM4: I don't expect many more changes, now that there are no new CPUs coming that way.

I'm surprised by the GPU driver updates requiring reboots. I see that for Intel iGPUs, but for Nvidia it's never required, except with Linux for major CUDA updates. But then I avoid game driver updates even on Windows and stick to using the drivers that come with CUDA updates, which regularly require no reboots there.

The only AMD GPUs I currently have are APUs in notebooks and those just don't receive any updates whatsoever (thanks, Lenovo 😕!)

Windows itself is another story, though; not only are reboots required, but I've found no way to avoid forced reboots since upgrading to Windows server 2019. Since I typically run VMs there, those tend to be killed brutally, not even shut down.

I keep telling it not to shut down the machine, it keeps telling me that I'll have to reboot within two weeks and then it get "smart" and reboots the following night, mistaking lack of user input for a permission to reboot...
 
  • Like
Reactions: bit_user