News Samsung's Memory-Semantic CXL SSD Brings a 20X Performance Uplift

Admin · Aug 3, 2022

Samsung's Memory-Semantic CXL SSD targets AI and ML applications with a large DRAM cache.

Samsung's Memory-Semantic CXL SSD Brings a 20X Performance Uplift : Read more

dehjomz · Aug 3, 2022

I know this is a datacenter application, but for the consumer space, if/when CXL-enabled CPUs come to market, can the OS be installed to a CXL drive, and what, if anything, will a CXL-based drive do for OS load times, application load times, and game-level load times? Will CXL drives enable persistent memory, for near instant OS loads from cold boot? Is the proliferation of CXL why Intel cancelled Optane? This samsung drive seems like Optane on steroids.

I have so many questions about what CXL will enable for the consumer market. Like as the PC consumer market begins its transition to DDR5 RAM, should I save my DDR4 RAM modules... will they be able to one day be used in a CXL.mem device for additional RAM/cache/storage?

abufrejoval · Aug 3, 2022

Well the promise of NV-DIMMs was that you wouldn't ever boot again unless you had to recover from a catastrophic failure. When memory is non-volatile you can just switch off and on and resume very much like you do on an Android.

And then remember that servers aren't supposed to boot or be idle: cloud vendors invest much more in making sure that their ARM servers are always loaded than in copying these wonky turbo or power saving features x86 has from its desktop and notebook legacy.

I don't think anyone on CXL comittees is thinking about consumer use cases, I doubt we'll see "CXL ready" stickers on any Intel notebooks or gamer kit.

CXL is designed and hopefuly paid for by hyperscaler customers and/or deep government sponsored HPC pockets.

Perhaps some consumer gaming or desktop kit may eventually speak CXL, but only once it's cheaper to reuse IP blocks from the cloud/HPC kit instead of maintaining PCIe-only libraries.

Generally the industry does not want you to reuse any RAM kits (or any other components, for that matter): It's bad for business!

But from former colleagues of mine, who produced HPC hardware, I also heard that DRAM really does age in HPC environments (probably in clouds, too) and would eventually become unreliable and fail, at which points all savings from recycling are quickly destroyed. Actually CPUs (and other logic chips) do, too, they just used to have higher margins built in.

That's changing as the competition is heating up ...quite literally.

And then the biggest challenge I see with CXL is that none of today's operating systems are near able to understand what it is or how to manage all the variety of devices thus connected. Remember that both Linux and VMS/WNT are DDOS, dictator disk operating systems with a bit of networking grafted on them and no notion of peers and on how to collaborate with them.

Unix can't really tell a GPU from a printer or audio port. And believe me you won't like what happens on your screen if Linux were to try to schedule 10k GPU cores on an RTX 4000. And now just imagine it trying to manage the allocation of memory blocks between CPU and GPU cores spread around various motherboards and busses across the data center...

I recommend you not hold your consumer breath for CXL.

jkflipflop98 · Aug 3, 2022

None of that makes a shred of technical sense.

InvalidError · Aug 3, 2022

dehjomz said:
I have so many questions about what CXL will enable for the consumer market.

It won't do anything for the consumer market. CXL is little more than PCIe with a cache-coherency (and encryption in 2.0) layer on top. It makes no sense on single-socket consumer systems where the only system memory pool in the system is controlled by the only CPU in the system.

CXL is primarily intended for stitching large-scale systems (multiple processors, accelerators, large-scale storage, etc. across multiple boards) together where something needs to keep tabs on what is being cached where to mitigate potential memory IO conflicts between all of the stuff that is on the bus competing for memory access across all of the different system-wide memory pools.

Basically, CXL is "Big-Tin" stuff. Unnecessary overhead and complexity for consumer stuff.

USAFRet · Aug 3, 2022

dehjomz said:
I know this is a datacenter application, but for the consumer space, if/when CXL-enabled CPUs come to market, can the OS be installed to a CXL drive, and what, if anything, will a CXL-based drive do for OS load times, application load times, and game-level load times? Will CXL drives enable persistent memory, for near instant OS loads from cold boot? Is the proliferation of CXL why Intel cancelled Optane? This samsung drive seems like Optane on steroids.

I have so many questions about what CXL will enable for the consumer market. Like as the PC consumer market begins its transition to DDR5 RAM, should I save my DDR4 RAM modules... will they be able to one day be used in a CXL.mem device for additional RAM/cache/storage?

We're already close to that point.

Consider boot time

HDD:
15 sec of BIOS,
15 sec of drive and Windows

30 seconds total.

SSD that is 5 times faster than the HDD
15 sec of BIOS
3 sec of drive and Windows

18 secs total.

Faster, but not magical.

Take the spinning drive out of the equation with solid state, and we are way deep into diminishing returns.
Raw drive speed is only a small part of the equation.

InvalidError · Aug 3, 2022

USAFRet said:
Faster, but not magical.

If you don't like boot times...

Standby:
0s BIOS
0s boot
~1s for the monitor to detect sync regardless of SSD or HDD.

I reboot my system maybe 4-5 times per year and that includes the 1H/2H Windows updates.

USAFRet · Aug 3, 2022

InvalidError said:
I reboot my system maybe 4-5 times per year and that includes the 1H/2H Windows updates.

That's about the same amount as I do.
Major updates, and after a vacation.

Other than that, it is ON and idling.

Makaveli · Aug 3, 2022

USAFRet said:
That's about the same amount as I do.
Major updates, and after a vacation.

Other than that, it is ON and idling.

Same but I also do a few more restarts due to GPU driver updates, AGESA Bios updates so my number is more in the 10-12 range for the year.

escksu · Aug 3, 2022

Despite the fancy names, this product is basically just having a DRAM cache/buffer pairing with NAND. Concept is not new and has been around for ages. HDDs have a small buffer that works the same way, just that its much smaller in size (~32-64MB). This is device level implrementation.

For PC side, a portion of our main memory is being used as cache for data from SSD/HDD. Serves the same purpose.

Btw, all these do not speed up boot time. When you power off the machine, the data from the cache is flushed to the storage. So, when you power up and boot to windows, the cache has to be filled from storge (bottleneck is storage here).

escksu · Aug 3, 2022

dehjomz said:
I know this is a datacenter application, but for the consumer space, if/when CXL-enabled CPUs come to market, can the OS be installed to a CXL drive, and what, if anything, will a CXL-based drive do for OS load times, application load times, and game-level load times? Will CXL drives enable persistent memory, for near instant OS loads from cold boot? Is the proliferation of CXL why Intel cancelled Optane? This samsung drive seems like Optane on steroids.

I have so many questions about what CXL will enable for the consumer market. Like as the PC consumer market begins its transition to DDR5 RAM, should I save my DDR4 RAM modules... will they be able to one day be used in a CXL.mem device for additional RAM/cache/storage?

No, it does nothing for consumer market. Consumer PCs already have this speed all along. All PCs (be it windows/linux/MacOS)) are known to utilise part of the man memory for caching data from storage (be it HDD or SSD). So, using CXL SSD will do nothing to improve performance.

Such SSDs are only mean for intensive servers by freeing up CPU and main memory for other tasks. In such environment, the CPU simply transfer data from main memory to CXL SSD DRAM cache (much faster than transferring to NAND). After that, the drive manages its own trnasfers to NAND.

bit_user · Aug 4, 2022

InvalidError said:
It makes no sense on single-socket consumer systems where the only system memory pool in the system is controlled by the only CPU in the system.

Leaving aside the "consumer" part, the rest isn't necessarily true. For workstations and servers, where you will have in-package DRAM* for high-bandwidth access to "hot" data, CXL.mem gives you a way to scale memory capacity that doesn't involve having to bring out large numbers of DDR5 channels directly from the CPU. Linux already supports memory tiering, where pages can be migrated between faster and slower memory tiers. This is like what Intel tried to do with Optane DIMMs, but scales even better.

InvalidError said:
CXL is primarily intended for stitching large-scale systems (multiple processors, accelerators, large-scale storage, etc. across multiple boards) together

Just because it's cache-coherent doesn't mean that's the only context where it makes sense.

There's plausible speculation that the next Mac Pro will use CXL.mem to complement in-package DRAM, in just such a fashion.

* The first mainstream server CPU with in-package DRAM will be the HBM variants of Sapphire Rapids... if Intel can ever get them out the door!
: O

bit_user · Aug 4, 2022

USAFRet said:
Other than that, it is ON and idling.

I put mine to sleep. It uses almost no power in sleep. Far less than even a truly idle system (and my box never seems completely idle, if I have even a couple web browser windows open and minimized).

InvalidError · Aug 4, 2022

bit_user said:
Leaving aside the "consumer" part, the rest isn't necessarily true. For workstations and servers, where you will have in-package DRAM* for high-bandwidth access to "hot" data, CXL.mem gives you a way to scale memory capacity that doesn't involve having to bring out large numbers of DDR5 channels directly from the CPU.

PCIe/CXL memory expansion would make sense to avoid hitting the swapfile but at 80-100ns of latency and I have already predicted as much while musing the probability of exactly that happening when AMD and Intel decide to put HBM on mainstream CPUs. You'll likely still want lower-latency stuff as supplemental memory when your workload is bigger than the HBM can handle though.

If CPUs drop parallel interfaces for directly attached external memory, I suspect they will come up with an ultra-lightweight "PCIe alt-mode" to enable much lower latency memory-to-host connections.

bit_user · Aug 4, 2022

InvalidError said:
PCIe/CXL memory expansion would make sense to avoid hitting the swapfile but at 80-100ns of latency ... You'll likely still want lower-latency stuff as supplemental memory when your workload is bigger than the HBM can handle though.

Last I checked, 80-100 ns latency is what you get when accessing unbuffered, directly-connected DRAM DIMMs.

The Intel 12th Gen Core i9-12900K Review: Hybrid Performance Brings Hybrid Complexity

www.anandtech.com

According to this, the CXL 2.0 interconnect adds only 20-40 ns of latency for CXL.mem devices:

Latency Considerations Of IDE Deployment On CXL Interconnects

Protecting high-value data passed across interconnects with minimal performance impact.

semiengineering.com

BTW, nice to chat with you, again. I'm not sure how active I'll be, but it's reassuring to see you still around here.

P.S. did you ever upgrade? I seem to recall you saying something about holding out for 2x single-thread performance or something. Did Zen 3 or Alder Lake cross the threshold? I'm starting to contemplate a move from Sandybridge to Alder Lake, myself. Supermicro's WS680 boards are finally in stock. Now, I just need to find some unbuffered DDR5 ECC DIMMs!!

abufrejoval · Aug 15, 2022

Makaveli said:
Same but I also do a few more restarts due to GPU driver updates, AGESA Bios updates so my number is more in the 10-12 range for the year.

AGESA has "stabilized" for AM4: I don't expect many more changes, now that there are no new CPUs coming that way.

I'm surprised by the GPU driver updates requiring reboots. I see that for Intel iGPUs, but for Nvidia it's never required, except with Linux for major CUDA updates. But then I avoid game driver updates even on Windows and stick to using the drivers that come with CUDA updates, which regularly require no reboots there.

The only AMD GPUs I currently have are APUs in notebooks and those just don't receive any updates whatsoever (thanks, Lenovo 😕!)

Windows itself is another story, though; not only are reboots required, but I've found no way to avoid forced reboots since upgrading to Windows server 2019. Since I typically run VMs there, those tend to be killed brutally, not even shut down.

I keep telling it not to shut down the machine, it keeps telling me that I'll have to reboot within two weeks and then it get "smart" and reboots the following night, mistaking lack of user input for a permission to reboot...

Search

News Samsung's Memory-Semantic CXL SSD Brings a 20X Performance Uplift

Admin

Administrator

dehjomz

Distinguished

abufrejoval

Honorable

jkflipflop98

Distinguished

InvalidError

Titan

USAFRet

Titan

InvalidError

Titan

USAFRet

Titan

Makaveli

Splendid

escksu

Reputable

escksu

Reputable

bit_user

Titan

bit_user

Titan

InvalidError

Titan

bit_user

Titan

The Intel 12th Gen Core i9-12900K Review: Hybrid Performance Brings Hybrid Complexity

Latency Considerations Of IDE Deployment On CXL Interconnects

abufrejoval

Honorable

TRENDING THREADS

Latest posts

Moderators online

Share this page