This makes me extremely happy. I tend to perform tasks that are more memory intensive than compute intensive. My most recent project took 3 days to run because my Ryzen 7 2700X + 64 GB RAM ran out of RAM. I've been frustrated by the fact that EPYC clock speeds are so much lower than Ryzen and
I inexpertly puzzle about the your surprisingly common predicament - the ~effective 64GB ceiling for super cost effective AM4 - yet it seems that with the advent of nvme, there is an opportunity to simulate vast amounts of memory, or even better, code around the problem.
an am4 x570 has 16 lanes of pcie 4 bandwidth available for nvme if using an 8 lane gpu, & 24 lanes if headless.
4 of those available lanes are allocated to chipset, but multiple nvme can share this chipset bandwidth.
thats a lot of bandwidth for nvme (2 GB/s per pcie 4 lane) - 64GB/s theoretical total e.g for a realistic 8GB gpu rig as above.
Cleverly utilised, this could be used as a ram extender for many of these ram intensive tasks - it seems to this newb.
I can see how OS memory swapping to raid arrays as a ram extender could be fraught, but with an array of up to ; 3x pcie 4 cpu linked nvme, & 5x chipset nvme (2x pcie 4 x4 & 3x pcie 4 x1),,,, cleverly distributing the working data around such an array could preclude the need for many operations that now use dram. Its still slower of course, but days could become hours very cheaply.
Note also, that each nvme can affordably have 8x+ processors & substantial dram cache
Perhaps some simple calculations could even be performed independently by the nvme
In short - code should consider if data really needs to reside in ram during problem execution - an array of nvme, often w/ prefetched data from nand to its dram cache, can be a productive substitute - and is hugely capacity scalable.
From what i hear, even the big players like to save a buck when they can, but for some jobs, anything but the fastest is a false economy.
once the task becomes routine, a cost effective farm is preferred to batch process big routine tasks. A source told me the 3900x was the sweet spot cpu for his such big math clusters