Bandwidth saturation is something that needs to be considered the VRAM came out ahead slightly in the real world 4K performance for average and 99 percentile results. I think that could be due to being faster than SSD/NVME while at the same time being as fast as RAMDISK at read speeds which is what games do mostly read data and most of the written data isn't large bandwidth intensive files anyway. Normally a GPU is slotted in a x16 slot so that has higher bandwidth than a standard PCIE x4 bandwidth NVME device would performance wise and I'll assume it also has less overhead than a x16 quad M.2 device subsequently as well plus no real performance throttling concerns due to cooling. The lack of testing PCIE 4.0 is rather unfortunate I'd suspect improved performance, but that holds true of NVME on PCIE 4.0 as well. I also think perhaps the AMD Ryzen 3950X or a 5950X would be a great test it's L2 cache is more substantial than Intel's with less cores and in turn less combined L2 cache. I think what may have happened with the SSD/NVME/VRAM over the RAMDISK is you've got the system memory bandwidth saturation + the SSD/NVME/VRAM bandwidth combined as opposed to only system memory bandwidth to saturate fully so it scores a bit higher in practice. Though I think it's also limited by how quickly and how much you can fill the the L2 Cache's bandwidth structure perhaps as well. In general the L2 cache isn't real enormous though these more multi-core heavy CPU's are effectively widening them further even at the same KB sizes which means less L3 latency penalties imposed under stress. I think it really takes something like a VRAM drive and PCIE 4.0 and more multi-core heavy CPU's to start to show the maximum upside benefit to it though in practice. That's not something a user in general would test for in reality it's a very obscure and rare use case with high end hardware compared to older hardware you'd never normally see that difference and might even rule it out as a CPU variance when it might not be the case. I'm curious if Microsoft will bump up the NTFS format unit allocation size from 4096bytes to a higher threshold while still enabling compression that's a limitation that benefit newer CPU's potentially NTFS itself is pretty old so when that when that was put in place when quad cores was the max core count for consumers that could've been a reason they didn't enable compression beyond that unit allocation size settings I'm not sure just my speculation on it. It may have been tested and saw no upside so didn't bother making it a option unless there is another reason I'm unaware of that just limits it to that point and can't be resolved like with 4GB on x86.
Something I'd like to see is ATTO disk benchmark tested for the the following scenario's for Ryzen 3950x/5950X and Intel i9-9900K with NTFS format and unit allocation size 4096 with compression enabled.
Ryzen 3950X/5950X.
256KB to 8MB I/O size/file size
256KB to 16MB I/O size/file size
256KB to 64MB I/O size/file size
256KB to 128MB I/O size/file size
Intel's i9-9900K
256KB to 2MB I/O size/file size
256KB to 8MB I/O size/file size
256KB to 16MB I/O size/file size
256KB to 128MB I/O size/file size