News PCIe 6.0 and 7.0 standards hit a roadblock — compliance slowdown could lead to broader delays

Page 2 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.
I literally just showed you a 2x improvement. That's full, 4k random reads, not merely an interface benchmark.


Again, if you cannot notice the difference between 80 usec and 40 usec, then tell me how you're going to notice a difference going from 40 usec to 9 usec?? For it to be true that shaving 40 usec off read latency is unnoticeable, there must be so much CPU overhead that there's no way you're going to notice that next 31 usec!
You've shown no real world client benchmarks showing any significant difference. You even said booting was faster, but showed no evidence, and you mentioned gaming and I showed you evidence of the advantages not actually appearing in real world use.

edit: Additional performance that cannot be realized may as well be theoretical. Without low QD performance being addressed interface performance can never be realized.
 
Last edited:

bit_user

Titan
Ambassador
You've shown no real world client benchmarks showing any significant difference.
I showed quantitative data showing the claimed real improvements in I/O performance.

You're the one claiming it doesn't help your use cases. I accept that there are some use cases where you're not going to see a benefit between SATA and NVMe SSDs, but that's simply because the SSD isn't your bottleneck - not because of the speed or limitations of its NAND!
 
I showed quantitative data showing the claimed real improvements in I/O performance.
Performance potential, not how it actually applies.
You're the one claiming it doesn't help your use cases. I accept that there are some use cases where you're not going to see a benefit between SATA and NVMe SSDs, but that's simply because the SSD isn't your bottleneck - not because of the speed or limitations of its NAND!
If QD1 performance doesn't exceed SATA's capability then NVMe doesn't matter for QD1 does it?
 

bit_user

Titan
Ambassador
If QD1 performance doesn't exceed SATA's capability then NVMe doesn't matter for QD1 does it?
You keep ignoring the data I presented, where a fast SATA drive had a QD1 4k random read latency of about 82 usec, while a fast PCIe 5.0 NVMe drive had a QD1 4k random read latency of just 40.2 usec. That clearly shows the interface is a potential bottleneck for QD1.

Funny enough, even that TechSpot article you linked shows two games that experienced a measurable benefit from NVMe drives:

5-p.webp


13-p.webp


Now, consider two facts about that article:
  1. They used somewhat early PCIe 4.0 drives.
  2. Their test platform was a Ryzen 9 3900XT

Now, what do you suppose will happen when they use a faster CPU? That's right - the CPU portion of the loading time will decrease, resulting in a bigger relative difference between the drives! Furthermore, if they used even faster drives, we'd see yet bigger spread in the loading time graphs.
 
You keep ignoring the data I presented, where a fast SATA drive had a QD1 4k random read latency of about 82 usec, while a fast PCIe 5.0 NVMe drive had a QD1 4k random read latency of just 40.2 usec. That clearly shows the interface is a potential bottleneck for QD1.
Or it could just be better controllers, or using fewer NAND packages etc etc etc. Bottom line is that QD1 does not exceed the throughput capability of SATA and to make conclusions based on read latency alone is foolish.
Funny enough, even that TechSpot article you linked shows two games that experienced a measurable benefit from NVMe drives:
Yes 2/9 games had measurable improvements of 6s and 5s respectively which is ~30% and still nowhere near the potential of NVMe.
Now, consider two facts about that article:
  1. They used somewhat early PCIe 4.0 drives.
  2. Their test platform was a Ryzen 9 3900XT

Now, what do you suppose will happen when they use a faster CPU? That's right - the CPU portion of the loading time will decrease, resulting in a bigger relative difference between the drives! Furthermore, if they used even faster drives, we'd see yet bigger spread in the loading time graphs.
Okay here's 12900K results and PCIe 5.0 drives only top 3/9 of the games tested and are beaten by PCIe 3.0 drives in some of the others:
https://www.techpowerup.com/review/crucial-t700-pro-4-tb/16.html
Doom Eternal shows the biggest disparity, but it's also less than 2s difference between best/best.
 

bit_user

Titan
Ambassador
Or it could just be better controllers, or using fewer NAND packages etc etc etc.
SATA is just an unnecessary extra step between the CPU and the SSD controller. Why go over PCIe to a SATA controller, which talks to the SSD controller, instead of having the SSD controller itself sit directly on PCIe? That necessarily adds latency, which in turn hurts your QD1 IOPS numbers, because QD1 IOPS are a function of the end-to-end round-trip time.

Bottom line is that QD1 does not exceed the throughput capability of SATA
That would be impossible. QD1 performance is limited by the end-to-end latency. The interface sits idle while the controller is fulfilling the request and when the host CPU is handling the interrupt, waking up the userspace thread who issued the request, and then the user space thread submits another request that goes back into the kernel, through the device driver, and finally back out to the SATA controller and to the SSD. That dead time is why QD1 4k benchmarks are always way slower than high-QD or large-transaction benchmarks, where you can either keep the interface busy with multiple requests or just amortize the overhead with large transfers.

Okay here's 12900K results and PCIe 5.0 drives only top 3/9 of the games tested and are beaten by PCIe 3.0 drives in some of the others:
https://www.techpowerup.com/review/crucial-t700-pro-4-tb/16.html
Doom Eternal shows the biggest disparity, but it's also less than 2s difference between best/best.
This one segments very distinctly:

ratchet-and-clank.png


This one has less distinct stair-stepping, but shows quite some variation between drives:

the-last-of-us.png


Source: https://www.techpowerup.com/review/corsair-mp700-pro-se-4-tb/16.html
 
  • Like
Reactions: thestryker
SATA is just an unnecessary extra step between the CPU and the SSD controller. Why go over PCIe to a SATA controller, which talks to the SSD controller, instead of having the SSD controller itself sit directly on PCIe? That necessarily adds latency, which in turn hurts your QD1 IOPS numbers, because QD1 IOPS are a function of the end-to-end round-trip time.
Oh I don't disagree here that if you're buying new drives today getting SATA drives is a waste. My entire point has been that NAND itself is a bigger limiting factor on performance than the interface is. Replacing SATA SSDs with NVMe SSDs to improve performance is not going to do so on a noticeable level outside of workloads that can leverage sequential transfer and most client workloads don't. Even the QD1 improvements of ~2x don't equate to a 1:1 in real world application.
I did not realize they'd done another PCIe 5.0 drive review with a new platform!

CP2077 is the only carryover between the two it looks like and the results while a lot lower don't seem dramatically different. I thought potentially the CPU would reorder things more.
This one segments very distinctly:
Funny thing about Ratchet & Clank is that was one of the first games Sony cited as "couldn't be done without the high speed NVMe SSD" for the PS5. Entirely possible that development has something to do with better being able to leverage NVMe than most newer games though Nixxes did have to do a ton of reworking in general on it.

edit: I would love to see P5800X/905P results for these tests
 
Last edited:
  • Like
Reactions: bit_user
I hadn't checked Albert's site in a while as there hadn't been anything new, but he posted this back in may: https://www.boringtextreviews.com/2...e-was-insanely-ahead-of-its-time-heres-proof/

Now we just need someone to do a whole suite to see how notable the performance differences are. I think the improvement for the PCIe 4.0 over 3.0 is the low QD random performance improvements with the controllers so I wouldn't expect to see much better from 5.0. I noticed this when testing my SSDs and seeing how much better the P41 Pro I got was than my P31 even though the numbers are still low.
 

bit_user

Titan
Ambassador
I hadn't checked Albert's site in a while as there hadn't been anything new, but he posted this back in may: https://www.boringtextreviews.com/2...e-was-insanely-ahead-of-its-time-heres-proof/
That pretty nicely illustrates what I was talking about. The upgrade from i7-7700K to i9-14900K reduced the CPU component of the loading time so much that it became primarily I/O-limited.

So, it's like I said: if you can't see the improvement between SATA and a fast NVMe drive, you're probably not going to see much benefit from Optane, either. Except, in this case, he's not even starting all the way back at SATA!
 
  • Like
Reactions: thestryker
That pretty nicely illustrates what I was talking about. The upgrade from i7-7700K to i9-14900K reduced the CPU component of the loading time so much that it became primarily I/O-limited.
Yeah I was mostly interested to see if the gap grew since as we saw with TPU's CP2077 results nothing really changed except every drive loaded faster. None of them suddenly changed position because a CPU bottleneck wasn't there (like the PCIe 4.0 drives with much better controllers leapfrogging PCIe 3.0).
So, it's like I said: if you can't see the improvement between SATA and a fast NVMe drive, you're probably not going to see much benefit from Optane, either. Except, in this case, he's not even starting all the way back at SATA!
That's why I really want to see a full benchmark from Optane to SATA on a top end platform. On slower systems you can't really tell the difference between SATA and NVMe, but looking at Albert's results there showing PCIe 4.0 at about half 3.0 means you'd see at minimum that much but probably more over SATA. A single test though doesn't indicate whether it's an outlier or not so it would be really interesting to see how the top of each interface + Optane fared.
 
  • Like
Reactions: bit_user

bit_user

Titan
Ambassador
That's why I really want to see a full benchmark from Optane to SATA on a top end platform. On slower systems you can't really tell the difference between SATA and NVMe,
I can't remember exactly which (was it Gemini Lake? Raspberry Pi 4?), but I remember having that thought about someone complaining at the lack of NVMe on one of these SBC's. I concluded that the CPU was so slow it'd be hard to find a practical case where there was much difference.
 
  • Like
Reactions: thestryker

TRENDING THREADS