News Intel Gets the Jitters: Plans, Then Nixes, PCIe 4.0 Support on Comet Lake

I get the craze for more raw bandwidth especially with SSDs but can we also focus on IOPS? We need higher and more consistent IOPS much like Intels Optane 905P offers for a smoother experience. You can throw as much raw bandwidth with PCIe 4.0 x16 as you can but file sizes haven't changed and IOPS seem to be steady but still drop at certain QDs drastically.

That said, for anything Intel it is best to wait. If you have anything Skylake variant don't waste the money on most anything unless the boost is worth it overall for the work you plan to do.
 
Jan 21, 2020
6
14
15
I get the craze for more raw bandwidth especially with SSDs but can we also focus on IOPS? We need higher and more consistent IOPS much like Intels Optane 905P offers for a smoother experience. You can throw as much raw bandwidth with PCIe 4.0 x16 as you can but file sizes haven't changed and IOPS seem to be steady but still drop at certain QDs drastically.

That said, for anything Intel it is best to wait. If you have anything Skylake variant don't waste the money on most anything unless the boost is worth it overall for the work you plan to do.

Miss this one?

https://www.tomshardware.com/news/adata-sage-ssd-unveiled
 
  • Like
Reactions: alextheblue

I did. However the question is that peak theoretical IOPS under a certain tests (most likely sequential) and will it uphold or will it drop like most others?

IOPS can be pushed higher but a lot of them the IOPS are only that for very specific tasks. I just want to see something more like the Optane 905p where everything is more consistent even if its not the highest speed.
 
Jan 21, 2020
6
14
15
I did. However the question is that peak theoretical IOPS under a certain tests (most likely sequential) and will it uphold or will it drop like most others?

IOPS can be pushed higher but a lot of them the IOPS are only that for very specific tasks. I just want to see something more like the Optane 905p where everything is more consistent even if its not the highest speed.

There is a whooooole lot at play, but sequential vs random IOPS and throughput can vary widely due to no fault of the SSD or its controller. To say that high, sustained IOPS are only possible with sequential I/O is a generalization that isn't exactly true and you'll find many folks that will prove it wrong.

This is where application and system/kernel optimization play a huge part. You may think "Jeez, this SSD in my system is SO SLOW when writing data. It's awful!" and the problem isn't even the SSD at all but rather the I/O scheduler used isn't ideal for the storage, the application doing the writes is very inefficient when doing write ops, etc.

Sure, there are some SSDs that do better than others with certain tasks, but in my experience that was largely with early, smaller SSDs. Recent improvements in SSDs, controllers, and OS/kernel-level efficiency when using SSDs have resulted in faster, more consistent performance in all aspects.

Are your writes page-aligned? Are your partitions aligned with page boundaries? Are you using an I/O scheduler that doesn't waste cycles trying to order writes before sending them to a storage controller that doesn't care (Think Linux CFQ vs deadline or noop)? Is your application the bottleneck due to bad/inefficient code when writing data?

Throughput and IOPS are great for comparing drives in general/generic ways, but unless your application and OS are extremely efficient and properly aligned to the storage you won't see the full potential of the SSD and it won't be the SSD's fault. Random vs sequential IOPS are largely inconsequential when it comes to actual SSD performance, IMO. Seeing a large difference in performance between random and sequential with a modern SSD raises some red flags and I would call the application and OS into question as incorrect/inefficient system setup could be triggering GC on the SSD more often than it should and if the SSD's GC isn't great then bad writes can expose that. You could argue that poor GC causing poor SSD performance is the fault of the SSD but someone else could just as easily argue that the app/system are set up incorrectly and the SSD is incorrectly being blamed as the problem.

Just my $0.02.
 
There is a whooooole lot at play, but sequential vs random IOPS and throughput can vary widely due to no fault of the SSD or its controller. To say that high, sustained IOPS are only possible with sequential I/O is a generalization that isn't exactly true and you'll find many folks that will prove it wrong.

This is where application and system/kernel optimization play a huge part. You may think "Jeez, this SSD in my system is SO SLOW when writing data. It's awful!" and the problem isn't even the SSD at all but rather the I/O scheduler used isn't ideal for the storage, the application doing the writes is very inefficient when doing write ops, etc.

Sure, there are some SSDs that do better than others with certain tasks, but in my experience that was largely with early, smaller SSDs. Recent improvements in SSDs, controllers, and OS/kernel-level efficiency when using SSDs have resulted in faster, more consistent performance in all aspects.

Are your writes page-aligned? Are your partitions aligned with page boundaries? Are you using an I/O scheduler that doesn't waste cycles trying to order writes before sending them to a storage controller that doesn't care (Think Linux CFQ vs deadline or noop)? Is your application the bottleneck due to bad/inefficient code when writing data?

Throughput and IOPS are great for comparing drives in general/generic ways, but unless your application and OS are extremely efficient and properly aligned to the storage you won't see the full potential of the SSD and it won't be the SSD's fault. Random vs sequential IOPS are largely inconsequential when it comes to actual SSD performance, IMO. Seeing a large difference in performance between random and sequential with a modern SSD raises some red flags and I would call the application and OS into question as incorrect/inefficient system setup could be triggering GC on the SSD more often than it should and if the SSD's GC isn't great then bad writes can expose that. You could argue that poor GC causing poor SSD performance is the fault of the SSD but someone else could just as easily argue that the app/system are set up incorrectly and the SSD is incorrectly being blamed as the problem.

Just my $0.02.

I Don't disagree that there are more factors at play. OS, software and driver levels all can change performance.

Articles focus on MB/s which have taken storage from massive bottleneck to still one of the slowest parts but not as bad as it was. I just want to see the way a drive performs with different loads and not just some top end number.
 

alextheblue

Distinguished
I Don't disagree that there are more factors at play. OS, software and driver levels all can change performance.

Articles focus on MB/s which have taken storage from massive bottleneck to still one of the slowest parts but not as bad as it was. I just want to see the way a drive performs with different loads and not just some top end number.
That's why we have reviews. The PCIe 4.0 models out now have already seen improvements in IOPS as well as sequential speeds over older models. We'll have to see some reviews to know how much that affects real performance. When the next gen units hit (like the Sage), we also need testing to see if they suffer a performance hit in real-world applications when limited to PCIe 3.0.

That aside, this delay is really embarrassing for Intel. When Samsung releases a PCIe 4.0 SSD or two, a lot of people buying that hot new Intel platform will start grumbling.
 

InvalidError

Titan
Moderator
I did. However the question is that peak theoretical IOPS under a certain tests (most likely sequential) and will it uphold or will it drop like most others?
Unless you are reading and writing from/to a virgin SSD, there really isn't such a thing as a truly sequential workload since the SSD's write-leveling and caching algorithms will randomize the physical space over time. After a while, your neatly sequential user-space workload might just as well be small-medium size random IOs as far as the SSD's controller is concerned. What's why SSDs that have been in use and getting slower for a while can regain much of their original performance with a full-erase to reset the memory map.

If you want to know an SSD's worst-case sustainable IOPS, you'd have to look for benchmarks that show IOPS after something like one drive worth of random 4k writes and erasures after initially filling it most of the way.
 

JayNor

Reputable
May 31, 2019
429
86
4,760
I see from other articles that the need for retimers on pcie4 motherboards is expected. I also recall AMD dropping support for existing motherboards when their pcie4 support was announced. Also, pcie4 is not supported on the Renoir chips. Were the need for expensive retimers the underlying issue for all these cases?

 

InvalidError

Titan
Moderator
I see from other articles that the need for retimers on pcie4 motherboards is expected. I also recall AMD dropping support for existing motherboards when their pcie4 support was announced.
PCIe4 worked fine on most 300-400 series motherboards with hard-wired PCIe/NVMe slots (no PCIe switches for x16/x8x8/x8x4x4 support) until AMD decided to kill it with an AGESA update. As for retimers/buffers, those are only necessary for slots that are further away from the CPU/chipset, whichever they originate from.
 

bit_user

Polypheme
Ambassador
I get the craze for more raw bandwidth especially with SSDs but can we also focus on IOPS? We need higher and more consistent IOPS much like Intels Optane 905P offers for a smoother experience. You can throw as much raw bandwidth with PCIe 4.0 x16 as you can but file sizes haven't changed and IOPS seem to be steady but still drop at certain QDs drastically.
Not enough attention is paid to low-queue depth performance, and this is where Optane is king. Everyone always touts these QD32 numbers, but those aren't relevant for the majority of desktop users.
 

bit_user

Polypheme
Ambassador
Unless you are reading and writing from/to a virgin SSD, there really isn't such a thing as a truly sequential workload since the SSD's write-leveling and caching algorithms will randomize the physical space over time. After a while, your neatly sequential user-space workload might just as well be small-medium size random IOs as far as the SSD's controller is concerned.
While performance will degrade, in a nearly-full (or not properly TRIM'd) drive, it's not as bad as you state.

First, you have to specify what you mean by random. If we're talking 128 kB, that's a different level of performance than 4 kB, since SSD erase blocks are larger than 4 kB (though I'm not exactly sure how big, these days).

Second, we should be clear about what's the bottleneck in random performance. Some significant amount is in the host transaction overhead, which you won't see, even when the SSD's controller is having to break up writes into smaller pieces.

Anandtech used to benchmark drives in this state, but I'm not seeing those tests in their most recent SSD review. Anyway, you should really find & cite some benchmarks, to properly qualify these remarks. It's certainly not nearly as bad as having sequential writes drop to the throughput of 4 kB random-writes.

If you want to know an SSD's worst-case sustainable IOPS, you'd have to look for benchmarks that show IOPS after something like one drive worth of random 4k writes and erasures after initially filling it most of the way.
Yes, citing these would help you substantiate your claims.
 

bit_user

Polypheme
Ambassador
Regarding the main topic, though, I can foresee this biting into Intel's 2020 holiday sales. Just imagine building or upgrading a PC with a PCIe 4.0-capable GPU and SSD (especially with Nvidia and Samsung finally getting on the 4.0 bandwagon), and then plugging them into a CPU/mobo with just PCIe 3.0. For some buyers, that's probably going to be the deciding factor to go with an AMD CPU.

It's hard to make the case that Intel needed PCIe 4.0, but that doesn't mean it's not going to hurt their sales.
 
  • Like
Reactions: Soaptrail
Not enough attention is paid to low-queue depth performance, and this is where Optane is king. Everyone always touts these QD32 numbers, but those aren't relevant for the majority of desktop users.

I remember the TH review of the 905p stating it was the smoothes Windows experience they had ever had. I think we get obsessed with one stat on something and push it while ignoring the other factors that can benefit you.

The other day a guy posted a RAID 60 with 24 WD Velociraptors and all he could think of was the MB/s. The saddest thing is a single SATA SSD would probably have better IOPS than all of those drives together and give a smoother and better OS experience.

IOPS is king in SANs though especially when you have multiple VMs running through them.

Regarding the main topic, though, I can foresee this biting into Intel's 2020 holiday sales. Just imagine building or upgrading a PC with a PCIe 4.0-capable GPU and SSD (especially with Nvidia and Samsung finally getting on the 4.0 bandwagon), and then plugging them into a CPU/mobo with just PCIe 3.0. For some buyers, that's probably going to be the deciding factor to go with an AMD CPU.

It's hard to make the case that Intel needed PCIe 4.0, but that doesn't mean it's not going to hurt their sales.

Agreed. Its the same as AMDs old AM3 setup that had PCIe 3.0 from the CPU but the rest was 2.0.

I am honestly surprised to see AMD push to it first. Intel was the one who normally went to the newest standards when they were much pricier to do while AMD waited until it was in a more price friendly point.

In the long run though this wont really kill Intel. I mean how many years did we have Netburst vs K8 and Intel turned it around in a single generation. I think, so long as they don't pull another Barcelona or worse Bulldozer, this is going to be a much better competitive market. AMD is doing well and actually challenging Intel properly. While I am not one to say 16 cores is needed in mainstream its not a bad thing. Now we just need something to push NAND prices down to magnetic drive prices. That or Optane/NVDIMMs need to make a hold in mainstream.
 
  • Like
Reactions: bit_user

InvalidError

Titan
Moderator
First, you have to specify what you mean by random. If we're talking 128 kB, that's a different level of performance than 4 kB, since SSD erase blocks are larger than 4 kB (though I'm not exactly sure how big, these days).
It can only do larger erases if it has larger contiguous unallocated areas to erase in the first place. There won't be many of those on a mostly full drive that has seen a drive worth of random 4k writes on top of that.
 

bit_user

Polypheme
Ambassador
It can only do larger erases if it has larger contiguous unallocated areas to erase in the first place. There won't be many of those on a mostly full drive that has seen a drive worth of random 4k writes on top of that.
That's dubious.

First, the majority of files are likely to be larger than 4k. Many of them, much larger. So, each time you delete one, it's probably going to free up a lot of contiguous space.

Second, the SSD's controller can reduce fragmentation, by packing smaller writes together. Don't forget that most SSDs buffer writes in DRAM and then some NAND written as MLC, before being rewritten more densely. Also, modern filesystems (and, by extension, I presume whatever low-level filesystem the SSD controller is using) are quite good at avoiding fragmentation unless you do a lot of churn at very close to full capacity. Finally, SSDs are continually patrolling the NAND for errors, which gives them an additional opportunity to coalesce free space.

Third, you can argue abstractions all you want, but what @jimmysmitty seems to care about is actual performance. So, the discussion should really be lead by the data. If you don't have some decent benchmarks on which to hang your comments, then I think they're not worth the bits comprising them.

Fourth, I've only come close to filling a SSD on one system, and that was with a bunch of extremely large video files. On most of my PCs, the SSDs have never gotten more than about 80% full.
 
Last edited:

bit_user

Polypheme
Ambassador
Typical database table updates aren't and that's what the bulk of SSD endurance benchmarks are based on.
Typical users aren't filling their SSDs with databases, and they darn sure aren't doing updates all day long (which would burn out a consumer drive, way prematurely)! We're talking about typical users, here, hence the whole rejection of QD32 IOPS numbers, etc.
 
  • Like
Reactions: Soaptrail

InvalidError

Titan
Moderator
Typical users aren't filling their SSDs with databases, and they darn sure aren't doing updates all day long (which would burn out a consumer drive, way prematurely)! We're talking about typical users, here, hence the whole rejection of QD32 IOPS numbers, etc.
Typical users may not do small random writes on the same scale as databases but where I live, Windows and most other software still has things like the registry that do a fair number of read-modify-write operations on a daily basis and those will add up over months/years.
 

bit_user

Polypheme
Ambassador
Typical users may not do small random writes on the same scale as databases but where I live, Windows and most other software still has things like the registry that do a fair number of read-modify-write operations on a daily basis and those will add up over months/years.
No. You're off by several orders of magnitude.

If you actually compare the write endurance of write-oriented enterprise SSDs, they're like 100-fold what a typical consumer SSD can handle.

This is the problem with coming up with some scenario from your brain, instead of focusing on actual data. Let's see some data - if the problem is as big and prevalent as you suggest, it should not be hard to find! Otherwise, this is all just a bunch of noise.
 
Last edited: