News Gigabyte AI TOP 100E SSD features incredible 219,000 TBW endurance rating — 183X more than the venerable Samsung 990 Pro

Admin

Administrator
Staff member

Drazen

Distinguished
Dec 29, 2015
29
8
18,535
My Samsung 970 Pro as MLC drive has 1200 TBW. SLC FLASH goes in 100,000 range but capacity is low and does not exist in consumer world anymore.
Today everyone in consumer world uses TLC with max 600 TBW.
So, no way.
 

Li Ken-un

Distinguished
May 25, 2014
123
86
18,660
Unless there is a huge catch (like crappy random I/O or huge latency), this product should put all extant M.2 Optanes out to pasture. The highest capacity M.2 Optanes won’t fit on most motherboards and enclosures. And the M.2 2280 ones are capped at 118 GB.
 
Hard to believe they could get that endurance rating with the usual TLC flash. MLC, SLC?

It's easy to believe. They can use a large amount of NAND for overprovisioning combined with a more optimistic TBW figure since they figure most will either be replaced or fail in some non-NAND way before then. It's been a while since one has been performed, TheTechReport did one 10 years ago, and the Samsung 840 lasted over 2.5x past its 73TBW rating before any uncorrectable errors popped up, and much longer past that while still in a usable state. Drive controllers have much greater ECC capabilities than they used to, and NAND is more durable, so the TBW figure is just what they feel comfortable with warranting it at, not the actual lifespan.

images-4.gif
 
  • Like
Reactions: 35below0

USAFRet

Titan
Moderator
And in normal consumer use, or even most corporate use...100 times infinity is still infinity.

I've asked this before, but I'll ask again...

Have any of you ever had a solid state drive die from too many write cycles?
Not just dead (I've had that), but specifically from going over the warranty TBW, AND it actually went into read only mode, or whatever that particular drive does.

If so, please list the specific drive make/model, and the relevant numbers.
 
  • Like
Reactions: CmdrShepard

CRamseyer

Distinguished
Jan 25, 2015
426
11
18,795
Because I am not going to reward their decision to randomly include a buzzword "AI" in the name of an unrelated product.

The name is engineered for SEO -- it has both "AI" and "TOP" in it so if you search for AI TOPS you will find it. You may think it's clever, I find it disgusting.

It is for a specialized AI workload so why not have AI and/or TOPS in the name? It's an amazing piece of technology that is unmatched in the industry.
 
  • Like
Reactions: bit_user

parkerthon

Distinguished
Jan 3, 2011
93
104
18,710
It is for a specialized AI workload so why not have AI and/or TOPS in the name? It's an amazing piece of technology that is unmatched in the industry.
Sure, it’s unmatched, but to what end? Who is running AI workloads on a workstation? And what are they running exactly? Like this is gigabyte. Who is running an expensive AI model, where the highest cost is the GPU, on a consumer brand ssd? It’s like taking a ford fiesta but dropping in an engine that runs 200k miles between oil changes.
 
And in normal consumer use, or even most corporate use...100 times infinity is still infinity.

I've asked this before, but I'll ask again...

Have any of you ever had a solid state drive die from too many write cycles?
Not just dead (I've had that), but specifically from going over the warranty TBW, AND it actually went into read only mode, or whatever that particular drive does.

If so, please list the specific drive make/model, and the relevant numbers.
My OCZ Agility 3 120 gb stata drive definitely became unstable because of its low write endurance, though it took 7 (9.5) years. It somehow managed to live through some 25k hours on, 56 TBW, and 1550 power cycles.
 
Last edited:
  • Like
Reactions: bit_user

bit_user

Titan
Ambassador
Hard to believe they could get that endurance rating with the usual TLC flash. MLC, SLC?
The capacities would suggest it's MLC, not SLC. It's probably the same NAND chips modern enthusiast SSD use, but just configured all to run in pseudo-MLC mode. When you do that on NAND capable of usable endurance in QLC mode, it's probably not too surprising (TBH, I wouldn't have guessed these endurance levels would be possible outside of SLC).

Today everyone in consumer world uses TLC with max 600 TBW.
No, only enthusiast drives are TLC. Most of the consumer world is using QLC.

the Samsung 840 lasted over 2.5x past its 73TBW rating before any uncorrectable errors popped up,
Okay, so 182.5 TBW vs. 219,000 TBD? You're off by 3 orders of magnitude.

Anyway, I think you're reading too much into the 2.5x discrepancy. Part of it is that the manufacturer needs to build in some margin of error, because the endurance is going to follow something like a bell curve. They'd want to be sure that they guarantee about 2-3 sigmas less than the mean endurance.

Also, they probably allow for it to be operated at higher temperatures than TechRepublic's testing used, and that's very detrimental to endurance!

and much longer past that while still in a usable state.
I'm pretty sure it doesn't need to get that bad, before they'll accept it for return under warranty.

So now a high endurance rating makes a drive “AI” instead of enterprise?
Last I checked, enterprise M.2 drives are pretty much extinct. I bought the last one I could find, and that was a Samsung PM9A3. The only place I could still find them for sale is ebay.
 
Last edited:
  • Like
Reactions: Li Ken-un

bit_user

Titan
Ambassador
Because I am not going to reward their decision to randomly include a buzzword "AI" in the name of an unrelated product.
Heh, sounds like you got triggered.
; )

They gave a plausible explanation for it, as well as mentioning some software utility they have which supposedly manages swapping to the drive in a way that I guess is more efficient than simply letting the OS do it.

Sure, it’s unmatched, but to what end? Who is running AI workloads on a workstation? And what are they running exactly?
Pick a LLM and try running it on a machine with too little RAM to hold it all in memory. That's what this drive is intended to optimize.

Who is running an expensive AI model, where the highest cost is the GPU, on a consumer brand ssd?
Microsoft's AI PC spec requires only 45 TOPS. You can reach that with a RTX 3050.
 
Last edited:
  • Like
Reactions: Li Ken-un

Pierce2623

Upstanding
Dec 3, 2023
213
175
260
The capacities would suggest it's MLC, not SLC. It's probably the same NAND chips modern enthusiast SSD use, but just configured all to run in pseudo-MLC mode. When you do that on NAND capable of usable endurance in QLC mode, it's probably not too surprising (TBH, I wouldn't have guessed these endurance levels would be possible outside of SLC).


No, only enthusiast drives are TLC. Most of the consumer world is using QLC.
Only extra cheap drives are QLC. Right now, the standard configuration for the best price/performance for pure consumer drives is dram-less but with TLC.
 

bit_user

Titan
Ambassador
Have any of you ever had a solid state drive die from too many write cycles?
...
If so, please list the specific drive make/model, and the relevant numbers.
Some users of Apple M-series Macs have supposedly worn out their SSDs, if they got models with only like 8 GB of RAM. The heavy swapping is what killed it.

If you read the description of this drive, that's exactly the type of activity it's designed for - heavy swapping!

Only extra cheap drives are QLC.
Extra cheap machines (e.g. Chromebooks) are what most consumers probably have.

Right now, the standard configuration for the best price/performance for pure consumer drives is dram-less but with TLC.
Optimizing for price/performance is already a cut above what most prebuilt PCs do.
 
Last edited:

CmdrShepard

Prominent
Dec 18, 2023
452
335
560
Heh, sounds like you got triggered.
Of course I did, stupidity is the universal trigger.
They gave a plausible explanation for it, as well as mentioning some software utility they have which supposedly manages swapping to the drive in a way that I guess is more efficient than simply letting the OS do it.
What plausible explanation could be for making a ridiculous claim of "Our SSD is better for AI"?
Pick a LLM and try running it on a machine with too little RAM to hold it all in memory. That's what this drive is intended to optimize.
Why would a sane, normal person who knows what they are doing ever do such a stupid thing?!?

It's:

- Cheaper
- Faster
- More power-efficient
- Doesn't require special software

To just buy more RAM, or if you can't afford it just quantize the damn LLM down to a manageable size.

This is clearly just a gimmick to fleece some dumb people.
 

usertests

Distinguished
Mar 8, 2013
627
584
19,760
We still need more info on this. That is a crazy spec.

Microsoft's AI PC spec requires only 45 TOPS.
40+ TOPS for Copilot+, I believe.

I've found it very hard to find TOPS ratings (INT8) for some of the consumer discrete GPUs. For example, that figure isn't on Nvidia's website or TechPowerUp for the RTX 3060. I have to go to this Wccftech article to find out it's supposedly 101 TOPS. Any leads on this?
 
  • Like
Reactions: bit_user
It is for a specialized AI workload so why not have AI and/or TOPS in the name? It's an amazing piece of technology that is unmatched in the industry.
Well you could argue that point about GPU's too! Lets hope nVidia/AMD don't start using AI TOPS in the already really long winded naming schemes currently in use. RTX4070 Ti Super AI TOPS!! Or AIB partners: 4070 Ti Super AI TOPS Windforce 57 SCC OC!!! Uggghh. No thanks.
 
  • Like
Reactions: CmdrShepard

CmdrShepard

Prominent
Dec 18, 2023
452
335
560
We still need more info on this. That is a crazy spec.
No, no we don't. We need to let it fade into obscurity where it belongs and hope that such idiotic naming doesn't catch with other AIBs.

The only sense this product makes is for training, not for inference and only if it can really endure writing thousands of checkpoints of multi-gigabyte model files.

And even then, using an UPS and writing to a large RAM drive with every n-th checkpoint written to storage would be way faster.

TL;DR -- this is a solution looking for a non-existent problem.