News Unpowered SSD endurance investigation finds severe data loss and performance issues, reminds us of the importance of refreshing backups

Page 2 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.
You have to use the SSD vendor's tool to do a low-level reformat, if you want any chance of restoring its reserve of spare blocks. Even then, if the NAND cells have suffered heavy wear, an old drive will wear out at an accelerated rate. It also might not retain written data as well, when left in an unpowered state.
So where is the expiration date on my new SSD box? Because, as you may know, SSDs sitting on warehouse shelf are not powered on.
 
So where is the expiration date on my new SSD box? Because, as you may know, SSDs sitting on warehouse shelf are not powered on.
It's the data in the cells that decays, not the cells themselves (i.e. at least, not by just sitting there). They don't expect you to read whatever values were stored on the drive before you wrote anything to it.

Once you write something to a cell, that's when the clock starts ticking, so to speak. Except, as I mentioned, the rate at which the charge decays has to do with various factors, but primarily the temperature at which the drive is being stored.

Think of the cells like nano-scale rechargeable batteries. As soon as you write something to them, the charge starts leaking (mostly out, but some charge can leak in from neighboring cells).
 
Last edited:
Anyone have concernshe used cheap no name drives and not high quality drives from a major manufacturer? There have been some truly crappy generic hard drives in the past, so why would SSDs be different. Garbage in, garbage out probably applies to this experiment (in my opinion).
 
  • Like
Reactions: ThereAndBackAgain
It's the data in the cells that decays, not the cells themselves (i.e. at least, not by just sitting there). They don't expect you to read whatever values were stored on the drive before you wrote anything to it.

Once you write something to a cell, that's when the clock starts ticking, so to speak. Except, as I mentioned, the rate at which the charge decays has to do with various factors, but primarily the temperature at which the drive is being stored.
You quoted the part of my comment where I was talking about new drives (closed box) sitting on a shelf. I know it's the data that are affected, not the cells themselves and it's exactly why I said that this "performance degradation" thing was misleading. The performance of the drive itself doesn't go bad because it's not powered on. Format it and it's as good as new.
 
You quoted the part of my comment where I was talking about new drives (closed box) sitting on a shelf. I know it's the data that are affected, not the cells themselves
Okay, then I misunderstood your comment.

it's exactly why I said that this "performance degradation" thing was misleading. The performance of the drive itself doesn't go bad because it's not powered on. Format it and it's as good as new.
Yeah, the performance degrades when the controller is encountering errors. You're right that when you read freshly-written data, it will tend to be error-free, and thus no performance penalty is incurred.

One thing, though. A worn drive should have lower write performance, because the writing process is iterative. A SSD controller writes a value to a cell, then has to read it back and check the voltage, then write it again. This process could be repeated and the more worn the cells, the more iterations will tend to be needed.

This is why SSDs use fast pSLC write buffers, and then the controller has to go back and "compress" that data by writing it at native density. Otherwise, people would complain about how slow the write performance is on most modern drives.
 
  • Like
Reactions: JayGau
HDDs are very reliable in my experience. I'm still using one from 2013 connected to my Wii U. I retrieved one from a laptop that hadn't been powered on in years, and the drive still worked just fine with no apparent issues.

I've also never encountered problems with SSDs or flash memory; though I still wouldn't use them for important cold storage, I suspect the failure rate is much lower than this YouTuber's experiment would have you believe.
 
This lack of longevity isn’t news, do real backups, not to cold store SSD.


“As per the JEDEC standards for consumer SSDs, SSDs after reaching their endurance limits, should be able to store the data for at least 1 year at 30°C. For enterprise SSDs, the retention period is at least 3 months at 40°C.

However, nobody exactly knows how long an SSD will keep its data in place once we disconnect it from the host system. However, there are different studies and standards set for the manufacturers which we are going to discuss in this article.

For consumer-grade SSDs, data retention typically ranges between 1 to 5 years without power under normal storage conditions i.e. around 30°C. Some manufacturers claim data retention of up to 10 years, but this can be shorter if the SSD is subjected to higher temperatures or if it has been heavily used during its lifespan. At room temperature (about 25°C), retention of up to 1 year is considered reliable, but at higher temperatures (30°C or more), data may only last a year or less.”
 
  • Like
Reactions: bit_user
I've also never encountered problems with SSDs or flash memory; though I still wouldn't use them for important cold storage, I suspect the failure rate is much lower than this YouTuber's experiment would have you believe.
That depends. In general, retention should get worse as the cells shrink and more bits are packed per cell. There have been improvements in cell design and manufacturing, but I suspect they've been just enough to keep pace with the capacity increases.

That video involved low-end TLC drives, which probably date back to when TLC was the highest-density available. If you tried the same experiment with newer, low-end QLC drives, I expect the results would generally be as bad or worse.

However, nobody exactly knows how long an SSD will keep its data in place once we disconnect it from the host system.
The manufacturers do. I'm sure they have charge leakage models and curve-fit real-world test data, as part of their quality control procedures.
 
  • Like
Reactions: ThereAndBackAgain
The manufacturers do. I'm sure they have charge leakage models and curve-fit real-world test data, as part of their quality control procedures
Agree, to a point, retention and where the cells are in their lifetime wear is entirely use dependent. For a given level of wear and tear the SSD manufacturer can infer a value within a tolerance but for a specific drive the number of variables is too great.
As the extract pulled out in bold.. the article goes on to explain more, 1 to 5 years unpowered is the range giving the minimum through to the best you should expect from a consumer SSD within its rated lifetime.
 
I don't know if you or whoever you were quoting got confused, but even the Crucial M500 (from 2013) had self-refresh. There was a bug in one version of their firmware, where it didn't run if you had ASPM enabled, so they advised people either to disable ASPM or upgrade their firmware.
The reason I bring up the mx500 is because it was the only consumer SSD found to temporally refresh cells; contrast that with the other consumer SSDs tested (Firecuda 520, SN850, Patriot P210, Teamgroup CX2, 970 Evo plus; WD blue wds100t2b0a) that were proven to not automatically refresh NAND pages after they have degraded due to time . The way it was proven that this group of SSDs do not automatically refresh their NAND was the observation of a severe degradation in read speed of data written to the SSDs long ago (6-36 months), and then a subsequent restoration of performance only after the drives were rewritten block by block; there was some discussion of FTL fragmention and TRIM implemtation as to the cause of the performance decrease but for various technical reasons both ideas were dismissed.

I'm intentionally mentioning time based refresh here because technically all the SSDs do implement read disturb refreshes, which could be construed as "self-refresh", but is not relevant to this conversation.


Another thing I want to stress is that none of these SSDs (talking about the external thread) actually lost data. None of them had been aged enough to prevent the controller from reading the NAND, but the controller did have to apply ECC and charge offset routines to read the data which significantly slowed performance. Also the SSD controllers failed to rewrite the weak NAND cells even after the first read attempt, only a manual rewrite of the SSDs restored performance.


The kind of densities modern SSDs have wouldn't be possible if they didn't self-refresh.
LDPC is probably the biggest contributor to helping SSDs achieve relatively error free operation at high densities with the relatively high raw ber of the NAND used. Also keep in mind that the NAND manufactures went from a 15nm class node to a 45nm class node when they went from planar to 3D NAND which also helped alot (there is a technical paper detailing all of this in post #14 of the linked thread)
 
As the extract pulled out in bold.. the article goes on to explain more, 1 to 5 years unpowered is the range giving the minimum through to the best you should expect from a consumer SSD within its rated lifetime.
Rated lifetime? I've only ever seen a warranty term and that indeed ranges anywhere from 1 to 5 years. Never seen longer than that, on a storage product. Any of these fabled 10-year drives must be specialized industrial models and probably very low-density, low-performance as well.

I really wouldn't trust a SSD to retain data for the entire warranty period, without power. In this article + thread, we have two accounts of drives giving up long before that. One was the article's author (not the video, but Mark Tyson gave an anecdote about his laptop SSD) and another is the guy who's wife's laptop wouldn't boot. Remember that the warranty is on the hardware, not your data.
 
The reason I bring up the mx500 is ...
Thank you for providing those details. I remain skeptical, but don't have time to explore the matter further, at this time.

P.S. during the SATA era, I bought lots of Crucial SSDs, including a couple MX500's. Those and Intel were my go-to brands. The last 5 SSDs I bought were a mix of SK Hynix and Samsung 990 Pro.

P.P.S. I do keep an eye on the SMART stats of my drives. I also run integrity checks semi-regularly. I've never seen any using a significant amount of their reserve blocks used up. I have some drives as old as the aforementioned M500 from 2013 and even a super-old C300 (SLC!).
 
Last edited:
I've never, ever heard of or seen such a thing. Not among sever SSDs, at least. I guess I can imagine that for specialty industrial applications.
Went hunting for the specs for the enterprise drive, these are worded slightly differently from what I remember:

• 2.5” standard form factor with SATA standard interface connector.
• Compliant with SATA revision 3.1 standard with 6.0 Gb/s transfer rate.
• Compliant with ATA/ATAPI-8 standard and ACS-2 command protocol.
• Built-in-Voltage detector for power shielding FW protection.
• Native Command Queuing up to 32 commands
• Garbage collection and TRIM Data Set Management command
• Global wear leveling algorithm evens program/erase count
• Early weak block retirement - Supports SMART feature command set.
• Supports 28/48 bit LBA mode command
• Built-in temperature sensor (Thermal Throttling) function to adjust the access speed of NAND flash and keep the SSD system stable.
• Standard IPC A-610E
[Product Specifications]
• ECC Capability: Hardware LDPC ECC engine (120bit/1KB) and Block/Page RAID
• Program / Erase Endurance: 3,000 P/E cycles
• Optimal sustained performance: Direct-To-TLC and SLC Cache Architecture
• Data Endurance & Data integrity: StaticDataRefresh technology, Early weak block retirement, Global Wear leveling
• Data Retention: ****
- 10% of program / Erase Endurance cycles: 10 Years
- 100% of program / Erase Endurance cycles: 1 Years
• Performance: (Maximum Read/Write)*
-512GB: 540/520 MB/s
-1TB: 540/510 MB/s
-2TB: 540/510 MB/s
(Test Platform: Average Value is based on Serial ATA 6.0Gb/s interface.)
• IOPS:*
-512GB: 49,000/82,000
-1TB: 51,000/85,000
-2TB: 50,000/85,000
• TBW:
-512GB: 728TB
-1TB: 1456TB
-2TB: 2912TB
• Humidity: 10% to 95% RH, non-condensing
• Operating System: Windows XP/7/8/10, or Windows Embedded Systems, DOS, Linux

*The value is various bases on the capacity and the test platform.
**Duration: 30 min x 3 axis
*****The value is based on normal program/erase endurance at room temperature. High environmental temperature may shorten the retention period.


Drive was actually a bit slower on the Read/Write: closer to 480/460, according to my notes (I benchmark all new drives, just for reference), but, maybe that was due to the slightly older Thinkpad?

However, yeah, this set of specs doesn't specify powered up vs powered down for data retention, but, 10 years for 300 P/E cycles, who can resist? Probably due to the "StaticDataRefresh technology" which most drives don't seem to have, but which only applies, while the drive is powered on. Unfortunately, now, this drive is sold out everywhere, probably discontinued.

Maybe if Optane had caught on more, and gotten less expensive, it might have longer powered down retention? Still waiting for the next generation of storage technology.

Can tell from the overkill of putting this drive into a retired laptop "just in case" - I take my "Oh, Honey..."'s seriously. ("How can we, prevent this, from happening, again?" - mantra from my time in the US Navy)

Finally,
Specs are interesting:
3,000 P/E cycles should, neglecting write amplification, on a 512G drive, mean around 1500 TB TBW. Yet, rated TBW is only 728 TB. This presumably means write amplification ratio > 2.0, which seems a little high, although, that's probably a consequence of the "StaticDataRefresh technology": it kicks up the ratio, in order to preserve the data. Worth it, I guess.
 
You have to use the SSD vendor's tool to do a low-level reformat, if you want any chance of restoring its reserve of spare blocks. Even then, if the NAND cells have suffered heavy wear, an old drive will wear out at an accelerated rate. It also might not retain written data as well, when left in an unpowered state.
I get what you mean, but low level format isn't a thing in ssd world. You need tracks and sectors and such to lay down a low level format. Most modern SSDs write all data with an internal encryption key, and the drive erase does nothing other than rotate the key thereby invalidating all data - it does nothing at a page/block level. Performing an erase operation on every page causes an immense amount of wear on drives.
 
I use Samsung SSDs and Samsung FIT Plus flash drives. I have never had a failure with a Samsung SSD or flash drive. I would never use a cheap brand or worn out drive to store important files. That is foolish.
When my SSD gets to about 80% of its expected useful write life TBW rating, I start shopping for a replacement SSD on sale.
I use MLC or TLC flash memory, no QLC.
I have not used mechanical magnetic hard disk drives, vacuum tube TVs, or fax machines, for many years.

SSD and flash drive specs do not state a minimum data retention time. The JEDEC standard is at least one year after the device reaches its write cycle end-of-life, because the leakage gets too high. But, a new name brand MLC drive has very low memory cell leakage and can store data for many years, if stored at a cool temperature.
To detect data rot early, I store files on new Samsung FIT Plus flash drives, along with SHA-256 hash values, generated by TeraCopy. At least once a year, I run a file hash verify with TeraCopy. I have multiple local and remote backup copies, so I can rewrite any file that fails hash verify, or discard a drive with a high error rate.
My most important files are stored in Folder A, plus a duplicate backup copy in Folder B, for redundancy.

I store the flash drives in a fireproof waterproof digital media safe, at a cool temperature.
Checking hash values periodically is known as an active archive strategy, much better than hoping files are intact with a passive archive strategy.

MLC flash memory is getting harder to find, but Transcend offers a number of MLC consumer grade, prosumer grade, and industrial grade flash products, that I have added to my Samsung drives.

Disk Genius software includes a sector speed test. I run a check when I buy a new drive and store the results. Once a year I run the sector speed test again to watch for drive slowing as cells wear out, causing read retries that slow down performance.
 
Went hunting for the specs for the enterprise drive, these are worded slightly differently from what I remember:
Kinda odd not to include a link or tell us the brand & model number.

As for the specs, they look pretty standard, except for that one about endurance:

• Data Retention: ****
- 10% of program / Erase Endurance cycles: 10 Years
- 100% of program / Erase Endurance cycles: 1 Years
So, that's a very interesting carve-out. I haven't seen something like that, previously.

However, yeah, this set of specs doesn't specify powered up vs powered down for data retention,
Right, good point. Could make sense, even as a "powered up" stat, because once you max the P/E cycles, you can no longer reliably refresh faltering blocks.

Maybe if Optane had caught on more, and gotten less expensive, it might have longer powered down retention? Still waiting for the next generation of storage technology.
I had wondered about this and done some poking around, but what I found seemed to indicate that Optane's data retention was even worse than NAND!

Can tell from the overkill of putting this drive into a retired laptop "just in case" - I take my "Oh, Honey..."'s seriously. ("How can we, prevent this, from happening, again?" - mantra from my time in the US Navy)
Backup to hard disk or cloud. That's your best bet. Anything else is a crapshoot. At least, if you're going to leave it powered-off.

3,000 P/E cycles should, neglecting write amplification, on a 512G drive, mean around 1500 TB TBW. Yet, rated TBW is only 728 TB. This presumably means write amplification ratio > 2.0, which seems a little high, although,
As NAND capacity increases, erase blocks are getting larger. That should mean write amplification getting worse. It also depends a lot on what sorts of writes they think are happening.

I once noticed Firefox was doing zillions of fairly small writes. Probably cookies, I'd guess. Logging is another source of small writes, and that's a more common issue on embedded systems, like how they seemed to be positioning that drive.

that's probably a consequence of the "StaticDataRefresh technology": it kicks up the ratio, in order to preserve the data. Worth it, I guess.
Could be. They might be estimating how many times a cell will typically get refreshed, over the course of the warranty period and subtracting that from the number of supported user writes.
 
I get what you mean, but low level format isn't a thing in ssd world.
NVMe has a format command. I'm not very clear on the details of exactly what it's meant to do. It does seem to operate at the namespace-level, rather than on the entire device.

I think it's conceivable that vendor tools have some kind of low-level command that resets the FTL, but if any of their tools expose it, it'd be the kind of thing that requires double-confirmation to activate.
 
I have never had a failure with a Samsung SSD or flash drive.
Back when I was buying Crucial SATA SSDs, the customer reviews on Samsung 8xx Pro drives were a little worse. I think Samsung has improved, since then. However, their 990 Pro had a bug in the initial firmware that was burning up people's endurance.

Also, it came out quite late in the lifecycle of the 980 Pro that they also had a premature death issue.

Anyway, for pretty much any popular SSD model, you can probably find someone claiming it died and took their data with it.

I would never use a cheap brand or worn out drive to store important files. That is foolish.
The only non-foolish solution is to backup your important files, preferably on optical or magnetic storage.

When my SSD gets to about 80% of its expected useful write life TBW rating, I start shopping for a replacement SSD on sale.
I wouldn't wait that long, especially if you can't afford for it to fail.

I use MLC or TLC flash memory, no QLC.
The later models to use MLC and TLC were far better than the early ones. Remember that when MLC and TLC were first introduced, it was on low-end drives and they weren't great.

Disk Genius software includes a sector speed test. I run a check when I buy a new drive and store the results. Once a year I run the sector speed test again to watch for drive slowing as cells wear out, causing read retries that slow down performance.
I wouldn't bother with a speed test. It's much simpler and more reliable to look at the SMART stats.
 
Rated lifetime? I've only ever seen a warranty term and that indeed ranges anywhere from 1 to 5 years. Never seen longer than that, on a storage product. Any of these fabled 10-year drives must be specialized industrial models and probably very low-density, low-performance as well.

I really wouldn't trust a SSD to retain data for the entire warranty period, without power. In this article + thread, we have two accounts of drives giving up long before that. One was the article's author (not the video, but Mark Tyson gave an anecdote about his laptop SSD) and another is the guy whose wife's laptop wouldn't boot. Remember that the warranty is on the hardware, not your data.
What are you trying to argue?

1, I used the words “rated lifetime” meaning endurance. I read endurance as provided by the manufacturer as a perfectly written drive, all cells evenly written. I was trying to convey the rated lifetime of a cell that has been used frequently approaching its P/E limit.
2, I agree, warranties tend to range between 1 and 5 years. More typically 5 years from those I have seen. Samsung 850 Pro SSD (MLC) came with 10 year physical warranty or the limit of its endurance. I have one of these devices and it still works beautifully.
3, The ‘fabled’ 10 year drives - SLC, charge is there or it isn’t 1 cell, one bit. A non zero charge (above a trigger threshold) will set the bit. The potential for leakage is still present but it is only a factor if the charge falls below the trigger threshold. So for a full charge decaying to the trigger point, all valid for a SLC drive a TLC drive will have lost 3 bits, 7voltage levels, QLC will have lost 4 bits 16 voltage levels. So one big window (SLC) or 7 small windows (TLC).. you can see that the data is more fragile in TLC, more so in QLC.

The link I posted gives an idea of how long data may be retained and an idea of how quickly it can be lost. Even if you just look at the tables you can see the causes for the loss of the trapped charge and loss of data.

(Opinion) My expectation is for data to degrade after a year, corruption to begin after a year. I don’t trust TLC, I don’t trust QLC for archiving. I can no longer buy MLC, I would if I could. SLC was prohibitively expensive.
Archive data to appropriate media. Refresh the backckups periodically and have a number of copies, 3 is the minimum recommended, all stored in different places.
 
The only non-foolish solution is to backup your important files, preferably on optical or magnetic storage.
I wouldn't trust optical. Sure, factory-pressed discs can last a very long time, but unless you find archival-grade media and have climate-controlled storage...

The safest way seems to be having multiple copies of your dataset, at different physical location, having redundancy at each location, and keeping the data online on a filesystem with data checksumming and periodic scrubbing (i.e. ZFS). But that's tedious and expensive.

I recently plugged in a cold-storage harddrive after a couple of years offline, and it suddenly had unreadable sectors, even though it's been stored in a shock-absorbant case in relatively controlled climate. Obviously just a single datapoint which shouldn't be extrapolated - but it did make me reconsider my archival strategy.
 
  • Like
Reactions: stuff and nonesense
I use 3-2-1-1-0 backup. At least 3 copies of important files, stored on at least 2 types of storage media, with at least 1 copy offsite for disaster protection, at least 1 copy on immutable media, and backups periodically hash-verified for 0 errors.

Samsung FIT Plus flash drives are fast, small, and rugged, metal body, temperature proof, shock proof, magnet proof, water proof, X-ray proof, dust proof, altitude proof, within specs. Old school fragile mechanical HDD can't compete.

FIT Plus flash drives have a 5-year warranty and are available up to 512 GB storage capacity, but do not support SMART diagnostic parameters (only a few premium model USB flash drives support SMART), so I use DIskGenius to check the sector speeds once a year to look for slowing performance. Since they are not used for heavy writing loads, they do not wear out. DiskGenius also lets the user retire the slowest sectors, as well as bad sectors, so only the healthiest sectors are used.

I replace the Samsung SSDs when they reach about 80% of TBW, no need to replace them sooner. Early SSDs had a manufacturer learning curve and reliability has improved. My early Intel and Kingston SSDs failed. My Samsung 850 Pro MLC SSD is 8 years old, with a 10-year warranty, and still running strong.

(I also buy Samsung smart TVs and smartphones and they have never failed.)

I make a year-end backup on Verbatim archival-grade DVDs.

I use encrypted cloud storage at Filen (Germany) and Proton Drive (Switzerland). I do not use any Big Tech snoop cloud storage.

There is some backup software that can also run scheduled file integrity checks and file refresh to recharge flash memory cells, but it is not common, or a user can do it manually once a year.
 
Last edited:
Samsung FIT Plus flash drives are fast, small, and rugged, metal body, temperature proof, shock proof, magnet proof, water proof, X-ray proof, dust proof, altitude proof, within specs. Old school fragile mechanical HDD can't compete.
LOL, are you throwing your backup media against a wall, sticking it inside degaussing coils, dunking it in the toilet, flying on a plane with it, or leaving it on the shelf of a dusty shed? If not, then none of that matters. The only thing that probably matters is the rate of charge leakage vs. the rate at which HDD media loses its magnetic coercivity. The steel body of HDDs provides pretty good protection against external magnets. I've personally knocked a 1 TB HDD off a table and onto a hardwood floor, by accident. It was unplugged, obviously, and no damage was done. I scanned the entire drive - zero errors.

I don't really care what you do. But, you seem to be distracting from the main issue. Anyway, it's definitely good that you keep backups and check your media.

My early Intel and Kingston SSD s failed.
I have an Intel 520 and 530 that are still in service. Also, the datacenter equivalent of the 750. Those are the only 3 Intel SSDs I ever bought. So, I'm 3 for 3.

My Samsung 850 Pro MLC SSD is 8 years old, with a 10-year warranty, and still running strong.
Okay, thanks for that. So, now I do know of a consumer SSD that had a 10-year warranty.

(I also buy Samsung smart TVs and smartphones and they have never failed.)
Don't get me wrong - I like Samsung as a brand. I haven't personally had any bad experiences with them. I even bought two 990 Pro SSDs!

I make a year-end backup on Verbatim archival-grade DVDs.
That's cool. I still have some Taiyo Yuden blanks (RIP), but I've mostly moved on to BD-R.

I use encrypted cloud storage at Filen (Germany) and Proton Drive (Switzerland).
I think Proton is good. I use their cloud services, too.

There is some backup software that can also run scheduled file integrity checks and file refresh to recharge flash memory cells, but it is not common, or a user can do it manually once a year.
I keep all my important stuff on Linux. I use BTRFS, which has a builtin checksum. When I want to "refresh" a volume, I use badblocks, but not before running fstrim on volumes that support it. Also, when using badblocks to refresh a volume, never do this on the whole device - always just whichever logical devices have data you want to keep. Then, run fstrim again, afterwards.
 
NVMe has a format command. I'm not very clear on the details of exactly what it's meant to do. It does seem to operate at the namespace-level, rather than on the entire device.

I think it's conceivable that vendor tools have some kind of low-level command that resets the FTL, but if any of their tools expose it, it'd be the kind of thing that requires double-confirmation to activate.
It is possible to change the logical sector size on some drives, but this is a logical operation, not what us old dogs have come to think of as low level format where a magnetic head physically alters the number of tracks, etc.

With smaller cell sizes and more bits per cell, the number of erase operations that can be performed are reduced. For this reason, performing an erase on all blocks is to be avoided on modern drives that you're planning to keep in operation. It is something people used to do to prepare for disposal. However, that's mostly deprecated today and replaced with the crypto key swap operation i mentioned before. It's more secure, faster (instant), and gentler on the drive vs performing block erase.