Question Flash Drive data retention improvement and verification ?

Crag_Hack

Distinguished
Dec 25, 2015
389
13
18,685
Hi I did my homework investigating data retention length for flash drives. There seemed to be some disagreement about how to prolong data retention periods on these drives when not powered on frequently. Some people claimed the flash drive controller would refresh the data when plugged in for a while (in addition to doing wear leveling and garbage management). Others claimed you need to use something like HDDErase to wipe the drive then recopy the data over to refresh it. What's the dealyo? Who's right?

Also for a quality drive let's say a Sandisk Ultra Flair (I use these all the time) what's a conservative ball park estimate for how long the drive can preserve data without being turned on? (if it's even possible to give such an estimate)

And I know it's hard to give #s for things like this but are there any conservative #s regarding how often to plug in a flash drive and how long in order to preserve its data? Or how often one needs to do an HDDErase type of operation.

Lastly how can one verify that the data on a flash drive is still integral? I have a bunch of flash drives I use for work with older OS's on them like Mac OS Catalina that only get used every year or two at the most and would like to know if they are still happy or if I have to re-setup their contents. Some might have been sitting around for 3-4 years maybe since last used. One minor complication might be I'm on Windows but some of the drives are formatted Mac OS Extended Journaled.

Thanks!
 
Regarding:

"what's a conservative ball park estimate for how long the drive can preserve data without being turned on?"

My conservative answer is zero.

Why? Any given storage device can fail at any given time without reason.

Which is the basic reason that multiple backups should be made and verified recoverable.

Just my thoughts on the matter.
 
  • Like
Reactions: Phillip Corcoran
Just to note, not all flash thumb drives do all that stuff like wear leveling, garbage collection and cell refreshing, though the top brands do. But that is one reason for cheap drives being slow and non-durable. As to the non-powered data retention, it ought to be pretty similar to an SSD, since it's still NAND flash, but the period depends on the flash used, MLC or TLC or QLC, the brand, the generation, the specific type like 3D V-NAND versus others.

Tom's just posted this article about tests done some no-name very small SSDs that were used to test the retention period. https://www.tomshardware.com/pc-com...ds-us-of-the-importance-of-refreshing-backups

They generated hashes for the files on the drives in order to verify them after sitting unpowered for a while, to answer the question of how to check your files. Of course you have to store those hashes somewhere that they won't get lost or corrupted. You can probably consider the retention issue to be the same for thumb drives.

After one year unpowered, they had no problems. After two years, there was some data loss. Those drives likely didn't use the highest-quality flash. JEDEC's requirement is one year at 30C (so it should last longer at normal room temperatures) for SSDs, and some manufacturers claim multiple years. EASUS even says 2 to 5 years is expected. However there are plenty of reports of people losing data after a drive sat unpowered for 6 months or even less, so it can vary with each specific drive even if the manufacturer's design is intended to meet the JEDEC specifications or exceed them.

If you're going to use thumb drives or SSDs as long-term unpowered storage and don't want to have to power them on regularly, you can go with the one-year assumption with backups. Make two or more copies, generate hashes for the data and store at least THREE copies of those on other media (so you won't risk having only two mismatched hashes and not know which one is right), and then compare them every year. The more copies you have, the less likely that you'll completely lose any particular chunk of data as at least one copy will be good. If the device is a good first-party brand, it will probably have the features to allow it to refresh the cells if it's plugged in for a while, and the time while you're checking the hashes might be enough to take care of it for another year, from what I've read. When you do that, you are also forcing the controller to access all the cells with data, during which the controller will be evaluating the condition of the cells and potentially performing reallocations or moving data around for wear-leveling, which would definitely result in that data being in freshly-charged cells.

Early SSDs didn't have any consideration of charge leakage during long periods without power, which led to a lot of data loss. Then the manufacturers fixed that and added these features, and I can't imagine that they need a very long period of time for the controller to do what's needed since they don't specifically tell us that it's needed, otherwise it would often not work and they'd still be getting flak for it. The same ought to apply to thumb drives that have methods for refreshing or testing after long periods unpowered.
 
@USAFRet My use case is a stick I carry on me at all times for backing up super important work that gets powered on pretty regularly, and then boot drives for Mac OS versions, Win 10/11, and another boot drive with 20 or so different boot setups created with Yumi (I'd hate for that guy to crap out).

@evermorex76 That Toms Hardware article is exactly what inspired me to do the research and create this thread. Do you think Sandisk Ultra Flairs have cell refreshing functionality? Any idea how often I'd need to plug in my drives and let them do their thing and how long to keep them plugged in in order to refresh the cells and prevent data loss? I totally wouldn't be bothered doing it every several months or so. If necessary I could obviously do something like hash calculations at the same time to have the controller look at all the data. Also the TH article referenced using Crystal Disk Info and looking at the Hardware ECC recovered value to get an assessment of the damage.
 
Do you think Sandisk Ultra Flairs have cell refreshing functionality? Any idea how often I'd need to plug in my drives and let them do their thing and how long to keep them plugged in in order to refresh the cells and prevent data loss?
If anybody had the functionality, it would be brands like Sandisk, Samsung, etc., who actually make the flash chips and controllers and charge a premium for their products. Even second-tier like Patriot or Silicon Power probably have it as they use controllers and flash from major brands and aren't producing cheap products. I'd only be concerned about the "grab a handful from the bin at MicroCenter while at the register" products or the fly-by-night randomly-generated brands on Amazon who might be getting cast-off components that didn't make the grade for other brands, or use mass manufacturers that simply make components using older generation equipment and technology to make everything as cheap as possible with little firmware development and the like. Plugging them in every few months should be more than enough, and just a few minutes ought to be enough for it to happen from what I've read. Generating and comparing hashes each time would simply be a full verification step, both confirming that the data is good and forcing the controller to check every bit.

I'm not sure the hardware ECC recovered number would directly help you assess anything unless you tracked it over time and saw they were mounting significantly. It could indicate that there are cells that are weak/failing, but the controller at that point ought to be reallocating those blocks anyway instead of leaving it to you to keep track of it. If the error was recovered, then the controller was able to read the data even though it took some extra work, and then either decided the block is bad or decided it wasn't a problem yet, and an individual user isn't going to know whether they should override the controller's decision about that. Uncorrectable errors would be more important, plus Media and Data Integrity Errors, reallocation counts, etc. But looking at my system, the NVMe drives don't even have most of those reported by DiskInfo for some reason, only the one SATA drive I have connected right now.

More importantly, that only works on SSDs and HDDs. DiskInfo doesn't show a USB thumb drive I plugged in at all, because thumb drives don't implement SMART and they aren't using a protocol like SATA or NVMe. (A drive in a UASP enclosure gets that data passed through so DiskInfo can read it.) I'm not sure there's actually any way to track or read that sort of data on a regular thumb drive.

If you're really concerned about it, these days there are USB SSDs that are actually NVMe drives with an integrated USB controller, but they're in a small form factor rather than only being a full-size drive with a cable. They can be roughly the size of a pack of gum or as small as a regular thumb drive, and the smallest sizes aren't even expensive. (Though obviously you aren't going to find anything like a 16GB model that you might just use for a bootable recovery image or cloning software.)

I have had many Sandisk thumb drives that were fine after several months or a year or more where the data wasn't altered but they were plugged in occasionally. I'd use Yumi or Ventoy or another tool like it to load ISOs of various versions of disk imaging/cloning tools for work as IT support, and Yumi/Ventoy itself wouldn't be updated on it for a very long time, and while I might add a new version of one of the software tools I usually kept the older versions as well for compatibility reasons, and all of it continued to work just fine without any efforts to refresh data. Or I just stored copies of application installers or other tools that I might run in Windows and rarely changed those. I just used Sandisk usually because they were quality, (relatively) fast drives that were not expensive, so I don't have data for other brands except the no-name ones that I would only use for making copies of data for immediate transfers, and those rarely lasted long before they just stopped working and weren't even detected. I haven't worked for over 3 years but I still have 3 of the Sandisk drives, some of which still have the data on them from back then, and those drives are at least 5 years old. I rarely even plug them in anymore, like ever 6 months or less.
 
  • Like
Reactions: ThereAndBackAgain