Investigation: Is Your SSD More Reliable Than A Hard Drive?

Page 3 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.
Status
Not open for further replies.
[citation][nom]pjmelect[/nom]What I would like to know is what component on the SSD drive failed, was it the memory chips themselves or the controller or additional logic or was it due to construction defects or capacitors etc.[/citation]

There's still a lot of low level data that we didn't go into because it would require about five more pages (exaggerating but reliability is complicated). First there's the validated vs. unvalidated number. If you read the whole article then you know unvalidated is about 2x to 3x higher. In these cases, the SSD experiences some sort of compatibility problem or firmware related problem. That's 50-66% of the errors right there.

Breaking down things further errors, I'd estimate up to 25% of the validated errors are probably flash translational layer related. Meaning that it's a recoverable error that a manufacturer can fix with a firmware update. (FYI, even though this "recoverable error," that only means the physical level of the drive can be brought back. Your data is likely gone. Btw, this also occurs with hard drives.) Keep in mind that often times this still requires an RMA because a simple firmware reflash won't fix the problem on the consumer end.

Robin Harris at Storage bits made a great point, "All SSDs do is replace a hard drive’s head disk assembly - the platters and heads - with a lot of flash chips. The rest of the stuff is the same - and that stuff accounts for about half of all drive failures. So the best we can expect is that SSDs could be twice as reliable.

But flash isn’t that reliable either, especially as feature sizes shrink. Few know that it takes ≈20 volts to write NAND flash, which is a lot when insulators are molecules thick. Entire plane failures on flash die are common. "


If you look back, five years ago, the DRAM segment lead the way in the memory business. Volatile memory production set the prices for the industry and that was the R&D that broke the barriers for solid state research. Now manufacturing prices are dependent on Non-volatile (NAND) production. DRAM isn't really the driving force anymore. As a result, the defective parts per million (DPM) is going to be higher for the latest and greatest [NAND] because manufacturers are constantly pushing the borders of density to drive down cost.

The conversation could go on and on. But reliability is no easy subject to discuss.

Rule #1 in storage: Backup Backup Backup!

Cheers,
Andrew Ku
TomsHardware.com
 
OK! Great article and now I have a better idea about SSDs. Back up, bu, bu...

I hope this wont be another megahertz, megapixel, etc race from manufactures. Where everything is reduced to a few numbers.

I haven't entered into the SSD drives because I preferred to have a lot of cheap redundancy first and actually right now I am planning my first SSD acquisition. But, is the SSD race only going to get more explosive on speed or will we see a stronger reliability production? I will easily pay the same for less speed Vs more reliability as the impact on general performance will be very close. There is no need for only one measure to dominate the market.
 
So basically consumer grade SSDs are NOT ready for primetime as they are unreliable. Enterprise folks who can afford SLC SSDs and redundancy can use SSDs without the fear of lost data, but consumers get a POS for a SSD with MLC SSDs and poor firmware.
 
[citation][nom]beenthere[/nom]So basically consumer grade SSDs are NOT ready for primetime as they are unreliable. Enterprise folks who can afford SLC SSDs and redundancy can use SSDs without the fear of lost data, but consumers get a POS for a SSD with MLC SSDs and poor firmware.[/citation]

Not at all. If you look at the graph, the population of X25-Ms that we published had an AFR ~0.7%. That's pretty good. But it's not a magnitude better than what we see in hard drives. Time will be tell if the 3rd and 4th year are as kind.

Cheers,
Andrew Ku
TomsHardware.com

 


Maxtor is gone..... But anyways, Seagate and WD reacted no differently than the SSD vendors when we started to make our initial inquiries. All the storage people hate it when you start to really dig deep into failure rates, because the secret gets out that MTBF is almost meaningless.

All the storage manufacturers refused to assist us in collecting data. Unfortunately for them, we happen to have great friends at data centers, so we know a lot more about hard drives than they would like us to.

And by no means are hard drives perfect. Far from it, some of the consumer drives hit a cumulative failure rate of 40% in the fifth year. That's almost 50% of your drives failing when you after five years of constant use. That's kind of nuts.

It doesn't matter if it's hard drives or SSDs, MTBF makes it sound like you could hand a drive off to your grandchild. Gordon Hughes (the one of the creators of Secure Erase and S.M.A.R.T.) made an interesting comment to me the other day. "It's all about the math." And that's completely true. The way the statistics are presented make it sound like these things last forever. Totally false.


Cheers,
Andrew Ku
TomsHardware.com

 
Well I've used SCSI HDs in many PCs for twenty years and had only one fail. I've tried IDE/SATA drives and had many failures. So I guess it depends on your definition of "fail" and the impact of lost data.

As noted in the very beginning of the article, OCZ, Corsair, Intel and others have ALL had data loss and other operational issues on their consumer grade MLC SSDs. That's simply unacceptable to me - period. I won't pay $3000 for a quality SLC SSD nor will I buy an unreliable MLC SSD. If and when the purveyors of half-baked SSDs come up with reliable consumer grade SSDs, then I'm willing to invest in them. Until then these jokes for consumer grade SSDs can be bought by people who don't care about lost data or suckers who don't know what they are in for. No responsible person would buy the current consumer grade SSDs if data security is important.
 
Just bustin' yer chops ...

I have had two catastrophic RAID-0 fails (go figger) ... really did mess my whole life up, including taxes, etc.

Now ... I have all SSDs and rotate 1TB F3's, in a drive carrier (drawer) ...
... I just drag and drop entire drive-trees into various mechanical partitions.
I used to sleep on a couch, while a tape backup system backed up all our company drives, across a Novell, Ethernet. ... No more of THAT !

=Alvin=


 
I'm still concerned that SSDs might actually start getting more and more UN reliable, that the controllers might get better, but the actual flash itself becomes more error prone. So failure rates, however they're tabulated are going to at best level off, not get better with every product iteration. Especially if the last part about write voltage leading to failure is more common as lithography shrinks (assuming the same voltage is required at smaller sizes?).


It seems like making the drive more reliable only makes financial sense to a certain extent. Clearly OCZ is still around and has had a perception of lower reliability drives for two Sandforce generations. It's also been the speed standard as well. So what happens when SSDs become a cutthroat, thin margin business? Do drives get better to protect minuscule margins, or do the drives fall off in quality to save a few bucks? Probably both

[citation][nom]acku[/nom]There's still a lot of low level data that we didn't go into because it would require about five more pages (exaggerating but reliability is complicated).


Robin Harris at Storage bits made a great point, "All SSDs do is replace a hard drive’s head disk assembly - the platters and heads - with a lot of flash chips. The rest of the stuff is the same - and that stuff accounts for about half of all drive failures. So the best we can expect is that SSDs could be twice as reliable.But flash isn’t that reliable either, especially as feature sizes shrink. Few know that it takes ≈20 volts to write NAND flash, which is a lot when insulators are molecules thick. Entire plane failures on flash die are common. [/citation]


Please, Andrew, give us the five pages of "low level" information.

 
[citation][nom]compton[/nom]I'm still concerned that SSDs might actually start getting more and more UN reliable, that the controllers might get better, but the actual flash itself becomes more error prone. So failure rates, however they're tabulated are going to at best level off, not get better with every product iteration. Especially if the last part about write voltage leading to failure is more common as lithography shrinks (assuming the same voltage is required at smaller sizes?).It seems like making the drive more reliable only makes financial sense to a certain extent. Clearly OCZ is still around and has had a perception of lower reliability drives for two Sandforce generations. It's also been the speed standard as well. So what happens when SSDs become a cutthroat, thin margin business? Do drives get better to protect minuscule margins, or do the drives fall off in quality to save a few bucks? Probably bothPlease, Andrew, give us the five pages of "low level" information.[/citation]

We have some stuff related to this we're going to look into. Stay tuned. :)
 
nice study, but if we want to compare two tecnologies (not products) I think we need to go deeper. Not only the amounts of failures (and I agree 2 years is not time enough to see write endurance problems, but it would be the same for mechanical HDDs: what percentage of the failures were head landings) it would be important to get to what caused that failures. And then analyze what problems are intrinsec to technology, the maker, the user or the environment and then see what are the Achilles Heels for every technology. SSDs are still new, HDDs had maturity problems and lessons-to-be-learned time ago (in a period with less stress). If you want to give an orientation of what reliability expect when you receive your next SSD box, its ok (Simply put: you can not forget backups still). Perhaps getting no less than mechanical HDD it is not bad either...
 
[citation][nom]serendipiti[/nom]nice study, but if we want to compare two tecnologies (not products) I think we need to go deeper. Not only the amounts of failures (and I agree 2 years is not time enough to see write endurance problems, but it would be the same for mechanical HDDs: what percentage of the failures were head landings) it would be important to get to what caused that failures. And then analyze what problems are intrinsec to technology, the maker, the user or the environment and then see what are the Achilles Heels for every technology. SSDs are still new, HDDs had maturity problems and lessons-to-be-learned time ago (in a period with less stress). If you want to give an orientation of what reliability expect when you receive your next SSD box, its ok (Simply put: you can not forget backups still). Perhaps getting no less than mechanical HDD it is not bad either...[/citation]

If you read the first page, then you know that write endurance isn't really an issue. That's something confirmed by our friends at XS.

And as for the type of failures, I believe that was detailed in an earlier forum post.

Cheers,
Andrew Ku
TomsHardware.com
 
Not to mention the fact that "planned obsolescence" can be good for the bottom line, so long as the market does not get gun-shy.

Unfortunately, with American made cars, it finally DID ... Until Chrysler came out with their 7/70 warranty, which restored confidence.

Solid warranties and "no questions (no fault)" swap-out policy (NEW drive), just like Craftsman(TM) Tools ... That would certainly help to define one manufacturer, above another.

Just an I-Deer ... I am sure this is already the case, in the IT market.
... (or soon will be standard OP).

=Alvin=
 
[citation][nom]Alvin Smith[/nom]Not to mention the fact that "planned obsolescence" can be good for the bottom line, so long as the market does not get gun-shy.Unfortunately, with American made cars, it finally DID ... Until Chrysler came out with their 7/70 warranty, which restored confidence. Solid warranties and "no questions (no fault)" swap-out policy (NEW drive), just like Craftsman(TM) Tools ... That would certainly help to define one manufacturer, above another. Just an I-Deer ... I am sure this is already the case, in the IT market.... (or soon will be standard OP).=Alvin=[/citation]

Alvin makes an excellent point. Random failures actually matter more in enterprise. I mean how many of us have a personal computer (I'm referring to your main computer) with a hard drive older than three or four years. We tend to upgrade because the capacity in 2007/2008 was half of what it is now.

SSDs are unique because they have a high initial investment cost that you're going to have a tendency to stretch out. Even if write endurance isn't a problem, SSD failures can be a real issue. When you pay $100 bucks for a 2 TB hard drive, buying a $100 3 TB hard drive four years down the road isn't that bad. When you pay $300 for a $128 GB SSD, I'd wager that you'd want to use that as long as possible.

FYI, in the IT world, the practice is just to stock up on spares, because any down time is bad.

Cheers,
Andrew Ku
TomsHardware.com
 
[citation][nom]acku[/nom]If you read the first page, then you know that write endurance isn't really an issue. That's something confirmed by our friends at XS.And as for the type of failures, I believe that was detailed in an earlier forum post.Cheers,Andrew KuTomsHardware.com[/citation]

Sorry, but in the first page I see theoretical numbers simplified from a complex equation.. numbers of cycles come from intensive testing. 2 years only is an small fraction of what should last, so probably most of the cells have only used an small percentage of write cycles, what will happen when most of disk has used half of write cycles ? In theory that should happen in 10 years, ok. But to compare against HDDs, what is the expected life of the bearings in HDD, do I have to expect a lifetime of the HDD based on that expected life ?
I am very cautious about moving a notebook with HDD while it is turned on (yes, I have experienced HDD heads landing and without even moving the computer). With SSDs this is not an issue. Then, what issues and cautions should I take with SSDs ? Choose the proper brand to get quality electronics ? it has all to do with brands and product quality or there is any environment measure to ensure endurance ? Do you know any sand clock manufaturer and its return rates: is that a problem of the tecnology itself or of the quality of the manufacturing, handling, etc.
 
The equation isn't just theoretical. It's a published standard used by all SSD manufacturers, like ISO protocols. Look up JESD218. I could just as well say that PE cycles is arbitrary value. It is rated for 3000 PE cycles, but it could really be 2500 or 6000.

Number of cycles comes from a small batch of testing during a precalibration of the NAND cells. There are standard of deviations with every batch, and we have no way of measuring that in the real-world. Another problem with flash.

On hard drives, you're forgetting fly height. The new hard drives don't have that start/stop park issue. And, it's not ball bearings anymore. It's a fluid bearing, hence wear isn't what it used to be. There's no metal to metal contact. Surprising, but this is backed up by the research done at Google and if you are a true storage nut, Storage Bits has commented on this multiple times.

Third, we're separating durability from reliability. We're talking purely about the tech. I've already commented on how solid state works well for extreme conditions. We merely attempted to scientifically measure reliability inherent in the stability of the media. The durability of the media is a separate subject.

And if you don't believe my point on write endurance, I suggest you talk to Anvil and Ao1 at XS. They've actually shown it not to be an issue in a real-world environment.
 
I've been interested in the 120-128GB range, and so I've done some research on newegg.com. I looked at the user feedback for drives with a significant amount of feedback and the results were very bad. I tabulated the percentage of reviews with a 1 or 2 egg rating, figuring that those were the people whose drives failed or in some way were really bad. In fact very many of the reviews point out that the drives worked great at first but failed in the first week or so.

Intel 5% out of 159
http://www.newegg.com/Product/Product.aspx?Item=N82E16820167035

Intel 5% out of 61
http://www.newegg.com/Product/Product.aspx?Item=N82E16820167050

Crucial 7% out of 359
http://www.newegg.com/Product/Product.aspx?Item=N82E16820148348

Intel 10% out of 107
http://www.newegg.com/Product/Product.aspx?Item=N82E16820167042

Mushkin 15% out of 75
http://www.newegg.com/Product/Product.aspx?Item=N82E16820226152

OCZ 19% out of 83
http://www.newegg.com/Product/Product.aspx?Item=N82E16820227714

Crucial 20% out of 50
http://www.newegg.com/Product/Product.aspx?Item=N82E16820148442

OCZ 22% out of 160
http://www.newegg.com/Product/Product.aspx?Item=N82E16820227578

G.Skill 23% out of 236
http://www.newegg.com/Product/Product.aspx?Item=N82E16820231378

AData 26% out of 47
http://www.newegg.com/Product/Product.aspx?Item=N82E16820211471

OCZ 29% out of 155
http://www.newegg.com/Product/Product.aspx?Item=N82E16820227706

Corsair 31% out of 116
http://www.newegg.com/Product/Product.aspx?Item=N82E16820233125

OCZ 31% out of 215
http://www.newegg.com/Product/Product.aspx?Item=N82E16820227395

OCZ 35% out of 171
http://www.newegg.com/Product/Product.aspx?Item=N82E16820227590

OCZ 37% out of 405
http://www.newegg.com/Product/Product.aspx?Item=N82E16820227551

OCZ 38% out of 153
http://www.newegg.com/Product/Product.aspx?Item=N82E16820227543

Corsair 59% out of 78
http://www.newegg.com/Product/Product.aspx?Item=N82E16820233181

As a comparison- I run the following models of WD drives in several of my computers. It sure seems that the WD rotating drives beat the pants off of almost all SSD's in terms of customer satisfaction.

WD Caviar Black 640 10% out of 2188
http://www.newegg.com/Product/Product.aspx?Item=N82E16822136319

WD Caviar Black 750 12% out of 1120
http://www.newegg.com/Product/Product.aspx?Item=N82E16822136283

WD Velociraptor 300 11% of 693
http://www.newegg.com/Product/Product.aspx?Item=N82E16822136322
 
Huge difference in sample size. Plus enthusiasts are always the first adopters and also the first to complain if something goes wrong. Something like 650 million hard drives are sold every year. Compare that to 11 million SSDs.
 
SSD is great in theory but I don't know how well it'll be adapted over the next few years to real laptops/PCs.
 
[citation][nom]beenthere[/nom]Someone is trying to convince people that SSD issues are insignificant, but the facts and people's experience say otherwise.[/citation]

Someone who?
 
[citation][nom]Device Unknown[/nom]Please tell me English is your 3rd language. I couldn't understand anything you said lol[/citation]


i almost fell out of my chair trying understand it my self Lmao
 
Status
Not open for further replies.