Intel Clarifies 600p SSD Endurance Limitations, But TBW Ratings Can Be Misleading

Page 2 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.
Status
Not open for further replies.
Why not have a special additional block of flash like 14nm flash that is in read only mode (likely ad no noticeable increase to the cost of the drive). Then when the drive enters read only mode, it activates that 128KB of flash which simply displays an error message + instructions on the boot screen of the PC, explaining that the drive is out of spare sectors, and has entered read only mode. Then it can provide instructions for Windows, Mac OS, and Linux based OS for recovering the data on the drive.

PS, I purchased a Samsung 850 pro a while back, and in about 2 years, I have put over 100TB of writes onto the 256GB drive, and I am not doing anything too write intensive. (occasional photo and video editing)

I believe that Intel has gone with low quality flash, and has decided to give it a very low endurance rating.

My ~40nm flash Sandisk SSD (old 120GB SATA 2 drive), has had over 200TB written to it, and is still going strong. No new dead cells, though from periodic tests with spinrite on level 2, shows that the error rate has gone up, thus the SOC is doing more ECC work, but it still works reliably.

Intel is either using a very small process node, or they are using the lowest level of binned chips. e.g., some companies would bin their highest quality chips for enterprise use, then the mid range parts for consumer use, and the lowest binn for things like flash drives and SD cards.

They may have tapped into that bottom level for their consumer market SSDs and needed to give it a low endurance.

If we look at the warranty for a wide range of hard drives (likely SSDs too), we see that from failure rate metrics companies like backblaze and google, the drives with the shortest warranty, have the highest failure rate

http://i.imgur.com/ZTQ86Uy.jpg

Most companies mark their warranty limit to and at a level where the projected failure rate is expected to go beyond a 5-7% mark, as RMAs cost money.
 


For Samsung, avoid the TLC flash based drives and you will have good reliability.

Their TLC flash based drives are a worst value than their pro drives when you consider the price difference, andwhat you give up.

Their TLC flash significantly cuts down on the BOM cost of the drive, as far less silicon is needed to make the drive. But only a very tiny fraction of those cost savings are passed onto the consumer. Beyond that, if you spread the cost of the drive across not only the amount of space, but also the endurance, and you end up with a drive that cost more per TBW

In the long run, these drives end up being more expensive. TLC flash gets even worst when you consider the increased write amplification.

 
Supporting Gary "What is Aleppo" Johnson?

I question your IQ, Palmer Freeman Luckey.

Gary Johnson is not even Libertarian, he is SJW shill.
 

You admit that everything you stated had nothing to do with reliability. Which is what you were responding to in my point. And judging from the downvotes and your upvotes, I think there's a lot of Samsung SSD owners who disliked me saying that Intel>Crucial/Micron>Samsung is the all-time pecking order of SSD reliability. :lol:

Just dislike the discussion on that in response to my post as if it were something I was arguing with you about. It has nothing to do with the longterm reliability rates of Intel vs Crucial/Micron vs Samsung.
We're in total agreement on the 600P TBW lock limit.


That's a very good point, and something that I've been mulling in the back of my mind for the last few days too. You've pushed me over the edge though and convinced me to ditch the 960 Evo 1TB I was going for and to spring for the 960 Pro 1TB. At $150USD more, it makes a lot of sense.
 

Actually I was responding to your comment about "Intel SSD haters". I agree that Intel have historically been fairly reliable, but I was making the point that the frustration towards Intel in the thread was about the endurance lock, not reliability.
 
It is unfair to compare the TBW reliability number between Intel 600p 128GB model against Samsung 960 EVO 1000GB model, where it is obvious TBW is proportional to capacity. By spec, Intel 600p TBW is acutally slightly better than Samsung EVO and here they are:
Intel 600p Samsung 960 EVO
128GB -> 72 TBW N/A
256GB -> 144 TBW 250GB -> 100 TBW
512GB -> 288 TBW 500GB -> 200 TBW
1TB -> 576 TBW 1000GB -> 400 TBW

From the table, Intel 600p can be written 576 times and Samsung 960 EVO can only be written 400 times.

This article provides totally wrong info for user's decision to buy Intel or Samsung SSD based on TBW. This should be corrected
 


They're the most reliable, statistically and anecdotally. I have their X25-M drives from 8 years ago going strong in hundreds of deployments. Agreed on the frustration, some didn't articulate why they "hated Intel SSDs" very well though and I just wanted to set the record straight on Intel's reliability in case people didn't know.



That's a great point, though I swear Intel had the wrong numbers posted themselves on ark. I checked it a week ago and I'm pretty sure they had 72TBW listed and now it's been updated to 576.
http://ark.intel.com/products/94926/Intel-SSD-600p-Series-1_0TB-M_2-80mm-PCIe-3_0-x4-3D1-TLC

The problem Intel has with these drives IMO is that it appears SS is going to beat them to market with their 960 1TB drives. I'm probably one of many that have long awaited 1TB M.2 NVME drives. I have no problem with the 600P 1TB if the price is right. It should be ~$360<$480<$630. Depending on priorities the 600P 1TB is the lowest end solution I'd go for today. It's ok, and it is Intel which helps its case. I'll probably get the 960 Pro 1TB myself. It's overkill for what I'm doing but Intel would have a strong case if they can get those available at $360 shipped.
 
First, the article reported the numbers provided by Intel at the time it was written. I saw them on ark.intel.com, with my own eyes. It was 72 TBW for all capacities.

Secondly, you cannot take the endurance curve from one SSD and extrapolate to a different model from a different vendor. They overprovision by different amounts, and that (combined with the wear profile) determines the endurance of the underlying flash. Then, to get a TBW estimate, you have to look at other factors that I won't bother to list, here.

Finally, there's the added confusion over what, exactly, the Intel drives do when the TBW limit is exceeded.

In all my years of reading this site, I've never seen any of the authors treat Intel unfairly. It's fair to question the article, but I think your accusations are out of line with the evidence you provided.

I see this is your first post. If you decide to stick around, I hope you'll try to take a more measured approach and present your concerns in a non-accusatory fashion.
 
** Intel's process for copying the data from a read-only SSD involves simply installing the drive as a secondary volume (non-OS) in a computer. **

That's fine, but many current model computers have only one M.2 slot.
 
** Intel's process for copying the data from a read-only SSD involves simply installing the drive as a secondary volume (non-OS) in a computer. **

Doing that may appear "simple" but what if your computer has only one M.2 slot -- as is the case with many current computers?
 

Do this the same way you'd upgrade to a larger SSD
1. boot off usb recovery drive if needed
2. use backup software to move the drive image to the same place you do your backups (an external usb drive, network etc.) If you don't do backups then when this SSD locks up is a painless time to learn you want them -- no data is lost.
3. restore the image to the new SSD


 


My apology to the author since it is confirmed by many that it was Intel's own problem by stating wrong TBW in their previous specs, and I updated my original comment regarding this.

I came into this article as I was looking for affordable SSD for my next machine. Tom's Hardware is trustworthy and that is why I looked for reference here. I couldn't believe my eyes when seeing such a big difference in TBW and there are many comments based on the TBW. I immediately checked Intel's website and saw a totally different story. I didn't think it is possible for Intel to make this kind of mistake, especially when queried by TOM's hardware. Because Intel should be out of business long time ago if this is the way they market their products. Intel's marketing team does need more technical trainings. On the other hand, this data needs to be corrected to reflect the truth.
 
I don't know how many people are still active on this thread, but does anyone know how Intel determines when "spare area drops below threshold", and what they mean by on continued use, "it will reach a point that it will be forced into read only mode".

All SSDs contain some form of Error Correction. So I'm guessing if a drive has to utilise error correction for a read command, it will mark the culprit cells as "failed", reassign the data elsewhere and reduce the "Available Spare" metric accordingly. Obviously when enough cells actually "fail", you run out of spare area and have no safe location to write new data to. If that's the point that Intel are locking the drive then I'm absolutely fine with that. The only other option in that case is to write data to known bad NAND cells, which is a clear risk and justifies setting the drive to read-only.

It's just that I'm piecing together the above. I might be wrong. The quote from Intel doesn't mention failed cells nor how the "available spare" metric is actually determined. Is Intel "failing" a cell once it reaches its pre-determined write limit? Because that's once again artificially limiting the life of the drive and I'm much less happy with that approach. The statement also vaguely talks about a "threshold" for SMART warnings, and then a further "point" when read only mode is applied. How are those "thresholds" and "points" actually determined?

The updated article, specs and statement from Intel drastically changes my perspective, particularly with the corrected the TBW ratings for the larger drives. We're back in the region of this being very unlikely, and I'd like to soften some of my criticisms from earlier in the thread. But I'm still left scratching my head a little. Given the PR nightmare this must (or at least should) have been for Intel, a much clearer explanation of this locking "feature" is important IMHO.
 
I once read such a description of SSDs (or a particular SSD). However, they qualified it by stating that the number of bit errors per block had to exceed some threshold, implying that a certain number of bad bits per block are still tolerable.

Who knows what Intel actually does? They probably consider that proprietary information and would never divulge what they actually do.

I just want clarification on whether TBW is a hard limit or simply advisory & warranty-voiding.
 

I'm now much satisfied about this. There are no 72TB limit, even 0-reached MWI won't trigger the read-only mode, Intel updated the TBW ratings to which it should be.
We still don't have the accurate information about "how many bad blocks will trigger?" or something like that, but in which product, we could ever know that? I suppose that will be a part of the industrial secrets. Then I have nothing to worry about. At least, no more than the other SSDs.
I assume that Intel had less intrest to TBW number itself in spec sheet (that's why Intel hadn't disclosed TBW for years and defined a blanket ratings to all capacity). But at last they've learned that it means much for consumers.
 

While I respect that some of this information needs to remain private to protect their IP, I think Intel has a responsibility now to go over and above to clear up some of the mess this release has caused. Basically this post-release article and the original TH review trashed this drive and Intel's integrity as an SSD company. It turns out that at least most of the concerns were baseless, founded on incorrect specs released by Intel and the failure of their PR team to respond to the questions raised by TH in time to address the (very serious!) concerns before publication.

The problem is that the reputation damage is done now. I've seen mainstream organisations like Linus Tech Tips pick up and share the TH article and concerns. I'd be surprised if they bother to go back and correct the information like TH has. That's why I'm suggesting Intel needs to go above and beyond to set the record straight. In this case, I don't thing vague "thresholds" and "points" are good enough.
 

Well, I agree about that. The review damaged Intel a lot.
When it comes to this point, it's Tom's' responsibility to do something (not meaning Intel shouldn't do anything). The first point was Tom's' - or I should say Chris' - misunderstanding about TBW. There was nothing to relate TBW and read-only feature. That was the mistake. At least, Tom's should post another article to apologize and correct the information.
Intel's mistake was that they didn't know Tom's' writer misreads the TBW spec. Years ago, Micron's C300 and C400 had a blanket 72TB TBW ratings too. But no one had a question about it then. Nowadays, blanket rating is not the mainstream. But Tom's have a long experience in PC review. It's not unusual for Intel to believe that Tom's know what the blanket TBW means. Blanket TBW is not incorrect information. It's just old-fashioned and worthless. It's not even on a official product brief, although it was on ARK. So what happened was, Tom's found the worthless information and wrote the incorrect review based on it.
To be more critical to Intel, they should have know that Tom's was misunderstanding about it when they had the first question. Eventually, they caused this mess.
It's better for Intel to do something, but to me, it seems cruel to say it's their responsibility.
 
He reported the information Intel published, and yet you blame the reporter? He dutifully followed up with them, in order to confirm what he interpreted and clarify the end-of-life scenario. He even published a new article, to indicate that Intel updated its spec. And yet you blame the reporter?

Do you have any sort of bias, in the matter? Otherwise, I can't make sense of your position.
 
Sorry about this. Honestly, I was watching top page everyday looking for updates, but I missed it (again). It's good for Tom's, I apoligise and withdraw the line about another article for correction.
I'm thinking that Tom's is the one who made 72TB TBW as a problem, and responsible about it. Maybe you say they are doing well now, but I say it's not enough. If you see this as the bias, yes.

These are my points.
1. Intel had nothing to be criticised.
2. Tom's made up the problem from nothing.

About the blanket TBW, Intel had a better choice. But it can't be said wrong. Not the best, but not wrong at all.
In the review, Tom's wrote "The TBW rating means the drive can only absorb up to 72TB of data during its lifetime." That's wrong. They hadn't corrected this misunderstanding yet. Even on the newest article, it looks that they keep their mind. Based on this, Tom's assessed 600p as a low endurance SSD. Critical mistake.
TBW should be recognised as "at least" instead of "up to". Then this problem didn't occur. It was all due to Tom's' misunderstanding. This made Intel to update their SSD's spec - unneeded work. Is it the right attitude for Tom's to report it as the third person?
Tom's have damaged Intel's reliability. It's bad influence to both Intel and readers. No need of apology? I don't think so. If the additional post was their sign, the post should be kept on the top page for some period. It will soon disappear from the top.
You know, Intel is completely innocent about this. But now they have a lot to do. Tom's should take the resposibility. This is my opinion.
Informational correction was done, how about the next?


 
Everyone else seems to agree that it's an abnormally low rating, for all sizes but the smallest. They were right to point it out, as it will affect some users. It's more than just guidance - it affects warranty coverage and there was the issue about the drive going into read-only mode that Intel seemed to suggest would happen when TBW was reached.

I think it's due to Intel's communication of the issue. They contacted Intel to confirm their understanding. What more should they do?

If the spec was wrong, why was in unnecessary for them to update it?

It seems to me like you're trying to say that the only thing wrong with tech products is the journalists that report on their shortcomings. If we all just pretend that their negative points aren't actually bad, then we can be happy (ignorance is bliss). But then tech journalists come around and spoil the party, telling us what not to like about them.

I disagree with this point of view. I think it's healthy for competition, if products are compared and judged on their merits, and it lets users pick the right products for their needs and then to know what to expect.

For some specs, it's easy just to test and see what the real measurement is. Unfortunately, longevity testing is time-consuming and expensive. Not every product can be put to the test, and doing so takes a long time. So, reporters have little option other than to report what the manufacturer states. That said, to answer concerns about write endurance and end-of-life behavior, Chris is currently testing a 256 GB drive.
 
I want to clear this first, TBW rating has nothing to do with endueance and warranty. It's clear from Intel's response.
High TBW may show the high endurance, but low TBW don't show the low endurance. TBW is too incomplete to be the indicator.
I agree half about this. SSD makers should clarify this. But TBW is the matter about all SSDs, not only Intel's. Why didn't Tom's know about it? There's no responsibility for Tom's to know more about important key words?

Because blanket TBW is not the wrong information, as I wrote. Intel didn't have to update it. They did it because bad rumor has spread.

I can't get it, but I've never criticised Tom's about reporting the bad point as the bad point. I'm talking about that they wrote bad about the point which is actually not.
As it seems that you're still misunderstanding about warranty, I think Tom's should write the article to clarify which information has updated and what was wrong, what is correct.
I do respect the actual writing test of 600p, although the first aim (to make sure 72TB lock exists or not) is already accomplished.
 
Agree with Ryhsiam. Intel need to explain clearly what happens to user data at end-of-life.

According to techreport.com Intel SSDs will BRICK THEMSELVES (not just read only) when reaching extreme write limits.

"Intel doesn't have confidence in the drive at that point, so the 335 Series is designed to shift into read-only mode and then to brick itself when the power is cycled."
see:
http://techreport.com/review/27909/the-ssd-endurance-experiment-theyre-all-dead

We've attempted to get this information directly from Intel, but none is forthcoming.
Does anyone know the answer?
 
Status
Not open for further replies.