WD Plots A Course To 40TB HDDs With MAMR

Page 2 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.
Status
Not open for further replies.

It's probably a bit too much to ask for 100% reliability from a consumer device with tiny, ultra-precise moving parts that need to interact at a microscopic level with platters spinning at thousands of revolutions per minute. Maybe drives with reliability much closer to 100% could be manufactured, but they would likely be significantly more expensive, to the point where you might be better off just buying two or more drives for redundancy to accomplish the same thing.

And the reason the price of 500GB or 1TB drives doesn't go down is because they are already using the minimum of one platter, along with the other necessary hardware to make a functional drive, so the price naturally remains the same. For capacities that fit on on a single drive platter, the cost to manufacture them won't go down due to increases in storage density, because they're already using the minimum number of platters, and are therefore already near the minimum cost it takes to manufacture a drive. Current platter density allows up to around 1.5TB per platter in a 3.5 inch drive, and 1TB per platter in a 2.5 inch drive, so anything below that should use one platter, and there isn't really much room to cut costs from there. Now, when we get to the point where 2TB can fit on a single 3.5 inch platter in a couple years, you may see 2TB drives get closer to this minimum cost.

As a side note, this minimum cost to build, package and ship a drive results in drives with the lowest capacities having the highest cost per gigabyte. Using a popular online store's prices for 3.5 inch drives as an example, 1TB drives start at a minimum of around 4.3 cents per GB, while 2TB start around 3.2 cents, 3TB start around 3 cents, and 4TB start around 2.9 cents. Past that point, the minimum prices pretty much remain within the 2.9 to 3.2 cent range up until you get to the highest capacities of 10GB or more, at which point they raise back up closer to 4 cents, likely due to the fact that all of those drives are intended for professional use. So, if someone suspects that they might have a need for additional capacity at some point in the future, it's currently probably best to go with at least a 2TB drive, since you may only need to pay around 50% more for double the capacity. And considering the slow rate of advancement for hard drives in recent years, you can be pretty sure that the pricing won't drop significantly for some years to come.
 

It's not going to happen. For one thing, mechanical components tend to be less reliable than solid state. But your real problem is that you're vulnerable to a single point of failure.

There's no solution other than RAID or backups (ideally, both). Sorry if you find that expensive or inconvenient, but that's just how it is. Even with SSDs, if you can't afford to lose some set of data, then you need redundancy.

Even RAID isn't always the panacea that some think it is. You need to scrub it (AKA consistency check), or have a controller that does this in the background.

https://en.wikipedia.org/wiki/Data_scrubbing

I consider RAID-5 to be too risky > 4 disks. Rebuild times are getting long enough that the chance of encountering an error on another drive is becoming significant.
 

Let's not forget RAID = Redundant Array of Inexpensive Disks.

I think the alternative was half-seriously derided as something like Single, Large Expensive Disk.

I'd rather run a RAID-6 of used or refurb disks (but not at the end of their usable life & with a spare on hand) than trust high-value data to a single drive of even the best pedigree. I made that mistake once before (almost 20 years ago), although it was only my personal data.
 
The biggest problem is trying to manage all the data you can put on these mammoth sized disks. If the average file size is around 1/2 MB, that means you could put something close to 80 million files on just one of these drives.

File systems require metadata for each and every file. A fixed sized metadata record for every file is stored in the file table on each system. File systems like Ext3 have a 256 byte file record (inode) and NTFS seems to be the worst at 4K per record (FRS). Multiply that by 80 million and you have to read 20GB or 320GB of metadata from each system respectively before you can do simple operations like search. If you want to cache all that metadata so you don't have to re-read it in for every operation, then you need an enormous amount of RAM.

I am working on an object-based data manager that has a much smaller metadata footprint. The fixed-sized meta-data record is only 64 bytes per object, so I can do fast queries on a container with 100 million objects after reading and caching just 6.4GB of meta-data. Caching the entire table for such a large container from a cold boot takes about 30 seconds with a fast HDD. SSD is much, much faster but very expensive to hold something like 100 million objects.

Each object is similar to a file so I can find things (e.g. all the photos) in such a container in just a second or two once the table is cached.
 
Raid is never a replacement for a backup. No matter which raid you go with, you also need a backup if you have data you value.

Raid just means that when all your data is lost, you have an even bigger problem, usually, then a single disk failure.

Lets just keep that very clear for our users please. Thank you.
 


Your comment didn't offend me. Reader feedback is always appreciated, especially from long-time members such as yourself. Finding the balance is hard at times, but feedback (even if it's harsh) keeps me on an even keel.

However, needling me about our loss to a clearly inferior team is crossing the line, especially while I am still in the middle of the mourning process 😛

 

Yes, of course we'd always like backups. But they're two ways of mitigating against the same problem.

Backups aren't perfect - the backup media is susceptible to problems and there can be issues with the actual backup procedure, and backups are nearly always at least a bit out of date.

Furthermore, which to focus on actually depends a lot on the specifics.

In my case, I have a bunch of lossless CD rips. I haven't traditionally backed them up, because I could always re-rip them and it was originally stretching my budget just to build one RAID to hold them all. I was content just to mitigate against failure with the extra reliability afforded by a RAID.

Now, if you're talking about source code, that's extremely high-value, low-volume data. Given the trivial cost of doing so, you'd be nuts not to back it up - even off site. In fact, my working copy of source code usually resides on a single storage device (SSD, these days), because I'm content with source control as a sort of high-frequency backup method.

So, while everyone likes to sing the praises of backups, you have to recognize that people don't have infinite amounts of time and money to protect their data, and not all data is equally valuable or irreplaceable. The best way is to understand the risks, costs, and benefits of the different options and craft a (preferably blended) solution that's well-suited to each situation.
 

I wish your colleagues from THW France would learn from the harsh feedback especially when it comes due to their incomplete and sometimes even idiot remarks. Overthere the norm is to rather close the thread's comments instead of analysing why those harsh comments did come.
 

Sorry to hear that, but one bit of praise I have for the authors on this site is their professionalism in the comments. I've almost never seen them get into flame wars with readers.

Even most of the mods seem not to abuse their authority (though I've seen a few exceptions - none recently).
 
QLC is only one side to the NAND threat. Stacked NAND still is developing but I think developers have slowed down to not kill off profits.

Considering that Seagate made a 60TB SSD in the 3.5" form factor 40TB is not as impressive. It will be much cheaper however thanks to NAND being stubborn in price dropping.

This will be great for data storage farms so long as reliability is there as well.

All I am waiting for is an affordable 4 or 8TB SAD to finally replace my HDDs in my system for a game/data drive.
 
SSD may NEVER replace HDD for bulk storage. SSD prices have not fallen much at all over the past few years. They are still more than 10x as expensive as HDD per TB. Things like this will make it even harder, but even if HDD stays at its current state, I wonder if SSD will ever be cheaper.

Now, if the cloud did not exist and nobody had more than about 1 TB of data, then they might have a chance, but everybody is getting more and more data all the time.
 

You're accusing them of price-fixing? Or just acknowledging that a point of diminishing returns exists as you try to stack the stuff ever higher?


That's funny. I recall seeing them comparing vs. flash based on price - not density.
 

Actually, it might. More cost-effective HDD storage could have a slight reduction on demand for flash, which is one of the factors keeping prices high.

On the flip side: if HDD density stopped increasing today, you'd actually see flash prices get further elevated for quite a while. The industry simply cannot make the stuff fast enough to keep up with existing increases in demand, much less if a bunch of datacenters started accelerating their migration to flash.
 
Status
Not open for further replies.