IBM Builds Monster 120-Petabyte Data "Drive"

Page 2 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.
Status
Not open for further replies.
uh yeah, DoDPA is on the right track. i can't believe this hasn't been mentioned yet. it's obviously a storage shed for government surveillance of citizens. that means you and me.

seriously, isn't this obvious? what is wrong with you sheeple.
 
[citation][nom]Zanny[/nom]We have been pushing the limits of mechanical disk reading lasers. Blue spectrum is the smallest imprint we are going to get, and the data error limits on drives past 3 terabytes are really small, in that it is very likely to have a bad sector somewhere on the disk by that point.[/citation]

Hard drives don't use lasers, never have. Optical drives and sharks use lasers. Current generation hard drives are somewhere in the 600Gb/in2 range with current PMR techniques likely to top out around 1Tb/in2. Seagate has predicted that the next generation of hard drives (HAMR) will be able push capacity into the 50Tb/in2 range. That's over 80 times more dense than current drives. So we are no where near the limits of mechanical hard drive capacities.
 
[citation][nom]mcd023[/nom]I'm not the RAID expert, but would a RAID 5 be better for redundancy? I know that the 2 drives in a RAID 10 that have the same data failing are improbable, but I was just wondering.On another thought: think of how many drives they'll be replacing like, what, every day? I've heard of large arrays needing several replacement drives/wk. Imagine this![/citation]

RAID 6 is much better as you have much more redundancy. Though with that many drives they must be using some sort of proprietary system to handle issues of redundancy and all the read/write cycles those drives will be going under. I would speculate that the system would more likely have thousands of RAID 60 array. Which using their GPFS filesystem views each array as an individual sector of the giant array. Though they likely developed some sort RAID system designed specifically for handling massive numbers of hard drives.

It would just be too inefficient to break apart every file among 200,000 drives then reassemble it. Not to mention you would severely limit the number of files that could be accessed at one time before slowing the array to a crawl.
 
.Azimuth01 08/30/2011 4:06 AM


Google must be about to hatch their master plan....
First: consolidate all the information they ever collected onto one machine
Next: Begin analyzing trends using variables and timetables from every known source
Lastly: Use this information to predict the future and take over the world

Still based on ideas of relevent information
 
My guess would be 1TB 2.5in drives in 6- to 8- drive RAID 6 configurations with some overhead to manage all the RAID 6 arrays. 3.5in drives just don't cut it for the density requirements of such a massive storage solution.

Also to oparadoxical_: You are misrepresenting units quite a bit. Lowercase b is for bits, not bytes. Lowercase t is meaningless. I suppose you were trying to illustrate binary vs decimal units, in which case it'd be TiB vs TB, GiB vs GB etc.
 
GPFS = new? We've been using GPFS on many of our systems for decades. Sizing the old VideoCharger servers is an example.
 
2001 - first 160GB hard drive
2011 -3(4) TB hdd

Thats a 25x increase in storage. Going lineally, we should see 100TB hard drives in the next 10 years, but maybe 160TB.

I like being young enough to appreciate all this new tech and old enough to see it unfold from an early point of view.
 
[citation][nom]CompTIA_rep[/nom]2001 - first 160GB hard drive2011 -3(4) TB hddThats a 25x increase in storage. Going lineally, we should see 100TB hard drives in the next 10 years, but maybe 160TB.I like being young enough to appreciate all this new tech and old enough to see it unfold from an early point of view.[/citation]
It seems like every time someone wants to draw a linear extrapolation, it gets reset. Remember pentium 4 and ppl saying it would get up to 5 ghz? Next generation we saw dual cores at 2 ghz instead. Same thing here...rather than 10 TB mechanical we see 120 gb ssd.
 
Which platform would you prefer to make use of 10 TB of storage space?

How fast would it have to spin to be reasonable?

Is it possible such a thing could be quieter than an old 74GB raptor?

As of right now, I could probably fill 2 TB with data if I really tried to... like backing up all optical media to it, from the last 10 years, etc etc.

I suppose needing 10 TB of space is probably not too far off. Especially when cameras go beyond 1080p video resolution.
 
Not sure why the contract was given to IBM...they probably gave it away for free just to get the contract on the Computing Services.

Its not going to be a single raid type, it will be a mixure of 10,5 and 6 depending on the application/requirements. It will also have a mixure of drive types, best density in the market for density/performance/cost is the 15K 600GB disks.
This is most likely a mixture of SS,FAS,SATA,FC disks...not just a singular type.

The article doesnt mention if it was NAS also...if it was, well its a Netapp front end doing the heavy lifting.
 
[citation][nom]mcd023[/nom]I'm not the RAID expert, but would a RAID 5 be better for redundancy? I know that the 2 drives in a RAID 10 that have the same data failing are improbable, but I was just wondering.On another thought: think of how many drives they'll be replacing like, what, every day? I've heard of large arrays needing several replacement drives/wk. Imagine this![/citation]

[citation][nom]chickenhoagie[/nom]I think you're right. RAID 5 is the best RAID solution for mirroring/backup as far as I know. Certainly better than RAID 10[/citation]

No, for redundancy, the best possible raid is RAID 16/61 (mirrored RAID 6). For excellent redundancy and fast access RAID 10/01 is the way to go. In respect to redundancy, from least to most goes as the following (price also climbs as you get more redundant as well):

RAID 0 (stripe without parity)

JBOD (just a bunch of disks)

RAID 1 (mirroring)

RAID 5 (stripe with parity, need to lose two disks, losing one and you're still fine)

RAID 6 (similar to RAID 5 in how it works, but more redundant, have to lose three disks)

RAID 1+0 or 0+1 (mirrored stripe without parity; have to lose both drives in a mirror, otherwise could literally lose half the array and still be able to recover)

RAID 5+0 or 0+5 (parity stripe array across a stripe without parity)

RAID 6+0 or 0+6 (more redundancy 5+0/0+5)

RAID 5+1 or 1+5 (mirror parity stripe)

RAID 6+1 or 1+6 (more redundant 5+1/1+5)

Going above this level of RAID get exhoribantly expensive without much benefit.

Overall, RAID 1+0/0+1 is probably best of both worlds with excellent redundancy and decent cost.

One thing that needs to be mentioned about RAID 5, for small arrays with a small number of disks, RAID 5 is fine for redundancy, but when you have an array that is 8+ disks of 2TB per disk or higher, you run into an issue where your array will fail due to a write failure. Due to the sheer number of sectors, you are 100% likely to get a write failure at in an 8+ disk array with 2TB+ disks, as such RAID 6 has become the new standard when you need a large sized array with 8+ disks that are 2TB+ in size.
 
[citation][nom]oparadoxical_[/nom]Actually, 120 petabytes=122,880Tb which equals 61,440 individual 2Tb HDDs.Then, the 122,880tb=125,829,120Gb and if you divide that by 200,000, you get about 630gb per HDD.[/citation]Yeah, you get 640GB drives if you do the math like that. But more likely it's 1TB drives with 1/3 terabyte redundancy or at least 750GB drives.
 
Status
Not open for further replies.