What Does One Petabyte Of Storage (And $500K) Look Like?

Page 2 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.
Status
Not open for further replies.
G

Guest

Guest
They apparently only implemented write caching via Zero Intent Logs and not the read caching via L2ARC. That would likely drive IOPS way up.
Also, they didn't mention deduplication or compression which are available on ZFS. Assuming they had the RAM or L2ARC space to hold the dedup hashes, they could get multiples of that 1PB for typical file storage.
 

therabiddeer

Distinguished
Apr 19, 2008
369
0
18,780
[citation][nom]Casper42[/nom]Lets see....I work for HP and we sell a bunch of these for various uses:http://www.hp.com/go/mds600Holds 70 x 2TB (and soon to be 3TB) drives in a 5U foot print.Can easily hold 6 of these in a single rack (840 TB) and possibly a bit more but you have to actually look at things like floor weight at that point.I am working on a project right now that involves Video Surveillance and the customer bought 4 fully loaded X9720s which have 912TB Useable (After RAID6 and online spares). The full 3.6PB takes 8 racks (4 of them have 10U free but the factory always builds them a certain way).The scary part is once all their cameras are online, if they record at the higher bitrate offered by the cameras, this 3.6PB will only hold about 60 days worth of video before it starts eating the old data.They have the ability and long term plan to double or triple the storage.Other uses are instead of 2TB drives you can put 70 x 600GB 15K rpm drives.Thats the basis for the high end VDI Reference Architecture published on our web page.Each 35 drive drawer is managed by a single blade server and converts the local disk to an iSCSI node. Then you cluster storage volumes across multiple iSCSI nodes (known as Network RAID because you are striping or mirroring across completely different nodes for maximum redundancy)And all of these are only considered mid level storage.The truly high end ignores density and goes for raw horsepower like the new 3Par V800.So Yes, I agree with haplo602. Not very high end when comparing to corporate customers.[/citation]
I work for HP also, and the 3par systems are ridiculous.

Honestly, when it comes to enterprise, you cant beat the stuff that HP sells.
 

HistoryBuff44

Distinguished
Nov 17, 2008
11
0
18,510
I havent worked with this system but i have used a Dell Compellent solution. It was a fantastic unit, 3 tiers of storage, SSD, 15K SAS and 2TB SATA drives. All writes are raid 10 then moved via scheduled job to raid 5 later. you can specify which data goes to which tier or let it do it. last i heard it was expandable to about 500 drives. we easily saturated our 4 gigabit fiber network with it without breaking a sweat on the controllers.
 

alidan

Splendid
Aug 5, 2009
5,303
0
25,780
[citation][nom]willard[/nom]Petabyte SSDs are probably never happening with current technology, for exactly the reasons you state. We'd need to increase the storage amount per cell pretty dramatically, and there are already problems with MLC drives compared to SLC.Magnetic disks, however, may well reach that capacity. There are some interesting technologies on the horizon, which I can't remember exactly what are called, but promise to greatly increase storage density on magnetic disks. Using lasers to heat up the platter before reading and writing to be able to manipulate smaller portions of the platter, for example.Assuming Moore's Law holds for magnetic disk storage density (and I have no idea if it's even holding now), we'd need to go from 3TB to 1000TB, ignoring the fact that current drives aren't really 3TB and 1000TB isn't really a PB. That's about eight and a half cycles, for lack of a better word, of Moore's Law. With each one being 18 months long, and rounding up to 9 cycles, we'd be looking at 1PB HDDs in 2Q 2025.There are a LOT of ifs involved, though. Honestly, I'd be surprised if magnetic disks were still in use in mainstream applications in 2025. We'll probably be eyeing the successor to SSDs by then.[/citation]

well heat assisted is only going to help so much, and its estimated to be about 10-20 times, somewhere around there. im not sure if that also includes making the head better or not or if thats with current stuff.

but an ssd, thats honestly far more likely, because the process to make them 3d should be somewhere around the corner, increasing the size exponentially without increasing the physical size my much if anything at all. and if the 3d process can be done significantly faster than 2 single layer wafers, that size increase will come at very little end user cost. i really doubt that we will see hdds get much bigger than 100tb, unless a new method of making them id found, or a way to make them take up less ecc space.
 

didgetmaster

Distinguished
Jan 27, 2012
32
0
18,530
I remember seeing a similar ad about 11 or 12 years ago for a 1 TB system that was loaded with 1 or 2 GB hard drives. It likewise cost a ton of money and was only affordable by big businesses.

Today, you can get 1TB that fits in your shirt pocket for under $100 (or at least you could before the Thailand floods caused the current hiccup in prices).

Does that mean that in 2025 we will get that 1 PB in our shirt pocket for about $100? Personally, I doubt it since it would require some major technological and scientific breakthroughs, but given the track record of HDD manufacturers over the past 30+ years, I wouldn't bet against them.

One thing is for certain, in ten years we will have data storage systems that have much higher capacity and at a much lower $/TB ratio than we currently have. What will fit in our shirt pockets? 10 TB? 50 TB? 100 TB? Time will only tell.
 

scook9

Distinguished
Oct 16, 2008
826
0
18,980
I have to say, that is a pretty ugly rack. They really need to get the racks looking better like EMC does for that price ;)

-Biased RSA/EMC employee :D
 

QEFX

Distinguished
Jun 12, 2007
258
0
18,790
[citation][nom]anort3[/nom]"Honey, I downloaded the internet!"[/citation]
Why does the show the IT Crowd come to mind?
 

A Bad Day

Distinguished
Nov 25, 2011
2,256
0
19,790
[citation][nom]augie500[/nom]Too bad in 5 years it will be recycled.[/citation]

Indeed. By that time, you can get a lot smaller rack for the same storage size and use less energy (less spinning motors=less watt usage).
 

g00ey

Distinguished
Aug 15, 2009
470
0
18,790
[citation][nom]clownbaby[/nom]I remember marveling at a similar sized cabinet at the Bell Laboratories in Columbus Ohio that held a whole gigabyte. That was about twenty years ago, so I would suspect in another 20 we might be carrying PBs in our pocket.[/citation]
A guy told me how he was marveling 20 years ago over a 3.5" hard drive that fit 4GB of storage. Now, he was working at a research facility that developed radar equipment for the military so it's no wonder they had access to state-of-the-art technology that had not yet reached the regular marketing channels.

So maybe there is some piece of storage holding 1PB of data, not bigger than a deck of cards lying around somewhere in a secret lab...
 
[citation][nom]g00ey[/nom]...So maybe there is some piece of storage holding 1PB of data, not bigger than a deck of cards lying around somewhere in a secret lab...[/citation]

And its from salvaged alien technology, like "hook and loop" fasteners, and Tang!
 

Kewlx25

Distinguished
They need more memory on those systems. 48GB is NOT enough for 1PB and ZFS.

Recommended memory is 5GB per TB for *just* dedup, not for file caching. Assuming 512TB per rack, that's 2.5TB of memory to optimally run just the dedup. They can help speed up their systems using a very high IO SSD to help cache dedup info.

Every time you want to insert new data, ZFS needs to look up the checksum of your new block to see if it already exists. About 5GB of checksums are created per TB.

They should really top off both boxes with memory and drop in 2TB of SSDs for their ARC cache.

They should also use a few SSDs for write cache, as ZFS will make good use of them.

What's really going to suck is running periodic HD scans. ZFS can fix some errors when it reads, but you still need to regularly scan everything as too much silent corruption can still kill the data. ZFS can only fix so much damage at once. At least it will let you know if it went bad.

Either way, this is super awesome.
 

g00ey

Distinguished
Aug 15, 2009
470
0
18,790
Kewlx25: Do you really need that much RAM to use dedup? Then I don't see the point with using dedup. The purpose of dedup is to save hard drive space so if you require lots of expensive extra ram then that defeats the purpose of dedup, after all hard drive space is still cheaper than RAM.

I run my ZFS pool with dedup disabled as I see no personal benefits with using it. I don't have more than 4GB of RAM for my 12TB raidz2 pool and I have not yet seen the system choke up all that RAM. But sure it wouldn't hurt to have a bunch of SLC drives for the L2ARC and ZIL.
 

stalker7d7

Distinguished
Jun 20, 2010
110
0
18,680
[citation][nom]clownbaby[/nom]I remember marveling at a similar sized cabinet at the Bell Laboratories in Columbus Ohio that held a whole gigabyte. That was about twenty years ago, so I would suspect in another 20 we might be carrying PBs in our pocket.[/citation]
That was 20 years ago. We currently have not only 1 gigabyte hard disk drives, but also 1-3 terabytes. If the same happens again, we will see not just 1 whole petabyte, but possibly 1-3 exabytes...
 

therabiddeer

Distinguished
Apr 19, 2008
369
0
18,780
[citation][nom]zulutech[/nom]My desktop has 1250 watt psu. This makes me wonder about the efficiency of my computer.[/citation]
Unless you run quad crossfire or triple SLI, you arent using close to 1250 watts. 1250 is complete overkill for most desktops.

[citation][nom]Kewlx25[/nom]
Recommended memory is 5GB per TB for *just* dedup, not for file caching. Assuming 512TB per rack, that's 2.5TB of memory to optimally run just the dedup. They can help speed up their systems using a very high IO SSD to help cache dedup info.[/citation]
Its not that high is it? A high end server with 36TB (12x3TB) would need 23 slots of memory to achieve that number, which would basically mean you cant fully utilize a 3TB solution at the moment without using a very large amount of SSD's to go with the memory.
 

Kewlx25

Distinguished
ZFS creates about 5GB of hashes per 1TB of unique data. In order for dedup to work, one must:
1) hash new data
2) check current hashes to see if said hash already exists
3a) if new hash, insert
3b) if existing hash, map to pre-existing block

If you can't fit that 5GB/TB of hashes in memory, you're going to take a penalty to read from the HDs on every write. Hashes are pseudo-random, so the more memory you have, the less *chance* of going out to the HD. You don't need to fit all of the hashes in memory to benefit, but the more the better.

With 1PB of storage, 48GB of memory doesn't even begin to cover the hashes. Most every write IO will require several reads IOs in order to do dedup look-ups. Adding more memory will help, but with the bulk of your dedup going out to the HDs, a nice high IO SSD should make up quite a bit of the lack of memory.

48GB is only about $700 of ECC high speed low voltage namebrand memory. Compared to the cost of 1PB, the enclosures, and HD controllers, system memory is dirt cheap.
 
Status
Not open for further replies.