What Does One Petabyte Of Storage (And $500K) Look Like?

pedro_mann · Jan 29, 2012

I am old enough to remember flipping through Computer Shopper magazine, back when it was as thick as a phone book because of all the ads, and seeing racks like this that only held 1TB. And I am pretty sure, you could have downloaded the entire internet at that time and stored it on one of those things

g00ey · Jan 29, 2012

Kewlx25: It seems like you've got it backwards, if you need 5GB RAM per TB hard drive storage then you need 5TB of it for a PB of storage and that is not "dirt cheap". That means you need to spend at least $75 000 on RAM (given that 48GB costs $700). The largest memory modules that are in production are 32GB, they are not that easy to come by nor are they cheap. HP charges $2500 for such a module which would require at least 160 of them which would add over $400 000 to the cost of the server. Also, I don't think there is any single system that can fit that many memory modules, the beefiest known quad-cpu motherboards hold 32 memory slots. That means you would need at least 5 motherboards, 20 additional Xeon (or Opteron if you go for AMD) CPUs, rackspace and PSU amperage to hold these additional motherboards to fit that many memory modules into a system.

Then we have the problem with how to optimally utilize this computational capacity to properly handle the I/O operations on the storage pool.

So if dedup requires as much RAM as you claim then it is a lot better to forget about it.

theronconrey · Jan 29, 2012

@issuemonkey
Glad to see Netapp paying attention. Need some accuracy in your comments however.

1) the caching; you could have 48GB of ram or more, this is for your server between able to handle to load of your I/Os and nothing more. It requires real caching to proper handle identical request like specific files like word, excel, or even a vmdk.

With proper caching, like the one we have on our gear, you can cache a couple of vmdk used to boot storm a full stack of virtual desktop... access it once on your disks, get it from fast SSD cache

response: Actually you're wrong. The reason you have a seperate "cache" product is your file system doesn't handle it natively, oh and you charge out the nose for it as a seperate product. The RAM or ARC (tier 1)cache works for both read and write. The next layer is split in two, with Write cache separate from Read cache, both on SSD or DRAM caching disks(tier 2) Both of these handle workload IO for different application sets you described. This removes the barrier of having to tier your disks to handle load, so that your zpools can be sized to handle the working set presented to the ESX cluster or application X. Your solution with expensive PAM cards and Flash cache do this as well, both costing more and not included with the initial cost of buying an array. Even then though PAM isn't as fast as the native RAM cache that Nexenta provides.

2/ looking at the pictures, the RAID adapter are not battery powered, so does it mean that there is no protection of your data during a controller lost?

response: No. please read about ZFS.

This is good to have a dual "server" to protect against failure but if you last writing to your DB are lost... basically, you run into trouble. This is the equivenent to a FAS, not the servers attaching to it.

Enterprise storage is using interconnect card between the controllers with some cache, we call that NVRAM, and if one is going down, this cache is battery powered and will be accessed by the remaining node to discard the set of data o disks.

response: Agreed, this is called the ZFS intent log, or ZIL and is on the shared trays of storage and can fail over to the other head. We do that and also think it's important. anyone that actually doesn't do this shouldn't be in the business of enterprise storage.

3) they do speak about failover mechanism, this is also scary, is this automatically done in a transparent way for the different protocols?

response: why is this scary? Does it scare you how Netapp does it? Yes it's transparent. Yes you're trollolololing. If you really wanted to see how enterprise failover works at scale, Nexenta has 45 day free trials. I'd be glad to show you why we outperform Netapp day in and out.

4) there is no concept of tiering, performance or workload type. This kind of setup will not fit for all...

response: ZFS doesn't need special tiering software or hardware because it's baked natively into the file system. Please read about ZFS before making crazy comments. Netapp doesn't provide me software that I can run on commodity hardware. That's not bad, but the fact that nexenta doesn't need expensive PAM cards or ridiculous "fast cache" products that cost piles extra doesn't mean it's not solving EXACTLY the same issues. Again I have to point out, we do it in the file system itself, not with crazy bolt on stuff. Not every Enterprise array works for every scenario, Netapp included. Fact is though, with a software oriented Open Storage Solution like Nexenta, I can take my software to other hardware if I need to. same can't be said for Netapp, and I'm stuck with an over priced solution with over priced support.

theronconrey · Jan 29, 2012

Kewlx25 g00ey If we're talking vmware, I'd rather leverage thin provisioning and compression to get my space savings, and leave all the ram for straight ARC caching. Dedupe works awesome but if I can slice the pie a different way and get more cache, I'd do that.

theronconrey · Jan 29, 2012

[citation][nom]scook9[/nom]I have to say, that is a pretty ugly rack. They really need to get the racks looking better like EMC does for that price -Biased RSA/EMC employee[/citation]

I can agree with that one

The EMC racks do look pretty.

theronconrey · Jan 29, 2012

[citation][nom]therabiddeer[/nom]I work for HP also, and the 3par systems are ridiculous.Honestly, when it comes to enterprise, you cant beat the stuff that HP sells.[/citation]

HP has gear on the Nexenta HSL as well.

theronconrey · Jan 29, 2012

One of the key points here I think isn't being talked about is that Aberdeen is putting together an enterprise grade array at a FRACTION of the cost other storage providers are bringing to the table. Anyone in the market to buy should be doing their own homework and taking what the vendors say with a grain of salt. This is just another way for you to drive down your storage costs while leveraging software to do what was once in the realm of only giant proprietary storage vendors. Thanks Tom for this write up! Go Aberdeen! Go Nexenta!

sigma3d · Jan 30, 2012

[citation][nom]anort3[/nom]"Honey, I downloaded the internet!"[/citation]

now upload it back whereever did you get it.

Guest · Jan 30, 2012

Given these shelves have disks in both the front and back how does that affect heat-management >22k BTU/hr coming through this rack how does it scrub its heat?

Guest · Jan 30, 2012

They should add another drive to make it a true petabyte. 963TB != 1PB

Guest · Jan 30, 2012

First, remember that this is 1PB for only $500.000! Let us see NetApp / HP / EMC / IBM pricing on 1PB. You could buy 5 of these Aberdeen racks for the price of one PB from another vendor. 😱) 😱) 😱)

Second, there are two servers here, each server holding 48GB RAM.

Third, to the HP guy that complained that the hardware raid cards in this server had no battery backup:
There is no hardware raid in this server. No need for battery backup. The disks are basically connected directly to the motherboard. Hardware raid is error prone, and might corrupt your data. CERN did a study and concluded that also big Enterprise storage servers could corrupt your data, without even noticing. There are lot of research on this. Here are some research:
http://en.wikipedia.org/wiki/ZFS#Data_Integrity

Thus, dont trust any storage solution that uses hardware raid. Just read the research papers and you will see why. Just discard the HP solution that relies on hardware raid.

billfor · Jan 30, 2012

I can remember when the first PC hard drive was 10 megabytes. How could you ever use that much storage? Ram was between 4-8K bytes which was enough to run almost any software. It will be interesting to see where things are in ten years.

Guest · Jan 31, 2012

For all those that work for HP..sorry but storage isn't the best thing you do even though you buy great storage companies. 3par..mehh. Once HP gets it slowly sinks. I work with with multi-petabyte systems at work..actually I choose and buy these systems.

Anyways nothing new here to see, lots of companies offer this much space inside the same footprint. Example LSI wembley assemblies hold a LOT of disks, probably the most in any footprint, I'm guessing HP is licensing these since Netapp owns the patent on this now.

Netapp bought this part of LSI recently and they have the patent on the design. Who else offers a lot of space ..panasas, isilon etc and they are not JBODs either. Panasas has been using 3TB disks for almost a year now..welcome to the party HP. Netapp I'm waiting for you the bring your game up..

Guest · Jan 31, 2012

The maximum amount of data density possible for hard drives is roughly 10,000 times more denser than hard drives today. The reason is because current hard drives use about a million atoms per bit. I doubt we'll use 10-20 atoms per bit because it theoretically won't be reliable.

Guest · Jan 31, 2012

so...would a file on that thing be called a...petafile???

SirMasterboy · Jan 31, 2012

This storage pod is a pretty big ripoff IMO. 1 PB shouldn't cost anywhere near $500,000.

For example, Backblaze has built their own storage servers that store more than 16 PB right now and it only costs them $56,696 per 1 PB and takes up the same space as this pod.

Performance also handles thousands of users at a time without problems.

http://blog.backblaze.com/2011/07/20/petabytes-on-a-budget-v2-0revealing-more-secrets/

douglaskuntz · Jan 31, 2012

Funny thing is.. I designed this exact setup over 2 years ago for another company (right down to using Nexenta...). So to answer a few questions. No, the cards dont have battery backup. They're HBAs, not RAID cards. No caching is done on the cards themselves. The 2 control nodes have their RAM, which is considered ARC in Nexenta/Solaris/OpenSolaris/etc, and is fast cache. Then you add in SSDs for Level 2 ARC (L2ARC), which is additional caching, which will stripe against X number of SSDs. Then you have the Log drives, also SSD, in a mirrored pair (or mirrored triplet), which should be no larger than 2X RAM. This is good for databases, rsync, NFS, or CIFS traffic. The cache helps everything, especially iSCSI based traffic. As they're probably offering the commercial version of Nexenta, then it will handle failover easily. There's heartbeating between the 2 systems, via network, serial and/or a quorum disk. If machine X fails, Machine Y takes over the virtual IP, imports the ZFS volumes, etc, and everything is good. Machine X comes back, it's told "you're now the standby" and it goes "ok" and waits for Y to fail.
And for the L2ARC caching, it can be in each control node, since if the data is lost, it'll just cause a slowdown. The Log volumes (ZIL) needs to be spread among several of the JBODs, since if it's on 1 controller or the other, data in it, but not flushed to disk, will be lost/corrupt.

Guest · Jan 31, 2012

So... much... porn...

Guest · Jan 31, 2012

Yes 3Par V800 is amazing storage density and speed. The images above from Aberdeeninc.com are just scary in terms of high uptime support risk and a big exposure for a company vs using a proven enterprise class or near enterprise class solution. I would use Barracuda Networks before them. But currently you get a great value for your money on IBM FastT, HP 3par T class or Netapp 3000 class products if you know how to purchase it at discount. Server Storage has to last 5 years (3 in production, and 2+ more years repurposed for slower performance QA/development for example)

Guest · Feb 1, 2012

Cool ariticle. You should run this story 5 years from now and make some comparisons.

Guest · Feb 1, 2012

This rack is actually 1.4 m and everyone after that is 1.1m. one week to get it up and running in your house.

Dead Pixel · Feb 1, 2012

wow what would you use that storage space for?

lysinger · Feb 2, 2012

Ahh, but will it play Crysis 2? 😉

marciocattini · Feb 6, 2012

When you think about it this is pretty cheap, Dell Compellent storage sells for about $ 270 k... with around 6TB of storage. Of course the managing software is top notch and they almost never fail. This looks pretty powerful though I wonder why it doesn't cost so much :O

xtreme5 · Feb 7, 2012

how much 1pb is equal to tb.

What Does One Petabyte Of Storage (And $500K) Look Like?

Distinguished

Distinguished

Distinguished

Distinguished

Distinguished

Distinguished

Distinguished

Distinguished

Guest

Guest

Guest

Guest

Guest

Guest

Distinguished

Guest

Guest

Guest

Guest

Guest

Guest

Distinguished

Distinguished

Guest

Guest

Guest

Guest

Guest

Guest

Guest

Guest

Distinguished

Distinguished

Distinguished

Distinguished

Share this page