How can blazing storage speeds be achieved?

eloric

Distinguished
Mar 13, 2010
848
0
19,060
Last August TomsHardware posted another record for I/O throughput: Another Record Broken: 6 Gb SAS, 16 SSDs, 3.4 GB/s!

From what I can tell this was about $14,000 worth of storage equipment. Quite impressive, I might add.


I used PassMark’s results to research my first build. Very pleased with the outcome, but when my new machine was benchmarked, disk performance was nowhere near the top rankings:

9km0x2.png


The short stubby lines at the bottom of each of the 3 graphs above are my Intel x25 SSD stats, reading at 225 Mbytes per sec and writing at a lowly 76 Mbytes per sec. The top Passmark systems are reading and writing between 3000 to 4000 Mbytes per second.

That's faster than a whole flock of Velociraptors!

These numbers have to be jacked up artificially. Is it some wildly bloated block size? I can't believe all these people are just throwing a ton of money away for bragging rights.

Can anybody offer some hints or share their secret?
 
Solution
While putting a bunch of HDDs in RAID0 sounds like it would be the best, it's not. Probably the most important thing when looking at a drive's speed is not the throughput, but the latency, a combination of how fast it can read one file, then move on to a different file at a different location on the platter.

You could relatively cheaply build a RAID0 array with 8 new Seagate or WD drives, and achieve, depending on the controller, over 1000MB/s read speeds. However the latency is going to be higher than a single drive (again depending on the RAID controller)

I'm not familiar with that benchmark so I don't know, do they show the latencies of the top scores? You could set up a large and relatively cheap RAID array with HDDs and get...
There's always going to be someone with a storage subsystem faster than yours. For all you know those high benchmark scores came from some expensive Enterprise server that was purpose-designed for throughput, or from some extreme benchmarking test such as the one you linked to.

Down that path lies madness...
 

eloric

Distinguished
Mar 13, 2010
848
0
19,060
'How do you know I’m mad?' said Alice.
'You must be,” said the Cat. 'or you wouldn’t have come here.'”

Actually, I wouldn't ask if it was just one rig. Appears to be several configurations with various builds, all non-server machines.

There is an enterprise solution called a violin memory appliance. It is so costly that they won’t bother to mention the price in public: http://violin-memory.com
 

mattshwink1

Distinguished
Nov 20, 2009
15
0
18,520
Increasing storage speed is usually all about increasing spindles (more drives). In a RAID-0 config these can get very fast, as Toms's showed with the large number of drives. In fact, in some instances lots of slower drives (even SATA) can be faster then SSDs (and cheaper). IBM XIV uses this as midlevel SAN solution (72 SATA drives per cabinet) to provide ~100,000 IOPs. So the number of drives is usually more important then their performance (of course, if the number of drives is equal the higher performing drives are usually better).

Of course, more drives means more space and more power. So those are the trade-offs. In a desktop/rackmount system you also have to be concerned with interface bandwidth. As you add drives you would saturate a PCIe x 1 interface, so you need to go x4 or x8 as you add drives so that the bus is not a bottleneck.

There are also some "extreme" solutions out there that use use DIMMs (RAM) as a hard drive. Those also provide impressive performance.
 

vgdarkstar

Distinguished
May 12, 2008
115
0
18,690
While putting a bunch of HDDs in RAID0 sounds like it would be the best, it's not. Probably the most important thing when looking at a drive's speed is not the throughput, but the latency, a combination of how fast it can read one file, then move on to a different file at a different location on the platter.

You could relatively cheaply build a RAID0 array with 8 new Seagate or WD drives, and achieve, depending on the controller, over 1000MB/s read speeds. However the latency is going to be higher than a single drive (again depending on the RAID controller)

I'm not familiar with that benchmark so I don't know, do they show the latencies of the top scores? You could set up a large and relatively cheap RAID array with HDDs and get 3-4GB/s transfer rates, but the lantency will probably be in the 40-60ms range, not good. That's only good for transferring large files, sequentially.

You also have I/O drives, which are very expensive. Heck, if you really just wanted numbers, you could download one of the many RAM drive programs and just make a small partition in your RAM. If all you're interested in is the numbers.

I have been asking myself the same question about storage speeds for years, what I've decided as the best is something like this:

Operating system goes on a SSD, you have one, you know, random read/writes of many tiny files is what a SSD is good at. Put a pair or more in RAID0 as it will increase throughput, latency shouldn't be affected too much, depends on the controller. I imagine that putting multiple SSDs in RAID would also improve the write wear they experience. If you have 2 drives, they're each being written to 1/2 as much.

For everything else, a decent RAID5 controller, and multiple HDDs in RAID5 is the way to go. I've researched for the last couple years, what I wanted, what gives the best performance/price, this is it IMO.

It will cost you though, for anything worth calling a storage system, it's going to cost between $1k and $2k at least. Too much for some, but it sounds like you and I both share the same requirements.

 
Solution
There are basically three dimensions to storage subsystem performance. It looks like the posted benchmark is only measuring transfer rates - you can maximize those simply by putting enough drives into a RAID-0 set and tuning the stripe size to best fit the application (or benchmark). There's really no rocket science to that, its just a question of how deep your pocketbook is. At some point you start running into bus limitations, and then you need a REALLY deep pocketbook.

Access times are the second dimension. RAID really doesn't do anything much to improve them - what you need is a device with intrinsically less delay to find data. SSDs (good) or battery-backed DRAM (very good) is the way to go here.

The third dimension is concurrent I/O rates. This isn't as important for desktop machines because they really don't have that much concurrent I/O going on most of the time. But it's extremely important for servers that are handling requests from numerous client machines. RAID helps a lot here - concurrency basically scales with the number of drives you add to a RAID volume. Again, not rocket science, just deep pockets.
 

vgdarkstar

Distinguished
May 12, 2008
115
0
18,690
Agreed, oh also, another dimension: reliability, I forgot to mention that. Backups are good but... you can also have a more semi-redundant file system, which is the point of RAID5.
 

eloric

Distinguished
Mar 13, 2010
848
0
19,060



This is the video that inspired Shmid and Roos to build their array with 16 Intel drives and 2 raid controllers to go 3.4 GB/sec. See the link in the original post.
 

eloric

Distinguished
Mar 13, 2010
848
0
19,060
I have a definitive answer to this thread. Looking at the system specifics for those benchmarks with astronomical storage I/O metrics, I noticed that there was a difference between their Physical RAM and RAM available. Discovered next that Passmark permits users to specify their drives for benchmarking purposes. The drive does not have to be a physical drive - it can be virtual.

Here is a link to download VSuite software that creates a virtual RAM drive:
http://www.romexsoftware.com/

Passmark also lets you specify a virtual DVD Drive:
http://www.slysoft.com/en/virtual-clonedrive.html

I created the RAM drive and a virtual DVD Drive. Next, loaded a disk image to the RAM drive and mounted it to the DVD.

Here are the results in Passmark:
2ltj1v4.png

For those of you with eyes like mine, those storage numbers are 3527 MBytes/sec for reads, 2648 MBytes/sec for writes, and DVD reading at 278 MBytes/second.

These are not the best figures for the top Passmark systems, but they prove the concept for me.
 

bnot

Distinguished
Nov 17, 2007
707
0
18,990


my companys tried out violins solutions a couple months ago.....thier like a chinese fusion/ocz knockoff with yearly cost of a ramsan farm. we switched to a more reliable supplier.