Do bigger HD's in Raid become slower then smaller HD's in raid?

Spitfire7

Distinguished
Jan 18, 2007
770
10
18,995
Hey guys, quick question. If I had two 150GB 7200 HD's in Raid 0 would that be any faster in seek time, gaming, and everything else compared to having two 640GB 7200 HD's in Raid 0? Im looking for the speed here for games, so what do you think?

My other option is to go with a Vraptor 300GB 10000, but I have found from research that two 7200's in Raid 0 are still slightly faster. So I am going to go with the 7200's in raid, but should I get two smaller HD's or am I safe to get two larger HD's without compromising any speed? Thanks.
 

sub mesa

Distinguished
RAID0 cannot lower the seek time of your disks; it simply can't. It can make it higher though, if created inefficiently.

So while you can have a RAID0 of many 7200rpm disks, that would increase both IOps and throughput (MB/s), but it cannot lower the latency of these 7200rpm disks so single-threaded blocking I/O applications like games and booting would benefit little from the RAID0 potential. Instead, a faster disk like Velociraptor or SSD would help alot more here.

If i have to choose between one Velociraptor or two ordinary 7200rpm disks in RAID0, i'd pick the raptor.
 

Spitfire7

Distinguished
Jan 18, 2007
770
10
18,995
Thanks for the input sub mesa. You are actually the first person I have heard say there would be more benefit from the Vraptor including from some charts I have seen which state that the two 7200rpm HD's would be at a combined speed of 14,400rpm as well with a much higher storage space. So hearing your input kind of puts me back to the drawing board. more input would be appreciated from all.
 

sub mesa

Distinguished
Two 7200rpm drives certain do not give you a 14.400rpm drive. a RAID0 of two 7200rpm drives can perform two operations simultaniously, so that when disk1 has completed an operation, 2 would be completed. This is called parallelism and is employed in many computing concepts (DDR, Dual/Triple channel, Bank interleaving, Multicore/SMP/SMT, PCIe multi-lane, Parallel ATA, etc). Parallelism can work great if there's stuff to do, but they require the application the use that parallel potential, or it will go to waste. A good analogy is single-threaded games and applications running on dualcore or quadcore CPUs. Only one core can be used to process the game logic.

Translating this to RAID0 means two 7200rpm disks would allow for higher sequential speeds than one 10.000rpm disk, but for applications employing serial non-sequential I/O operation, the 10.000 would win anytime over the 7200rpm disks, no matter howmuch you have paired in RAID0. True performance is not MB/s you measure with ATTO, HDTune or HDTach, but IOps performance of real applications. These statistics are hard to establish on windows-based systems. Although some tracing benchmarks try to mimic their behavior, and make excellent realistic tests. Those tests will show that a system disk gains more benefit from low latency (an SSD would be great, or a HDD with high rpm) instead of high throughput (MB/s).
 

Spitfire7

Distinguished
Jan 18, 2007
770
10
18,995
Sub Mesa,

So just break it to me straight. So for the latest games with a high quality fast i7 CPU and a fast system what would be best and quickest performance for the latest games?
 

sub mesa

Distinguished
A good SSD as a system disk, perhaps paired with a large WD Green 1-2TB for mass-storage, or two of them in RAID-1 if you like. That would be a basic setup that gives you both speed for bootup/applications/gaming and the storage space plus added reliability by the mirror for data you want to store safely.

Good yet affordable SSDs are Intel X25-M and OCZ Vertex.

If you cannot or don't want to invest in an SSD, a single velociraptor would be the best choice, without any RAID. You can put two of them in RAID0, though a single velociraptor will do just fine. Likewise, you can also pair multiple SSDs with RAID0 without much risk, since they cannot fail like mechanical drives fail. The RAID layer itself, however, does introduce another layer which can fail. With a single SSD or Velociraptor, you don't have that additional risk. Especially considering Windows RAID-drivers are often of bad quality, when comparing to solutions found in Linux and BSD. Alot of people just don't get all the potential that RAID0 can offer, in the right circumstances.
 

jrst

Distinguished
May 8, 2009
83
0
18,630
The short answer is that the 640GB drives will be faster than then 150GB drives *if* all other drive performance characteristics are the same *and if* you're using the same amount of (contiguous) space. RAID-0 won't change that.

E.g., Worst case to access 150GB the 150GB drive requires a full stroke seek (seek across the entire drive); the 640GB drive will require 150/640 ~= 1/4 stroke seek (seek across 1/4 the drive). That is sometimes referred to as "short-stroking". THG did an article on it a while ago. They limited the drive size in firmware, but you can do pretty much the same by properly partitioning your drive.

However, that doesn't mean the seek time of the 640GB drive (over 150GB) will be ~1/4 that of the 150GB drive. It's not linear because of acceleration, deceleration, and head settling time. (An indication is single-track or track-to-track seek time, which is the lower limit.) Also, access time is what counts, and seek time is only one component; the other component is rotational latency (which will of course be the same for drives at the same RPM).

sub mesa has pretty much covered the 2x RAID-0 vs. Raptor. I'd only add that if you were to short-stroke a much larger 7.2K drive (or pair in RAID-0), you might get overall better performance than the Raptor. Key words are "might" and "overall" (i.e., more than just maximum transfer rate from a synthentic benchmark). I don't know of any credible tests.

Note: many of the tests you see involving RAID-0 vs. whatever are bogus. They don't account for the fact that in many workload tests, that the improvements are due to short-stroking-like behavior, which is due to the larger size of the RAID-0 array, which means the heads don't have to move as far (and which could be had cheaper and easier simply with a single larger drive).

In short, (and if SSD's aren't on the table) I'd go with: (1) Single Raptor. (2) A pair of decent >= 1TB 7.2K drives in RAID-1. (3) A single decent >= 1.5TB 7.2K drive. (4) If you must... a pair of decent >= 750GB 7.2K drives RAID-0 with a third (non-RAID) disk large enough for the OS and to backup anything important on the RAID-0 volume.

I'd create a dedicated partition for the stuff you'll use while gaming. Smaller is better--you want to effectively short-stroke the drive by limiting the distance the heads have to travel during gaming, and not having it end up spread across the drive (but leave enough space so you can keep it defrag'd). Use the rest of the space for something that won't be needed when gaming.
 

ZenGeekDad

Distinguished
Sep 18, 2009
16
0
18,510


So serial means "can't be run in parallel" and non-sequential means "random access" ... right?

Anandtech's landmark 2009 articles on SSDs convinced me that SSDs trounce any RAID for random access reads (and the better SSDs do so for random access writes too). But I hadn't run into your concept here that there is such a thing as a serial ("can't be made parallel") write. I thought that the RAID controller divvied out the data to write across the stripes of the RAID by it's own logic.

E.g., "Put the first 128KB of this song on disc A sector 1, the next 128KB of this song on disc B sector 1, etc." Even if the writes are lots of little files, I would expect the RAID controller to follow the same principle: "Put the first 128KB of this series of little files on disc A sector 1, the next 128KB of this series of little files on disc B sector 1, etc." and that would still be something the RAID controller could do in parallel to all the discs in the RAID array.

So it seems there should be no such thing as a serial ("can't be made parallel") write. The notion of a serial ("can't be made parallel") read seems much more reasonable: the RAID controller ties physical sectors of the RAID array's HDDs together, so moving the read heads of the RAID arrays distinct HDDs independently -- to different platter locations -- to read differently-located stripes in parallel would seem to violate the basic way that RAID controllers work. Right?

Just trying to understand this intriguing new technology. You seem to know your way around this topic, so hopefully you can clear this up for me. Thanks.

 

Scott2009

Distinguished
Sep 20, 2009
34
0
18,530
Remember to only 'benchmark' the first 150 GB of the 640 GB HDD's in RAID-0, vs the 2 x 150 GB RAID-0.

Otherwise it's not truly apples vs apples. (The first 72% of the surface of a HDD is going to outperform the last 28% that's for sure, so YES, the 2 x 640 GB will be faster, even compared to a single 300 G 10,000 rpm HDD).

Surface density and throughput are more important to gamers than seek times.

If 300 GB is enough, consider waiting for the next generation of SSD's in sizes: 256 to 384 GB.

Typically games only *read* from a few LARGE files (which are similar to .WAD files in the original Doom), and don't perform 'that many' seeks.
 

Scott2009

Distinguished
Sep 20, 2009
34
0
18,530
If the typical read is going to be a sequential read for 512 - 768 KB of data, which is then stored in memory, the 'longer' seek times of a 7,200 rpm HDD (vs a 10,000 rpm one) will be negated by the throughput of 2 x 640 GB HDD's.

Despite this I still recommend a 32 KB to 128 KB stripe size, unless the controller can't handle the I/O (if you go too small CPU usage may skyrocket).
 

Scott2009

Distinguished
Sep 20, 2009
34
0
18,530
As for the weird stuff above, remember that no two drives spin at exactly the same RPM, one will be 7190 rpm, the other maybe 7210 rpm. They do not spin in sync.

This is why NCQ and TCQ (SCSI) are such good ideas, the controller takes it in serial, and the reads AND writes will occur simultaneously. Even to different files and/or partitions.

Also: The person above has 'sectors' and 'clusters within a stripe' 'slightly confused'. Bearing in mind a cluster could technically span a stripe, but *usually* a stripe contains multiple clusters.