Hi guys, many thanks for the help building my graphics workstation and its huge RAID arrays. Now I've got another big project, this time for a company I'm working for, and I need a little help with some throughput issues. The issues are complex, so sorry for writing a novel. I do have a couple quick questions that I'll put up at the beginning though.
Quicky #1: what's the best stripe size for serving large files over network? I assumed it would be the smaller ones (i.e. 16k, what it was set to) because the machine is serving ethernet packet sized chunks, but I don't know how caching and interrupt-bundling in the ethernet card factor in. And how does stripe size factor in to raid-50? Because if you write 64k chunks to the three "logical" raid-5 arrays that make up the raid-0, are you really writing 21k chunks to each disk? or are you writing 64k chunks to each disk, making the effective chunk size for the raid-0 array (of raid-5 arrays) 192k?
Quicky #2: I had a hard time finding benchmarks, so which is faster raid-5 or raid-50? Raid-50 supplies more redundancy than raid-5, so it should be slower correct? But one benchmark I found showed 50 as much faster, and one showed them being the same (but that was a very poorly done benchmark. Used the PCI bus I think.) I've only managed to find those two benchmarks.
So on to the big problem. A little background... We capture old 8mm/16mm video film to digital video, from multiple stations around the office, edit it in Premiere and export it to DVD. Before I joined, they used sneaker-net with external firewire hard drives (not a good system, but functional) so I took on the task of building them a centralized network storage machine that could handle the load of uncompressed video flying around everywhere.
So, on to the hardware... I built this system for them to handle the networking:
<b>Tyan S2720-533</b> (<A HREF="http://www.tyan.com/products/html/thunderi7501.html" target="_new">here</A>)
(<b>with dual gigabit ports</b> and a single 10/100 port. Upgradable to as many as 14 gigabit ports.)
<b>3Ware Escalade 9500S-12</b> (<A HREF="http://www.3ware.com/products/serial_ata9000.asp" target="_new">here</A>)
<b>Nine 250GB SATA drives</b> in hot swap enclosures. Initially all drives were put into a RAID-50. (i.e. raid-0 of three raid-5 arrays of three drives each, I think. 16k stripe size.)
<b>Windows XP Pro</b> (need to learn to use linux for this stuff I know)
Here's the problem. The stations do everything realtime, so the video each station puts out has to be read and written in realtime during editing, and written realtime during capturing. SiSoft Sandra reports that the array has PLENTY of random read and random write throughput, but the system seems to get hammered when the typical number of stations are working at the same time. I don't think it's network bandwidth, because the office is divided into two gigabit segments and I can have three stations doing stuff on segment 1, but one station starting up on segment 2 after that kills it. The three network ports are bridged in windows, which seems to be smart enough to not send traffic through to other ports if it's not destined to that segment. (at least, one switch can be blinking like nuts, and the other can be silent.) So it's not the network bridge. I don't much trust the windows performance monitoring applet (the one in AdminTools) but it does say that the hard drive array is running with only 5%-7% idle time after three stations are using it. But it also says that it's spending 170% of its time in reads, and another 130% of its time in writes..... so... yeah thanks windows...
Last night I divided the single raid-50 array into multiple smaller raid-5 arrays and a couple single disks. I'm hoping that way we can use one array for capturing, one for exporting, etc. and get more total throughput out of the same number of drives since RAID doesn't exactly scale performance linearly (especially random-RW performance) but it would be much more efficient for us to have a single array.
So does anybody have some performance improving suggestions to get this single array system working right? All we need is about 4-ish MB/sec, per station, write-only for two capturing stations and both read and write for the three editing/exporting stations.
SiSoft Sandra said last night that the 9-drive raid-50 array had 110MB/sec random read, and 30MB/sec random write, and warned that this discrepancy in read and write performance might be due to a write-verify setting being on. I can't find that setting if it's there. It's not in the 3ware card's bios, or its management software, I looked there already last night.
Another note. Jumbo frames did not seem to help, for whatever reason. I specifically built the network to support them, but apparently the problem gets worse when they are on.
And, anybody know of any software that can monitor the amount of bandwidth each station is using? Apart from starting the windows performance monitor on each station, because when they are exporting / capturing they should not be refreshing that thing, and it would be hidden behind Premiere where you can't see it anyway.
And, should I even bother trying a single raid-5 if a single raid-50 didn't work? Raid-0 is out, because the likelihood of failure in a 9-drive system is too big. And because if we lose all our data suddenly, it would be pretty bad.
Thanks to all the hardcore people who read this far. Give me some hardcore advice so I can get full use out of this system!
<P ID="edit"><FONT SIZE=-1><EM>Edited by grafixmonkey on 08/31/04 02:08 PM.</EM></FONT></P>
Quicky #1: what's the best stripe size for serving large files over network? I assumed it would be the smaller ones (i.e. 16k, what it was set to) because the machine is serving ethernet packet sized chunks, but I don't know how caching and interrupt-bundling in the ethernet card factor in. And how does stripe size factor in to raid-50? Because if you write 64k chunks to the three "logical" raid-5 arrays that make up the raid-0, are you really writing 21k chunks to each disk? or are you writing 64k chunks to each disk, making the effective chunk size for the raid-0 array (of raid-5 arrays) 192k?
Quicky #2: I had a hard time finding benchmarks, so which is faster raid-5 or raid-50? Raid-50 supplies more redundancy than raid-5, so it should be slower correct? But one benchmark I found showed 50 as much faster, and one showed them being the same (but that was a very poorly done benchmark. Used the PCI bus I think.) I've only managed to find those two benchmarks.
So on to the big problem. A little background... We capture old 8mm/16mm video film to digital video, from multiple stations around the office, edit it in Premiere and export it to DVD. Before I joined, they used sneaker-net with external firewire hard drives (not a good system, but functional) so I took on the task of building them a centralized network storage machine that could handle the load of uncompressed video flying around everywhere.
So, on to the hardware... I built this system for them to handle the networking:
<b>Tyan S2720-533</b> (<A HREF="http://www.tyan.com/products/html/thunderi7501.html" target="_new">here</A>)
(<b>with dual gigabit ports</b> and a single 10/100 port. Upgradable to as many as 14 gigabit ports.)
<b>3Ware Escalade 9500S-12</b> (<A HREF="http://www.3ware.com/products/serial_ata9000.asp" target="_new">here</A>)
<b>Nine 250GB SATA drives</b> in hot swap enclosures. Initially all drives were put into a RAID-50. (i.e. raid-0 of three raid-5 arrays of three drives each, I think. 16k stripe size.)
<b>Windows XP Pro</b> (need to learn to use linux for this stuff I know)
Here's the problem. The stations do everything realtime, so the video each station puts out has to be read and written in realtime during editing, and written realtime during capturing. SiSoft Sandra reports that the array has PLENTY of random read and random write throughput, but the system seems to get hammered when the typical number of stations are working at the same time. I don't think it's network bandwidth, because the office is divided into two gigabit segments and I can have three stations doing stuff on segment 1, but one station starting up on segment 2 after that kills it. The three network ports are bridged in windows, which seems to be smart enough to not send traffic through to other ports if it's not destined to that segment. (at least, one switch can be blinking like nuts, and the other can be silent.) So it's not the network bridge. I don't much trust the windows performance monitoring applet (the one in AdminTools) but it does say that the hard drive array is running with only 5%-7% idle time after three stations are using it. But it also says that it's spending 170% of its time in reads, and another 130% of its time in writes..... so... yeah thanks windows...
Last night I divided the single raid-50 array into multiple smaller raid-5 arrays and a couple single disks. I'm hoping that way we can use one array for capturing, one for exporting, etc. and get more total throughput out of the same number of drives since RAID doesn't exactly scale performance linearly (especially random-RW performance) but it would be much more efficient for us to have a single array.
So does anybody have some performance improving suggestions to get this single array system working right? All we need is about 4-ish MB/sec, per station, write-only for two capturing stations and both read and write for the three editing/exporting stations.
SiSoft Sandra said last night that the 9-drive raid-50 array had 110MB/sec random read, and 30MB/sec random write, and warned that this discrepancy in read and write performance might be due to a write-verify setting being on. I can't find that setting if it's there. It's not in the 3ware card's bios, or its management software, I looked there already last night.
Another note. Jumbo frames did not seem to help, for whatever reason. I specifically built the network to support them, but apparently the problem gets worse when they are on.
And, anybody know of any software that can monitor the amount of bandwidth each station is using? Apart from starting the windows performance monitor on each station, because when they are exporting / capturing they should not be refreshing that thing, and it would be hidden behind Premiere where you can't see it anyway.
And, should I even bother trying a single raid-5 if a single raid-50 didn't work? Raid-0 is out, because the likelihood of failure in a 9-drive system is too big. And because if we lose all our data suddenly, it would be pretty bad.
Thanks to all the hardcore people who read this far. Give me some hardcore advice so I can get full use out of this system!
<P ID="edit"><FONT SIZE=-1><EM>Edited by grafixmonkey on 08/31/04 02:08 PM.</EM></FONT></P>