Home Server used for data backups: RAID 5 or disks mounted independently of each other

howlingmad_wolf

Honorable
Nov 11, 2013
9
0
10,510
I am building a home server and have the basic system up and running but I am at a point where I need to decide how I will configure the hard drives that will hold the most of the data. I have not purchased the drives yet. If i go with RAID I will be buying the WD Red drives. If I decide to independently mount the drives, I think it would better if I got the WD Green drives as the Reds have time limited error recovery which may not be the best thing outside of a RAID (and the greens are cheaper).

I guess the real question I want to ask is how often do parity blocks get errors and how often do file systems get corrupted on a RAID?

I have no intentions of doing a JBOD as that is just as dangerous as RAID 0 without the performance benefits (not doing RAID 0 either). RAID 10 costs more money than I want to pay.

Also I was wondering if I partition the drives in half to double the number of drives virtually and then configured them in a RAID 6 array, would I be able to mitigate some of the risks of data corruption, or would I just be increasing it?

If I independently mounted the drives, if one fails then the data on the disk that failed would be "lost" and I would have to re-upload a new backup of that data. If I do a RAID 5 array and I loose a drive and have a bad parity block then the data can't be reconstructed or if more that one drive fails and/or the entire file system is corrupted then ALL the data is lost and I will have to re-upload all the backups and data from scratch......so what is the best option????
 
Well it really depends on how many drives you have. I use a WD Red as a single drive as I have no issues along with two in a RAID 0 (I have a backup of course). With a RAID you have less of a chance of totally using data and having to restore from your last backup.

Just note with RAID the more drives you have and the Larger the drives the long it will take to rebuild with most of a chance of a drive failing.

Me personally I don't like RAID cards that I can't read the SMART data off of (My Dell SAS 5 RAID card allows me to see that data and i check it quite often)

Also the whole "Partitioning" thing for the RAID 6 you have to understand that with RAID partition's do not matter. Only disk and Disk Size. And if you do any kind of RAID make sure they are all the same disk and same size (And if you do use a RAID I wouldn't use any Disk that has a automatic shutoff or Variable RPM Like the WD Green. Only asking for problems) Also for Raid 5 and 6 they don't recommend you use anything bigger than 2TB because the rebuild time could take a long time and another drive could fail.


One thing I always do with all my drives is i use Active Kill Disk (You an use CCleaner as this has this built in and is free) But I do at least a 3 Pass wipe on the ENTIRE drive and then check with Crystal Disk Info for the hard drive stats. usually if a drive is bad out of the box it will fail the Wipe and you can see it in the smart status. This has come in handy since one of my clients buy's a lot of Dell Servers and they do Security (Mainly video) and just need space. Last time we got them 3 WD Reds. Ran the program, one bombed with in a few minutes. Returned it on Amazon and got a new one which passed.

Just a few points and my opinion on RAID
 


I am planning on having four drives in the 3-4TB range. I plan on having this server available to family out side my local network, plus some media content (that is why I am at looking larger volume drives, and also why I do not want to use RAID 10 or 1).

Partitions don't matter if your using a hardware based RAID controller; however, you can create multiple partitions as separate volumes ("virtual drives") on a single drive and if you use a software based RAID controller it would see the volumes as drives, as the controller is within the OS. There is a thread on this at: http://superuser.com/questions/116983/can-i-setup-a-software-raid-in-windows-7-using-virtual-hard-disks. What I was thinking on this is that RAID 6 has a fault tolerance of two drives, It makes two parity blocks per drive, if one parity block is bad the other may not be. It is still vulnerable to more that one non logical drive failure because if one physical drive fails then two virtual drives have also failed and the array is at its limit! I thought of this as a way to implement RAID 6 without as much storage loss.

Another reason why I might want to use software RAID is that it removes the controller as a point of failure. If a hardware RAID controller fails you have to find the same model controller to replace it with or you will have to start from scratch. If a software controller fails, you just have to reinstall it for the most part.

I am aware that non RAID optimized drives are not safe to use in a RAID array. The main difference between a WD Red and any other mainstream WD desktop hard drive is the way they handle read errors and/or variable RPM (and other energy saving features). If a drive in an array tries to reread the sector the where the error has occurred in order to corrected the error (standard procedure on a normal desktop drive) and does this for to long, the drive falls out of the array and the RAID controller rebuilds the array, this hurts the performance of the array and puts the array at risk of failing altogether. A NAS optimized drive, on the other hand, allows the RAID controller to handle the read error. The RAID controller then rewrites the sector in question and the drive remains in the array.

“Also for Raid 5 and 6 they don't recommend you use anything bigger than 2TB because the rebuild time could take a long time and another drive could fail”
This is what I am concerned about. I was hoping to find out how often hardware and non hardware failures occur on average. If it is very often, I would rather deal with losing part of the data than losing all of it! On that note, if I decide not to use a RAID, why put up the extra money for WD Red drives?

CCleaner is a great program and I use it on a regular basis, however, it is, to the best of my knowledge, strictly a Windows program. I am running Linux on this server (specifically Ubuntu based Linux). I think I can use Gparted for the test you are describing.

p.s. I am running with an AMD FM2 four core A8 6600K processor, 8 GB of RAM, SSD boot drive and 6 G/bs SATA controller. (If any of that means anything for RAID rebuild time)
 
Ah i gotcha. Its not Partitions you plan on using but Virtual drives. That makes more sense because even with software RAID, as far as i know with the windows build in RAID, you can't just do a parititon. Has to be the whole disk.

if you plan on setting up a VHD then setting the actual drives up in a RAID shouldn't matter.

And i know how it is. afraid to lose your data. I have 15 years worth of data, Music, Pictures, documents, ect. So I always take my backups serisouly and have all my family members on backup as well. I manager about 20-25 servers for clients and also one clients who buys servers like crazy for us for Video. Most of the time they are SAS Drives which, as you said about in the 4th pargraph about non SAS Drives, have less issues. I've been running a RAID 0 on my Samsung drives for over a year now with out issue and this sucker is on 24/7. Have other clients running RAID 0 and 1 with out issue but its not a RAID 5 or 6. Most of the servers i deal with are RAID 1, 10, or 5 and yes we have had clients with dead drives, toss one in, give it a day, and everything is back to normal.

Honestly the main thing here is backup. As long as you have a backup, drive/RAID failure shouldn't be an issue. Like with me if my RAID 0 goes oh well. I think i've added a few anime epsiodes since my last backup so its not a big deal.

Also you should be better off getting some 2TB drives vs 3-4. hear a lot of bad things, but that's just me but again if you plan on running Vitural Hard drives in a RAID then it shouldn't matter if the hard drives are in a RAID or not. I'd rather keep them seperate if i was to do that if possible. Only reason why i don't like VHD though is if it does crash and you don't have a backup its harder to recover lost files from them.
 


I was thinking about the virtual drive idea and realized that I really would not be doing anything but making the array more complicated. I had not fully thought it though when I replied to your post. By the time I would have done the virtual drive thing, I might as well have saved my self some time and effort by just put the physical drives in a RAID 6 instead.

I have been thinking about the whole problem and I think I have an idea for a RAID alternative. I mount the drives as separate independent drives. Then I use cron jobs to create compressed tarball backups that are distributed across the other three drives so that each of the four drives contains one third of the data on each of the other three drives in a compressed format. Its sort of like RAID 5 except the drives are not combined into one continuous file system therefor each drive does not require the others to function. data lose can still occur but it requires that two or more drives fail and even then data loss is nominal.

If I attempt to try this, is there any benefit to buying the WD Red drives over the WD Green drives, that I may not be aware of? I know that WD Reds are optimized to function in a RAID in close quarters with other drives, that they give off less heat than a conventional desktop hard drive and that they are suppose to produce less vibration. Except for the RAID optimization, WD Green drives appear to have the same, or very similar, specifications.

As this server is supposed to hold backups of data and/or disk images from other computers, movies (for viewing convenience), and some non critical data, I am not so concerned about loosing data as much as I am about having to rebuild the backups of the data. Some of the relatives that I mentioned earlier are currently in another country that has poor internet infrastructure, so it would be frustrating to have to rebuild their backups from scratch. I realize that drive failures and data corruption are things that will inevitably happen. I just want to minimize the loss of data on the server when it does happen. That being said, it really dose such to loos data! Most of the computer on my home network do not have backups, which is part of the reason for building this server.