disk array to fit my needs

bkrebs

Prominent
Jun 16, 2017
8
0
510
hi everybody. i have a bunch of files that i'd like to keep on a raid array, or something of the sort. the only issue is that i barely know anything about raid, and i've never done it before. here are the requirements i would prefer to meet:

- works with mac
- can use any size disk
- allow hot-swapping disks (if one disk goes bad, i can swap it out and still be fine without having to power off)
- supports mirroring with error checking via checksums (MD5, SHA1, SHA256, etc.)
- support for a large number of disks

any help is appreciated
 
Solution
You also might consider just getting a regular NAS box, instead of rolling your own.

I have a Qnap TS-453A, and it serves that function admirable.
Qnap and Synology are two good brands.

lfkfkfkffs

Admirable
It depends mostly on your budget and total storage needs. Like you could do something as simple as a 2 bay NAS with two 10TB IronWolf drives which come with a free data recovery plan and replacement drive. or you can get as crazy as getting a bunch more drives and doing like a raid 5 or 10
 

bkrebs

Prominent
Jun 16, 2017
8
0
510
i honestly don't know how much i'll need. it will eventually expand up to 10tb in, i would say, maybe by the end of the year. for money, it's not an issue. but preferribly, i'll take the cheaper option :)
 

Paperdoc

Polypheme
Ambassador
My guess at first glance is that you will be buying a separate external box containing its own RAID controller and a way to connect you your computer with some fast interface. You MIGHT want it to connect only to a single computer, in which case you need to know what interface options are available on it. Or, you MIGHT want simply to have this new box be another device on a wired local network that can be accessed by several computers.

From your brief description, I suspect RAID5 or RAID6 is what you need. The difference there is in numbers of drives and failure recovery. A RAID5 array is classically done best using 5 drive units. Roughly speaking, one of those will contain all the data verification information, and the other four the actual data. So you get to use 80% of the total HDD space. In the event of failure of a single drive, a good system will alert you that this has happened and keep running, albeit at slower performance speeds. Using its management tools (AND the MANUAL!) you can identify the failed unit, remove it, replace it with a new drive unit (ideally identical to the original) and tell the system to copy and reconstruct all the data "lost" on the failed unit until the entire array is re-established. This can take a lot of time, but while it is being done you are still running, just at slightly reduced performance speeds. Whether or not you can do the removal and replacement of the failed unit "Hot" (not shutting down) or will need a short-term shut-down to do that depends on the features of the box you buy and its RAID management system. A RAID5 system can recover from the loss of ONE drive unit, but if a second unit fails before you have replaced a first failure and completely restored the system, you cannot recover from that easily. In that situation you MUST rebuild the RAID5 system hardware and then do a complete restoration of its data from a recent BACKUP.

A RAID6 system is similar in many ways, but different in one important one. In such a system you use one MORE drive unit than a RAID5 system, and two of the units are used for data integrity and restoration information, while the others contain all the data. Thus you use 67% of your total capacity. The big plus of RAID6 is that it CAN recover from the failure of TWO drive units at the same time.

A final reminder: no RAID system is a replacement for a good backup system that IS used regularly. Things Happen! I worked at a large manufacturing business that experienced this. Their main server ran a RAID5 system and has a good backup system with TWO backups made regularly - one stored on site as tapes, and one stored on a server in a different city. The RAID5 system failed. There was a delay because the local office of the hardware supplier did not have a matching HDD on site and had to fly one in. AFTER that was installed and they began restoration actions, they discovered that a SECOND drive also had failed. (I still don't understand how two failed without notice!) So with further delay they replaced that, and began to restore data from their backups. The system was down until both drive units could be replaced then re-started. It ran slow for a while as the data restoration was done, and of course at first not all the data was available as the restoration proceeded. But everything was fully back to normal in just over 3 days.

Bottom line: make SURE your plans include the backup system your DO need!

Managing a RAID array and handling troubles with it requires that you know a fair amount. That means you'll have to learn it. IF you think that my suggestion of RAID5 or 6 is in the right direction, I suggest you start reading and learning, concentrating on those types of RAID. You won't need to know all the details up front. But as you do learn the essentials, you will begin to understand how to answer your own questions about what system to choose, and then narrow that down to exactly which pieces of hardware you need.
 

bkrebs

Prominent
Jun 16, 2017
8
0
510
wonderful responses, guys! i think ill stick with raid 6 (i'm a very paranoid person, and i've already lost some data from file corruption)

have any hardware suggestions? i plan on having this as a box that will be connected to my router and accessable throughout my house, and that i can buy another one, link it to the first one, and easily expand upon it as my file collection grows. another perk would be decent speeds, as i would use it for everything (music storage, video storage, documents, game backups, etc.) just not ALL at once, though ;)

oh, and lastly, would you say a ZFS formatted array would be best? or can i just stick with HFS+ or APFS (once it comes out)?