Help with DAS/NAS diy

Page 2 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.

BLACKBERREST3

Prominent
May 23, 2017
50
0
630
Hello, I am looking to build a DAS/NAS. I have been researching on what the best solutions would be for performance and it has left me with many questions. I would like to use existing hardware if possible which consists of: i7 6700k, 64gb DDR4 (non ecc) corsair ram, Z170 Deluxe (20 lanes all together). I will use SyncBackPro for scheduled backups. I want insane read/write speeds on the main work portion (around 8 terabytes) and 2 high-speed redundancies (I’ll add more as I need it). I am looking towards using software raid and using either FreeNAS or Windows Server 2016. I also need my data to be byte perfect with no degradation over time to preserve precious data. I plan to use this as a personal DAS/NAS and not something that would require it to run all the time.

My questions are:
1. ZFS or ReFS, suggestions?
2. Can I use ram as a non-volatile super cache or lazy read-writer if it is always powered and I keep redundancies to prevent data loss?
3. What is the best set up for performance that also lets me add more storage easily if I need it; raid0 SSDs + raid10 HHDs or tier-ing or something else?
4. What SSD/HDD combros do you guys recommend, I am leaning towards seagate for HDDs?
5. If a raid array fails, does that mean that I must replace the drive that failed or all drives because it damaged it somehow (only talking about hardware not data)?
6. What is the best way to connect to this DAS/NAS; direct PCIE pc to pc or 40/100gbe or something else?
7. How would I set up a 40/100GBE connection and what would I need?
8. Is there any other thing that I may need to know or want relating to this?
 
Solution
PCIe Gen 4 is due next year. Gen 5 is due a couple years later. 5 years from now, you won't be looking at a build like this at all. ThreadRipper is due in weels. I wouldn't go for a build like this even then.

Regarding the workstation+storage server build, you don't need a crazy network. Just allocate half the drives to the storage server for backup purposes. Unless you need twice-daily backups, a simple gigabit network would suffice. All you'd need is a half decent switch that won't destroy the rest of the network while the backup is running, and a second port for RDA.

Regarding "no bottleneck vs depends on networking", that's fairly naive. There's always a bottleneck. It's almost always a function of the workload, and usually best...
So which drive configuration would be the best for my situation? I thought I mentioned what I wanted to do with it already. I need more storage plain and simple. I would like it to also act as a nas so I can access these files remotely.
 


SSD(s) might speed up the file system overhead. But a 500MB/s SSD is still less than the combined bandwidth of the RAID6. Your sequential bandwidth will be MAX about equal to a 10GE. Your 40GE statements just show that you haven't done the basic research that was recommended in @Ralston18's post (that you voted down).
 
I have a couple of comments/questions...

1) Can I ask why you need the "server" to be separate from your workstation if you're the only one ever accessing it? The DAS/NAS/SAN requirements add significant complexity and overhead compared to working with local storage. If your server needs to be physically secured or there other issues, you might be better off looking at a thin client/remote desktop type solution such that your IO is all contained within a single system. It's still going to be complicated - particularly if you want redundancy - but at least that would allow you to consider more off-the-shelf hardware.
I'm fully aware there may be complications in getting your software to run locally on the storage device. But I'm suggesting given you have Windows, FreeBSD, Linux and even hypervisor options for the storage server, you may well find it easier and cheaper to overcome those problems than it is to solve the 40Gbps+ data transfer requirement you hit as soon as you separate your "server" and "workstation". A Thin clients setup would still allow you to work remotely if required.

2) I'm struggling to understand why you're so attached to the 6700K machine you already have. You've told us you need 100TB + of storage as well as 8TB of ultra-fast-as-fast-as-possible storage. You're looking at $10K as an absolute minimum just on storage. There's no getting around that. So if you're already looking at that cost, surely it makes sense to consider alternative hardware too if that's going to simplify things, or even make it achievable?

3) Unless I misunderstand you, you have very high data integrity requirements over a long period of time. I think ECC RAM is absolutely fundamental to that requirement. I'll leave it to others to address this. But my understanding is that you simply can't have confidence in data integrity without ECC RAM.

It seems like @thenerd is able to give you some solid advice here. So I wish you well!
 


The only configuration that could be recommended is a RAID6. Ten 8 TB Red Pro drives in an 8+2 configuration would provide the 50+ TB space you require. You will have to have a dedicated RAID controller card to support this.
 
You guys have given me a lot to think about. I was naive at first and downvoted when I shouldn't have and I do regret that. I apologized to ralston18. Thank you guys for taking the time out of your day to help me, I really appreciate that.
 
I should state that, so far, my answers have been made assuming you actually need the kinds of speeds you're talking about. As others have pointed out, we still haven't established that.

If you've never put a storage server together, it is quite unlikely that your intended use for this system will actually need the performance that you've been asking for. Not only are the performance figures you've been quoting quite unrealistic in the situations you're referring to, it's not at all easy to get those performance figures from six figure servers. Normally it requires intimate knowledge of the workload, extensive pre-purchase testing, and custom software. Sometimes it even requires custom hardware.

I completely agree that your expectations should be recalibrated regarding what can and cannot be done with conventional hardware.

I also completely agree that your questions seem more like you've not done your homework than like you're trying to actually pull off the performance you've asked for.

I'm here to help you get what you want out of the system. I was there once. I made quite a few rookie mistakes. I'm not trying to get in arguments, and I'm happy to answer rookie questions. I admit that some of your responses have been quite frustrating, as you don't seem to want to listen to any advice that contradicts your understanding of how things should work.

@kanewolf - The WD Red Pros are okay, but in performance sensitive situations or high-reliability servers, I strongly favor the WD Golds or REs. They've got a lot of history as extremely reliable drives with solid performance. They're also better in random workloads. I'd also recommend RAID 6E over RAID 6. It's one more drive, and can make maintenance a bit easier. Rebuilds start as soon as a drive goes down. Once the new drive comes in, you just swap the bad one, and you're off.
 


We're here to help. We know the lay of the land.

I'd be happy to walk you through a serious storage system that will give you whatever performance you want out of it. If you want the best for your money, I can do that too.
 
@the nerd 389, I just used the Red Pro as an easy to research example. The Gold may be better, I have not personally used them. RAID6 vs 6E is probably a personal thing. Purchasing a shelf spare and not putting hours on the spindle vs having rebuild start unattended. We could argue that one for days. I have some very smart engineer friends that would say that ZFS or BTRFS would be better than a dedicated RAID controller. Again, flame wars have been fought over less :) There are multiple solutions to this and each one has trade-offs.
 


Fair enough. I'm not exactly a fan of flame wars, and didn't intend to start one. I just thought I'd throw in my two cents is all.
 
Here is what I got so far; my next build will have 40-80 lanes (dual xeons), ECC ram, pcie nvme ssds raid 0 (expensive no doubt), 2x seagate sshd raid 0 since I already have it to match the nvme ssds, and enough memory bandwidth to not worry about even coming close to a bottleneck. My server case is going to be a Lian-Li - PC-D8000 which will house the z170/6700k and 10-20 Seagate Archive HDDs (ST8000AS0002) that are rated for 800000Hrs or 91.32 years. The memory bandwidth for the server is 34.128GB/s and I don't know what kind of overhead that will involve for what type of raid setup, but I don't think 20 drives rated for a max 190MB/s per drive (3800MB/s) would come close to that and hopefully my cpu can keep up too. I like the idea of raid 6 in an 8+2 configuration because that would give me 2 parities, but I need to do a little more research on raid from the enterprise side of things because I also need to know how to scale in the future. My questions this time are;

Is there such thing as a passive raid card or just a pciex16 or 2x pciex8 sata 6gb/s card?
If I were to solely use software raid with a passive card if it exists, does anyone have experience with how much of a performance hit this would have on the cpu using windows server 2016 built in raid solution or any software raid for that matter?
Is there an article I could read to fill me in on the gaps of scaling, raid configurations, and more?

I read something from another article that talked about different read/write ratios from different raid configurations. The one I am going after would be 80/20 read/write ratio for the cold storage. If I got something horribly wrong or if you have already told me before, please go easy me :)
 
I keep hearing great things about software raid, but it only leaves me with hardware raid. I don't know where to look.
edit: Finally found it. its called an HBA card or a raid card with IT mode that you would flash to it.

I'm going to be looking for good candidates like pcie3.0x16 with 20 sata ports or 2x pcie3.0x8 with 10 each if anyone knows of any while I am looking.
 


Any more than 8x is almost impossible to make use of. The drives you'd use with it simply aren't fast enough to leverage 16 lanes, even with 20 drives.

That said, I'd stick with either Adaptec or LSI RAID controllers. They're mature brands that have proven to be incredibly reliable.

You can get expansion cards if the standard 8xSATA/SAS connections aren't enough. You can expand them out to about 128 drives if needed. [strike]I'll fetch some links for you and update this post with them.[/strike]

Here's a solid option for a high performance RAID card. Note that the card itself can deliver 6.2 GBps sequential writes. The spec page has a typo, and instead lists 5.2. The manufacturer's product page lists 6.2. If you need more than that, you're better off with a non-RAID approach. It has 16 internal ports.
https://www.newegg.com/Product/Product.aspx?Item=N82E16816103258

Here's an expander that adds 28 internal ports.
https://www.newegg.com/Product/Product.aspx?Item=09Z-01XU-00002&cm_re=82885T-_-09Z-01XU-00002-_-Product
 
Actually, that was one of the first questions I had. I just didn't know how to ask it. As far as I know there are lots of raid setups that focus on different aspects like different read/write ratios and iops/latency. I would like to have a read>write setup with a fault tolerance of at least 2, but I have no idea how to set that up. That's why I was hoping to find an article explaining scalability and long term data management. Also, I found an HBA (LSI 9305-24i) instead of a raid chip. It's less expensive than the raid card, but I don't have any experience with software raid and so I don't know if my cpu can keep up. If I believe hard enough, my cpu can do it...

Just to clarify: Main pc will have some type of ssd nvme raid 0 with 2xSSHD4tb in raid 0 to match it so that takes care of main work space and temporary redundancy with decent speeds. It's the server that will have slow write and high read speeds for cold storage. I know we all like numbers and hard data, but sometimes its all relative.
 
All forms of RAID will cost you some IOPS/latency, generally. Regarding "read/write ratios", I'll assume you mean read and write performance in sequential workloads. This is a direct function of how many non-redundant disks are in the array. For instance, RAID 5 with 4 disks has 1 disk of redundancy, and 3 disks of space. The 3 disks of space can work together and combine performance in sequential workloads. In other workloads, you're going to be limited by the slowest disk in the array, plus any processing overhead. Basically, RAID will widen the gap between random and sequential performance considerably.

If you want a fault tolerance of 2, your best bet is RAID 6. It will allow you to lose any two drives without loss of data. RAID 10 will let you lose any one disk, and it will let you lose two disks if the disks are not in RAID 1 with each other.

Regarding NVMe drives in RAID, be aware that the processing overhead is considerably greater, and performance consistency of the drives plays a vital role in overall performance. NVMe is designed to improve SSD performance in random workloads, and RAID will eat into this advantage. In sequential workloads, you will be bound by the slowest drive in the array on a per-write basis. If the model of drive you use isn't very consistent, then you will take a severe performance hit when you put it in a RAID array. TLC drives are horrible choices for RAID, as they tend to be inconsistent at best. Only true enterprise SSDs will be suitable for use in a RAID array (with the possible exception of the Intel 750).

Your CPU can handle the work required to run two 10-drive HDD arrays at full speed in certain conditions. If you have any CPU-bound workloads, they will see reduced performance with software RAID. If you have workloads that involve random read or write conditions, the CPU will introduce more latency than hardware RAID. Software RAID is also vulnerable to instabilities in the CPU, RAM, and software.

I know you've requested we not talk about using RAID as a backup system. I will not say that your plan isn't sufficient for your needs, as I don't know precisely what your needs are. I will simply lay out the differences between a backup system and a RAID setup. You should decide if RAID is sufficient only after you completely understand what it can and cannot protect you from.

RAID: This is a redundancy scheme, and as such, is not intended to handle disaster situations. The goal of RAID is to protect the system from hard drive failure and reduce the amount of downtime due to system maintenance. The protective mechanisms of RAID stop there. This will not protect against a failing power supply, failing fan, failing software, fires, power surges, weather-related disasters, etc.

Backup: This is a disaster recovery scheme, and has the sole purpose of mitigating damage that results from component-, system-, facility-, city-, or region-scale disasters. Neither a broken air conditioner (facility-scale) nor a massive storm (city/region-scale) will put a properly designed backup system at risk.

Any server or storage system you build will be no more reliable than the power system, cooling system, software, or any other mechanism that would be a single point of failure.

Lastly, it sounds like the SSDs you plan to use will be subjected to extremely write-intensive workloads. This is a problem for most consumer SSDs. At full speed, you can eat through their endurance ratings in a matter of weeks. If you do a back-of-the-envelope calculation using the published specs of the 1 TB 960 Evo, it will last 2.4 days at the rated write speed. That's assuming it would be able to pull off 1900 MB/s throughout that time period, which is simply not the case in the real world. It would choke up a few minutes in and drop to something closer to 200-500 MB/s in sustained sequential write conditions. That would allow the drive to survive 9 to 23 days of sustained punishment.
 
In addition to what RAID of any type won't save you from...
Accidental file deletion, corruption, ransomware, other malware and viruses.

An accidental deletion of a file in a RAID array simply means it happens across multiple drives at the same moment.

This is what actual backups are for.
 


From what I understand, he/she's planning to keep on backup within the machine. This will protect against accidental deletion and corruption. It will not protect against the rest of it.
 
That was something I forgot to mention. I did learn about the difference between backups and redundancy, though not as straightforward as tn389 said. I was debating having a backup on the same machine (10 for storage and 10 for backup) because it doesn't protect me from local disasters, but I think I will risk it for now until my collection gets bigger, then I will find an offsite location, but that won't be for a while. If I only use the cpu to handle raid and nothing else, then it should be fine. It's the client devices that will work the files. The server is for sequential storage more than random read/writes. I only need to copy files to and from the server. I didn't know that SSDs did not play well with raid. Most pcie SSDs that I find are only 1-2 terabytes and I need 8tb at once. I am after reduced latency and high iops which you can only get with solid state. Would a jbod be better than raid for this? Also, forget what I told you to not tell me before, I was naive back then.
 

TRENDING THREADS