[SOLVED] 2x NVMe in RAID 1 on a Consumer Motherboard

AndrewJacksonZA

Distinguished
Aug 11, 2011
576
93
19,060
Hello

(I have already read through this old post from four years ago: RAID 1 PCIe M.2 NVMe Card?)

We're trying out Google's approach of buying a consumer grade machine as a replacement for a ten year old Xeon server as we need the maximum possible grunt from an 8-core CPU that we can get due to software licencing reasons, and the Xeons and EPYCs with 8 cores just clock too low.

We're going to be running Windows Server 2019 on an i9-9900K on a MSI Mag Z390 Tomahawk with 128GB of RAM (4x 32GB of Corsair Vengeance LPX 2666 C16) and clocking it at 4.9 or 5GHz.

I've specified 1x Samsung 860 EVO SATA as the OS drive, and 2x Samsung 970 Evo Plus 1TB NVMe drives in RAID 1 as defined in the BIOS (not Windows) as the data drive. However, I'm having second thoughts and have some questions regarding how RAID might be handled on a consumer board:
  1. If one of the NVMEs fail, because the RAID 1 is setup in the BIOS, should the one remaining drive carry on until we can replace the dead one? On server hardware, if a drive dies, one walks into the server room, pulls the drive from the server's chassis, plugs in a new one, and the drive backplane takes care of it, all without downtime. I'm not yet sure how to handle consumer motherboards like this.
  2. We're migrating from 1x 5400RPM HDDs in RAID 1 so ANY boost in performance would be VERY welcome. Given that, for more "server" like RAID behaviour in terms of RAID array reconstruction, perhaps an Intel RS3WC080 RAID card and changing to SATA SSDs in RAID 5? In terms of TB written, it appears that 70TB/year might be it, and drives like the Samsung 860 EVO 1TB SATA have a lifespan of 600TBW. The MSI Mag Z390 Tomahawk's specs say that it supports RAID 5 over SATA, so I'm not sure if the Intel RAID controller add in card is just going to be a waste of money.
In case this matters, we're going to be running the CPU at full tilt every 30 minutes for about 10 minutes between 05:00 - 22:00, and writing to the drives at the same time. Reading from the drives will typically occur during those same intensive 10 minute slots, and then less intesive reads scattered throughout the day.

Thank you
Andrew

EDIT: Added typical workloads to the end of the post.
 
Last edited:
Solution
RAID 1 is for uptime, not data protection.
Any RAID 1 needs to be supplanted with an actual backup routine.

In theory, a RAID 1 will run in a degraded state if one member dies. It should notify you of this.
When you can schedule some downtime, that is when you replace the failed drive and let the system rebuild the array.

Do you have a need ($$ affected) for this continual uptime?
What is the use case for the data drive?
And what is your actual backup situation?

USAFRet

Titan
Moderator
RAID 1 is for uptime, not data protection.
Any RAID 1 needs to be supplanted with an actual backup routine.

In theory, a RAID 1 will run in a degraded state if one member dies. It should notify you of this.
When you can schedule some downtime, that is when you replace the failed drive and let the system rebuild the array.

Do you have a need ($$ affected) for this continual uptime?
What is the use case for the data drive?
And what is your actual backup situation?
 
Solution

AndrewJacksonZA

Distinguished
Aug 11, 2011
576
93
19,060
Hi USAFRet, and thanks for the reply.

We literally don't have the budget to pursue more expensive "proper" options right now.

The use case for the data drive is writing perhaps 5GB of data every 30 minutes to the drive between 05:00 - 22:00 every day. Reading from the drives will typically occur during those same periods, and then random reads scattered throughout the day.

Backups are currently happening twice a week, and that should be increasing to every second day with the upgraded machine.
 

USAFRet

Titan
Moderator
We're running Tableau Server, and I want it so that if a drive dies, the entire machine doesn't have to be reloaded, we can wait until a replacement drive arrives, remove the defective one, and slot in the new one.
It will probably work.

But there is zero reason Incremental or Differential backups can't happen every night. Or even 2-3 times a day.
Twice a week is not enough.