Raid 1 failure, retrieve the data?

Status
Not open for further replies.
Hello,
I'm in a bit of a bind here and need some assistance. I have a small file server I setup running Raid 1 via the motherboard controller. System Specs:
Mobo: ASUS M5A99FX Pro R2.0
CPU: FX-8320
GPU: ASUS EAH6450 Silent
RAM: G.Skill 16gb (4x4gb) 1600mhz - F3-14900CL9Q-16GBBXL
HDD: 2X WD 2tb WD2000F9YZ R (In RAID 1)
PSU: Corsair CX750M (I know it's not great but fit the budget and isn't really stressed)
Windows 7 Ultimate
System has been running flawlessly since November 2013...until today.

Here's what happened:
System froze, pressing Ctrl+Alt+Delete made it reboot. Instead of starting is gave the error to "Reboot and select proper boot device". SATA setting in BIOS is RAID (Ports 1-4, Port 5 is IDE) and using the onboard controller a RAID1 Array is/was setup. The BIOS startup sequence does not show any SATA drives other than the DVD drive in SATA 5. The 2 HDDs are in SATA 1 and 2 (I tried them in SATA 3 &4, no change). In the RAID software, both drives are shown, however, 1 shows as a "single drive" and the other as being in RAID 1, with the second in the array drive "failed or disconnected". I connected the drives 1 at a time to SATA 5 and the motherboard did detect them, but could not boot from them, giving the same error as before. With everything connected as before, I booted from the Windows DVD, but the Windows setup could not read the RAID array or detect the HDD's.
There were no hardware changes and no recent updates.

I'm thinking that the on-board RAID controller is faulty, so I have a few questions:

1) Can I change the BIOS SATA setting from RAID to AHCI and boot the machine up? Or will this further damage the array and I will lose all my data? Or maybe connect it to SATA 5 to bypass the SATA ports connected to the RAID controller and boot from there?

2) If I can't boot it that way, how can I retrieve the data from one of the drives? I have a HDD dock, if I connect it there, should I be able to browse the files and retrieve what I need?

Unfortunately, despite my advice, there are no backups...

Sorry for the wall of text, but I'm in a real bind here and ASUS so far refuses to help other than to help me setup a new array...which deleting the current array and making a new one would certainly lose all the data.
 
Solution
1. I do not think this would work. RAID is picky and changing from RAID to AHCI wont help. If anything it should boot even with one drive. Almost looks like not only did the RAID fail but the HDD also lost data. There is no option to rebuild the RAID?

2. This should work so long as a partition is still on the HDD you are trying.

Asus really can't help you much as they don't design the RAID chip or software so they are out of their league. It is probably an AMD RAID software but their help wont be much either.
1. I do not think this would work. RAID is picky and changing from RAID to AHCI wont help. If anything it should boot even with one drive. Almost looks like not only did the RAID fail but the HDD also lost data. There is no option to rebuild the RAID?

2. This should work so long as a partition is still on the HDD you are trying.

Asus really can't help you much as they don't design the RAID chip or software so they are out of their league. It is probably an AMD RAID software but their help wont be much either.
 
Solution
No option to rebuild the array, I've been all over the software searching. I don't know if the data has been lost as so far there is zero access.

I was thinking that the controller itself went bad since only the RAID software can even see the drives, unless connected to a SATA port that is outside the RAID controller.

I'm going to try the HDD dock a little later, I'm at the office with the file server and the dock is at my house...

What would your advice be going forward, I'm going to replace the mobo (probably going to be a fight with ASUS to warranty it though), but should I replace the HDD's as well? I am going to push for backups after this is sorted out, something like a system image and stored at Carbonite or something.
 
I would replace the drives. More than likely one is failing and got kicked out of the RAID. Also make sure the drives you are using are not WD greens or Blues as they are not rated to be in any RAID and can cause issues.

For backup, I recommend to local backup and a cloud based backup, imaging if possible so you have a copy of the system as it is but also a backup of just data.

You can use Carbonite to make an image and then have a backup run the image and data to a cloud based backup solution.
 
Well at least you have a brain to use the proper drives. I can't tell you how many people I see using consumer drives in a RAID having failures.

In that case it could be the drives or it could be the board. I would say drives before board, I have not seen many failures at the controller level. The fact that the RAID firmware sees the drives means that it is working.

I would say get new drives and then RMA these and keep the RMA ones as backups in case something similar happens.

Personally I would suggest against a RAID unless you plan to buy a server or SAN and have 5-8 drives in a RAID 5/6 so you have good redundancy and need that much space. I would stick with one drive in your case because all RAID is doing is doubling the odds of failure for you.
 

indsup

Reputable
Apr 26, 2015
432
1
4,960
Raid 1 is a mirror system. One drive exactly mirrors the other. Both drives will have exactly the same data. You should have no problems accessing the data at all. Shouldn't have lost any data at all.
 


I was trying to build this machine for longevity and fault tolerance by using quality parts(didn't know about the PSU quality at the time and reviews really are misleading). Oddly, I didn't think of the motherboard being an issue. It's disappointing that there is no option given to rebuild the RAID array, but you were right, it's an AMD RAID controller.
The attraction for RAID 1 was that the information would be copied up to the minute as opposed to backups run nightly. I didn't expect the whole system to stop if a HDD failed, something apparently wasn't working as I thought or both drives are failing. Oddly, the RAID software says the drives are "healthy"...guess that isn't very accurate.

I'm thinking that when I RMA the drives and get the replacements from WD, I will likely just get a HDD enclosure and make it into an external drive for backups and a system image.
 
....The attraction for RAID 1 was that the information would be copied up to the minute as opposed to backups run nightly....
RAID is NEVER a replacement for a good backup. In the case of RAID 1, what ever is written to one drive is also written to the other, this includes file deletions, whether deliberate or not. RAID is for use in systems that require 24/7/365 uptime, I never recommend it for home use.
 

Paperdoc

Polypheme
Ambassador
So far what is glaringly missing is the MANUAL for your RAID system! You are using mobo-based RAID1. Normally there ought to have been a manual on the RAID system on the CD of utilities that came with the mobo. If you don't have that, look for a manual document for RAID on the website of your mobo's maker, and look specifically for the RAID doc for YOUR model of mobo.

In RAID systems I have used, there is a particular key you must push or hold down at the correct time during the boot process to enter the RAID Management utilities. (This is a little like entering BIOS Setup.) From there you usually have some tools to show you details of what RAID system(s) you have set up, and their status. For a failed RAID1 array, I would expect it will tell you which HDD unit has failed as an aid in replacing it. It also will give you a tool to BREAK the RAID1 array. This process separates the two HDD units of the array so that each now behaves like a single stand-alone HDD unit. With that done you should be able to boot from the one good HDD unit.

Then you replace the failed HDD with a suitable match to the good one. Then you use the RAID Management screens again to re-establish the RAID1 array and have it copy all the data from the good drive to the new one. Once that is done, your RAID1 mirror system is completely restored and you can use it again. The manual can help you a lot to understand the process and detail the exact menu choices you can make.
 


Unfortunately, all there is, is the motherboards users manual and all that does is assist in how to set up the array and nothing else. There is a severe lack of support for the RAID on these boards and a lack of options in the software. I've been over the manual about 5 times today hoping to see something I missed.

But there is a key combo to access the array software, Ctrl+F. But nothing there like you described...I would have loved it if I could have recovered the machine as you described as tomorrow I will have to spend the better part of the day getting this machine back up and back to business.
 

Paperdoc

Polypheme
Ambassador
AUGGHH! I see your pain! That BIOS appears to have no useful RAID Management tools!

Your mobo uses the AMD SB950 Southbridge chip. Best I could find by searching for AMD sb950 RAID Utility was this reply about the AMD software called RAIDXpert.

https://www.experts-exchange.com/questions/27785287/Help-I-cannot-find-AMD-SB950-controller-RAID-GUI.html

The second link calls up the manual in .pdf form. The first link wasn't a lot of help trying to locate the utility on the AMD site. But a further search for AMD RAIDXpert download found several versions of the utility, depending on which OS you have.

HOWEVER, that utility is an application you install on a working Windows machine to help you create and monitor RAID arrays. It can help you repair them, too. Your problem is you don't have Windows running on this machine, because the boot disk is the failed RAID1 array!

My only idea then would be to install temporarily a single spare HDD unit on an unused SATA port, disconnect all other HDD's, and install a Windows version on that so you can boot and run. Download and install the RAIDXpert software and manual. Then reconnect your HDDs' to their original SATA ports and run the software to see what it can tell you about the failed array. Once you get it fixed you could remove the temporary HDD and let it run as it did originally.

Here's a clever idea I found in a reviewer's notes, not in the manual. The reviewer was working on how to repair an array such as RAID1. Now, in the software there is a way to select what repair action should be taken when an array is detected as failed. There is also a place where you can configure a single HDD unit on a SATA port to be a "RAID Standby" unit that is unused until it is needed for automatic repairs. So the reviewer proposed installing and configuring the spare, then shutting down and disconnecting the failed unit from the original RAID1 array. When the machine is rebooted and the RAIDXpert software used, you should be able to tell it to repair the array (which is now failed because one HDD unit is missing) using the new spare HDD. Once that is done, you could shut down and restore the original configuration. In fact, with the temporary drive removed, RAIDXpert would no longer exist, but the restored RAID1 should work.

Well, it should, they say! Maybe these ideas can help you. I've never had to deal with your specific problem, so I cannot guarantee.
 


I set this system up 2 and half years ago and have been urging the owners to let me setup an offsite backup ever since. They've wouldn't listen and put if off, they are listening now though. I was hoping the RAID 1 would give at least some fault tolerance, but was never intended to be a full backup. The file server has been running 24/7 all this time and this is the first issue it's had...and proved the weakness of RAID 1.
 


The odds of both drives failing are small but possible. I had a customer once who had a system with a RAID 1 and the RAID failed and both drives died. To top it off he had two separate backups on two different external USB HDDs and both of those also failed. It was a catastrophic failure that I have never seen but is possible.

The RAID software is probably reading the SMART status and I have seen plenty of drives that pass SMART but will fail a full read or write test.
 
I was able to use the HDD dock and browse and backup the databases (Thank God) and am now running WD Data Lifeguard Tools, extended test to see if there was any damage. While I was in the RAID management program, the mobo gave out (CPU fan ramped up and the system turned off, no boot, nothing. I installed the new mobo and it's running good) I'm hoping the RAID controller failure didn't damage the HDD's, assuming that the controller was just the first part to fail. I ended up RMAing the motherboard, but because of how ASUS RMA works, it was going to take too long and ended up buying a replacement. They will have a backup mobo once the RMA completes.

If the HDD's check out, I'm planning on turning them into an external HDD and do backups on it. If anyone knows of a better HDD scan program than WD Data Lifeguard, please let me know. I want to do the best possible to make sure these are still solid.

Thanks for all the help, I wont be going with RAID again, just system images and regular backups. The owners are much more receptive to this now.
 

Paperdoc

Polypheme
Ambassador
WD Data Lifeguard is really good, but it is particularly designed to work on HDD's made by WD, and refuses to do some of its jobs on other drives. So IF your HDD's are from WD, you already have a very good tool for your work.

I have a personal preference on this issue. WD Data Lifeguard (and others by other HDD makers) comes in a few versions. One is a Windows app that you run under a working Windows OS. But I prefer to get the "For DOS" version. It is placed on a bootable CD disk. You boot from that disk in your optical dive and it loads a mini-OS into RAM, creates a Virtual Disk in RAM for storing brief reports you can save to an external device like a USB memory stick, then presents you with a menu of tools to use. It runs perfectly with NO good HDD in your system, and NO other OS to boot. So even if your HDD's all are not working, this version can boot from the CD and run, allowing you to use all its tools.

To get the "For DOS" version you download that item from the website, and it is a .iso image file. You then need a blank CD+R disk, a CD burner, and a CD burning utility that CAN burn an .iso image to a CD. Nero is one that can, but there are others. Once you've made your disk it IS a bootable CD that will run the diagnostics suite for you with or without any HDD's working in your system.

By the way, RAID has lots of difficulties and users need to learn how to use and manage such systems. But I will comment that the particular implementation of on-board RAID you have been using is missing important management tools that other versions of RAID include. The tools I outlined earlier in my Apr 14 post are all there and work on two different machines I've used. Plus, since the tools are part of the BIOS code and not an application that runs under Windows, they are available when the HDD's in the system are not working.

Glad to hear your supervisors are now listening to you about the importance of backups, and the distinctions between RAID and a real backup program.
 
Status
Not open for further replies.