4 disk (JBOD) dynamic array - 1 disk fails..what now?

confuxion

Distinguished
Aug 4, 2009
41
0
18,530
I have four 1tb drives, all WD drives but different models, configured in a simple, spanned dynamic array. In other words, my Windows 7 x64 OS sees it as one big 4tb drive. There aren't many files in number, but every file is very large, and about 3.2tb is taken up on the array.

One of the drives appears to have died, and I have no experience in how to handle things from here. If someone could please list out the steps I should take in order to try to recover as much of the data as possible, I would greatly appreciate it. Note that I have yet to try to access each of the four drives separately to find out which is the bad one, but I am going to do that now.

I don't know if it's possible to get any, if not some, of the files back from this dynamic array. Oh, and any software recommendations that would help me in this process are welcome, too. Thanks in advance!
 
Soo if you had it all in a RAID 0 then you are in some serious trouble.

Need some more details. What motherboard are you using? Do you use Onboard RAID or have an Add-in Raid Card?

With all 4 Drives plugged in download and install the Crystal Disk Info in my signature and download the Standard Installer. Lets see if it can see the drives (It may or may not) and see what the health status of them are. If it can see them then it should give the health status. That will help to see if we can recover the drives.

As for checking files on each drive it self forget about it. A raid 0 saves a piece of each file on each drive. Not one drive will have an entire files so that is pointless.

a few pointers. 1) RAID is NOT a backup. Its only a fail safe (Not in your case though as RAID 0 has ZERO Fault Tolerance as one drive dies the whole thing dies) and 2) ALWAYS have a backup of your importatant data for this kind of situation. I would say if none of it is importatnt i would just get a new drive and remake the array but it seems as if you need files off of there
 
Accessing the JBOD set individually will let you find the bad one fairly quickly but what you can recover from the failure depends alot on the type of failure.

Can you tell us how you setup the JBoD? Raid controller, windows, motherbd jbod? Your systems specs & OS.
 
Thanks very much for the advice and willingness to help, drtweak. I'll address your questions:

The mobo I'm using is the Asus P6T. As for what type of RAID I used, I'm not really sure how to answer (pardon my storage ignorance). I don't have an add-in Raid card, and I didn't configure Raid in any intentional way. I literally went to Disk Management, converted the 4 drives to dynamic, then spanned them into a single volume, effectively turning my 4, 1tb drives into about 3.2(?)tb of usable storage, which Windows 7 saw as a single drive.

As for these files being mission-critical, I should have mentioned that I semi-regularly backed up the contents (my ripped movie collection) to an NAS. Having said that, my goal is to recover the 4tb span on my local PC (the one where one disk has failed), as the NAS backup hadn't been run in several months, and doesn't include about 50 of the 450 or so movies. In other words, it would be great if recovery were an option, but not the end of the world if all was lost on the 4tb span (I'd just have to re-rip those 50 or so movies).

As for diagnostics, I should have screencapped the error messages I saw in Disk Management so I could report it accurately here (I'm a donkey). I do remember that all four disks were showing up as individual disks (Disk 0, Disk 1, etc), that one of them had a red X next to it's title, and that it said something along the lines of "disk failed to initialize" when I right-clicked the title of the disk (eg. Disk 4). I can't remember exactly what options were available upon right-clicking that title or the section to the right of that (which has the name of the disk - ie. "MEDIA" - and a graphical representation of it), but I do remember there only being a few options available - diagnostic in nature - and I proceeded to research what they meant and carrying them out. I wish I could remember what the resulting error messages upon selecting these options, but I can't recall them now (sorry!).

I'm going to take your advice and download Crystal Disk and see what the health status of those 4 drives are. I'll report back here. I remember my PC taking forever to start with those 4 disks powered on, but there's no way to get around that.

Thanks again!



 
popatim, thanks for your reply. I basically addressed your questions with my reply to drtweak, but I'll add the following in case it's pertinent:

Windows 7 Ultimate x64, 12gb RAM, Asus P6T mobo.

You mentioned accessing each of the 4 drives separately to root out which one is bad. I have one of those tools that allows you to connect any bare HDD to your PC quickly via USB, so I understand that part. If you have anything to add in terms of what methods and/or software I might use for the diagnostics, feel free to chime in.



 



Yea sounds like you are running a Software RAID 0 on those 4 drives though windows.

have you ran the Crystal Disk yet?

 
UPDATE: 8/19/2014

I hooked up power to those 4 drives last night and booted up. I got to the login screen, entered my password, clicked login, and my system proceeded to hang for the next 16 hours (I left for work and hoped it would be resolved when I got back). So I simply rebooted and was able to login and get to my desktop thereafter. However, running CDI proved futile, as I tried to simply launch it 3 separate times, and each time, my PC crashed after hanging for about 5 minutes. So I was never able to get CDI running. Fortunately, I have a disk health program installed called Hard Disk Sentinel (HDS), which I ran and took some screenshots for your review. HDS displayed 3 of the 4 drives in the spanned volume, so it's obvious there's a big problem with one of the drives. The only thing worth mentioning about HDS, while a completely different subject, is something I saw on the first screenshot, titled "hdsentinel-disk0.png". Look at the passage appearing to the right of where the disks are listed that talks about one of the healthy disks being in "PIO mode" and such. It states that the health of this disk is only at 50% b/c of this PIO issue. Could you shed any light on what that means and what I might want to do to correct it? This is just a sidenote, so don't worry if you can't help me with that particular item.

hdsentinel-disk0.png

hdsentinel-disk1.png

hdsentinel-disk3.png


Anyway, after looking at what HDS had to say, I opened up Windows' built-in Disk Management console, and took a screenshot of what I saw that you can review. Oddly enough, Disk Manager can detect that 4th HDD, but as you can see, it says that it is "Missing." The options available when I right-click both the box where the title (ie. "Disk 0", etc.) of the disk is, and the longer box that shows a graphical representation of the disk's space, are "Reactivate Disk" and "Reactivate Volume" or "Delete Volume", respectively.

disk-management-static.png


So what would your recommendations be from here? Thanks again for your willingness to help me with this!

Coby

-------------------------------------------------------------------------------------------------------------------------------------------
Thanks for the quick reply, drtweak.

I'll run CDI tomorrow and post my results here. Have to open up box and plug power back into the 4 drives, not to mention wait upwards of 30 minutes for my PC to boot up afterwards (which I'm surmising means the drive that's having issues is majorly screwed).



 
drtweak, I just updated my last post with the results from my attempted run of Crystal Disk Info. I wasn't sure if you'd be notified of it b/c I technically just updated a post instead of making a new one, so I wanted to make sure you were aware if you didn't get notified. Thanks!
 
Yea haha you don't get notifications on edits XD and yesterday i was super busy so didn't get to take a look.

yea its not even showing one of the drives.

One thing you could try is take one HDD out of the loop and see if it boots up faster at all. More than likely the one that allows it to boot faster is the bad one. That or boot up and see how it is showing 3 of the drives and the 4th is missing? If you remove a drive and the other 3 are still there you know which one is acting up, OR you can remove all of them and plug them in one by one with a USB adapter and check the health status of them. This way you don't have to boot it up all the time.
 
Sorry, drtweak, I should have been more specific about my findings. First of all, you're wrong about one of the four drives not showing in the Disk Management Console (if that's what you were talking about). All four of them are showing, but the last one at the bottom shows a red X with the word "missing" next to it. So Disk Management can see all four, but the 4th one apparently is missing. There was only one option available ("Reactivate Disk") when I right-clicked where it said "missing", so I selected it to see what would happen. Basically, I just get a progress circle for 10 seconds, which then goes away and there are no error messages presented at all. So I guess Reactivating the disk isn't an option.

The other thing I should've been more clear about was the fact that I already know which disk is the bad one. That was easy to find out b/c the disks have different long-descriptions, and I just read that of the disk that shows up with the "missing" text next to it. So when I asked you what you thought I should do next, I wasn't referring to trying to identify which drive was bad. Instead, I wanted your advice on what my options might be for recovering the data off of this span, which, I'm guessing, would start with figuring out a way to reactivate that one bad disk. Any suggestions?

Also, I thought I read somewhere that you can convert dynamic disks back to basic without losing any data. If I remembered it correctly, it said you had to use some special type of software or something. Any thoughts on that? Any thoughts, ideas, and suggestions are welcome, and thanks for sticking with me this far!



 
Well the fact that is says "Missing" means its not seeing it. It's still showing it because you have a Spanned Set of drives there which that drive is apart of hence why it still shows it. If you go into device manager it should list all 4 drives. If you're only seeing 3 then one of the drives isn't even being seen by the BIOS. It's more so what is wrong with the failed drive. We need to see the SMART Status of the drive is possible but if its not seeing it by the BIOS at all then we could have complete failure or an issue with that SATA port on your motherboard. If its just a matter of something like a few bad sectors or something we may be able to make a backup image of that drive with some software and then restore it to a new drive and hopefully get it back. It's not a matter of getting it to be a Basic drive or not. Again because its a spanned drive data is written to ALL 4 drives so you won't find any full files on that drive. Only bits of them and even then to try to recover can be a huge pain. Also in Disk management Disk0 isn't always always the SATA 0 port either. The who Disk 0,1, 2 ect is in which order windows saw the disk and/or if its on an external RAID or internal RAID ect. Like on my PC my SD Card reader built into my monitor is Disk 3-6. any new hard drives added regardless of where i plug it into will be Disk 7. So even though it says. So just because the other 3 drives are Disk 0,1, and 3 doesn't mean they are the drives plugged into SATA Ports 0 1 and 3. it could be but not always. But still we need to find which of the 4 drives is the bad one. If you go into the BIOS in there it should list what drives are connected to what SATA port and form there we can use the process of elimination and find the failed drive and then see about recovering the data if possible.

Even if you look at the program you used to get the smart status there. How many WDC1000's are listed? 3. So that means one drive is NOT being read. We need to find which one it is.
 
Thank you for the very thorough and informative response, drtweak, much appreciated. I'm fairly confident I can identify which drive is the one that failed b/c I used 3 different models of drives in the spanned volume. Two of them are WD1001FALS and one is WD10EADS. I know this b/c that's how they are identified in the Device Manager. And I'm pretty sure the 4th, failed drive is a Seagate model. If it ends up being one of the two models I listed above, then I'm assuming my only recourse for finding out which one is bad is to hook each one of them up to the SATA-to-USB adapter that I have and whichever one doesn't launch when attached to my PC is the dead one.

And to answer your question, yes, the BIOS can only see 3 of them. I went into my BIOS at startup and visited the Disk Configuration (or whatever it's called) section, and in one of the SATA ports, it said "No Disk" (or something similar). Hopefully that 4th disk is in fact a Seagate so I'll be able to identify the bad one easily. If it isn't, then I'll do what what you mentioned by seeing which SATA port has the "No Disk" listing.

After I determine which one is bad, I'm assuming the next step is to attach that drive to my PC separately (using the SATA-to-USB cable) and run diskchk on it, yes? During the first run, I won't select the option that says "Fix Bad Sectors" or whatever; I'll just run it to find out if and how many bad sectors there are. I realize that if there's a lot, it will take forever for Windows to correct them, so I'll wait to do that only if you recommend it. If there's some other step I should be doing first, please let me know. Also, what if the drive won't be recognized (by Windows and Device Manager) at all when I attach it separately? Is there anything I can do at that point, other than chuck it in the garbage?

Thanks again for your continued help!



 
Well a Disk Check will not help up here. That is only meant for partitions. Again because this is a spanned volume (A Software RAID 0) you need ALL 4 drives to be online for that partition to work properly because it needs all 4 drives.

When you find the bad drive (which is probably the drive you are talking about since there are in fact 3 1TB WD's in there from the screenshot) then yes you are correct in trying to use a SATA to USB adapter or something similar. If we can read it great! If it still doesn't come up even after that (I'd also try a different SATA port as well) But if it doesn't come up then the drive just might be dead. It could just be the PCB Board on the hard drive and if you had the EXACT same model hard drive you could possibly swap them and get it to work but can't say for sure. Might only make it worse. But if we can't get a PC to see the drive then there isn't much more i could help you with. You would have to send it in to a Pro with all 4 Drives and see what they could do.
 
Awesome. Thanks for the knowledge! I'm going to take the bad drive out and attach it separately this afternoon, so I should have some answers for you soon. That's interesting what you mentioned about swapping out the PCB board if the drives are, in fact, identical. I suppose there's a chance that 4th drive is another one of those WD's I have two of. I'll find out when I crack open my case. I'll be back shortly with whatever results I find. Thanks again!

 
Sorry it took me so long to reply, drtweak. I finally took all 4 drives out and accessed them separately using a SATA-to-USB connector I have. It was easy to determine which drive was bad, and it turned out it was a Seagate Barracuda 7200.11 1tb (model ST31000340AS). When I connected it to my PC and powered it on, it made the sound that a normally-functioning HDD makes when first turned on (it "wound" up). However, after winding up for about 5 seconds, it made a clicking sound, followed by "sliding" sound (sort of a "click-wooosh, click-woosh"). It did this 3-4 times before the drive "wound" down and stopped making any noise, as though is was shutting itself off. Nothing on my PC to indicate it was recognized (no driver loading message, nothing showed up in Device Manager) so obviously there's a major problem. When I powered it off, there was no sound at all, as if it had already shut itself off after the set of "click-wooshes." No winding down noise like a healthy drive would make upon killing the power to it.

Now for the (hopefully) good news: I have an exact duplicate of this Seagate drive that's fully functional, so perhaps your idea of swapping the PCB would work, and allow the bad drive to power on normally. Is it literally just a matter of unscrewing the PCB off the good drive and screwing it onto the bad one? I've never done this, so any tips/guidance you could offer would be greatly appreciated! If you need me to post pics of the two, identical drives, I certainly can (if you need to compare the identifying information on each drive to confirm that they're, in fact, identical). They literally have the same firmware and date code, so I'm hopeful the PCB-swap idea is one we can pursue in this case.

If there's anything else I need to know prior to attempting this, please let me know. I'll await your reply before venturing into this. Thanks again for your attentiveness to this on-going saga!
 
Whoa! Not as easy I thought it would be. How in the sam hell does one take off the screws that are used to hold the PCB onto the HDD casing? They're like the smallest heads I've ever seen, and they're sort of star-shaped, like an asterisk. I don't have anything that small to unscrew those with, and I've got one of those multi-tools with the flip out rods that you'd think would cover every screw head possible! Any insights as to what I need to get those screws off?
 
oh yea haha sorry it took a while for me to reply. for some reason i didn't get a notification of your previous post.

But yea they are Torx screws. Probably like a T6 size screw. having a small kit like this

http://www.microcenter.com/product/321772/Precision_Screwdriver_Set_-_32_pcs (I have this exact one)

Helps A LOT with the small stuff. I have a Seagate ST31000528AS in front of me and its a T6.

Again be careful and do this at your own risk. make sure they are the same EXACT model. The thing is because its a clicking sound it may not be the PCB it could be the heads and may just not work. So again if you swap them and screw up everything even worst then yea don't blame me lol. I have have in the past successfully swapped PCB's before and have them work.
 
No sweat, drtweak. I did a lot of reading last night and discovered just how bad these 7200.11 drives are from Seagate. It all seemed to point to two issues that seem to happen to a lot of them, but they're firmware issues, and from what I read, the 11 clicking noises (which is what mine does) indicates something wrong with the head. I tried the PCB swap to no avail.

It's hard to know what to do now. I couldn't find exactly what you're supposed to do when this issue arises. I saw a few videos that showed you how to supposedly correct a stuck head, but they involved opening up the drive and exposing the platters, which I'm not really prepared to do without a cleanroom and the right tools. The thing that sucks is when this issue occurs, the data is usually fine in most cases, at least from what I could find. I don't know how much the typical cost is for sending a drive into a data recovery place, but I suspect it's prohibitive for me. I saw a few posts where folks upgraded the firmware to the one Seagate released released back in 2009 to address all the drives that were having these problem, and it fixed the issue completely, but I couldn't anything about how one goes about upgrading the firmware when the drive is inaccessible. Any ideas there?

Regardless, thanks for all your help. I think I'm just out of luck here. If you have any further suggestions though, I'm open to suggestion.
 
yea i have an old seagate 1.5 TB 7200.11 and first one last a year. then 6 monhts, 3 months, 2 months, then 1 month. At this point i spent more in shipping it to Seagate when what i paid for the drive. Even though i still had 3 years of a warranty left i tossed the drive. They wouldn't give me another model or a new one after shipping it to them 4 times already, so i gave them the bird and swore to never buy a seagate again...well only one i bought was the SSHD they have for my laptop which honestly was a waste of money. SSHD's should be destoryed because its a 8 GB Cashe not a 8GB SSD to use. Only other time we get Seagates are their enterprise drives for my work for servers. Those are the only ones i'd ever get.

But yea sounds like its just toast. And its probably just the heads and not the platters. Up to you on how much you want to spend on getting any data off of there.