Marvell 88SE6145 RAID controller seems blown. What are my options?

ambush

Distinguished
Jan 13, 2002
136
0
18,690
While booting up my system, I was surprised to discover that suddenly the mobo's on-board Marvell 88SE6145 RAID controller BIOS reported that it could find no RAID array definition for my two-disk RAID 0 array (which fortunately doesn't hold my boot partition). When I looked closer under Windows, a third non-RAID NTFS disk that was connected to the same controller was reported to be "unallocated".

In reality, however, the disks and the data appear to be perfectly fine. SMART testing reported everything was in good shape and the WD diagnostics confirmed this. I could read data off both drives in the RAID 0 array with a disk editor. Also, when I connected the non-RAID disk that was reportedly "unallocated" to the other on-board SATA disk controller, the NTFS volumes there turned out to be perfectly normal with no apparent loss or damage of any kind.

Question 1: Does this mean that the Marvell controller is shot? Or did it just lose the array definition, say because of a CMOS/EEPROM failure of some kind? If the latter, is there any way to restore just the definition? (I have an old bit-for-bit cloned copy of the partitions on a separate backup disk which might well contain the definition somewhere if I knew where to look).

If the controller is shot, the very expensive PCI-X mobo is no longer under warranty and it's also out of production, so I can pretty much forget about replacing it.

Question 2: If I purchase an external SATA RAID controller card, for example this one, can I pretty much pick right up from where I was with no data recovery required? Or does any replacement card have to have the same Marvell controller chip for that to work?

Question 3: I don't remember the parameters I used to create the array (other than that it was RAID 0). How big a problem is that? Again, I have an old bit-for-bit copy (made with Acronis Disk Director) of the partitions which, with detailed instructions, I might be able to recover (direct disk editing doesn't scare me).

I saw a reply elsewhere in this forum which claimed that one can do this sort of thing without data loss, but I'm skeptical. What's the definitive answer? I tried asking Asus tech support, but after several attempts, I can't even get them to understand the question! :pt1cable:

Mobo: ASUS P5WDG2 WS Professional

System BIOS: Latest (0905)
Marvell RAID controller BIOS: 1.1.0.26 (can this be updated?)

OS: Windows XP Pro / SP3

Gurus, I need your help, which is why I came to Tom's!
 

sub mesa

Distinguished
Tip1: don't do anything drastic; you may kill your ability to recover your data this way!

Tip2: try Ubuntu RAID recovery procedure:

step1: download Ubuntu Linux livecd .iso
step2: boot from it and click Try out Ubuntu (without changing my computer)
step3: click menu Places and select Home
step4: in the window that opens, on the left side there is a panel. It should include a line for your RAID volume, for example "...GB Filesystem" or the name you've assigned to it. Click it to reveal your data.

If this works you don't need to buy anything and should be very safe. Commercial options like RAID Reconstructor also should work; though i suggest you try the Ubuntu method first.

Tip3: invest in an external backup solution; 2TB disks are rather cheap now, buy one and put it in an external casing. Cheap and simple backup to save your data next time!
 

ambush

Distinguished
Jan 13, 2002
136
0
18,690


Many thanks for your reply, sub mesa. It was generous of you to offer your time like this.

Regarding Tip 2, Ubuntu didn't see anything Windows didn't. The Marvell RAID BIOS still didn't detect an array definition, so it didn't really surprise me that Ubuntu didn't either. Also, the Marvell RAID monitor app (an Apache process) that allows you to create and modify array definitions from within Windows can't find a definition either.

As for Tip 3, it appears you didn't understand my questions any better than Asus did. I repeat: There is absolutely no reason to believe any disk is damaged or any general data lost whatsoever. I do NOT need to recover ANYTHING other than the array definition data -- a few KB maximum -- AND I still need the answers to my actual questions from my OP:

Question 1: Does this mean that the Marvell controller is shot? Or did it just lose the array definition, say because of a CMOS/EEPROM failure of some kind? If the latter, is there any way to restore just the definition? (I have an old bit-for-bit cloned copy of the partitions on a separate backup disk which might well contain the definition somewhere if I knew where to look).

If the controller is shot, the very expensive PCI-X mobo is no longer under warranty and it's also out of production, so I can pretty much forget about replacing it.

Question 2: If I purchase an external SATA RAID controller card, for example this one, can I pretty much pick right up from where I was with no data recovery required? Or does any replacement card have to have the same Marvell controller chip for that to work?

Question 3: I don't remember the parameters I used to create the array (other than that it was RAID 0). How big a problem is that? Again, I have an old bit-for-bit copy (made with Acronis Disk Director) of the partitions which, with detailed instructions, I might be able to recover (direct disk editing doesn't scare me).

I saw a reply elsewhere in this forum which claimed that one can do this sort of thing -- i.e., just connect the same drives to the new external card -- without data loss, but I'm skeptical. What's the definitive answer?

Thank you.
 

ambush

Distinguished
Jan 13, 2002
136
0
18,690

Thanks, ahnilated.

Yep, the Marvell is set to RAID. That hasn't changed. Nor is there any reason to believe anything's wrong with any disk, since as I pointed out in my OP, all tests and diags report no problem whatsoever.

That leaves us with the Marvell controller having failed, so...

Question 2: If I purchase an external SATA RAID controller card, for example this one, can I pretty much pick right up from where I was with no data recovery required? Or does any replacement card have to have the same Marvell controller chip for that to work?

Question 3: I don't remember the parameters I used to create the array (other than that it was RAID 0). How big a problem is that? Again, I have an old bit-for-bit copy (made with Acronis Disk Director) of the partitions which, with detailed instructions, I might be able to recover (direct disk editing doesn't scare me).

I saw a reply elsewhere in this forum which claimed that one can do this sort of thing -- i.e., just connect the same drives to the new external card -- without data loss, but I'm skeptical. What's the definitive answer?

Thanks again.
 

sub mesa

Distinguished

Then your disks have lost their metasector; stored in the last 512 bytes (sector) of each HDD that is a member of a RAID array.
You can only perform manual data recovery beyond this point.

As for Tip 3, it appears you didn't understand my questions any better than Asus did. I repeat: There is absolutely no reason to believe any disk is damaged or any general data lost whatsoever. I do NOT need to recover ANYTHING other than the array definition data -- a few KB maximum
So do you need the data on the RAID volume or not? If you don't need the data, simply create a new RAID array and restore from backup (which i assume by your reply that you have?).

If you do not have a proper backup, then i assume you DO wish to recover data from your RAID array. And not just the 512-byte metasector; why do you need only that if you don't need the data on the disks/RAID? Doesn't make sense to me at all; please enlighten me!

Question 1: Does this mean that the Marvell controller is shot? Or did it just lose the array definition, say because of a CMOS/EEPROM failure of some kind?
You probably lost the RAID array like thousands of other people using FakeRAID do; the drivers that power your solution don't handle disk timeouts properly, and consider any disk that has recovery time beyond 10 seconds as failed, and updates the metadata sector to reflect this. This causes 'broken' or 'split' arrays, and is very common when using onboard Windows driverRAID.

If the latter, is there any way to restore just the definition?
You can use RAID reconstructor (software) and connect the disks on a non-RAID controller, or you can pay somebody to do it for you under Linux/BSD. Your data should still be there.

Re-creating on your or other brands controller may also work, but it has to use the exact same parameters, some which you cannot influence directly. These include:
- RAID level
- stripesize
- offset
- disk count
- disk order
 

ambush

Distinguished
Jan 13, 2002
136
0
18,690
Outstanding work, sub mesa! -- Now THIS is why I come to Tom's!

I still have more very important questions though, or I'd have chosen this as the best answer. I've edited your reply to focus on the most relevant parts...
At last: the location of the array definition! Thanks!

I understand your frustration, and sympathize. Let me try to explain it differently: First, forget about recovering the metasector/array-definition for now. Then, consider the fact from my OP that when an ordinary, non-RAID drive is connected to the Marvell RAID controller (which I did only because it was the last available SATA connector left and I needed it), Windows and all of my disk tools report that the drive is completely unallocated, or in other words, all data is missing. But, when I connect the very same drive to a SATA port on a different controller, everything's perfect and all the data is still there!

That tells us that with absolute certainty that it's the Marvell that's at fault rather than the drive or the data, and therefore, there is no good reason to believe that the data's missing from the two RAID drives, either! Follow me?

(Yes, it's theoretically possible that there might be some data loss on the RAID drives, but so far there's absolutely no evidence of any data loss because the Marvell is known to be non-functional and that fault falsely made perfectly good data seem to have been lost when it actually wasn't.)

So, given the logical train of thought I just elaborated, there's very strong reason to believe that if I could magically fix the Marvell controller on the mobo, I would find that everything's perfect and not one single byte has actually been lost! This would make backups, reconstruction, or data recovery entirely irrelevant.

Unfortunately, there's no practical way to fix the Marvell controller or replace the mobo. Thus, I'm left with trying to find out if I can buy an external RAID card to replace the Marvell and pick right up where I left off, since apparently no data has been lost.

[Note: I've moved part of this discussion to another post in an attempt to reduce the complexity....]
Okay... Tell me what you think: All the information in that list is somewhere in the metasector/array definition, correct? Assume for the moment that I can set up those values on a replacement external RAID controller; do you have, or can you point me to, a detailed data-structure map of the array definition within the "metasector stored in the last 512 bytes (sector) of each HDD that is a member of a RAID array"? For example, the stripesize might be stored, say, in bytes x & x+1 in that sector. All I'd need to know is the value of x to know the original stripsize. Then I'd need x for the other items in your list too, of course.

If I knew precisely where to look, I could connect the two RAIDed drives to a working SATA connector and read the binary data from there using a disk editor. Then I could enter the values I'd extracted to establish the new array definition on the new controller.

That way, I would need absolutely no "RAID reconstructor" (which would set me back $100 on top of the cost of the new RAID controller) or data-recovery service whatsoever. What say you?
 

ambush

Distinguished
Jan 13, 2002
136
0
18,690
I've moved this part of sub mesa's response here to avoid over-complexity. I'll pick up right here....
I've read about the difference between hardware and software RAID, but apparently I'm still confused. My system has a separate Marvell hardware RAID controller on the mobo that has it's own hardware BIOS with which you create and control RAID arrays entirely during boot time. This doesn't appear to have anything to do with Windows whatsoever. Why isn't that real/hardware RAID rather than fake/software RAID?

The way I've always understood it, software/fake RAID does not have any RAID hardware component and instead, everything is done within Windows.

Sure, Windows has to have a driver to work with the RAID controller, but every hardware device has to have a Windows driver too. This doesn't make a hardware RAID system any less a hardware RAID system. Or does it?

For example, on this very same system I have an unbelievably fast external PCI-X Adaptec SCSI RAID HBA controller with superfast 15K RPM SCSI drives in a RAID 0 array which hold my system partition (and others, mostly for video rendering). Is this considered FAKE/software RAID too?

If not, why is the Marvell considered so? Please help me understand.
 

sub mesa

Distinguished

Are you really sure? It could be that what you're saying, but your data should not be 'fine'; it may see a partition and NTFS filesystem, but it should list that as being larger than the disk; i.e. a 1000GB partition on a 500GB disk. This is how the first disk in a RAID0 would look like on a non-RAID controller.

If your data was really RAID0, then using a single disk you cannot access your data. It could either be a RAID1 (mirror) or JBOD.

If you really had a RAID0 with two disks, of say 500GB each. Then you would have had 1000GB total storage space. If you can access 'all data' on just one disk, then you probably are not running RAID0.

I've read about the difference between hardware and software RAID, but apparently I'm still confused. My system has a separate Marvell hardware RAID controller on the mobo that has it's own hardware BIOS with which you create and control RAID arrays entirely during boot time. This doesn't appear to have anything to do with Windows whatsoever. Why isn't that real/hardware RAID rather than fake/software RAID?
That "Marvell hardware RAID controller" is actually a Marvell SATA controller with an optional BIOS ROM that allow you to create a RAID array etc; this utility writes 512 bytes to each disk to store the chosen RAID config. The BIOS ROM can also boot by reading from the right disks.

BUT: all the RAID functionality performed by this 'RAID controller' is done by Windows-only drivers. If you boot Linux or any other OS, it is seen as a SATA controller with separate disks on it. If you boot Windows it will be a 'SCSI controller' for each array you made and requires a driver to work. Windows Vista/7 may already have such a driver just like they have for Intel software RAID on ICHxR controller.

So this is Onboard RAID / FakeRAID / HybridRAID / FirmwareRAID / Software RAID - all the same. True software RAID has no BIOS utility; so fakeRAID's only hardware is the BIOS ROM; so its 99.5% software; once you're in windows there is no other hardware at work than a normal SATA controller. The BIOS ROM is required to make booting from RAID0/RAID5 possible, for example. Other OS can do this natively; windows cannot, so needs hardware support.

There's nothing 'bad' about software RAID though; in many respects software RAID is superior to hardware RAID. But many people think their onboard RAID controller is hardware RAID; that is not true.

Sure, Windows has to have a driver to work with the RAID controller, but every hardware device has to have a Windows driver too. This doesn't make a hardware RAID system any less a hardware RAID system. Or does it?
A real hardware controller will have a tiny driver like 8 or 16 kilobyte. It will only be used to transfer data and logic from the controller to windows. A "fakeRAID" driver will implement all RAID logic, and use your CPU to calculate anything it needs, particularly on RAID5 arrays.

For example, on this very same system I have an unbelievably fast external PCI-X Adaptec SCSI RAID HBA controller with superfast 15K RPM SCSI drives in a RAID 0 array which hold my system partition (and others, mostly for video rendering). Is this considered FAKE/software RAID too?
Not sure about Adaptec/LSI, but likely hardware RAID. Especially if it has a fan and heatsink so it has a processing chip like Intel IOP3xx; then it is true hardware RAID. A hardware RAID will not allow its physical disks to be seen directly to the OS; any OS. The OS sees arrays, not disks. With fakeRAID the driver hides the physical disks and only displays the virtual software RAID array, but it is accessing the disks directly.

So that means when doing RAID1 (mirroring) and we write 100GB file, we write 100GB to disk1 and disk2. Software/Fake RAID needs to send 200GB to the controller. But when using Hardware RAID you will only send 100GB to the controller; the controller will do the work of sending 100GB to each disk connected. So this is a key difference.

Cheers,
-sub
 

ambush

Distinguished
Jan 13, 2002
136
0
18,690
With utterly genuine respect, good sir sub mesa, I must admit that I am flabbergasted by the truly perplexing and frustrating difficulties I keep facing in trying to be understood! Perhaps people are just speed-reading/skimming instead of reading carefully?

Let me try to explain it differently yet again:

(1): I had 3 SATA disk drives connected to the Marvell contoller.

(2): Two of those 3 drives were identical 500GB drives in a RAID 0 configuration. Let's call these 2 drives ALPHA and BETA.

(3): One of those 3 drives was a normal 2TB NON-RAID NTFS-formatted drive. Again, this was NEVER a RAID volume. Let's call this drive GAMMA

(4): The Marvell controller died the other day.

(5): Since the Marvell died, it no longer recognizes the RAID 0 array of ALPHA & BETA.

(6): However, all disk diagnostics and SMART reports indicate that the physical disk drives ALPHA & BETA are in perfect condition. There might have been some data loss, but at this stage we can't tell yet.

(7): Since the Marvell died, it also reports that the NON-RAID drive GAMMA has lost all data and is totally unallocated.

(8): BUT! when I connect drive GAMMA to any other SATA port other than the Marvell contoller, EVERY BIT OF DATA IS PERFECT! Not one byte of data has been damaged or gone missing! It shows a 2TB NTFS partition will every bit of data that should be there perfectly intact. And yes, I am sure. THIS IS KEY!

(9): Logical conclusion: We know with certainty that the Marvell will report at least one drive (i.e., GAMMA) to be damaged and worthless even though it's actually PERFECT!

(10): Given point (9), since the Marvell is giving a false report about GAMMA, it could very easily be that the Marvell is also giving a false report about ALPHA & BETA, too!

(11): The smartest tactic to begin with is to assume that ALPHA & BETA are not damaged in any way, in exactly the same way that GAMMA was not damaged in any way even though the Marvell controller falsely claimed that it was.

Now, given those facts and assumptions, do you see how your reply is off-point? Let's look at it...

This is off-point because you're focusing there on the ALPHA & BETA drives rather than the GAMMA drive, which needed to be grasped first. My logic depends on understanding FIRST that the Marvell is giving false damage reports about GAMMA, which in turn gives us the presumption that there's likely nothing wrong with ALPHA & BETA either since the Marvell is probably mis-reporting their status too!

So that leaves me with the questions 2 & 3 from my OP again along with the bottom part of my previous post. But maybe it's best to pause here and make sure we're on the same page now...


Best regards,

- ambushed

 

ambush

Distinguished
Jan 13, 2002
136
0
18,690
I do hope I haven't offended you, kind sir sub mesa. If I have, I genuinely apologize. I was just baffled as to why my situation and questions were apparently so very puzzling to people...

I just learned that I already own a RAID repair tool: ZAR RAID Recovery

I might not need it, but at least I can obtain the array definition/parameters with it if necessary, even if I don't need to recover any other data (which I think will be the case).

I plan to purchase the external PCI-X SATA RAID HBA I've mentioned and plug the drives in there. Please advise if you have any advice.


Thanks!
 

bluscarab

Distinguished
Jan 20, 2011
3
0
18,510
So how did it go? I'm guessing from experience that the data on ALPHA & BETA showed up as rubbish with your new PCI-X card. Don't fret. This happens even with "true hardware" raid cards.