RAID Array Issues on my server

mesarectifier

Distinguished
Mar 26, 2006
2,257
0
19,780
Hello, cheers for looking, hope you can help.

I run this Compaq ProLiant 1600 server:
- 2x Intel Pentium III 550mhz (Katmai, 512kb each)
- 512mb SDRAM (PC100/133)
- 4x 9.1gb ULTRA-2 SCSI 10k RPM
I'm having some major issues with the RAID array and the hard disks.

My server had been on for about a week (24/7). I woke up in the morning to find it had BSOD'd. Pretty unusual, as it's usually so well behaved, but it is Windows so it's bound to happen every so often, nothing you can do.

Anyway, I restarted the server and the drives started spinning up. Except this time, 2 of the drives presented errors. I should point out at this stage that Drive 4 has had a 'drive failure' light on for about 3 months, and still worked perfectly well (passed disk tests). Now drives 2+4 weren't spinning up and drive 2 had no lights on at all.

I switched around the caddies, tried changing the order of drives etc... but to no avail. (I restored it to it's original order btw.)

I went into Windows setup and, while it recognised both logical drives on the RAID array (there's one of about 4gb for Windows and one of about 30gb for backup - it's a 'Vital files' backup server for my network.) However, it said it couldn't access either of them.

So I thought I might just take out the two drives presenting failures and try the two remaining good ones in a smaller array. However, when I did that, drive 1 presented a failure light. Now I have 2 drives presenting failure lights, one drive presenting nothing at all, and 1 drive that still works.

What do I do next?
 

Pain

Distinguished
Jun 18, 2004
1,126
0
19,280
Sounds like the controller or the backplane might have gone bad. Not much to do now, except maybe try one disk at a time. You might also try to eliminate the backplane by using a scsi cable directly to the disks, but since they are hotswap disks you may not be able to find a cable to fit them.
 

mesarectifier

Distinguished
Mar 26, 2006
2,257
0
19,780
I think I've got an adapter somewhere to connect the disks directly to a cable, I'll give it a go. I've got an Adaptec PCI SCSI card inside as well, is it worth trying to connect the array to that?

(Just so I know exactly what you're saying, Backplane is the board on which the drives connected to the array are mounted?)
 

mesarectifier

Distinguished
Mar 26, 2006
2,257
0
19,780
I didn't actually set the array up myself (factory) but as far as I can tell it's RAID 0.

It's 4 physical disks set to 2 logical drives of different size - one 4gb and one 30gb - although whether or not it's one drive with 2 partitions or 2 drives I'm not sure, although it does say 2 logical drives so I guess it's 2 drives.
 

rippleyaliens

Distinguished
Mar 7, 2006
62
0
18,630
1. DID you do all the compaq updates? PRIORITY
2. Latest firmware and drivers.

With the proliant 1600's back then, those need the latest and greatest patches, firmware and such.

What version of smartstart are you using? BTW. ALSO, did you install compaq insight manager, to see what all is causing problems, IE, possible bad controller, or drives?
 

Pain

Distinguished
Jun 18, 2004
1,126
0
19,280
It's more than likely a raid 5 [added: Oh, no, if you have 34g of disk space then it's not a raid5, so you're probably right that it's a strip set]. Yes, the backplane is the board at the back of the drive cage that the disks slide into.

Oh, and you do know that your array is now toast, so once you get the problem sorted you'll have to reload the OS? As soon as you started moving disks around and taking disks out then you very likely blew the array. I'm just saying that in case you intend to recover that old array. If you just want to start over then cool. :)
 

mesarectifier

Distinguished
Mar 26, 2006
2,257
0
19,780
Pain: like I said, it's a backup server, and as long as I've got the originals it's cool, I'm not really bothered about losing the data. I wasn't hotswapping the drives if that helps.

Rippleyaliens: Nope, not the latest updates (I try to steer clear of firmware updates until it's neccessary) but I'll try. I can't actually boot the system to run the Compaq array manager. I'm not sure what version of startsmart I have (or what startsmart is, to be honest!)

I'll try firmware - keep the suggestions coming in this is great, thanks.
 

Pain

Distinguished
Jun 18, 2004
1,126
0
19,280
I think that's the best, and probably the only think you can do right now...to update all the firmware, etc.

It won't matter if you hot swapped the disks or not, if they were written to after you moved them around or removed some of the disks completely, then the array is gone. If nothing was written to them during that time then it may still work, but since you have all your data I would certainly start over from scratch when/if you get it running and reformat/reload.
 

mesarectifier

Distinguished
Mar 26, 2006
2,257
0
19,780
I've got some more info if it helps.

When the system starts up it says (after RAM/CPU initialization):

Adaptec AHA-2940 present
(this is what I use for my HP DDS2 drive)

then the array controller initializes:

COMPAQ SMART ARRAY 221 (Rev. A v. 4.16) 2 Logical Drives
1780 Slot 4 Drive array failure
The following SCSI drive(s) should be replaced:
SCSI Port 1: SCSI ID(s) 0, 4

I looked for some files on the HP website in relation to my server. I found:

A SmartStart bootable (I think) CD-ROM, and some bootable firmware floppies for the 5300 array controller, which it says is compatible with my server, though mine says "SMART ARRAY 221" but I guess it just won't work if it's the wrong file.
 

Pain

Distinguished
Jun 18, 2004
1,126
0
19,280
If you still want to play with this then go into the array control menu and force the failed disks back on-line.

Assuming that this machine has been running fine for years, I have a hard time believing that it is due solely to outdated firmware. I would still check for any updates and apply them first. I would also open the machine up and blow all the dirt out of the backplane fans, from around the disks, etc.
 

mesarectifier

Distinguished
Mar 26, 2006
2,257
0
19,780
Okay, I've booted to the SmartStart bootable CD and I've done some diagnostics. The drive which isn't lighting up at all is not being detected, I may try another slot.

It says one of the 'logical drives' is RAID 5 (strange!). I'm going to move some disks around and build a new array, will report back any problems.
 

mesarectifier

Distinguished
Mar 26, 2006
2,257
0
19,780
Ritey, I've averted most of the hardware problems with the SmartStart CD (I wish they made CD's like that for desktop PCs). I'm being given options for my new array:

RAID 5

RAID 0+1

RAID 0

What do you reckon is best?

I'm going to be setting up 2 logical drives (one for OS and one for data). Given the limited (but adequate) capacity of the disks, I'm thinking RAID 0, but any comments would be helpful.
 

mesarectifier

Distinguished
Mar 26, 2006
2,257
0
19,780
Spooky. I'll try changing the SCSI ID of the last remaining failed drive and try adding it to my new array. I went for RAID 0 by the way, I figured I need all the disk space I can get with the low capacity drives and I've just proved that it can afford to be down for a few days.
 

mesarectifier

Distinguished
Mar 26, 2006
2,257
0
19,780
Well, of the four drives one was showing it was failing, so I used the Compaq StartSmart tools on bootable CD-ROM to reset all system settings and drives and then built a new array out of the 3 remaining good drives. I'm currently exploring the possibilities of Linux with the help of linux_0 but it's not working too good at the moment so I may just have to switch back to Windows.