RAID Guru Help Required....

Page 3 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.

Whizzard9992

Distinguished
Jan 18, 2006
1,076
0
19,280
Loaded the latest and greatest BIOS/firmware package from IBM
7.12.12

http://www-304.ibm.com/jct01004c/systems/support/supportsite.wss/docdisplay?lndocid=MIGR-62540&brandind=5000008

The HD Tach now completes without error at a whopping 54 MB avg with burst of 90

The LSI adapter to external array is pulling down 145 MB avg with burst of 179

I'd really like to hear from IBm about this performance,

Thanks for going through all of this trouble to help :)

Have you tried IO Meter? I've read that HD Tach can report incorrect results.

Unfortunately we can't contact IBM directly, and I'm still trying to convince our local techs that 80MB/s performance is unacceptable performance for a $10,000 SCSI array. I'll let you know as soon as we get feedback from IBM, for sure though. We're running BIOS 7.12.02. Our device drivers are reporting version 7.10.18.

Check to make sure the controller is in the PCI-X 133 slot.

Looking at the manufacturers specs there are three PCI-X slots, two are 100 the other is 133. This probably isnt causing the bottleneck but you never know.

That's a good catch and I was actually thinking the same thing. I came across some posts saying that the 6M didn't like anything other than 133MHz. I might have to take a drive to the data center just to check this out. Me and this box need to get to know each other better, anyway :)

When I first saw this in the specs, my first thought was to check the hardware manager ('cause I don't have easy access to the physical box itself). This was peculiar:

My Adaptec AIC7902B is on the PCI-X bus via PCIe Bridge. That makes sense. My 6M is on the same bus, but it's on ANOTHER "PCI-Standard to PCI bridge". PCI-Standard says "66MHz" to me when I read it, which doesn't make sense to me because there are no standard PCI slots on the board. I assumed it was a bad driver name for the bridge, but it's perplexing nonetheless.

My device manager looks like this:

-> Intel E7525 PCI Express Root Port 0
--> Intel 80332 PCI Express-to-PCI bridge A (0330)
---> Adaptec AIC7902B - U320 SCSI
---> Adaptec AIC7902B - U320 SCSI
--> Intel 80332 PCI Express-to-PCI bridge B (0332)
---> PCI Standard PCI-to-PCI Bridge
----> IBM ServeRAID 6M controller

My controller is on PCI Slot 6 (PCI bus 5, device 8, function 0). I'm not able to translate that to a physical slot. Is that possible somehow?


(Oh, and I did the obvious thing and checked for IRQ conflicts and such. That was one of the first things I checked.)
 

Whizzard9992

Distinguished
Jan 18, 2006
1,076
0
19,280
6. I cannot stand piece-by-piece servers. I have to maintain a list of support contracts and numbers, then every time an upgrade comes around, I have to make sure the whole kit works. I would think you would have applied your cost logic to this and see that assembling an entire server is not as cost effective as getting a vendor to do it (HP, Dell, whatever). However, this one makes me understand your warranty dilemma. I just go to the one vendor and say "XYZ is broken. Send a tech out." and the problem is fixed.

I agree. There's something to be said about having 100 servers with all of the same hardware. That's why blade servers are so popular.

I can't imagine having a datacenter full of white-box servers. In fact, I'm pretty sure I had a nightmare about that once :)
 

steelspy

Distinguished
Jan 2, 2007
15
0
18,510
Hey,

I don't wanna flame anyone, but...

Supreme, start a new topic if you like, we're trying to work through something here.

Whizzard9992 has a specific problem that we're interested in resolving.

Supreme, No offense intended.

I think, based on my results, that the database isn't a piece of the puzzle. Since I am not running any DB on my box.

The problem is RAID performance. Since HD Tach and my manual tests have similar results, I am gonna trust the HD Tach is in the ballpark.

Same server, same RAID controller. differing performance (both poor)

I have the 6M in one of the 100Mhz PCI-X slots. My LSI controller is in the 133 slot. I really don't see that as being the problem.

I am tempted to hook the internal raid drive backplane to the LSI card to giggles.

There is a NIC in the other PCI-X 100Mhz slot. I am tempted to take that out and test as well.

No IRQ conflicts found.

My specific part number for the 6M is 32P0033.

I saw the page on the IBM site about 33 MHz. Everywhere else on the site it refers to PCI-X up to 133 MHz I gotta believe that one page is mistaken. 64 bit 33 MHz is kinda backward.

http://searchstorage.techtarget.com/magItem/0,291266,sid35_gci870758,00.html

states that PCI-X bus will be crippled down if any slower cards are present.
Will test and advise
 

sandmanwn

Distinguished
Dec 1, 2006
915
0
18,990
I have the 6M in one of the 100Mhz PCI-X slots. My LSI controller is in the 133 slot. I really don't see that as being the problem.

What slot is the PCI-X 133 in your IBM server?

Whizzard has his in slot 6.
 

Whizzard9992

Distinguished
Jan 18, 2006
1,076
0
19,280
> PCI-Standard says "66MHz" to me when I read it, which doesn't make sense to me because there are no standard PCI slots on the board.


"standard PCI" historically means 32 bits @ 33 MHz = 133 MB/second MAX
(this is the same number as found in ATA-133 to describe PATA HDDs)

Thanks, but in the server-world, standard PCI has been 64-bit for a while, which is generally 66MHz.

Please take your degree from the wiki-google university and tout it somewhere else.

Apparently you saw the "RAID guru help required...." subject and felt compelled to prove yourself as such. The problem is that I would spend more time rebutting the garbage you're spewing than addressing my original problem.

No one cares what you've done in the past, and no on here needs a tutorial on PCI-X, RAID, Databases, or the differences between SCSI and SATA. In fact, 1/2 the stuff you say is completely wrong anyway.

Let me put this simply: you're not a guru. You've established that already and there's nothing you can do to change that. While you're at it, stop calling yourself a webmaster. That title went out of style like criss-cross and MC Hammer. My 12-year old cousin can code HTML and call himself a webmaster.

Offense intended. Please leave. You obviously cannot provide any meaningful help. You're not welcome here.


@steelspy: Cool. Thanks for checking that. My HD Tach benches came in close to my IO Meter benches as well. I found a cool tool written in .NET called DiskBench that will create a file from memory, eliminating bottlenecks. It's a pretty straight-forward synthetic: it just creates a file and writes garbage to it as fast as possible. It's hard to tell what's going on under the hood of HD Tach and IO Meter, so DiskBench was helpful to know that the problem existed with even elementary applications.

Is it possible for you to try putting your 6M in your PCI-X 133MHz slot? I don't have that luxury because my server is currently in production, and I remember reading something about the 6M not liking PCI-X 100MHz somewhere....

Thanks for that link. I wonder if there's a misbehaving card on my PCI-X bus dragging it down... I didn't even consider that before.
 

steelspy

Distinguished
Jan 2, 2007
15
0
18,510
Removed the NIC and no change....

My 133 MHz PCI-X slot is slot 6 as well. Server model is the x236 P/N 8841-01U tower with 6 bay hot swappable driver enclosure built into the front of the box.

Slots four and five are the PCI-X 100 Mhz

Slots two and three are PCI-Express

Slot 1 is 33 Mhz 32 bit PCI

I email'd IBM. We'll see what response, if any, I receive.

Put my 6M into slot six???? Why not....

I do not intend to reveal to you
why I am participating in this forum.
Supreme, Go work on your own forum, it appears to be read only!
The following is directly from SL's site:
The Supreme Law Forum is presently available in READ ONLY mode,
during a period of extended database updates and routine maintenance.
 

Whizzard9992

Distinguished
Jan 18, 2006
1,076
0
19,280
Wait, let me try. No, Whizzard gets pissed because someone came into the thread trying to show how SCSI was more expensive then SATA, which DOESN'T help with his problem. When asked to either help or go away, the posted did neither. Did I get the problem right Whizzard???

You're right on-point.


Regarding the RAID config, having 3 arrays on one card is a pretty standard DB config, as long as the card is beefy enough.

One array is the system array. It's only 2 drives, so since it's not enough to saturate the channel, the second mirrored array was placed on the same channel. Since they're mirrors they don't really tax the controller at all.

The RAID 5 array is on its own channel for obvious reasons :) It taxes the hell out of the controller.

The first mirror is a system mirror, which contains the OS and installed programs. The second drive is for transaction logs, which offers additional database redundancy and performance. Since our temp DB also exists there, it gives us a little more performance for parallel queries.

RAID 5 was a good choice because we do mostly reads, and since this is an analysis database, size is a major factor. We perform VERY few random write operations on the database.

I didn't personally configure the server, but I do agree with it.
 

Whizzard9992

Distinguished
Jan 18, 2006
1,076
0
19,280
Removed the NIC and no change....

My 133 MHz PCI-X slot is slot 6 as well. Server model is the x236 P/N 8841-01U tower with 6 bay hot swappable driver enclosure built into the front of the box.

Slots four and five are the PCI-X 100 Mhz

Slots two and three are PCI-Express

Slot 1 is 33 Mhz 32 bit PCI

I email'd IBM. We'll see what response, if any, I receive.

Put my 6M into slot six???? Why not....


I do not intend to reveal to you
why I am participating in this forum.
Supreme, Go work on your own forum, it appears to be read only!
The following is directly from SL's site:
The Supreme Law Forum is presently available in READ ONLY mode,
during a period of extended database updates and routine maintenance.

hrm. It's interesting that you have the same series machine that we have. I still wonder if there's some sort of architectural bottleneck; i.e. a bridge on the motherboard without enough throughput to handle the data to/from the controller, or some on-board device soaking up all of the bandwidth.

I'm going to head back to google for a bit and see what I can dig up on this controller/server pair....
 

sandmanwn

Distinguished
Dec 1, 2006
915
0
18,990
I did some google search on the 6M and it seems that their are some older (much older as in first gen. 6M's) controllers that are severely limited in bandwidth. Maybe need to check partnumbers and see if these match up to yours.

Oddly enough one of my searches for IBM 6M bottleneck returned this very post. 8O
 

steelspy

Distinguished
Jan 2, 2007
15
0
18,510
6m P/N is 32P0033

God I hope the 0033 doesn't mean 33 MHz...lol

Actually, I already checked IBM site and they indicated that the specific part number I have is of the 133 variety.
 

Whizzard9992

Distinguished
Jan 18, 2006
1,076
0
19,280
Oddly enough one of my searches for IBM 6M bottleneck returned this very post. 8O

Haha. That's awesome.

You know that makes sense. I wonder if my GUI will give me the P/N or Serial. Like I said before, we've had a large number of problems with this box. I know the card in there now was not the original, and neitehr was the backplane. IBM's solution has been, thus far, to just replace all of the parts until the server works again :roll:

I wonder if we got a refurb that just couldn't take the heat.
 

Whizzard9992

Distinguished
Jan 18, 2006
1,076
0
19,280
UPDATE

Ok so I think I'm on to something here....

I looked at my RAID UI for the ServeRAID controller, and it showed an Adaptec "back-end". Apparently, the ServeRAID controller uses the Adaptec chip for the SCSI interface. I don't think I have another actual card in the case....

I managed to get a copy of the invoices and we didn't purchase an additional controller for the tape drive. We can thank the IBM reps for that :roll:

Since we're using an external enclosure for the RAID 5 array, I'm assuming an IBM lackie plugged the tape drive into the only open SCSI connector inside of the machine. Unfortunately, the RAID 5 array is on that same channel, connected to the external SCSI plug on the OUTSIDE of the card.

I didn't consider this because the RAID utility showed all of the drives in the box, and omitted the tape drive. The tape drive appeared to be on another controller, which would be an accurate assumption. I'm assuming now that the UI only shows drives loaded into the RAID BIOS, and non-raid devices appear on a different driver entirely (which makes sense).

I'll be taking a drive down to the data center tonight. I'll post my results tonight or in the AM.
 

Whizzard9992

Distinguished
Jan 18, 2006
1,076
0
19,280
Bah.

The tape drive is connected to the onboard SCSI controller. The adaptec chips just happen to be the same.

The ServeRAID card is in the PCI-X 133MHz slot (Slot 6).

I've not found a way to determine the channel speed because I'm using a backplane :( The UI won't report it for some reason if it's on a backplane. The BIOS was no help.

Back to square 1.

I'm starting to think it's a bottleneck on the board or the card at this point.
 

sandmanwn

Distinguished
Dec 1, 2006
915
0
18,990
any idea on when the system was purchased?
perhaps you have one of those older 6M's models with the 64bit/66mhz.

would be nice to have part numbers and manufacturer dates. bummer on the remote location.
 

belvdr

Distinguished
Mar 26, 2006
380
0
18,780
Bah.

The tape drive is connected to the onboard SCSI controller. The adaptec chips just happen to be the same.

The ServeRAID card is in the PCI-X 133MHz slot (Slot 6).

I've not found a way to determine the channel speed because I'm using a backplane :( The UI won't report it for some reason if it's on a backplane. The BIOS was no help.

Back to square 1.

I'm starting to think it's a bottleneck on the board or the card at this point.

I'm no expert on IBM's range of RAID cards, but I am quite familiar with HP's. Specifically, HP provides onboard controllers like the 5i or 6i. You can upgrade to better cards, like the 6400 series. Even though both cards connect to the disks over Ultra320, the 6400 blows the 5i and 6i away.

This makes me wonder if the 6M is a lower end card, and the bottleneck is in the card. I'm still wondering where that 6,200 I/Os / second came from.
 

steelspy

Distinguished
Jan 2, 2007
15
0
18,510
The serveraid 6M does use the Adaptec chip (AIC7902?)
But the motherboard has SCSI on it as well. There are SCSI ports on the motherboard for peripherals such as tape drives.

I would be shocked if the tape drive is plugged into the RAID controller, let alone the same channel. My understanding has always been the channel can used internally or externally, not both. Please correct me if that is incorrect.

I am kinda surprised by the negativity about IBM. Forget the Deathstar drives (consumer grade stuff) for a minute. With this exception of the 6M, I've had excellent success with their server grade equipment.
 

belvdr

Distinguished
Mar 26, 2006
380
0
18,780
The serveraid 6M does use the Adaptec chip (AIC7902?)
But the motherboard has SCSI on it as well. There are SCSI ports on the motherboard for peripherals such as tape drives.

I would be shocked if the tape drive is plugged into the RAID controller, let alone the same channel. My understanding has always been the channel can used internally or externally, not both. Please correct me if that is incorrect.

I am kinda surprised by the negativity about IBM. Forget the Deathstar drives (consumer grade stuff) for a minute. With this exception of the 6M, I've had excellent success with their server grade equipment.

Okay, I wasn't sure if the 6M was an add-on equivalent of the basic onboard RAID. I would have to read the docs on that card, but some cards have dual controller ICs which allow you to use all ports.

Actually, I've had great success with the DeskStar drives. I'm not sure how I avoided problems there, but I'll chalk it up to luck. I had one issue with an IBM ServeRAID (back in the P2-400 days), and the array controller destroyed the array. Oh what fun that was.
 

steelspy

Distinguished
Jan 2, 2007
15
0
18,510
The PCI-X slots are, for all practical purposes, equal.

I removed the LSI card from the 133MHz slot and put the 6M into that slot.

Ran tests with similar results to before.

I then installed the LSI card into the 100 MHz slot that the 6M had occupied.
The external array that the LSI supports worked at the same speed it did before.
 

Whizzard9992

Distinguished
Jan 18, 2006
1,076
0
19,280
(I'm just going to start ignoring the stupid posts. If no one replies, they'll eventually go away. SupremeLaw a.k.a. WizardOZ feeds off of conflict. Remove the conflict and he'll leave.)

The ServeRAID card I have uses the 7902, which is the same as the onboard controller, so there was no way for me to tell remotely into which controller my tape drive was connected. The RaidMan.bat UI will show you what SCSI chip you're using (SCSI Backend property).

It was indeed the on-board controller, as one would expect.

I've inherited this server/problem, so all I know of IBM support @ this point is what I've heard from the Help Desk. They want to avoid making the call because IBM is going to want to take the server down so they can start replacing hardware (which is what they've done in the past).

Just out of curiosity, I ran IO Meter on my friend's new computer running 2 7200.10 drives on Intel's Matrix RAID in RAID 0. A single 16K sequential read load yielded ~5700 IOPS and ~75MB/s. That's within 10% of what my ServeRAID is doing.

Each drive should provide about 60MB/s on sequential reads. Even if we had crappy drives at 40MB/s, we should be seeing > 200 MB/s transfer speeds with 6 spindles. I expect to saturate the bus and cap out at > 300MB/s for sequential reads.
 

steelspy

Distinguished
Jan 2, 2007
15
0
18,510
I don't think you can multiply 6 drives times 40 MB to get over 200MB/s with RAID 5. But I'm not a doctor or anything. I am running 5EE, which might be worse.

I will use the LSI card with the array that is connected to the 6M and see what results I get.

I intend to call IBM as well, but I might as well have some more fun first

Are you still averaging 55 to 60 MB/sec?
 

Whizzard9992

Distinguished
Jan 18, 2006
1,076
0
19,280
I don't think you can multiply 6 drives times 40 MB to get over 200MB/s with RAID 5. But I'm not a doctor or anything. I am running 5EE, which might be worse.

I will use the LSI card with the array that is connected to the 6M and see what results I get.

I intend to call IBM as well, but I might as well have some more fun first

Are you still averaging 55 to 60 MB/sec?



I'm running 5EE as well.

I'm running an 8k stripe, so a 16k sequential read should span 2 drives and give me > 100MB/s raw. With read-ahead I should get more than that.

I just tried running IO meter and it failed :( I installed some windows updates last night, and I wonder if that fubar'd my IO Meter :(