Question RAID 5 - Intel Raid controller RS3DC040 Failed drive replacement help

massop

Distinguished
Aug 19, 2015
34
0
18,530
Good evening guys.

I have been working on this for 5 days and I am now at a complete loss...

I have a custom built freeNAS server with 4 x HGST 10TB SAS drivers (12GB) set in RAID 5 using the Intel Raid controller RS3DC040.

For almost 2 years this has been working perfect.

Recently one of the drives in the RAID array failed, no big deal, all the data is there and it still works..

I replaced the faulty drive with a identical one and it shows up within the Intel RAID controller bios, however for the life of me I cannot get it to work within the original array.

I have placed the new HDD in the exact same SAS port as the failed one thinking it would just automatically rebuild it for me but the drive just shows as UG (Unconfigured good) in the RAID BIOS, if I set the HDD to a hot spare drive I will get the error "warning none of the existing vds will be covered by this hot spare" which I guess makes sense as it was created after the Virtual disk.

I have read the Intel manual on this and the options that are listed for various tasks in the manual are greyed out for me, see below


hUTa6Ac.jpg


szcvesR.jpg


pdbqQOX.jpg


qcoiLZ3.jpg


I thought about updating the firmware of the RAID BIOS but I am only 2 revisions behind and none of the patch notes show anything that could be the reason for my problems (I am on Firmware Package:24.21.0-0091 (MR 6.14 Point Release))

Firmware Package:24.21.0-0132 (MR 6.14 Point Release 6)
Firmware 4.680.00-8527
ROMENV 1.10
BootBlock 3.07.00.00-0004
NVDATA 3.1705.00-0021
UEFI_Driver 0x06180203 (SIGNED)
Hii 03.25.05.13 (SIGNED)
FCODE 4.19.08.00
BIOS 6.36.00.3
Ctrl-R 5.19-0603
PCLI 1.07.05

Bug Fixes and Enhancements:
===========================
Firmware:
* DCSG00230289 (Closed): MR FW 6.14 - Integration of 3108 Phase 14 MR PL FW Version - 14.00.15.00
* DCSG00077493 (Closed): 9361 (3108) MR6.14: support for sanitize command
* DCSG00136418 (Closed): MR FW 6.x - Integration of 3108 Phase 14 MR PL Version - 14.00.13.00
* DCSG00134675 (Closed): MR6.14: Crytoerase Sanitize Type is failing on SED Drive
* DCSG00072635 (Resolved): storcli return with error when set patrolread mode
* DCSG00068474 (Closed): Creating VD for Direct Connect fails
* DCSG00102246 (Resolved): Dev defect to investigate CLCA code review finding for DCSG00051973 - possible loss of snap dump during abrupt power loss
* DCSG00074489 (Resolved): SEP discovery issue after reset / enclosure reset - RMDP-528
* DCSG00129646 (Closed): MR6.14: Drive state change to Sanitize after crypto erase operation completion "storcli64.exe /c0/e42/s20 start erase crypto"
* DCSG00129198 (Closed): MR6.14:WG: Sanitize operation does not show remaining time in progress bar notification/Event logs
* DCSG00146728 (Resolved): MR 6.14:While clearing the configuration sanitize running drives state has been changed from “SANITIZE_IN_PROGRESS(4) to UNCONFIGURED_GOOD(0)”.


Firmware Package:24.21.0-0126 (MR 6.14 Point Release 5)
Firmware 4.680.00-8519
ROMENV 1.09
BootBlock 3.07.00.00-0003
NVDATA 3.1705.01-0013
UEFI_Driver 0x06180203 (SIGNED)
Hii 03.25.05.12 (SIGNED)
FCODE 4.19.08.00
BIOS 6.36.00.3
Ctrl-R 5.19-0603
PCLI 1.07.05

Bug Fixes and Enhancements:
===========================
Firmware:
DCSG00069702 - Fix to make allowBootWithPinnedCache sticky through reset
DCSG00051973 - MR_6.14: observe “!!! Firmware halted. Fault code = 0x83 !!!” while running double DIP cache offload test on 3108
DCSG00067176 - SAS9341-8i board encounters 'Previous configuration completely missing at boot' error
DCSG00048845 - Cache offload failing to complete on 3108 resulting in L3 initialization failing
DCSG00051973 - Fix to incomplete onfi register configuration leading to FW HALT
DCSG00025975 - Fix to update VD guid info in pinned cache to resolve import foreign config failure
DCSG00063321 - Fix to fast-path task queue look ups


I dont want to mess with updating BIOS if I dont have to.

This has been driving me crazy, does anyone have any suggestions on what I can do to get the replacement drive working in this array?

Thanks in Advance
 
Small update, I managed to get the data off the array and then I deleted the Virtual Disk to rebuild from scratch.

This didnt work because I now realise the identical replacement drive I got is not actually Identical, same brand, model and speed but the format is 512e where all the drives in the array are 4kn.

So this was the problem all along.

I am checking to see if I can change settings to allow a RAID 5 to be built with 512e & 4kn drives. If not then ill have to get another one...