Weird hard drive behavior, intel x-25 raid 0

Page 2 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.

csflame4

Distinguished
Jul 23, 2008
121
0
18,680
Recently my hard drives have been acting weird, this is going to be hard to explain but ill do my best.

Every now and then my hard drives will just start doing something, the hard drive activity LED on my case just stays on, it doesnt flicker or anything, when this happens I cant really run any programs, iTunes will not boot up, some of my games wont even boot up, the windows system health report wont even work correctly, it just hangs on the page were it says collecting information for 60 seconds. Ive looked through task manager I dont see anything unusual, i even check the performance monitor and theres nothing unusual in there also. Even when I try and shut the computer down, it will sit on the shut down screen for longer than normal, and then when the monitor turns off this is when the computer turns off also, but it stays on and the hard drive LED is still on. Is this intel's version of garbage collect? It never did this when I just had one SSD

This is extremely frustrating and im about ready to reformat, even though this windows installation is not even a week old. I have all of the required tweaks for a SSD to run correctly, so I cannot figure this out.

If I just leave it on and let it do its thing, when it finally finishes, I still cannot run some programs, sometimes if I restart it will not do it again and then I can play my games and run iTunes, but also sometimes when I restart after it has finished, it will do it again once it turns back on
 
I don't think any of the tweaks could lead to the kind of behavior you described; and you also said something comparable happened during Windows setup/install.

Could you tell me exactly what you have in your system? CPU, mobo, disks, PCI cards, everything! Also do you have any USB devices plugged in besides the keyboard and mouse? Can you confirm you ran Memtest86+ for at least 24 hours and used Prime95 to test the stability of the CPU? Prime95 can be downloaded with google, and runs in windows unlike memtest86+ which you have to burn to cd.
 
Yes that is supposed to be a 2 instead of an S. I have ran prime95 and memtest for 24 hours. Everything is stable.
Core i7 950 @ 3.83Ghz
Mushkin redline DDR3 1600 @1603Mhz, 12 Gigs, BTW: it is running on the XMP profile and default timings which are 6,7,6,18
EVGA SLI classified mobo with the most current bios, Bios 51
Creative x-fi titanium fatality pro, which uses the small PCI-E slot
No other PCI cards
Lite-on blu-ray disk drive
Lite-on DVD-RW drive
2 intel X-25 M's
1 WD caviar black 1TB
1 WD external 1TB Mybook, which is the only thing plugged into a USB port besides mouse and keyboard
1 Nvidia GTX 295
Thermaltake 1200 watt PSU
I think thats everything

Another thing that may be usefull. Before I was running raid, I only had one X-25. Every now and then my computer would randomly stop detecting one of the drives, sometimes it would be the SSD, and when that happens the comp would crash, sometimes it would stop detecting the WD caviar, sometimes it would also stop detecting one of the CD drives. The only fix that would work was to switch the SATA ports. Sometimes it wouldnt happen for weeks, sometimes it would happen multiple times a week. I pretty much narrowed that down to the SATA controller. Also the computer hanging problem im having now never happened before I set-up raid. So then I bought a second X-25 and eventually started having the hanging problem. Today for the first time, my mobo didnt detect one of the SSD's and windows would not boot, I then had to switch the SATA ports to fix it and then I switched them back and it booted up.

So could this maybe be a mobo problem?
 
It really sounds to me like you've having some sort of motherboard or controller problem that's likely not the fault of the Intel SSD. Have you checked your system event logs to see if there are any errors being reported?

It would also be worth checking the SMART attributes to see if there are any DMA errors. Unfortunately my Intel SSD doesn't seem to have some of the SMART error counts that my hard drive does, but it does have an "End to End Error" count that might reveal connection issues.
 
I haven't checked the logs yet, but I will. What exactly do I check for in the logs? Is there another way to check the SMART attributes? I know the intel toolbox checks that, but the toolbox does not support raid.



 
Look for what I call "red bullets" in the event logs - events that have a red error icon next to them. Not all of them are really all that bad, but if you find stuff related to the disks then it might provide you with a clue.

I suspect with RAID you may be out of luck in accessing the SMART data, unless the RAID configuration screens themselves have that capability.
 
Didnt see any errors related to the disks, I also couldnt find anything to check the SMART attributes in the matrix console, the only thing I saw was that it can verify the raid array, so I did that, took about five minutes, and it came back with no errors.

BTW, my mobo stopped detecting on of the SSD's again, so my system froze and then got a BSOD because of that, had to switch the ports again to fix it.
 
Perhaps it is simpler to boot an Ubuntu Linux cd and access SMART information from there. If you want to try this, go ahead and download latest ubuntu desktop iso from ubuntu.com and burn it, boot from it and click Applications -> Accessories -> Terminal.

Then install smartmontools:
sudo apt-get install smartmontools

Then check what kind of disks you have:
ls -l /dev/sd*
dmesg | grep GiB

Then retrieve SMART data from your HDDs:
smartctl -a /dev/sda

You should get a lot of output. Specifically, check for the RAW VALUE of the following properties:
- reallocated sector count
- Current Pending Sector (this NEEDS to be zero or you have a problem)
- UDMA CRC error count (this indicates cabling errors)

Do this for all HDDs/SSDs; though the output of the SSDs may look very different. If possible, post all data by copy-pasting it while still in Ubuntu; you should be able to get Firefox going and login and create a post from there.

Remember: with the Ubuntu livecd, nothing is installed to your harddrive. That also means that after a reboot everything is reset again; thus you would need to install smartmontools again. Also downloading this package would require internet connection (just plug in the Ethernet cable).
 
Well I was replying to your post about 15 minutes ago untill my computer started going crazy.

I will start with what I was origanally going to write. I havnt had much time lataely to run ubunto but I will try. Today when I turned on my comp it booted up noticeably slower, so I retarted it, instead of going to the black screen that says windows 7, it just went to a black screen with nothing on it and the hard driove LED was on, I let it run for about five minutes, then I pressed the restart button, then when the raid set up screen comes up it shows an error on the second disk in the raid array, so it boots up and I have to go into the matrix console and set that disk to normal. I got an error on the raid array a few days ago and im 90% sure this was the same disk, I have also done a clean install of windows since that first error.

First just to make sure im doing this right. When I do a fresh install of windows I normally delete the raid array, then go into windows repair and run a full clean on both drives, then I go back and create the raid array, then I boot up onto my second windows partition on my storgae disk and I do a full format on the raid array.

Now the fun begins, as I was writing this post for the first time, the computer would fully freeze for about two seconds, everytime it would freeze, the hard drive LED light would go on. This would happen about once every 5-10 seconds. Finally after about five minutes of this, the matrix console reported an error on the raid array, once again it was the second disk. When I opened up the console, it didnt even show the second disk on it. So it finally freezed again, only this time it stayed frozen and then got a BSOD. It booted up and did not detect the second disk, I had to switch the ports for it to detect it again. Now everytime I boot up on the raid array, I will get a BSOD about one minute or so after it has booted up,I took a screenshot of it.

photo.jpg


I am writing this post on my second windows install and everything is running smoothly, no freezing, no weird hard drive behavior like I have mentioned before.

I think I am just going to use this partition untill I can fix this issue, I will also be able to tell if the problems have carried over to this partition also.
 
If you can get inside windows with the drives connected to non-RAID controller, you can use HDTune to access the SMART data by looking at the Health tab. Now please post all SMART data in here of all disks it finds. It is still a mystery to me too.
 
Ok I will do that wen I get home, what is the best way to get SSD's back to like new conditions? I don't mean speed because I know they wear down, I'm just wondering if I'm not doing it right when I perform a fresh install and maybe that is causing some of these issues, I've tried hdd errase but I cant figure out how to get that working, I'm not very good at dos apps.
 
No I have not replaced the Sata cables, I will do that wen I get home, I'm not sure if I have any new ones though, my motherboard isn't even a hear old yet, do sata cables go bad that easily?
 
Yes happened to me several times. You should be able to identify cabling issues by looking at the UDMA CRC Error Count SMART-value. Though replacing them just in case can't do harm. I would check the SMART output first, though.
 
This is kind of a dumb question... but how do I check all of the things you said on HD tune, I see were it says info but it doesnt show anything else. Also, im assuming the SSD's need to be single disks in order to test them? not set up in raid
 
Here are some screenshots of my results with HD tune and Intel Toolbox. I replaced all of the cables with bran new ones still in their packaging. The western digital drive was tested with each cable in every SATA slot on the mobo and each time it turned back a 45 error count.

First screenshot with one SSD

Untitled1.jpg


Second screenshot with the other SSD

Untitled2.jpg


Third screenshot with western digital drive

Untitled3.jpg


Here is one screenshot with two intel toolbox's running, one for each SSD

Untitled4.jpg


EDIT: I just did a full format on both drives and one of them now has a bad sector...
 
Seems you indeed had cabling issues on your WD drive; that could have caused Windows to switch to PIO mode (temporarily) and cause very low performance across the board even for infrequent accesses to the WD drive.

The number should not rise, you may want to keep an eye on it. If it stays 45 now that you have replaced the cables, you should be fine and ruled it out as a possible cause of any issues you may encounter from this point on.

In the last sentence you said you formatted both drives (i assume the SSDs in RAID0?) and you see one of them has a bad sector. Does the SMART values say anything about that? Reallocated sector count? Where do you see the bad sector?

Also, i can see a high number of "unsafe shutdowns" as reported by the Intel toolbox utility. That could be due to a bug with windows shutting down too early not giving the SSD/HDD enough time to flush everything to disk. But i'm not aware of any such bugs with Windows 7. Could also be the result of cabling issues; for example if one of the SSDs had a bad SATA power cable or converter cable.

Still the likely suspect of your symptoms would be cabling issues with the data gathered so far.
 
hmm I never noticed the unsafe shutdown part. Im not sure what would be causing that, any possible SSD tweaks that may be causing that? remember I linked the sites that I use for the tweaks.

Yes I do format the SSD's in raid 0. I didnt take any screenshots of HD tune when it reported the bad sector. The toolbox said everything was ready to use, and there was only one bad sector that was reported by HD tune.

Since then I have done a secure erase on both drives and have installed windows and all the drivers. I am writing this post while booted on the raid array and so far everything is good..
 
I looked at your tweak links, however they appeared to be harmless.

The bad sector that was reported could also have been a glitch from a cabling issue instead, but because it took so long for only that sector HDTune might think it is bad.

So i would keep an eye on the SMART info on all your drives (SSDs + HDDs) to monitor if the cabling issues are gone now.
 
Ok I will monitor my drives. I think the next step will be to buy new cables, if that doesnt fix it then I will try running the drives by them selves for awhile and see how that works. If their fine i suppose it may be a raid controller problem and I will probably just RMA the motherboard.

I really appreciate the help
 
Well I just noticed something interesting. Everything was running normal on my fresh install right up to the point I put a game in MY Blu-ray drive to install it. It is now doing that problem I have been having. I left it on the main install screen for a couple on minutes and then drive was spinning at maybe half speed or so, fast enough to were I can hear it, and the green activity light is flickering. I then I exited the install screen and the drive is still spinning and the green light is still flickering but there is nothing opened up from the CD itself.

Maybe this really is a faulty SATA controller. I am still using the new SATA cables

EDIT: Just ejected the CD and the hard drive LED turned off. Put the cd back in and the hard drive LED went on again, even just on a normal install screen were you hit next. Now it is sitting on the setup screen with no progress completed and it will probably stay like that for a few minutes.

Finally started installing but it is installing extremely slow. Not just slow for an SSD but even my Caviar black can install twice as fast as its going now.
 
Just got some interesting results with HDtune and ASSD benchmark

Untitled1-1.jpg


and this one is after I clicked OK on the error in the previous screenshot

Untitled2-1.jpg


EDIT: I just ran the ASSD benchmark again and it worked correctly. I opened up matrix console and notice that voume write back cache was disabled, so that may have been the problem. I enabled it and it worked fine