MemTest Failed but I suspect the Harddrive or Virus

gmatt

Distinguished
Mar 8, 2009
8
0
18,510
Hi,

I've purchased a dell system a short while ago and when playing games I occasionally got corrupted data file errors. These errors never seemed to crop up but they become more and more recent of late. I suspected my RAM might be the culprit so I ran memtest http://hcidesign.com/memtest/ . I have 8gb of data and I ran 3 copies testing a total of 7.5gb at once. After about 5 hours it did one 100% pass and it found one error. I thought that I had indeed found a memory problem, but then I ran several other memory tests to confirm faulty ram (Vista Memory Diagnostic tool, Memtest86, Dell Diagnostic tool) all of which could not find a similar error in the ram (passed with flying colors with each.) Note the difference between the latter 3 tools and the 1st is that the 1st runs while Vista is active and read/writes to the harddrive while all 3 latter ones run without an OS running. In fact I read bits of memtest manual and came across the excerpt

"NOTE: If you run MemTest and it only checks a few % of RAM over the period of an hour, this means you told it to allocate more RAM than is available. When this happens, almost all of the testing time is taken reading from the hard disk swap, which is a reasonable hard disk check, but not very useful for checking RAM. Select less RAM to check and try again. "

So it seems that the error was most likely caused by the harddrive since my inital test left only 500mb of ram to test 7.5gbs.

My question I pose to anyone willing to comment is would such an error be more likely to be a hardware error with the harddisk or a software error (like driver issue, or virus-- I ran several downloaded executables that were high risk virus without checking.)
 
Download Ultimate Boot CD and run memtest86+. You will need to burn the ISO to CD with ISO Recorder or other ISO burning software. Run it overnight at least a couple of passes, unless it fails of course. If memtest86+ doesn't fail then your RAM is good.

Don't run a RAM test in the OS, especially if the OS is getting flaky, you obviously can't trust the results.

Download the HD manufacturers tools and test the drive. You can also download SpeedFan and pull the SMART from the drive and compare it to their online database.

You can also run HDTune and see if the drive is performing to spec.
 


Thanks for the reply Zorg. I did run memtest86+ for a couple of passes and it reported 0 errors, but I'm flogging it some more just to be sure.

I was just going to ask you if you knew of any software to test HD instability but it looks like you already answered that =) I will try those things you suggested to test the HD.
 



Thanks for the opinion. Do you have any suggestions on how to isolate a failing drive, other than wait until it fails? =P
 
Zorg mentioned HDTune, which will work. Typically if a drive is failing a simple check will report errors, and if those are repaired the next check reports more.

I had not clicked on your memtest link before. The one you want is here:
http://www.memtest.org/
and is run from a booted CD, not from Windows. I think the program in the boot CD Zorg gave is the same or very similar.

In Vista, when you R-click on a drive and select properties, you'll have a tab called tools. In there you'll have the option to error check the drive.

 
just an aside:

I was getting memtest errors due to a failing PSU. After trying known Ram, and getting errors running Vmemtest (memtest for video - I might have the name wrong), and errors running other tests. I tried a new PSU and everything worked
 
A bad/junk PSU can throw all kinds of errors, it's probably the most important, and most neglected, piece of hardware in the computer.

Not to mention it will also eat hardware.
 



Thanks for commenting, this gives credit to the theory that not all memtest fails are caused by faulty ram.
 
A bad mobo can also cause memtest86+ to fail, but it's more likely going to be the RAM.

Anyone that buys a junk PSU is already screwed. It's a good way to eat hard drives as well.

I guess we should say, "If you have a junk PSU then replace it and post back." :lol:
 



Actually I think I narrowed down the problem to a Wifi card (dlink DWA-556 which uses PCIe) that I recently installed. It seems to be the most logical explanation because

1) I never had any corrupt file issues before installing this card (recently)
2) When I select shutdown in Vista x64 it proceeds to shutdown, but then the computer reboots instead!
3) When I remove this wifi card Vista x64 shutsdown properly!
4) From what you guys have mentioned about PSU being a common source of error, it doesn't seem far fetched that the Wifi card is wreaking havoc in subtle ways, maybe change voltages or something; if it has the power to reboot instead of shutdown it seems like it could very well do anything else it pleases
5) I ran memtest86+ for almost 2 days with this card removed and I had no errors in RAM
6) I checked my hardrives using the manufacturers software and it reported to be stable and healthy (not failing)
7) I have since installed the most recent drivers for my MotherBoard (nForce 650i) and the computer now properly shutsdown instead of rebooting (from Vista) even with the wifi card


So as I said in 7) I have since installed the most recent nForce drivers and its seemed to have corrected the behaviour of the rampant wifi card. Im going to rerun the same test I have performed earlier and I will see if I still get an error. Does anyone know if a wifi card like this can damage other hardware? Can I sue DLink if it does?
 
Anything is possible but you won't get anywhere trying to hold DLink responsible. I doubt the card damaged anything, just made things screwy. I should have told you to strip everything out of the machine to begin with, I didn't think you had any PCI cards. I'll be more thorough in the future. Always look for a recent change when the machine acts up.

A friend had me working on his Dell with a 3G Prescott in it. I reloaded it and ran every test under the sun and it performed flawlessly. He took it home and it started acting up again. It turned out to be a cheap USB wireless dongle. It took me a while to figure it out, because he never told me he was installing it and never gave it to me with the machine.

I had a problem with what appeared to be a 74G Raptor on my machine. The machine ran fine with a PATA drive but not properly with the Raptor. I swapped out the drive with another SATA and found out that the P35-DQ6 mobo had gotten flaky. The really odd thing was that the SATA drives would run slow on boot and occasionally in use, but the on board ethernet would fail after 10-15 minutes and the PATA drives created no problems at all.

It's not always easy to diagnose a problem.
 



You're absolutely right, this was be no means an easy diagnosis, but I'm gaining confidence that it is the correct one. In the future I will know to strip the PC to isolate the problem much faster (especially remove recently added hardware.) Thanks for all your help.

Since my last post I have run Memtest86+ for about 4 hours with the Wifi card I mentioned in my previous post installed. Memtest86+ detected an error on pass #1 (which is the second pass, since Memtest86+ starts counting from 0.) I am now about 95% certain that the wifi card is screwing with my memory. The only other possible explanation is that Memtest86+ did not find this error when I tested it the first time without the card, but that is extremely unlikely because I tested it for much longer than I did with the Wifi card. To attempt to reduce the probability of this explanation I will now run Memtest86+ for 4 passes or more without the Wifi card. I'll report back my findings.

PS. I have written down the details of the error that Memtest86+ found on a piece of paper. Is it possible to directly test to see if I can reproduce the error without having to run all the other tests? The pattern it failed was 00010000 at memory address 00098456f14 (2436.4mb) do you know if the memory address are static when mapped to the hardware or do they change everytime you run software/Memtest86+? If they are static, can I just write a really quick and dirty C program that attempts to write to this address and then read from it? For now I am just doing it the naive way and re-running Memtest86+
 
I would just run Memtest86+ again and go to sleep. There is no guarantee that the error would occur at the same address, assuming the WiFi card is to blame. It is probably random.