Very odd behavior with BSOD & Memtest86+

thephatp

Distinguished
Jul 6, 2007
52
0
18,630
Hi everyone,

I apologize for the long post, but I'd like to give as much relevant info as possible up front. And I desperately need help, so please hang with me.

I just built a new system, and I'm having some very frustrating and debilitating problems. I'll try to be as concise as possible, but there's a lot to it.

Background, I've run both Windows XP Pro SP2 32-bit and Vista Business 64-bit with similar results. Currently, I'm running XP.

I started experiencing random BSODs about 6 weeks ago. I have since tried to decipher a large number of minidump files (unfortunately, I didn't save any from Vista), and I'm not sure what to think. About half of them are "IMAGE_NAME: memory_corruption" but not all of them. From reading through some of the dumps, I thought it was my sound card (which was an old SB Auditgy 2 Platinum card), because of many `kmixer.sys` problems in the minidumps. So, I finally replaced it, thinking it would solve the problem, but to no avail.

The trend is very strange. I shutdown (read, completely turn off) my PC. Give it a few minutes. Turn it back on. Now I get about 3 days to 1 week with no problems, no BSOD. Finally, it crashes somewhere in that timeframe. So, I reboot (read, do NOT shutdown or turn off), and then come back up and check out the dump. Now I get about another day or so (1-2 days) before another BSOD. Again, reboot, keep working, and then BSOD in some number of hours (say 6-12 hrs). Again, reboot, keep working, and then BSOD in about 1 hours. The time between crashes continues to shrink until I completely shut down the PC. If I shut it all off, give it a few minutes, then power it back on, I "reset" my time horizon and can get another 3 days to 1 week, but the process just repeats.

Now, a couple of weeks ago (when I was switching back and forth between Vista and XP Ghost images), I was getting to a point where I'd try to load ghost or the Vista repair CD, but both would provide an error saying "Kerner is corrupt, or missing". It took quite a while, but I finally got around this. FYI, all of my problems started happening after I downloaded and installed a number of CS3 trial apps (Photoshop, Illustrator, Acrobat, Dreamweaver). None of the trials worked, and gave an error. Mind you, this is on a brand new hard drive, and I started from scratch with windows installs (then made ghost images afterward), so there was never any Adobe software installed before. I've been really tempted to blame my problems on Adobe, but I'm trying to be rational. ;)

Ok, back to memory testing...When I was having crashes in vista and experiencing the "kernel" errors above...Once I got past the kernel error such that I could run the Windows Memory Diagnostic tool, it would report "hardware issues" withing about the first 30 seconds of testing. This was VERY consistent with this tool throughout about 2 weeks of crashing and testing. At the same time, Memtest86+ never reported a single error.

More recently...Last night I experienced my first crash after replacing my sound card. I rebooted and immediately ran Memtest86+ and lo and behold, it reported a large number of errors. I didn't know how to read the display and unfortunately, I didn't pay much attention to the errors except that I saw a lot of "ffffffff" passed but "0fffffff" (or something other than 'f' in the first place) failed. I thought, well, I'll just remove the 4 sticks (4x1GB) and test each one individually. I let each one go through at least the first pass (in the first DIMM slot) and no errors. It was really late, so I put them all back in and ran the test overnight, and 5 hours later, no problems whatsoever.

So, here's my list of questions:

■Has anyone seen the consistent process of BSODs progressively occurring more often as I explained above?■Does restarting completely flush the RAM, or is power still supplied such that only a hibernate or shutdown would flush it? (I ask b/c perhaps this is why it gets progressively faster until I kill power.)
■Could the answer to the question above relate to why I saw errors in memtest86+ immediately after I rebooted, but never after a shutdown (which is obviously required for removing RAM...ie., memory is flushed, so my time horizon starts over)?
■Can I determine (from the output) which RAM module is faulty if memtest86+ shows an error? Or do I then have to start testing each individual stick?
■Is there any other stress test I can use to try to test the RAM? (I'm tired of waiting day's in between crashes to troubleshoot...this is getting very drawn out.)
■What else (what other hardware) could be causing this problem? And how would I test such hardware?
■Would posting any of my numerous minidump files be of any benefit?
■Any other suggestions on how to go about troubleshooting this issue?
■Could Adobe really have anything to do with this? (I've thought about zeroizing my HD, but that would be a huge pain as well, and I have no real evidence to support trying that right now.)
■Anything that you can think of that I didn't ask! ;)


Also, another FYI...I updated my video card drivers (posted Jan 8, 2008), as well as my sound card divers (beta version posted yesterday, Feb 13, 2008). I'll list my setup below.

Any help/suggestions/comments are GREATLY APPRECIATED!!! (And sorry for the long post.)

Thanks,

Chad


Core 2 Duo E6750
4GB (4x1GB) Crucial Ballistix DDR-800 RAM
Gigabyte P35-DS3R v2.0
PowerColor Radeon x1950pro 512MB
Creative X-Fi Fatality Platinum Sound Blaster
Seagate Barracuda 250GB SATA 7200 RPM
PC Power & Cooling Silencer 610 EPS12V EPS12V 610W Continuous
Antec Solo case w/ 3 fans + Zalman CNPS9500AT
 

night_wolf_in

Distinguished
Jan 7, 2007
702
0
18,980
it seems u have ram issue.

im not sure. but maybe one of the rams is faulty. ull have to do extensive testing with memtest+

n as above said. are u giving enough juice to ur rams? meaning, enough voltage as the specification of manufacturer.
 

coldneutron

Distinguished
Oct 13, 2006
106
0
18,680
I agree that, at first glance, it seems like a memory problem. By "(read, completely turn off)" are u switching off the psu or otherwise removing all power to the system? If so, and u have to do that for the system to operate at all, that would implicate the memory controller or mobo imho.
 

thephatp

Distinguished
Jul 6, 2007
52
0
18,630


Not switching off the psu, but simply doing a shutdown from windows or manually killing it with the power button on the case, not the psu switch (although, I did unplug the psu power when changing out the RAM). In this case, would you still suspect something like the mobo? I'm really hoping it's not a bad DIMM slot or something, because replacing the mobo is the worst of all. :(
 

thephatp

Distinguished
Jul 6, 2007
52
0
18,630


I'm not really sure where to look for this info. On the "PC Health Status" screen, I see the following in relation to voltage:

Vcore 1.332V
DDR18V 1.888V
+3.3V 3.328V
+12V 12.048V



In the MB Intelligent Tweaker (MIT) screen, I see the following:

DRAM Timing Selectable (SPD)" [Auto]
******** Standard Timing Control ********
x CAS Latency Time 5 4
x DRAM RAS# to CAS# Delay 5 4
x DRAM RAS# Precharge 5 4
x Precharge delay(tRAS) 18 12
******** Advanced Timing Control ********
x ACT to ACT Delay(tRRD) 3 Auto
x Rank Write To READ Delay 3 Auto
x Write to Precharge Delay 6 Auto
Refresh to ACT Delay 52 0
x Read to Precharge Delay 3 Auto
x Static tRead Value 6 Auto
xStatic tRead Phase Adjust 1 Auto
******** System Voltage Optimized ********
System Voltage Control [Manual]
DDR2 OverVoltage Control [Normal]
PCI-E OverVoltage Control [Normal]
FSB OverVoltage Control [Normal]
(G)MCH OverVoltage Control [Normal]
CPU Voltage Control [Normal]
Normal CPU Vcore 1.35000V



I've tried setting the DDR2 OverVoltageControl setting to something higher than the default (only options are +0.1V, +0.2V, ... , +0.7V -- haven't tested yet). However, even when I change the "DDR2 Overvoltage Control" to be +0.1, but I can't step any more than just by 0.1 up to 0.7 differences. I'm not really sure where to go from here, how to find out what my RAM voltage really is, etc. Any suggestions on how to view actual voltage for RAM and other basic hardware needs. I also don't lke only being able to step (.1, .2, .3, etc. to .7) in the single digits but.

Again, I'm not sure where to go from here. Any quidance?
 

Ed-B

Distinguished
Dec 28, 2006
24
0
18,510
I have had similar problems and traced it back to Windows Updates, I deleted the latest updates and now everything is OK. I think Microsoft is trying to sabatage XP users into buying Vista.
Ed
 

vegeta

Distinguished
Nov 23, 2007
6
0
18,510
Mate, i have had a similar problem to you, and i have put it down to RAM temperature. I have 2x1gb OCZ and 2x1gb geil, and after awhile of use felt the top of the ram and it is seriously HOTTTT. i purchased a ram cooler made by OCZ but haven't had a chance to use it proper since i did a bios flash and routed my p5b-deluxe :-|
If i increased the voltage up from 1.8v to past 1.9v it would error quicker as it was producing more heat. if i had each stick in separately then they would work fine. put them all back in and they would error after awhile. they SEEMED to work fine at the speed of 266mhz which is inline with my fsb of the CPU (e6400) but 333mhz which is what the ram is rated at would error.
it isn't anything to do with your OS or updates/patches as when your running memtest from the cd its separate from your OS ;-) if i were you i would try the ram at the slowest speed (333mhz) and relax the timings rite down to 5-5-5-15 with 1.8v and tester-out. if you have probs, just try 2 sticks, or getting a nice strong fan in there aiming RITE at your ram. but trying 2 sticks first is the easiest way as that'll keep them ALOT cooler. like i said, mine seriously woulda almost been catchin on fire. freakin HOTTTTTTTT
if anyone no's what the safe temp of ram is that'd be good to no! :p
let me know how you go lad! im still waitin on me new mobo to start testing and finding how things go for me with this new ram cooler.