The Undiagnoseable Rig - An open challenge to all experts

cococabana

Honorable
Jul 24, 2012
117
0
10,710
Hey guys,

I wantd to ask help about a rig that has left me stumped, perplexed and simply flummoxed. This rig defies the basic elimination method and simply refuses to be troubleshooted. Basically the problem is erratic and varies but generally its boot failure, and an unstable windows operation with intermittent lockups and freezes. Things i have tried so far:

Reinstalled Windows (Multiple Times)
RMA-ed the Seagate HDD (1 time).
Replaced current RAMs with known good and got same problem, tested current RAMS in another PC without problem.
RMA-ed the Mobo.
Tested PSU 5v and 12v using Multimeter (5.12, 12.2)
Bought new HDD (Western Digital Blue EZEX)
Verbally Abused the system considerably.

Interesting symptoms:
When windows was working, Linpack burning using LinX gave the same answer every time flawlessly (no errors).
BIOS and POST are very very slow
I have tried removing the sound card thinking it might be a shorted piece.

Specs:
Asrock Extreme6 TB4 Z77
Core i7 2600k
Xigmatek Tauro 600W
G.Skill RipjawsX 8GB (4x2)
GTX 580 Zotac 1.5GB
Seagate 1TB (no longer) Western Digital Blue EZEX 1TB

If any of you guys can do this, would be a life saver. Honestly.

I dont know what to think now, i think the only thing i have truly eliminated is the RAM and HDD. Could it be a faulty CPU? Mobo faulty again? PSU?



 
Solution
Specs:
Asrock Extreme6 TB4 Z77 ( This motherboard doesn't have the best track record in RMAs from my research and you say you've RMA'd the motherboard was your original repaired or just replaced? Do you even know? I always mark my RMA motherboards somewhere so I at least know if I'm getting mine back or a replacement?)
All the problems you're listing with the tests you've done in my own experience point to intermittent motherboard failure, and unfortunately that is one of the most difficult problems to get a clear diagnosis on because it may need to reach a certain temperature before it fails?

Core i7 2600k (Personally after all the problems ASRock encountered after the BIOS flashing fiasco with flashing Sandy Bridge motherboards with...
assuming temps are good, and that the older Xigmatek low-end crap psu youre using is actually providing good power (which i doubt)
have you tried resetting the bios to defaults?

i really think you should try another psu. xigmatek isnt known for making high quality power supplies, and even if the voltages show good on a multimeter...theres also ripple and a whole host of other things that can make a pc unstable caused by a psu. more over that, its an older psu so the caps have aged and output has been reduced

also, are you running 4x2gb or 2x4gb ram? theres also the minimal chance that the mobo youre using just doesnt like having all its dimms loaded at the same time. you could always try running with just 2 dimms installed to see if that alleviates some of your issues
 
I would honestly see if you have a faulty copy of your operating system first. Try running a linux OS and see if you run into any problems.

Do you have it overclocked at all? A bad overclock can cause some serious problems. Even an old overclock that was once stable may not remain stable forever.
 


I clear CMOS every time i do a step and i am using a 2x 4GB configuration for RAM designed specifically for sandy bridge :/ (RipJawsX). Hmm, seems i need to get another PSU now.

My friend has successfully used the same USB Boot stick to install the same copy of windows on his PC with no problems. No overclock. cant even boot now.

 
Specs:
Asrock Extreme6 TB4 Z77 ( This motherboard doesn't have the best track record in RMAs from my research and you say you've RMA'd the motherboard was your original repaired or just replaced? Do you even know? I always mark my RMA motherboards somewhere so I at least know if I'm getting mine back or a replacement?)
All the problems you're listing with the tests you've done in my own experience point to intermittent motherboard failure, and unfortunately that is one of the most difficult problems to get a clear diagnosis on because it may need to reach a certain temperature before it fails?

Core i7 2600k (Personally after all the problems ASRock encountered after the BIOS flashing fiasco with flashing Sandy Bridge motherboards with Ivy Bridge bios flashes, I'm not really sure why you would even buy a motherboard that wasn't designed specifically for Sandy Bridge CPU. Many say that doesn't make a bit of difference because the motherboard CPU support list covers the 2600K, but that may not really be 100% problem free, as there are advanced feature capabilities the motherboard has, that the Sandy Bridge CPU can not take advantage of, like PCI-E 3.0 for one.)

Xigmatek Tauro 600W ( The only thing I can say is it at least has a 36a single 12v rail to handle the 580GTX, but It still could possibly be the problem, can you swap out the power supply from the machine you did the memory test in?)

G.Skill RipjawsX 8GB (4x2) ( This is interesting, does this memory have a model #, speed, timings, and voltage? Are you overclocking the 2600Ks memory controller just to run this memory? Many don't realize they're even doing it? There's a little missing information here?)

GTX 580 Zotac 1.5GB (I don't see this as the problem)

Seagate 1TB (no longer) Western Digital Blue EZEX 1TB ( well obviously with completely switched out HDDs, even brand wise, that's not the problem)
 
Solution


Wow, That's a good point and possibly, I stand corrected!

What really matters here is know problems and possible fixes and sharing experiences we've encountered.

Just because I didn't see it as a problem doesn't mean it isn't, I was not disclaiming your post, and I back your claim 100%.

 


Some times it would help us if when they repaired something they actually told you what the problem actually was?

You know what I mean?

With all my past experience of motherboard RMA they never said what actually was the problem that caused the failure.

Of course in every RMA situation I've encountered regarding a motherboard, I've never gotten back the same one I sent in.

It's always been a replacement.

 

Yes, one of the first things i did was see if running it on the iGPU produced the problem and it did.


> yeah i am pretty sure i got a new one because of the fact that my previous one was licensed to AsRock ( in the bios) while this one states American Megatrends. + one of the capacitor is bent (not by damage, by manufacture). So pretty sure its a new one.

> I went with this cpu/mobo combo because it was the one that offered the best value (for OC) and I asked around in Tom's forums a year ago to make sure there wouldnt be any incompatibility problems and it didn't appear there were.

> Yeah I am currently leaning towards the PSU being the problem (before i revert to the only part i have not yet tested/replaced/changed: CPU). I will try swapping the PSUs to see if it works but that will take me some time because (new symptom) i figured out that the boot failure is occuring because the HDD data is being corrupted and the MBR itself is being corrupted, leading to slow bios POST. If i put in a new HDD or correct HDD MBR, then bios should resume normal speed but it will get corrupted within 1 -2 shutdowns.

> Running RipJawsX on sandy bridge freq, not overclocking the IMC. I made sure to disable all XMP profiles and revert to safe stock settings and timings. Besides, the problem still occurs with different ram. RAM and HDD are two of the only things i can safely say are not at fault.

The problem is, I just cant do basic elimination because after every boot failure, the HDD gets corrupted. So its not as simple as just swapping components continuously. I have a feeling GParted might be doing harm to the new HDD>
 
I'd replace the PSU as well. Also the odds of a replacement motherboard that you've verified to not be the one you sent it having the same problem seems unlikely. Were you OCing the CPU? Excessive voltage? Also, do you have any powered USB hubs plugged into the motherboard? I've recently read that certain ones can "leak" voltage and cause system instability.
 
If MBR is being corrupted, I see 2 possible causes. Psu having bad outputs and spiking the hdd or a boot level virus, which have a tendency to survive most anti-virus programs, formats, memory replacements etc as they float around in ram, temp memory, cache etc and are almost impossible to run down as you fix one part, it gets cloned from another.
 
A virus can't stay in volatile memory once the system has been shutdown.

 
In my hands i had it coupled with a Noctua U14s on cold ambient temps and hadnt gotten around to a proper over clocking session yet. However, I bought it second hand and the guy had it coupled with a non-Z mobo so OC is out. But i have never heard of a CPU being bad like this. It either doesnt work, period, or works fine? + its giving the same answer on Linpack generic (LinX 0.6.5) so no errors it would seem.......

I need GParted to reconstruct the MBR.

nah, nothing like that..



That is very interesting. So a bad CPU cannot cause MBR corruption? If not then the answer has to be bad PSU (99.99% sure it isnt a low level virus).

 
I can't do too much without the computer in front of me for analysis but just based off of what you are describing my initial reaction is to suspect the power supply. Power supplies can cause all kinds of just down right weird issues.
 
i would also point out that maybe the cpu was not siiting in the socket properly or maybe the heatsink isnt making a good solid connection to the top of the chip....is the proper amount of thermal paste used? is it evenly distributed on the chip? is it a good thermal paste? i have had heatsinks that werent quite seated just right cause similar problems...are you using the stock cooler or after market?
 
You guys won't believe what the actual problem was. Turns out the motherboard i have has the PCI-E lanes slightly offset, so when you secure the GPU to the chasis the connectors actually disallign a bit. Same for the sound card. I now have the GPU and sound card secured not straight but tilted at an angle and it seems to be working just fine. This is probably one of the weirdest troubleshoots ever. I am definitely staying away from Asrock for my next build. Thankyou everyone.
 


its far more likely that your case has warped, or even that the mobo has shifted on its stand-offs, than for slots on the mobo to be misaligned from factory.

 


Well, case is Antec 902, one of the more rugged cases i have seen in my dabblings. + its brand new.

 
try the mobo in a different case, or a different mobo in your antec 902 just for the sake of diplomacy. asrock isnt always well-known for high quality control, but an issue like misaligned pcie slots is pretty strange coming from a mobo

its not at all uncommon for cases to become warped through packing/shipping damage
 


Gigabyte mobo works fine. And i dont think this is the kind of misalignment that results due to case warping, only thing i can think of are the slots. Here, take a look at it yourself, least i can do. Red line is perfectly secured to the chasis as its meant to be, you can feel it click into place. The green line is where it is now slightly tilted at an angle (relative to the board).

qxxK2Nr.jpg