[Maybe solved] Sporadic freezes that force restarts: is it mainboard, PSU or cpu?

arthurmimo

Reputable
Aug 2, 2015
11
0
4,510
Hello everyone,

My computer is giving me more and more random freezes that require hard restarts during gaming lately. The computer logs these events only as error ID 41s (uncontrolled shutdown), so unfortunately the event logger does not tell me what is going wrong before I press the restart button.

I built the computer roughly 9 months ago, and it seems to become increasingly unstable. These random events sometimes occur just with a couple of minutes in-between, but sometimes you might waste an entire weekend gaming without ever crashing. The seeming randomness makes it difficult for me to zoom in on the problem.

This is how far I have come with narrowing down the issue myself:
According to a hardware log by hwinfo64, temparatures and load are fine (not at peak) across the entire PC when those crashes occur, so I do not think it is an overheating issue.

PSU voltages are spot-on according to hwinfo64, Asus' AI suite and Bios (do not have a PSU tester yet to verify all the plugs).

According to Crystal Disk and Samsung's Magician, my SSD is healthy as well.

Memtest86 ran a full pass without errors, so RAM is probably okay good I believe (still have to reseat and/or do long term test).

I did a stress test on the GPU months ago but should repeat that to be sure that it's not that.

The CPU was tested with intel's diagnostic tool, which gave a pass - so CPU might not be it.
Update 1: I also did a burn-in test with the Intel Processor Diagnostic Tool. There were no errors and temparatures stayed at 60° Celsius, so CPU does not seem to be it.

However, when I run Hot CPU Tester, which supposedly tests both mainboard and cpu, my computer restarts somewhere in the middle of the test (which takes 6 hours). I have done this twice now, and at neither times did the computer create a report file for the test - so I assume it must have crashed. Therefore, at this point I suspect that the issue is either the Motherboard or the CPU.

This is my system:
CPU: Intel Core i7 4790K
Motherboard: Asus Maximux VII Gene (Z97)
RAM: 2x 8GB G.Skill (19200 DDR3-2400 CL10)
SSD: Samsung SSD evo 840, 512 GB
GPU: MSI GTX 970
PSU: 500 Watt be quiet! Straight Power E9 Non-Modular 80+ Gold
Audio: Soundblaster USB SBX pro studio
OS: Windows 7 64 bit

Is there a way I can dissociate between these rivaling hypotheses? That is, rule out one culprit in 'favor' of the other (CPU vs. PSU vs. mainboard)? What other things can I try to get a clearer picture?



All the best,
Arthur
 

arthurmimo

Reputable
Aug 2, 2015
11
0
4,510


I tried ruling out the PSU by verifying that its voltages never stray away from what they should be (they're within a 5% range and vary hardly ever). Can I run additional tests on the PSU without buying a tester?
 

arthurmimo

Reputable
Aug 2, 2015
11
0
4,510


Hi Paul,

I just did a burn-in test on the CPU and it passed flawlessly (temps stayed at 60° Celsius the whole time, no errors). I think if I could verify whether the PSU is working correctly, I could maybe put the blame on the Mobo - is there any way to test the PSU without removing it (very tricky, Bitfenix Phenom M case).

 

arthurmimo

Reputable
Aug 2, 2015
11
0
4,510


Paul,

Sorry for spamming - upon consideration you are of course right, I do of course reset the computer after it locks up. So in that sense, the error 41 is easily explained. However, the problem of interest is the freeze that actually forces me to reset the computer in the first place. I have adjusted the thread title and text to make that more obvious.

None of the tests I ran could conclusively point to one source for the problem. I'll be picking up one of these kill-a-watt devices to see how much power my PC draws when it crashes, maybe you're right and the PSU can't deliver. However I am not very optimistic; the PSU ought to produce around 200W more than the maximal power consumption, and it would have to have degenerated rapidly in the last months to not suffice anymore. Also, the system can perform for hours before failing.

If you or anyone else has a tip as to what I can test I would be very thankful, at this point I'm considering to just buy a new PSU and Mobo and be done with it.
 

arthurmimo

Reputable
Aug 2, 2015
11
0
4,510


Not that easy unfortunately, as said in the OP memtest86 can't find any errors, and reseating or removing modules does not appear to make a difference :(
 

arthurmimo

Reputable
Aug 2, 2015
11
0
4,510
Not trying to bump shamelessly, but I want to keep this one updated in case someone experiences similar problems.

After Paul's suggestion I ran memtest a third time, this time for 12 hours (4 passes) and still could not find any errors there.



I found out that OCCT has a power supply stress test. Ran the stress test for 20 minutes. Everything seems fine except the 5v rail, which, at two neighbouring sampling points, exceeds the ATX specs slightly (voltage rises up to 5.27 - that's an increase of 5.4% whereas ATX specs only allow 5%).

Very doubtful that this can explain the erratic behaviour, but I will keep testing. I'm this close (gestures expressively) to buying a new MoBo and PSU altogether, though.
 

arthurmimo

Reputable
Aug 2, 2015
11
0
4,510
Just a short update: I had read in a lot of threads that systems could become unstable and freeze as a result of driver problems (in particular, when on-board devices conflict with others of the same kind). So I disabled (both in BIOS and Windows) all that came to mind, and even did a clean install of Windows on a formatted partition, but neither helped - system still fails Hot Cpu Tester stability test after some hours.
 

arthurmimo

Reputable
Aug 2, 2015
11
0
4,510
So I have some preliminary information, maybe it's useful for others.

Last week I approached the supports of Asus and Intel, respectively, given that all the diagnostics so far point to a fault at their end (i.e. that GPU, SSD, RAM, temps and CPU all seem to be okay).

Whereas Intel did not respond yet, ASUS responded within hours. Given that all the other components seemed stable they said that the mainboard could indeed be causing the problems and that I should either update the BIOS and reset CMOS, or send it in for testing. I updated the BIOS (I never did that because as a kid I learned that you shouldn't do that due to the risks) and cleared CMOS. Since then the system ran ~18 hours of Hot CPU Tester diagnostics without freezing, and could play games an entire evening without problems.

I'll have to keep on testing to see whether stability is indeed as rock-solid now as it ought to be, but for now I'll consider this as solved. So if anyone else has problems with freezes, consider updating the BIOS before sending in the mobo.