[SOLVED] Confused! My 2nd ever built PC reboots under high CPU load and I've exhausted my investigation :(

Jul 3, 2020
10
1
25
I'll admit that I'm a bit lost and would thoroughly appreciate some guidance as to what I can do next.

Context

I've just built my 2nd ever PC (read: amateur) after deciding to upgrade most of the components of my first build 6 years ago. All parts (except 6 year old PSU) have been purchased <2 weeks ago with latest drivers/software installed and BIOS updated. Here's what I have:
  • CPU: AMD Ryzen 5 3600 with stock cooler but using Artic Silver MX-4 paste
  • DRAM: Corsair Vengeance RGB Pro, 2 x 8 GB, 3200 MHz
  • GPU: Sapphire Pulse RX 580
  • PSU: Corsair CS 550M
  • Mobo: MSI B450M Mortar Max
  • 2 x Samsung 840 Evo 120 GB SSD (Dual booting between a fresh Windows 10 installation and an up-to-date and stable Arch Linux installation, one on each SSD)
  • 2 x Samsung 860 Evo 500 GB SSD
  • Case: Cooler Master Q300L
  • 3 x Corsair AF120 LED 120 mm case fans
Built everything over a week ago. I booted up, I did not touch any settings in the BIOS related to overclocking. I even left XMP off as it was default (so the DRAM was at 2133 MHz instead of 3600). Everything seemed fine, installed windows, reinstalled Arch bootloader, booted into both several times did various admin things, everything still seemed fine.

The issue

Then, I tried gaming. Opened battle.net, started playing Call of Duty: Warzone, set the graphics to auto but increased a couple of settings a bit. Got into an online game just fine, getting ~75 fps. About 2 minutes into the game, the PC turns off and reboots (as if I'd pressed the reboot hardware button).

I immediately check CPU temp using HWMonitor, it's at about 70 degrees C.

I check Event Viewer, there's nothing in there for the <10 minutes previous to the surprise reboot, and no Errors for a while, but, timed after the reboot, there's an Error: "The previous system shutdown at <hh🇲🇲ss> on <dd/mm/yyyy> was unexpected." with Event ID: 6008. There's also a "Critical" entry with ID: 41 saying "The system has rebooted without cleanly shutting down first. This error could be caused if the system stopped responding, crashed, or lost power unexpectedly."

My troubleshooting

I couldn't find other clues in Event Viewer. I tried again to play Warzone, this time watching my CPU/GPU/Case temps as I play. Roughly 2 minutes in, PC does the reboot, and at this time, CPU was about 75 deg, GPU was about 70 and case as about 40. CPU volts didn't go above about 1.4 V. Fans all seemed to be working fine and CPU fan had hit ~2000 RPM.
Since then, I've tried a number of experiments:
  • Turning on and off XMP, and trying various DRAM clock speeds
  • Updating BIOS to latest version
  • Completely rebuilding and rewiring PC from scratch
  • Reducing Warzone graphics settings (a lot)
  • Reapplying stock cooler using new paste
  • Ensuring there's no dust in the case
  • Reinstalling Windows and Reinstalling Arch. (btw the problem happened once whilst I was using Arch too)
  • Updating SSD firmware
  • Moving DRAM to different slots
  • Customising Page File size
  • Turning off Windows game boost
  • Resetting CMOS
I also deduced that this wasn't game specific since the issue seems to occur generally when the CPU is under high load. I've run prime95 on "SmallFFTs" and also "Blend" modes and both seem to make my PC reboot after about 1 minute. I ran HWInfo the last time I tried this - here's the csv which covers about a min before the test starts, the min of test running all the way until the time the PC rebooted. In general, CPU temp never seems to go over about 80 degrees.

In terms of software, I have AMD Radeon Software, Ryzen Master, Corsair iCUE, Samsung Magician all up to date and working seemingly happily.

Also, with heavy daily usage (e.g. multitasking, working, tons of internet tabs, music playing, stuff downloading, watching hi res youtube videos, driving a 1440p + a 1080p monitor etc.) performance seems absolutely fine and I don't get reboots. Also lower end games seem to run fine.

It's just prime95, Warzone and 10 "While True; Wend" .vbs scripts running that seem to make my pc reboot.

My conclusions so far
  • I thought it was temperature but AMD say the r5 3600 can go up to 95 degrees C, right? I'm not getting that close. Also, I'm not overclocking.
  • I thought it was DRAM but the issue persists with them even as low as 2133 MHz with XMP off.
  • I thought it was wiring but I rebuilt and rewired cautiously.
  • I thought it was PSU, but my total power draw even with boost should surely be lower than 550 W with decent headroom?
  • I thought it was GPU but the vbs script doesn't use the GPU yet still reboots my PC.
  • My current thinking is that it could either be a faulty CPU, some dodgy MSI Bios settings that I don't generally understand or my stupidity somewhere but I have no idea what to try next except for maybe a Memtest86 since I've seen some recommend that to others.
  • Windows Memory Diagnostic Tool (passed, no errors)

    Please, please help - any suggestions welcome!
 
Last edited:
Solution
I RMA'd the motherboard, the retailer confirmed there was a fault but did not specify. I've been sent a replacement which seems to work perfectly with no issues :) Thanks for all your help!
Jul 3, 2020
10
1
25
Sounds like a PSU issue to me. Replace with a high quality unit would be my suggestion.

Thanks for the suggestion - what makes it sound like a PSU issue? Could you recommend one for my current components? I'm not looking to change or add anything for several years. I'm not a serious gamer
 
Thanks for the suggestion - what makes it sound like a PSU issue? Could you recommend one for my current components? I'm not looking to change or add anything for several years. I'm not a serious gamer
Rebooting under load makes it sound like a PSU issue. It's a shame they are expensive now with the human malware. I will assume you are in the USA I guess.
I suggest this Evga G3:
https://www.newegg.com/evga-g3-seri...&ranSiteID=8BacdVP0GFs-_M1E7SRxllkR3W1Eys0_Zw
 
Jul 3, 2020
10
1
25

I'm in the UK. I'll keep testing a few things and trying different settings but I'm also deciding on a new PSU anyway. Thanks for the recommendation, looks good. Do you have any thoughts on this Seasonic Focus Plus Gold? https://www.cclonline.com/product/2...-Fully-Modular-ATX-Power-Supply-Unit/PSU1723/
 
Jul 3, 2020
10
1
25
so... hooked up the brand new Seasonic Focus Plus Gold 550W and I'm still (!) having the same issue - seemingly identical. Whenever I go into an intense game or open loads of videos, the PC just reboots! I'm all out of ideas now, I've tried a bunch of different BIOS settings and even restored BIOS to default but nothing seems to stop the rebooting. Any further ideas?
 
so... hooked up the brand new Seasonic Focus Plus Gold 550W and I'm still (!) having the same issue - seemingly identical. Whenever I go into an intense game or open loads of videos, the PC just reboots! I'm all out of ideas now, I've tried a bunch of different BIOS settings and even restored BIOS to default but nothing seems to stop the rebooting. Any further ideas?

The rebooting happens in both Linux and Windows so that points towards a hardware problem. That was partly why I thought the PSU was to blame.

I think the next thing to do is to try with just the minimal hardware required to boot so just one storage device, one stick of ram, just keyboard mouse and monitor connected.
 
Jul 3, 2020
10
1
25
Sorry the new PSU didn't solve your issue, I thought it would. The rebooting during an infinite loop seems really strange, can't be many instructions to process. How many iterations does it complete before it reboots?

you mean Prime95? I think it didn't manage one iteration - I'm not sure, I'll check.

The rebooting happens in both Linux and Windows so that points towards a hardware problem. That was partly why I thought the PSU was to blame.

I think the next thing to do is to try with just the minimal hardware required to boot so just one storage device, one stick of ram, just keyboard mouse and monitor connected.

Yeah, it seemed reasonable to be the old PSU!

Sure, I can try minimal hardware - what do you think that will tell us?
 
you mean Prime95? I think it didn't manage one iteration - I'm not sure, I'll check.
Actually I ment your visual basic script. Seems like the cpu is at fault if it cannot run the most simple program.

Sure, I can try minimal hardware - what do you think that will tell us?
Probably nothing but I guess I was thinking an issue with an older SSD? I remember someone once who never mentioned having an old usb card reader connected to the system, which turned out to be the cause of the problem.
 

punkncat

Polypheme
Ambassador
Before you go replacing parts....

I have had a Ryzen that would do a thermal shutdown when it got to an indicated 70C. I suspect that in this particular case there was something wrong either with the sensor or the program reading it, something....anyway.

Before you replace anything try the open case with a fan method and see if your shutdown persists. If it doesn't get a better cooler or consider air flow in the case as culprit.
 
  • Like
Reactions: Flayed
Jul 3, 2020
10
1
25
Actually I ment your visual basic script. Seems like the cpu is at fault if it cannot run the most simple program.


Probably nothing but I guess I was thinking an issue with an older SSD? I remember someone once who never mentioned having an old usb card reader connected to the system, which turned out to be the cause of the problem.

Ah, yeah, I just found the vbs online somewhere when Googling for a CPU stress test. I guess it goes through A LOT of iterations before rebooting. I could amend the script to log how many it gets through. Anyway, I've tried a few more BIOS things so now I'm getting to testing with min hardware

Before you go replacing parts....

I have had a Ryzen that would do a thermal shutdown when it got to an indicated 70C. I suspect that in this particular case there was something wrong either with the sensor or the program reading it, something....anyway.

Before you replace anything try the open case with a fan method and see if your shutdown persists. If it doesn't get a better cooler or consider air flow in the case as culprit.

I have been trying it with case open and it doesn't seem to help. I'm using the stock cooler with new MX-4 thermal paste. I've already tried refitting the cooler with new paste once - I could try again? I've also got OK cable management and 3 additional Corsair AF120 case fans that seem to stay around 1000, 1000 and 700 rpm, and those are in addition to the stock Cooler Master case fan. I guess there could be something wrong with the sensor or whatever is reading it. Do you reckon there's anything I can do in the BIOS to mitigate this?
 
Ah, yeah, I just found the vbs online somewhere when Googling for a CPU stress test. I guess it goes through A LOT of iterations before rebooting. I could amend the script to log how many it gets through. Anyway, I've tried a few more BIOS things so now I'm getting to testing with min hardware



I have been trying it with case open and it doesn't seem to help. I'm using the stock cooler with new MX-4 thermal paste. I've already tried refitting the cooler with new paste once - I could try again? I've also got OK cable management and 3 additional Corsair AF120 case fans that seem to stay around 1000, 1000 and 700 rpm, and those are in addition to the stock Cooler Master case fan. I guess there could be something wrong with the sensor or whatever is reading it. Do you reckon there's anything I can do in the BIOS to mitigate this?
Loading the optimised default bios settings and enabling d.c.o.p or axmp I think it is called in MSI bios is all you should really need to do in the bios.

If you have a large house fan handy (i have a 2 foot fan blowing at me right now lol)then opening the side panel and pointing it at the motherboard is a quick and easy way of diagnosing overheating issues.

If you try with one stick of ram, then swap them so still one stick but the other one that removes faulty ram from the list of possible causes (unless they are both bad which seems highly unlikely)

If you already refitted the cooler once then I don't think doing it again is needed.

I'm not sure what else to suggest
 

mamasan2000

Distinguished
BANNED
I would run a memory test https://www.hcidesign.com/memtest/ 100-400% coverage.
Then OCCT's CPU test for a couple minutes if memtest passes. 5-10 min. Should equalize temps.
Faulty RAM can cause a reboot but so can too little voltage to CPU. But before you touch CPU voltage, know what the limit for it is. Google 'FIT'. It's a test you run. Should be around 1.25v for 3000 series under full load but every chip is different.
 
Jul 3, 2020
10
1
25
WTF! So, I removed one stick of RAM and everything seems to be fine. No reboots so far. I've tried it with XMP on and off. Does this mean I had faulty RAM, or maybe the BIOS didn't handle the dual channel well? Or the CPU didn't like it? Should I stick them both back in and run memtest86 then?
 

Fiorezy

Notable
Jul 3, 2020
376
86
890
WTF! So, I removed one stick of RAM and everything seems to be fine. No reboots so far. I've tried it with XMP on and off. Does this mean I had faulty RAM, or maybe the BIOS didn't handle the dual channel well? Or the CPU didn't like it? Should I stick them both back in and run memtest86 then?
Can you try to switch between the sticks and run a test?
 
Jul 3, 2020
10
1
25
Interesting update:

My mobo has four DIMM slots from left to right: DIMMA1, DIMMA2, DIMMB1, DIMMB2.
  • DIMMA1 is physically blocked by the stock AMD cooler -_- and I don't think I have case space for a liquid cooler
  • either memory stick inserted into DIMMA2 boots the computer up fine, with no reboot issues at all
  • either memory stick inserted into either DIMMB1 or DIMMB2 now no longer boots the computer properly (hangs with black screen and with the MSI EZDebug light corresponding to "indicates DRAM is not detected or fail" permanently lit)
  • with either memory stick inserted into DIMMA2 AND the other memory stick inserted into either DIMMB1 or DIMMB2, same no boot issue as above.
  • I've run memtest86 on both sticks of memory separately overnight and there were no errors.

Note that yesterday, the PC still did most of the time boot up with two memory sticks in, but then just reboot under high load (this was my main issue) but now it doesn't seem to boot up at all with two sticks in.

Given that I really cba to return and replace the mobo, and that I don't care about dual channel memory, can I just return my 2 x 8 GB sticks and get a 1 x 16 GB stick? Or is there something better I can do?
 
Jul 3, 2020
10
1
25
I RMA'd the motherboard, the retailer confirmed there was a fault but did not specify. I've been sent a replacement which seems to work perfectly with no issues :) Thanks for all your help!
 
  • Like
Reactions: Fiorezy
Solution

TRENDING THREADS