[SOLVED] Weird restart issue after upgrades

Sep 17, 2021
5
0
20
I replaced my ASUS TUF motherboard with the higher end ASUS ROG and built the system into a new case with better cooling as I was getting high temperatures. The PC runs fine until I put some load on it.

The difference on the CPU (Ryzen 9 5900X) is massive:

Before:
Clock speed: 3700-3800MHz
Temperature: 65-75 degrees C

After:
Clock speed: 4300-4500MHz
Temperature: 35-45 degrees C

Both cases is with "Optimized Defaults" settings from ASUS and no custom overclocking, only letting the CPU boost as usual.

While this looks like an excellent improvement, whenever the PC is under a high load for a few minutes, it just restarts instantly - nothing in the event logs except the power loss, no sound, nothing. Persists after Windows reinstall and also happens with Linux. Nothing gets logged, PC just cuts out as if I pulled the power cord, then starts back up again and runs fine until there's high load. Temperatures don't exceed 43-44 degrees C at the time of the restart, and drop a bit shortly after.

Perhaps it's due to messing with the fans? My previous CPU cooler (120mm AIO) had a cable from the pump to the CPU fan socket on the motherboard. The new one doesn't and is one of those pumps that just always runs at 100% (not a big deal, it's not noisy), so I plugged a chassis fan into that socket to get the POST to keep quiet, as well as the fact that I needed a place to plug it in.

Besides the new CPU cooler, I have one more chassis fan than before and I used liquid metal instead of traditional thermal paste.

Disabling the CPU Boost in the BIOS stops the problem from happening, but clock speeds go down by about a GHz. It could be a faulty CPU, I guess, but it's rather new yet it never ran at these higher clock speeds due to the cooling issue - but it worked fine.

To make matters even stranger, the same thing happens when the GPU (RTX 3070) is under high load, even with low CPU load. Could my new motherboard be faulty? Or perhaps my temperature sensor isn't working correctly and it's pushing higher than it should - 45 degrees under load feels quite low to me.

System specifications:
ASUS ROG Crosshair VIII Impact Motherboard
AMD Ryzen 9 5900X CPU
2x 32GB G.Skill Trident Z RAM
EVGA nVidia GeForce RTX XC3 Ultra
Gamdias Kratius P1-750G 750W PSU
Cougar Aqua 240 CPU Cooler
3x Cougar HPB 120 Case Fans
2x Phanteks 120mm Case Fans
Phanteks Eclipse 200A Case
Focusrite Scarlett Studio Audio Interface
Dell AW3420DW Monitor
Razer Blackwidow Keyboard
Razer Basilisk V2 Mouse
Logitech c922 Pro Webcam
Logitech 3D Extreme Pro Joystick
Oculus Rift S

Operating systems tested:
Windows 10 Pro for Workstations
Ubuntu 21.04 LTS
 
Last edited:
Thank you for your response. I have updated the full specs above, some of it perhaps unnecessary, but I didn't want to overlook anything.

The PSU is a Gamdias Kratius P1-750G 750W PSU that is 6 months old and in excellent condition as far as I know, but knowing the condition might not be that easy.

PC is used mostly for Unreal Engine game development. Two of the previous restarts happened during a packaging build.
 
I believe I might just have solved this issue.

I updated the firmware on the motherboard as it was quite old despite the board being brand new. I restored the exact same settings with the boost enabled again.

Outcome:
- Could not reproduce the problem again
- CPU clock speeds seem slightly lower
- CPU temperatures seem slightly higher

I did the packaging builds, ran CPU-Z stress tests as well as several runs of Cinebench R23 and it seems to still be going strong.

I suspected something might be wrong with the PSU given the symptoms, but I couldn't find a replacement locally and had to order one online - guess I'm also upgrading my PSU next week. I might need to replace it anyway as I've been having power issues with my Oculus Rift S even before doing this upgrade, although I could reproduce the problem earlier with the Rift S disconnected.

In the end, I suspect ASUS had a firmware bug that messed with the sensors somehow, perhaps allowing it to push the CPU further than it could handle.