[SOLVED] Computer hard shutdown/crash under load in very specific cases ?

Aug 1, 2021
6
0
20
See title. This particular issue stems from GTA5, with every other game I've played save RDR2/RDO not having issues. Whenever the map zooms in and out in loading screens too quickly, or on cayo perico after a few minutes, the monitors instantly cut and the CPU error light on my mobo comes on, along with my GPU's LEDs/lights turning off, I can only turn my computer back on once I turn off and then turn the power supply back on.

I've already swapped my ram (which I believe went bad) with different sticks, and also swapped out my PSU from an EVGA 750w G+ with an EVGA SuperNOVA 1000 G6 and this issue still persists. Thermals on both my GPU, CPU and even my VR MOS also never get to a point that would cause concern, and benchmarking my CPU and GPU, with thermals reaching lower 70s doesn't cause this issue either, neither does any other game I play.

As far as I recall I had to turn my settings down in GTA5 to prevent this problem, which has now come back up seeing as I recently upgraded from a 2070 super to a 3090 and figured that the issue was resolved, bare in mind I've had this issue as far back as early 2020 when I first put the computer together.

At this point I'm mostly stumped, as most people say its either overheating or power supply and as far as I can tell its neither. I want to guess its either the motherboard, or the CPU, which was originally at 1.480 volts on auto in the bios, and now 1.3 manually set, but fluctuates between 1.0 and 1.4 at boot and then eventually stays at 1.4 volts on all cores according to OCCT. The behavior of these sudden hard crashes react like a safety measure in response to too much power draw, overheating, or something with power fluctuations.

Also to clarify the previous mention of RDR2/RDO; The issues I have in that game come from USB devices randomly disconnecting and reconnecting while it runs. May be related, at this point I'm very tempted to just put money down on a new mobo and cross my fingers.

Specs:
AMD Ryzen 9 3950X
32.0GB Corsair Vengeance LPX 32GB DDR4
B450 TOMAHAWK MAX
NVIDIA GeForce RTX 3090
4TB Hitachi HDD, 2 TB Cruical SSD, 512 GB Silicon Power NVMe
EVGA SuperNOVA 1000 G6, 80 Plus Gold


EDIT: I should also add that the power supply is not in eco mode & have my USB ports all plugged in w/ a non externally powered USB-C hub with provides more usb connections also filled.
 
Last edited:
Solution
Welp. My hunch with it being the mobo was 100% right. I'm not exactly sure if it was the chipset, but it was definitely the motherboard. I recreated both instances that brought upon crashes and there were 0 crashes whatsoever on the new board. My bet is that my tomahawk max was faulty but thankfully I believe the issue is resolved. Not amazing I had to essentially find out through trial and error and burnt through a couple hundred for a 1000w PSU that did nothing to fix the issue, but that means my 750G+ will serve as a backup. I would still highly recommend swapping PSU first before mobo though just to see if it could be PSU, because having to rebuild a computer takes more time than a PSU swap, but I see the upgrades I have done as...
Look in Reliability History for error codes and warnings that correspond with the times of the hard crashes.

Question: "usb connections also filled": what USB devices are installed? Unplug any not needed for gaming.

Try an externally powered USB hub.
 
Look in Reliability History for error codes and warnings that correspond with the times of the hard crashes.

Question: "usb connections also filled": what USB devices are installed? Unplug any not needed for gaming.

Try an externally powered USB hub.


Reliability history displays the shutdowns as "windows was not properly shut down" and doesn't provide any further information. I should also add that event viewer lists these shutdowns as event 41, and also does not record any kind of error code.
98hco2x.png


In terms of the USB disconnect issue: I'm prone to agree that an externally powered hub is the solution, hence why I mentioned it in passing. Bigger issue is this shutdown problem over the USB disconnect.
 
Not at all unusual for crashes to generate "not properly shutdown" error messages. Most likely as a result of the end user needing to switch off power in some manner versus the normal Windows shutdown icon.

What happens is that such shutdowns can and do cause file corruption which in turn causes even more problems.

Try running "sfc /scannow" and "dism" via the Command Prompt to clean up corrupted files.

References:

https://www.lifewire.com/how-to-use-sfc-scannow-to-repair-windows-system-files-2626161

Fix Windows Update errors via DISM or System Update Readiness tool - Windows Server | Microsoft Docs

Likewise the Windows built in troubleshooters may find and fix something.

However, cleaning up is just a interim step.

The power problem (USB or otherwise) may be the real root of the problem(s).

An externally powered USB hub may work per se but appear to fail due to corrupted files.

Try an externally powered USB hub, install but hold off on connecting any unnecessary USB devices.

Then clean up the files.

Determine if the system will boot and remain stable thereafter.

Next start adding back USB devices one by one. Ensure that all drivers are up to date. Manually download updated drivers via the applicable manufacturer's websites. Reinstall and reconfigure. No third party driver installers.

Allow some time between device installs or other changes that you make. Idea being that only "one thing" at a time is changed.

That will help identify if the problem is one particular USB device or its supporting software.
 
Not at all unusual for crashes to generate "not properly shutdown" error messages. Most likely as a result of the end user needing to switch off power in some manner versus the normal Windows shutdown icon.

What happens is that such shutdowns can and do cause file corruption which in turn causes even more problems.

Try running "sfc /scannow" and "dism" via the Command Prompt to clean up corrupted files.

References:

https://www.lifewire.com/how-to-use-sfc-scannow-to-repair-windows-system-files-2626161

Fix Windows Update errors via DISM or System Update Readiness tool - Windows Server | Microsoft Docs

Likewise the Windows built in troubleshooters may find and fix something.

However, cleaning up is just a interim step.

The power problem (USB or otherwise) may be the real root of the problem(s).

An externally powered USB hub may work per se but appear to fail due to corrupted files.

Try an externally powered USB hub, install but hold off on connecting any unnecessary USB devices.

Then clean up the files.

Determine if the system will boot and remain stable thereafter.

Next start adding back USB devices one by one. Ensure that all drivers are up to date. Manually download updated drivers via the applicable manufacturer's websites. Reinstall and reconfigure. No third party driver installers.

Allow some time between device installs or other changes that you make. Idea being that only "one thing" at a time is changed.

That will help identify if the problem is one particular USB device or its supporting software.

There'll probably be a bit of time between now and next time I post & follow up with what happens with the USB hub, as I have work & I'll also have to order the aforementioned hub. But I wouldn't know how a single USB device on its own would cause enough instability to cause a computer, let alone mine to hard crash.

It's not like a GPU driver failing or a bluescreen, its practically like a component in the computer loses power, or gets too hot and shuts off and I have to subsequently turn the power supply in my computer on and off just to turn the computer back on. It doesn't automatically restart despite that being enabled, and no memory dump is created. There are absolutely no hints that tells me what this is besides what google results give me, both of which haven't resolved the problem.

This is a distinct type of shutdown unlike a GPU driver crash and a bluescreen, it would be like if you yanked the power to the GPU mid operation or the CPU power connector mid operation. My only guesses are I have too many USB devices drawing power out of the computer and something trips up, the motherboard has a defect, or my CPU has a defect.

I should also add that when these shutdowns happen, every single fan in the computer is still running & the motherboard LEDs still have power, power button does nothing until power supply is turned off and on and then I am able to turn the computer on normally
 
Last edited:
Remember that computers use three different voltages (3, 5, and 12) amongst varying components.

However all components are linked one way or another so a failure in or with one component likely to have a domino effect.

Plus the meaning of an lit LED may simply be that power is present but not necessary that there is enough power. A fan may spin but perhaps not fast enough to really cool as expected.

Much like your car headlights may turn on - the car battery works but there is not enough battery power to actually start the car. Eventually the headlights go dim as well.....

Another thing to keep in mind is that a computer is doing/processing many things (computations) all at once. All very fast and quantifiably measured in different ways.

The processing is a continual thread of actions. Often in parallel. However any interruption in a thread or threads breaks the thread and any downstream dependencies subsequentially fail. Happens very quickly but in many cases the computer can recovery. Error logs (Event viewer) can reveal many such situations - harmless errors that are often used by scammers to fool people into thinking their computer was infected, broken, etc. as a shake down tactic to solicit "repair" monies.

It does take time and effort to troubleshoot crashes: hardware? software? configuration? Even worse when intermittent errors occur.

And the error logs (Event Viewer. Reliability History) may or may not prove directly helpful.

If some bit of code is expecting input "X" and receives "Y" instead the program may crash unless the developer provided code to handle other possible inputs. And there are likely thousands of "inputs" taking place.

And failing to receive an expected input due to a power issue can have much the same effect.

So a USB device with some power problem may not provide the expected or necessary input, provide the wrong input, or perhaps provide no input at all. The software not knowing "what to do" simply stops. May or may not take the OS down with it (aka "bluescreen", "crash").

Sometimes a simple reset will recover the system. Other times a hard power down is needed.

For the moment the problems seem to be power related and within USB device power requirements a bit more specifically.
 
After a bit of speculation and more google digging I believe my issue may possibly lie in the mobo chipset overheating from a sort of runaway thermal situation, seeing as most x570 boards have active chipset cooling (a fan over the chipset), while the B450 Tomahawk Max doesn't.

I noticed this because, funnily enough, a review for a motherboard I was considering as a replacement stated that their chipset wasn't properly cooled due to thermal padding issues on their board and as a result either using the BIOS or putting load on the system (much like my case) induced a crash. And considering my CPU error LED lights up and GPU LEDs cut off when it crashes it makes sense. I found that the PCH sensor in HWiNFO shows temps going up fairly quickly under certain loads, but going down very slowly overall afterwards, a lot slower than the VRMs, GPU, CPU, etc does. It also probably doesn't help that a 3090 is covering the chipset heatsink completely, which is likely choking airflow completely.

At this point I'll likely be ordering a new mobo to see if this issue continues, but I am starting to think I may have determined the source of my problem here, may try to stress test to see results.
 
Bumping this thread incase old threads with no activity get auto-locked because I don't want to have someone suffer through a similar issue and get met with the classic "fixed it" but with no explanation. Motherboard arrived a few days ago, waiting on some new thermal paste and a powered USB hub, also got the Meshify 2 compact today, figure that I'm pretty much revamping my entire computer at this point with some upgrades. I still get the very distinct feeling about the chipset but I haven't tried to crash my computer by recreating the issue's conditions because I don't want to damage anything, it's the component that for sure doesn't have the best cooling & the shutdown definitely acts like overheating. Will continue to post updates.
 
Welp. My hunch with it being the mobo was 100% right. I'm not exactly sure if it was the chipset, but it was definitely the motherboard. I recreated both instances that brought upon crashes and there were 0 crashes whatsoever on the new board. My bet is that my tomahawk max was faulty but thankfully I believe the issue is resolved. Not amazing I had to essentially find out through trial and error and burnt through a couple hundred for a 1000w PSU that did nothing to fix the issue, but that means my 750G+ will serve as a backup. I would still highly recommend swapping PSU first before mobo though just to see if it could be PSU, because having to rebuild a computer takes more time than a PSU swap, but I see the upgrades I have done as fairly necessary (the meshify 2 compact is IMHO a few leaps better than the meshify C) so I wont complain, just happy that everything seems to be back in working order. Will post in this thread again if anything comes up.
 
Solution