Question PC crashes to desktop or reboots when gaming - - I tried everything I could think of to fix it ?

DMs_ZC

Distinguished
May 31, 2013
16
0
18,510
Hi all,

I hope I'm posting this in the correct place. I've come here because I don't know what else to try. I'm having an issue where my computer is crashing while gaming. I recently upgraded my motherboard, CPU, and memory. Everything was running great for a few weeks. Starting yesterday though, my PC now crashes to desktop, freezes and crashes to reboot, or crashes to bluescreen after an indeterminate amount of time playing. It seems to be based on resource load? The less intensive the game the longer I can play. More intensive games on crash almost immediately, taking roughly 1-7 minutes. Think MSFS in VR or a AAA game with nice graphics and Ray Tracing enabled. Less resource heavy games like BattleBit Remastered seem to run fine. My PC seems to be running perfectly in every other capacity.

While gaming I usually crash to the desktop and my computer acts weird for a few minutes - everything is frozen and my toolbar/taskbar disappears. Other times though, it freezes on the current screen then reboots or I get the bluescreen and reboot. The type of crash seems to vary based on the game.

On bluescreen crashes I've gotten “DPC_WATCHDOG_VIOLATION” error.

------------------------------------------------------------------------------------------------------------------------------------

Possible issues and troubleshooting I've tried so far:

Outdated or broken GPU driver
- NVIDIA released a new driver yesterday. I installed it and rebooted but am still crashing. New divers don't necessarily mean good drivers so I completely wiped all traces of any NVIDA drivers with DDU (Display Driver Uninstaller) and installed a driver that was know to be stable with MSFS2020. This didn’t solve my problem either.

My CPU and/or GPU are getting overstressed and overheated - Neither are overclocked. I've been using HWiNFO64 to monitor temps. My CPU stays cool around 60-70degC even in extreme situations and my GPU is staying around 50-70degC. I cannot see if temps spiked right before the crash because my whole screen freezes. I was recently able to capture a picture right at the crash. None of my temps or power draws seem to be out of the ordinary. I did run one log through HWiNFO64 that I saved but I'm unsure if it caught the issue. I'm unsure how to read the report.

I have attached screenshots of CPU and GPU temps while running UserBenchmark.

Memory Issue - I found someone with a similar issue to mine. His XMP profile ended up being the issue. Once he turned off XMP his issue was solved. My DDR5 memory was running stable at 6000 speeds. I reset my motherboard to all default settings and my memory is now running at its default 4600 speeds. This has not solved my issue. Last night I ran a memory test. Through the command line I ran "mdsched" and let it go overnight. It found no errors. I also ran Memtest86. The process took over seven hours to complete but everything passed, zero errors.

I have included Memtest86 photos in my Imgur link.

My power supply isn't big enough - I have a 1000W power supply. I plugged all my components into PC Part Picker and it suggested I needed less than 800W to run my setup. I know PCPP isn't infallible but it should get me in the ballpark at least and by their calculations I'm more than adequate with 1000W.

Transient Power Spikes - Looking into this one now. Apparently with newer 30/40 series GPUs this has become a problem because they draw so much power with their base/usage loads. Apparently spikes can be 2x their normal power draw. I'm trying to find a program that can monitor transient spikes through crashes and give me a report. From my research, it seems that transient spikes that you system can’t handle causes everything to shut down. This has not been the case with my situation - always a crash to desktop or reboot, never a power loss.

Bad SSD or out-of-date SSD Drivers - I do not believe this is the issue. Games have been crashing on both my SSDs. I need to troubleshoot this more but device manager says all my HDD/SSD drivers are up-to-date.

Surge Protector - I found another post with someone having similar issues. He said that his surge protector was causing the issue. Plugging directly into the wall seemed to solve the problem. I tried this and it did not solve anything.

BIOS out of date - My BIOS is updated to the most recent version.

------------------------------------------------------------------------------------------------------------------------------------

Current Specs:

  • Windows 10
  • ASRock X670E Steel Legend MOBO
  • AMD Ryzen 7 7800X3D
  • DeepCool AK620
  • ZOTAC Gaming GeForce RTX 4090 Trinity (Non-OC version)
  • G.Skill Trident Z5 NEO RGB (AMD Expo) 64GB - DDR5 6000 CL30-40-40-96 - 1.40V
  • EVGA SuperNOVA 1000GT, 80 Plus Gold 1000W
  • HP Reverb G2
  • Various flight sim peripherals via USB to MOBO/USB Hub to MOBO - HOTAS, Collective, Yoke, Rudder Pedals

------------------------------------------------------------------------------------------------------------------------------------

I generally have nothing overclocked except for RAM and only to their advertised speeds. Regardless though, I've reset all settings to default in my BIOS. I’ve been at this for 36 hours now and I genuinely have no idea what the issue could be. I’ve tried everything I can think of but I’m at a loss, any help or suggestions would be greatly appreciated. I'm not extremely tech savvy but I'm trying to learn. If I can provide any more information please let me know. Thanks for taking the time to read!

User Benchmark Link:

https://www.userbenchmark.com/UserRun/62449960

Imgur Link to all Photos and Screenshots:

View: https://imgur.com/a/P81woeW


Photo of HWiNFO64 and HWMonitor at time of crash. I don't see any temps or power draw out of the ordinary:
View: https://imgur.com/a/LnCmN7u
 
Last edited:

Misgar

Commendable
Mar 2, 2023
1,497
395
1,590
An extremely thorough description. Well done. Some posts are unintelligible.

I'm not into gaming but your system crash is similar to the one I was experiencing during video processing. The software I use is constantly changing, with weekly updates to improve performance and fix instabilities.

I'm using a 7950X with 64GB RAM at 4800MHz (no DOCP/XMP) and an RTX 3060 GPU. I considered a 4090 which would have rendered my videos two and a half times faster, but at five times the cost. I run the system overnight so time is not an issue.

When I started using the video software last December, I'd come back in the morning to find the render had crashed. HWInfo didn't show anything untoward, all powers and temperatures looked normal, but the system still crashed.

I'm using NVidia Studio drivers because they're supposed to be more stable.

After some experimenting, I settled on limiting GPU max. power in MSI Afterburner and set it to 95%. This "fixed" the problem at the time and I haven't bothered to see if improvements in the video software make the Aferburner tweak redundant.

Some forums suggested undervolting the GPU, but the Afterburner "fix" was much easier and required less experimenting. The video software runs the GPU flat out for hours on end.

I'm not sure if Afterburner can calm the 2x current spikes in the 4090, but if you can sacrifice some gaming performance, give it a try. 1200W and 1500W PSUs are expensive.
 

DMs_ZC

Distinguished
May 31, 2013
16
0
18,510
User Benchmark Link:
After some experimenting, I settled on limiting GPU max. power in MSI Afterburner and set it to 95%. This "fixed" the problem at the time and I haven't bothered to see if improvements in the video software make the Aferburner tweak redundant.

Some forums suggested undervolting the GPU, but the Afterburner "fix" was much easier and required less experimenting. The video software runs the GPU flat out for hours on end.

I'm not sure if Afterburner can calm the 2x current spikes in the 4090, but if you can sacrifice some gaming performance, give it a try. 1200W and 1500W PSUs are expensive.
Hey man, thanks for the reply! Undervolting or underclocking isn't something I even considered it makes sense. I'm going to go try that right now. If it does end up "solving" the problem at least I will know that it's all power related. If that does end up being the issue I think I'll just have to bite the bullet and by a 1600W PSU (absolutely ridiculous). Lol I feel like its unethical to buy such a monstrosity of a card like the 4090 is and limit its power. Thanks again for the suggestion, I'll report back!
 

DMs_ZC

Distinguished
May 31, 2013
16
0
18,510
Well guys, the issue has been solved. I purchased a 1600W PSU today and plugged it in. Not only did this not solve the issue, but the crashes became more frequent and ultimately I started seeing artifacts after reboot. I guess my GPU is on its last leg. Really unfortunate, this card is less than a year old and has never been overclocked. Not sure what ultimately caused the issue. Just bad luck I guess. Time to find a new card and move on. Thanks for the help Misgar.
 
Just informational: your 1000W PSU is plenty for that card (assuming the PSU is not dying of course). If you had an ATX 3.0 PSU you could even safely drop down to 850W with no concerns (there's a minimum double power transient requirement for ATX 3.0).

As for the issue while I think GPU is a reasonable assumption based on the amount of things you've tried there are a couple of things I'd do:
1) run LatencyMon while doing something that usually leads to a crash just to see if it can pick anything up that may not be the video card.
2) unplug all peripherals that aren't the kb/m and see if that makes any difference

I doubt either will lead to a different conclusion, but worth trying before doing anything regarding the card.
 

gekraider

Distinguished
Jul 11, 2012
1
0
18,510
It's simple bro, your card is dying. If your videocard on warranty send back to shop. If not , try to limit and ofset GPU Memory clock from -100 to -800mhz (find best value by step by step downshifting). I used a Nvidia inspector on my old videocard to permanent frequency downshifting.
As stability test you can use "3d mark time spy" is best GPU Memory benchmark that I've seen. If you card done 2-3 tests at row and you have no seen artefacts - it is pretty stable.
 
Last edited: