Question 3090 ti shutdown under load

Status
Not open for further replies.
Feb 22, 2023
3
0
10
TL;DR: 3090 Ti is shutting down under load; I suspected PSU based on troubleshooting, but recent symptom has me doubting the diagnosis

Build:
  • AMD Ryzen 7 3800X cpu
  • Wraith Prism LED Cooler
  • MSI MPG X570 GAMING PLUS mobo
  • G.Skill Trident Z RGB Series 64GB (2x 32GB) ram
  • ASUS TUF Gaming GeForce RTX 3090 Ti 24GB gpu
  • 2TB Samsung 980 Pro NVMe SSD w/ factory heatsink as main drive
  • 4TB Samsung 870 QVO SSD as storage drive
  • Corsair HX1000 psu
  • OS: Windows 10 64bit
Issue: GPU shutting down while under a load

Specifics: While playing Fallout 76 yesterday evening, my GPU suddenly shut down, with the RGB on the spine shutting off, the fans shutting off, the red LED next to the power connector coming on, and all 3 attached displays shutting off. I believe it is important to note that the ambient and musical audio that was playing was still playing. To recover, I held the power button until the PC shut down entirely, which is how I end up recovering from each instance of issue. Thinking maybe there was a driver issue, I took troubleshooting step 1. Attempting to resume play, the same issue arose. After attempting to recover, I noticed that when turning the PC back on shortly after it was shut down, sometimes the red LED would not go off on the GPU. In these instances, the system would not boot with any display, and I would be forced to attempt to recover again.

After having this happen twice, I decide to run GPU Z while playing Fallout 76 and see if anything suspect shows up in the logs. Other than it running pretty warm (steady 81-82c, last dusted when the GPU was installed 4 months ago), nothing seemed out of the ordinary, but it did have the same GPU shutdown occur. I will also note that when the GPU shutdown occurs, all readings from the GPU appear to cease immediately, with the exception of Memory Used [MB], which took nearly an additional minute to zero, and all of the Power Draw readings, which all read 0.0 but also took the same amount of time to stop reading on the GPU Z log. This is pretty consistent, but is sometimes interrupted by me forcing a manual shutdown of the PC. Before anything else, I took troubleshooting step 2, as I wanted to continue under the last driver I did not have this issue with.

This led to troubleshooting step 3. Both of these runs would cause GPU shutdown after approximately 6 minutes, though the first I gave an additional 6 minutes to see if the system would recover on its own, which it did not, but GPU Z did log that the CPU temp and System Memory used remained consistent, so it appears the issue was definitely isolated to the GPU. I try to play a less intensive game (World of Tanks), but have the GPU shutdown while I was tabbed out watching a YouTube video, maybe 15-20 minutes since I had launched the game. At this point I read about the "single" vs "multi" rail setting on the HX1000, and decide that I would see if that or any loose connections had anything to do with my issues in the morning.

This morning I completed troubleshooting step 4. I notice while checking connections that I had actually daisy chained one of my 8 pin PCIe connectors that connect to the 3 to 1 pigtail that supplies the GPU. I am not entirely sure why I did this, unless I was unable to locate my spare modular cables when I installed the GPU. Regardless, I rectified the issue; 3 independent 8 pin PCIe power cables run from the PSU to the 3 to 1 pigtail for the GPU. After everything was buttoned back up, I took troubleshooting step 5. This led to another GPU shutdown, again in about that same 6 minute timeframe as before. I decided to research these symptoms further before continuing.

Picking back up, I conducted troubleshooting step 6. First run of the OCCT test brought GPU shutdown in just over 2 minutes. Lowered to 95%, just over 4, lowered to 90% about 6. At 80% it ran a whole 14 minutes, so I really thought it was getting somewhere. I cut another 5th off of the power and dropped to 66%, this stress test lasted a whole 46 minutes. At this point I was convinced that lowering it more would make it stable in a stress test for over an hour, so I just dropped it to 50% and decided that since the less stress I put on the PSU the better it does, it must be the PSU. Afterall, that seems to be what most people who experience these types of loaded GPU shutdowns need to fix. So I ordered a new PSU and went on my way. I considered testing my HX1000 with the power supply tester I have (Thermaltake Dr. Power II), but the tester itself says compatible with up to ATX12V 2.3 and the PSU is ATX 12V 2.4, so I am not sure if the tester would render accurate data.

Then I decided to try Fallout 76 again. I played maybe 10 minutes and had another GPU shutdown. This was after about 2.5 hours of uptime, where outside of playing Fallout 76, the most GPU intensive tasks were YouTube videos, all limited to that 50% gpu power. As I finish writing this, I am again at about 2.5 hours uptime, with most of it spent here or looking at logs.

Troubleshooting Steps Taken:
  1. Updated Nvidia driver from version 528.24 to 528.29 via GeForce Experience.
  2. Created System Restore Point, uninstalled Nvidia Driver with DDU, and installed last known good version, 528.02.
  3. Two runs of FurMark GPU Stress Test, Fullscreen, Resolution 3840x2160. Logged with GPU Z.
  4. Opened and cleaned case, checked connections, changed PSU from "multi" rail setting to "single", addressed daisy chain.
  5. One run of FurMark GPU Stress Test, Fullscreen, Resolution 3840x2160. Logged with GPU Z.
  6. Run a System > Power test with OCCT until GPU shutdown, recover, reduce power limit through MSI Afterburner, repeat. Logged with GPU Z.
TL;DR: 3090 Ti is shutting down under load; I suspected PSU based on troubleshooting, but recent symptom has me doubting the diagnosis

Thanks in advance.

Edit: missed a step.
 
Last edited:
Feb 22, 2023
3
0
10
It's all quality components. And since ONLY the GPU is shutting down and the PC stays on, I'm going to say a failing GPU.

The easiest/cheapest thing to try is another PSU at least 850w.

Well that's unfortunate. But its new enough I should be able to rma through the ASUS warranty I believe. I've never had to before.

I have a HX1500i that will be here Friday. Every other PSU in the house is 760w or lower. If it isn't that, I will probably order a new GPU before warrantying this one, as the only backup we have is a 970.
 
Status
Not open for further replies.

TRENDING THREADS