[SOLVED] Unexpected Shutdowns after installing new GPU (3080)

Jun 18, 2021
5
1
15
Hi,I bought an new and factory sealed 3080 yesterday. I upgraded from an 1080Ti. Reinstalled Nvidia driver just to be sure.
I am currently using this power supply, its a little bit over a year old:https://www.amazon.de/-/en/gp/product/B01N9HIND1/ref=ppx_yo_dt_b_search_asin_title?ie=UTF8&psc=1

The first thing I did after installing the GPU was to check if its working correctly.
To do that I first run a dxdiag to check for obv errors but its turned out fine. I then installed FurMark and run some presets with were working just fine, before I switched to stress tests. Even after letting a stress test run for ~20 minutes, everything is fine and the temperature of my GPU does not rise above 79°C.

I then played a little bit of WoW, which was running perfectly fine. Switches to Final Fantasy 14 and played a little bit (around 10 minutes) before I entered a new location (that I never was in before) and tried to accept a new quest.
At that point my computer just shut down as if the power was cut off. I could only reboot the computer after unplugging the power cable, waiting a few seconds and replugging. it.
Ofc this triggered some alarms and I went back to check everything. I could think of.
Reinstall drivers. Run another stress test. Start monitoring GPU and CPU under load.

Everything was fine, GPU never went above 79°. CPU spiked on 84°, but as its a X5900 this apparently is normal https://www.techradar.com/reviews/amd-ryzen-9-5900x
I tried a few more games, everything was running fine. I then decided to test FF14 again. Booted up, ran around 15 seconds, tried to accept the same quest again as before, and again complete shutdown.
Restarted and decided to try to exactly reproduce this. Started FF14 up again and this time it even crashed before I got to accept the quest (but while I was still standing right next to the quest giver).I had monitored GPU temp, CPU temp and GPU power consumption while this happened and right up until that crash everything seemed completely fine (power consumption according to HWMonitor on something like 92-96%, Temps on CPU at 80°., Temps at GPU at 72°.

So I started testing all that again and I have no idea anymore what to do, because I can reliably reproduce this issue now, but only in FF14. I don't even care that much about not being able to play exactly that game, but it is very weird that this happened the after I installed the new GPU.

In the evening I tried again after changing my power cable and the place where I connect to the power supply of my flat and instead of staying into the same zone, I moved somewhere else quickly before something happened and stayed there. Was able to stay there around 10 minutes, went 5 minutes walking around and actually PLAYING the game, went back to the zone from before, stayed for 5 minutes, just went afk 2-3 minutes, came back, PC was shut down.

I know that this might be connected to the PSU and its slowly dying and FF14 is just triggering it EXACTLY the right way, but honestly, this feels really, really unlikely. Main reason for that is ofc that the 1080ti and the 3080 both use 320w, which means that I don't suddenly use more than before and it was working fine just a day ago. I know the 3080 might spike up and consume more power but then again, I am already using a 750w PSU.

I should also mention that everything worked just fine literally a day ago (before I installed the new GPU).

I would of course really like to figure this out, preferably without completely reinstalling windows or having to disassemble everything and trying a new PSU, when I am not at least 90% sure it really actually IS the PSU. Horror scenario would ofc be its actually the GPU, but how can I figure that out?
Can you people help me, please?

Full system speccs:
Nvidia RTX 3080
AMD Ryzen 9 5900X
RAM: https://www.mindfactory.de/product_...arz-DDR4-3600-DIMM-CL16-Dual-Kit_1333992.html
Power supply: https://www.amazon.de/-/en/gp/product/B01N9HIND1/ref=ppx_yo_dt_b_search_asin_title?ie=UTF8&psc=1
1 NVME + 2 SSDs (NVME: https://www.amazon.de/gp/product/B07M64QXMN/ref=ppx_yo_dt_b_search_asin_title?ie=UTF8&psc=1)
Cooler: https://www.amazon.de/gp/product/B07GKSFC6J/ref=ppx_yo_dt_b_search_asin_title?ie=UTF8&psc=1
 
Last edited:
Solution
Couple of things, and keep in mind that "reinstalling drivers" is NOT the same thing as COMPLETELY removing them using the DDU and THEN installing the latest version. FWIW I believe your PSU is good quality and sufficient capacity for that card. Whether or not there might be something wrong with your PSU, could be a different story, but I'd try all these things first and using the DDU and also making sure you have the MOST up to date, NON-beta motherboard BIOS version are probably the most important.

If there are any steps listed here that you have not already done, it would be advisable to do so if for no other reason than to be able to say you've already done it and eliminate that possibility.



First,

Make...
Couple of things, and keep in mind that "reinstalling drivers" is NOT the same thing as COMPLETELY removing them using the DDU and THEN installing the latest version. FWIW I believe your PSU is good quality and sufficient capacity for that card. Whether or not there might be something wrong with your PSU, could be a different story, but I'd try all these things first and using the DDU and also making sure you have the MOST up to date, NON-beta motherboard BIOS version are probably the most important.

If there are any steps listed here that you have not already done, it would be advisable to do so if for no other reason than to be able to say you've already done it and eliminate that possibility.



First,

Make sure your motherboard has the MOST recent BIOS version installed. If it does not, then update. This solves a high number of issues even in cases where the release that is newer than yours makes no mention of improving graphics card or other hardware compatibility. They do not list every change they have made when they post a new BIOS release. In cases where you DO already have the latest BIOS version, simply resetting the BIOS as follows has a fairly high percentage chance of effecting a positive change in some cases so it is ALWAYS worth TRYING, at the very least.


BIOS Hard Reset procedure

Power off the unit, switch the PSU off and unplug the PSU cord from either the wall or the power supply.

Remove the motherboard CMOS battery for about three to five minutes. In some cases it may be necessary to remove the graphics card to access the CMOS battery.

During that five minutes while the CMOS battery is out of the motherboard, press the power button on the case, continuously, for 15-30 seconds, in order to deplete any residual charge that might be present in the CMOS circuit. After the five minutes is up, reinstall the CMOS battery making sure to insert it with the correct side up just as it came out.

If you had to remove the graphics card you can now reinstall it, but remember to reconnect your power cables if there were any attached to it as well as your display cable.

Now, plug the power supply cable back in, switch the PSU back on and power up the system. It should display the POST screen and the options to enter CMOS/BIOS setup. Enter the bios setup program and reconfigure the boot settings for either the Windows boot manager or for legacy systems, the drive your OS is installed on if necessary.

Save settings and exit. If the system will POST and boot then you can move forward from there including going back into the bios and configuring any other custom settings you may need to configure such as Memory XMP, A-XMP or D.O.C.P profile settings, custom fan profile settings or other specific settings you may have previously had configured that were wiped out by resetting the CMOS.

In some cases it may be necessary when you go into the BIOS after a reset, to load the Optimal default or Default values and then save settings, to actually get the hardware tables to reset in the boot manager.

It is probably also worth mentioning that for anything that might require an attempt to DO a hard reset in the first place, IF the problem is related to a lack of video signal, it is a GOOD IDEA to try a different type of display as many systems will not work properly for some reason with displayport configurations. It is worth trying HDMI if you are having no display or lack of visual ability to enter the BIOS, or no signal messages.

Trying a different monitor as well, if possible, is also a good idea if there is a lack of display. It happens.


Second,

Go to the product page for your motherboard on the manufacturer website. Download and install the latest driver versions for the chipset, storage controllers, audio and network adapters. Do not skip installing a newer driver just because you think it is not relevant to the problem you are having. The drivers for one device can often affect ALL other devices and a questionable driver release can cause instability in the OS itself. They don't release new drivers just for fun. If there is a new driver release for a component, there is a good reason for it. The same goes for BIOS updates. When it comes to the chipset drivers, if your motherboard manufacturer lists a chipset driver that is newer than what the chipset developer (Intel or AMD, for our purposes) lists, then use that one. If Intel (Or AMD) shows a chipset driver version that is newer than what is available from the motherboard product page, then use that one. Always use the newest chipset driver that you can get and always use ONLY the chipset drivers available from either the motherboard manufacturer, AMD or Intel.


IF you have other hardware installed or attached to the system that are not a part of the systems covered by the motherboard drivers, then go to the support page for THAT component and check to see if there are newer drivers available for that as well. If there are, install them.


Third,

Make sure your memory is running at the correct advertised speed in the BIOS. This may require that you set the memory to run at the XMP profile settings. Also, make sure you have the memory installed in the correct slots and that they are running in dual channel which you can check by installing CPU-Z and checking the Memory and SPD tabs. For all modern motherboards that are dual channel memory architectures, from the last ten years at least, if you have two sticks installed they should be in the A2 (Called DDR4_1 on some boards) or B2 (Called DDR4_2 on some boards) which are ALWAYS the SECOND and FOURTH slots over from the CPU socket, counting TOWARDS the edge of the motherboard EXCEPT on boards that only have two memory slots total. In that case, if you have two modules it's not rocket science, but if you have only one, then install it in the A1 or DDR4_1 slot.



Fourth (And often tied for most important along with an up-to-date motherboard BIOS),

A clean install of the graphics card drivers. Regardless of whether you "already installed the newest drivers" for your graphics card or not, it is OFTEN a good idea to do a CLEAN install of the graphics card drivers. Just installing over the old drivers OR trying to use what Nvidia and AMD consider a clean install is not good enough and does not usually give the same result as using the Display Driver Uninstaller utility. This has a very high success rate and is always worth a shot.


If you have had both Nvidia and AMD cards installed at any point on that operating system then you will want to run the DDU twice. Once for the old card drivers (ie, Nvidia or AMD) and again for the currently installed graphics card drivers (ie, AMD or Nvidia). So if you had an Nvidia card at some point in the past, run it first for Nvidia and then after that is complete, run it again for AMD if you currently have an AMD card installed.



And last, but not least, if you have never done a CLEAN install of Windows, or have upgraded from an older version to Windows 10, or have been through several spring or fall major Windows updates, it might be a very good idea to consider doing a clean install of Windows if none of these other solutions has helped. IF you are using a Windows installation from a previous system and you didn't do a clean install of Windows after building the new system, then it's 99.99% likely that you NEED to do a CLEAN install before trying any other solutions.


How to do a CLEAN installation of Windows 10, the RIGHT way
 
Solution

mamasan2000

Distinguished
BANNED
How is the CPU that hot? Shouldn't it be 50-60 C in games? You are not pushing all threads at 100%, right? I would look at CPU cooler situation.

A bios/cmos reset doesn't take minutes. If you short the 2 pins on mobo, it's 10 secs. I do that all the time. My Reset button on my case is connected to those 2 pins. That's how often I use it.
Anything metallic on those 2 pins will do. Most people use a screwdriver.

As above poster said, use DDU.
 
Jun 18, 2021
5
1
15
Stock cooler most likely. Also, PBO likely enabled. Solution is to get a better cooler, for that issue anyhow.

Actually, I am using this one:
https://www.amazon.de/gp/product/B07GKSFC6J/ref=ppx_yo_dt_b_search_asin_title?ie=UTF8&psc=1
Added it to the first post too

How is the CPU that hot? Shouldn't it be 50-60 C in games? You are not pushing all threads at 100%, right? I would look at CPU cooler situation.

As above poster said, use DDU.

I'm will go through those suggestions, using DDU / Updating Bios etc.

That aside, no, I am not pushing threads at 100% but at least from my research those temperatures are not really strange for a 5900. I'm still going check the cooler once more and re-apply thermal paste just to be 100% sure. However, once again, even the max spike at 80° should still be far away from emergency shutdown territory, right? The issue also occurs on lower temperatures, e.g. 76°.
 
Last edited:
The 3080 & 3090 are known to cause problems for even some high quality psu's that just can't respond quick enough to the spikes. Its not their wattage but their transient response performance thats import. I must say I don't know if your psu falls into this category but that is my First thought.
 
  • Like
Reactions: DRagor
Jun 18, 2021
5
1
15
What I did today:

  • DDU clean install of all drivers
  • installed the AMD chipset drivers (which I didnt had before)
  • Updated my bios to the most recent version
  • changed the power connection for the GPU from a single cable with 2x4 Pins to 2 separate cables. Only learned today that apparently for modern graphic cards you should not use the same port on the PSU for the GPU but 2 separate ports instead.
  • While changing the cables I checked all physical connections again
Ran dxdiag, everything fine. Did 20 Minutes of Furmark Stresstest, everything fine. Temperatures of 70° CPU, 76° GPU.

Booted up FF14 (still in the same location as before), nearly instant shutdown once again.

The 3080 & 3090 are known to cause problems for even some high quality psu's that just can't respond quick enough to the spikes. Its not their wattage but their transient response performance thats import. I must say I don't know if your psu falls into this category but that is my First thought.

This actually sounds like the most plausible reason at the moment. I found a lot of posts regarding this with people having the exact same problem as me, even including FF14 in their problem description.
I already ordered a new PSU that should be able to handle this and it will arrive on monday. I will try replacing it now and keep you updated, in case somebody else encounters these problems.
 
Last edited:

Zerk2012

Titan
Ambassador
Actually, I am using this one:
https://www.amazon.de/gp/product/B07GKSFC6J/ref=ppx_yo_dt_b_search_asin_title?ie=UTF8&psc=1
Added it to the first post too



I'm will go through those suggestions, using DDU / Updating Bios etc.

That aside, no, I am not pushing threads at 100% but at least from my research those temperatures are not really strange for a 5900. I'm still going check the cooler once more and re-apply thermal paste just to be 100% sure. However, once again, even the max spike at 80° should still be far away from emergency shutdown territory, right? The issue also occurs on lower temperatures, e.g. 76°.
A much shorter answer.

I could only reboot the computer after unplugging the power cable, waiting a few seconds and replugging. it.

That is because the protection tripped on the power supply unplugging it resets it.
The RTX 3080 can have a power spike of almost 500 watts!
A bunch of power supplies cant handle these power spikes and shut down.
 
Jun 18, 2021
5
1
15
That RMx is a pretty darn good PSU, so I'd be really surprised if you still had problems after swapping that in unless the PSU wasn't the problem to begin with.
I mean, as you stated yourself, my previous PSU also was a very good one and STILL it was not able to handle the 3080.

I installed the new PSU today, which was EXACTLY was annoying as I thought it would be, but it seems like it was worth the effort. Did the usual stress tests and then spend 60 minutes in FF14 in the location that caused the issues in the past. I know this is not a perfect way to figure out if it really worked, but since I was able to reproduce the issue 100% of the time with this game and I am not able to reproduce it any longer, I think its the best I can do at this point.

So yeah, it seems like it really were the powerspikes.
Completely insane issue.

Thanks for the help everyone, an updated bios and chipset drivers can never hurt anyway :)
 
  • Like
Reactions: DRagor