[SOLVED] Is my MSI RTX 3080 going bad? GPU mostly fails AFTER running something intensive, and with artifacts. GPU overall acting very strange.

Jan 20, 2021
3
0
10
This GPU is an RMA issued replacement card, as my first MSI 3080 was faulty. It is an MSI RTX 3080 Gaming Z Trio. It has been acting fine ever since I've gotten it 3 months ago until last week. My PC screen will sometimes go black now, and then the PC will restart. It doesn't do this during games, however. A few times, it displays artifacts while crashing. It performs flawlessly with games.

The best way for me to recreate the symptom however, is using Furmark. It runs the test fine, and nothing is overheating as the temps displayed are fine. This is until AFTER I close the test, and when the PC usually hangs and crashes a few seconds later. It typically shows artifacts when this is occurring. The crashes can also sometimes be replicated after closing an intensive game, such as Forza Horizon 5 after running for a long time. Rarely will Furmark crash my PC on the stress test start-up.

Sometimes, the PC will boot-up with horizontal artifacts, and sometimes regular artifacts after the crash. One time, afterburner wasn't even detecting the GPU anymore, as well as other games. This was fixed with a restart, as restarting fixes the artifacts.

Often times after a crash, the GPU will also not be detected during post and the VGA light will shine on my motherboard. This can be fixed by simply reseating my RAM sticks which is odd. Also to note, reseating my GPU will sometimes stop the crashing from happening for about a day. After that, the crashes will start happening more frequently and after a Furmark test which is strange as well.
----------------------------------------------------------------------------------------------------
These are the most common statements with dump files, and the most frequent ones are VIDEO_TDR_ERROR 0x116's.

On Thu 12/9/2021 2:39:07 AM your computer crashed or a problem was reported
This was probably caused by the following module: nvlddmkm.sys (0xFFFFF807459D6350)
Bugcheck code: 0x116 (0xFFFF868D4A6E5460, 0xFFFFF807459D6350, 0xFFFFFFFFC000009A, 0x4)
Error: VIDEO_TDR_ERROR
file path: C:\WINDOWS\System32\DriverStore\FileRepository\nvmdi.inf_amd64_422d4a8d182d8330\nvlddmkm.sys
product: NVIDIA Windows Kernel Mode Driver, Version 497.09

On Wed 12/8/2021 5:37:19 AM your computer crashed or a problem was reported
This was probably caused by the following module: nvlddmkm.sys (nvlddmkm+0x7fe4f2)
Bugcheck code: 0x133 (0x0, 0x501, 0x500, 0xFFFFF8010F905330)
Error: DPC_WATCHDOG_VIOLATION
file path: C:\WINDOWS\System32\DriverStore\FileRepository\nvmdi.inf_amd64_422d4a8d182d8330\nvlddmkm.sys
product: NVIDIA Windows Kernel Mode Driver, Version 497.09

On Wed 12/8/2021 3:22:14 AM your computer crashed or a problem was reported
This was probably caused by the following module: amdppm.sys (amdppm+0x3a82)
Bugcheck code: 0x133 (0x1, 0x1E00, 0xFFFFF8051C705330, 0x0)
Error: DPC_WATCHDOG_VIOLATION
file path: C:\WINDOWS\system32\drivers\amdppm.sys
product: Microsoft® Windows® Operating System
-----------------------------------------------------------------------------------------------------------
These are things I've tried to fix this issue.

Check temps of all parts; temps are fine. DDU'ing GPU driver. Reinstall Windows 10; I even switched to Windows 11 and same issue. Made sure that GPU and CPU are not overclocked and are stock, same issue. Changing HDMI cable. My motherboard BIOS is the most recent one.
I do not have a spare GPU to test on sadly.

My parts - https://pcpartpicker.com/list/MBVCGq

Artifacting sometimes during crash - https://cdn.discordapp.com/attachments/878543524163887147/918055833911308309/20211208_032150.jpg

Artifacts after crash - https://cdn.discordapp.com/attachments/878543524163887147/918084413672288296/20211208_051753.jpg

Afterburner not detecting GPU one time - https://cdn.discordapp.com/attachments/878543524163887147/918086136440688650/20211208_052617.jpg

I am sort of thinking that it may be the PCIe slot on the MB but I don't really think so. Could it be a problem with the GPU controlling voltages maybe? Can anyone help?
 
Last edited:
Jan 20, 2021
3
0
10
Hey there,

Have you the most up to date bios? I'd start there, as seemingly a lot of issues with RTX and MSI mobo's can be resolved with an updated bios.

Yes, it is the most recent update for my B550 Tomahawk. I was thinking about updating to the version before the one I have now, though.

However, a couple of minutes ago, the crash happened again and the VGA light activated on my MB again and would not post and it could not be fixed via a RAM reseat this time, but rather reseating the whole GPU. Does the VGA light always indicate an issue with the GPU? Could the PCIe slot be bad? The card still lit up and was getting power though, so I am sort of lost. I am definitely betting it's the GPU that is going to be requiring an RMA.
 
Dec 30, 2021
2
0
10
Yes, it is the most recent update for my B550 Tomahawk. I was thinking about updating to the version before the one I have now, though.

However, a couple of minutes ago, the crash happened again and the VGA light activated on my MB again and would not post and it could not be fixed via a RAM reseat this time, but rather reseating the whole GPU. Does the VGA light always indicate an issue with the GPU? Could the PCIe slot be bad? The card still lit up and was getting power though, so I am sort of lost. I am definitely betting it's the GPU that is going to be requiring an RMA.

I have the same symptoms you describe with a Dell RTX 3060Ti. It is your GPU. Here is why, I have two Dell 3940 PCs, one with an RTX2060Super the other with the RTX 3060Ti. When I swap cards the problem travels with the RTX3060. Dell is sending a replacement for the RTX3060Ti.
 

TRENDING THREADS