GPU Crashing, Causing Freezes and Restarts

Mar 10, 2018
3
0
10
Over the last couple days, I have been experiencing GPU crashes. I have a GTX 680 and it’s about 5-6 years old now.

First noticed while gaming, specifically in a graphics intensive environment. With no warning there were a few black screens in quick succession, followed by a total system freeze, and finally a reboot. Upon checking eventviewer I found about 100 errors all along the lines of graphics shader errors.

I ignored it for a while and just avoided that particular place in the game and everything was fine until this morning. Happened again in a different, moderately graphics intensive place. This time I noticed localized parts of the game starting to blink or become distorted (artifacting?). Then more black screen flickering.

I restarted again and noticed artifacting on the Windows logo screen, which tells me that this is most certainly a GPU problem.

I thoroughly dusted out my card, took the plate and cover off and everything and made sure the heat sink was clean. Same problem was happening and now its happening even when I launch some programs (Battle.net, Discord, etc). It has happened when trying to watch a movie on VLC and once while going through EventViewer. I’ve monitored temps while this is happening and they are well within normal ranges.

Ran a Memtest 1 pass and no errors.

Now this card has generally been running hot for the past 2 years (60c idle, up to 90c under load) which is just laziness on my part. I think its life has been spent.

I really think I’m out of options and just need a new card but wanted to be sure if there was anything else I could check.

Edit: Bump again. Now the system freezes on the Windows logo boot screen for a few seconds, the GPU fan kicks on to 100%, and the logon screen displays in a low resolution (no drivers?). I haven't noticed a single crash or artifact when booting in Safe Mode. Do I need a new card or what?

Thank you.
 
Sounds like a heat problem.

It is an old card. Try putting fresh thermal grease on the gpu. Use some that is non-conductive like MX4. It could very well be that the current thermal grease has dissipated after all those years of use.

Clean the gpu surface with some isopropyl alcohol 70 to 90%. Do it with a coffee filter if you can get some, since these will neither scratch the surface, or leave any fibers behind. Just put a tiny amount of thermal grease in the middle. The heatsink will spread it when you tighten it back down. Very important that you use non-conductive thermal grease.
 


Temps appear to be fine. Before all this happened I was running 3 monitors: one at 120Hz and the other two at 60Hz. Apparently if you have dissimilar refresh rates across multiple monitors the temps will be significantly higher. Since this has happened I've switched to only one monitor at 60Hz. Temps went down drastically and idle around 40c now. Any other ideas?

Also still wondering: If it really is a hardware problem and the transistors/capacitors have been irreparably damaged by long term heat exposure, why does booting in Safe Mode always work fine?
 
Not really, sorry, guess it is time for a new card. Probably a better idea to buy a used 970 ebay the way the prices are nowadays. Or maybe even a 960. buying a new card can be painful with the current prices.

I was lucky I got mine before prices went through the roof. Now I could sell it for more then I paid for it. Though I won't.

The 970 is almost as fast as the 1060 as you can see on this chart. And you can pick on up for a lot less.

https://www.anandtech.com/bench/product/1743?vs=1771
 
In case anyone runs across this post in 2 years, I just want to say that getting a new card fixed everything. These were all classic signs of a GPU going bad.

Things I've learned:

1) This was NOT a heat problem.

2) Dusting out a pc/card ROUTINELY is extremely underrated. It can add years to the lifetime of your card. HOWEVER, if you wait 5 years and dust it out for the first time, it can cause your card to start artifacting overnight. I've noticed this with 2 cards now. My GUESS is, over a long period of time, the dust stagnates around a circuit, gets burned onto it, and begins corroding the circuit. At the same time, the dust build up might actually become its own circuit (dead skin cells are certainly conductive). Once you blow all that out of the way, you're left with only the corroded circuit which can make problems apparent. Just my theory.

3) The failing card worked in safe mode and without drivers because there is little to no stress on the GPU under these circumstances.