Weird Graphics Problem (TDR)

SpetsnazBeaver

Honorable
Mar 25, 2015
142
0
10,680
Hi guys. So 2 years ago I built a new PC from scratch with a GTX660, an i7 4790k and 16GB RAM. I have since been plagued with a TDR issue whenever I play any game; eventually, it will crash, I will get a "driver failed and restarted" message, the resolution will decrease from 1080p, the mouse cursor will start moving in laggy jolts and even disappear and I need to disable and re-enable the GPU in Device Manager to get back to smooth desktop operation. It got so bad that I gave up on fixing it but here I am again, trying to fix this once and for all.

Please help. I have tried wiping the drivers clean (including deleting NVIDIA files and registry), changing the power settings in nvidia panel and windows power options and adding a TdrDelay key with value = 8. That didn't help.
 
Solution


Your GPU temps could be showing stable, however the VRM's (voltage regulator modules) could be overheating which are not monitored. If these die...then your GPU dies with them.

Off the top of my head, I would say TDR errors can come from bad power being supplied to the GPU, bad drivers, or a failing GPU.

Easiest to test should be drivers. Go back a year or more, to before you started getting TDR errors, and use a driver release from that time.

Next, if possible, swap in a known good power supply.

Finally, the card may be aging badly and in need of RMA work or outright replacement.
 

SpetsnazBeaver

Honorable
Mar 25, 2015
142
0
10,680
I don't know how to verify this but I doubt the PSU is an issue. I'm running an 850W EVGA B2 Bronze power supply..

There was no time when I didn't have TDR errors. I built the PC 2 years ago and immediately had the issue come up so there was no stable driver to go with.

There was one issue that came up when I was assembling the PC back then. When I installed the CPU cooler it peeled off the material around the mounting holes on the MOBO (Gigabyte Gaming 5) which caused the metal mount to short wires going to the RAM slots, making me lose a channel. This was quickly fixed by putting in a rubber washer to stop the shorting and both sticks work.
 
The GTX 660 was old 2 years ago, so going back in time, driver wise for such a card would be looking at drivers from say, 2013 or so. The 7xx series were released in the 2013 time frame, so the 6xx series cards would have been nearing any significant driver improvements being worked on before 2014.

The issue with power supplies is not always the amount of power, but actually the quality of the power being delivered to your GPU. Some power supplies are incapable of stable output when the 12 V rail, the one your GPU is being fed from, is being hammered on for bursts of high amperage draw. The power draw of video cards goes up and down depending on the scene demands in games you are playing.

Here is a great, in-depth thread where a lot of potential issues and solutions are offered for TDR errors:

https://answers.microsoft.com/en-us/windows/forum/windows_8-update/video-tdr-failure/bb314f70-827d-4b07-a172-1d76de1b51e6
 
The damage to the motherboard could be the cause. A great place to start with that would be extensive memory tests.

Run the tests overnight or longer, unless you get errors, at which point there's no need to continue testing.

Since each test is slightly different, you can sometimes catch errors with one program that another misses. It doesn't hurt to run more than one program.

MemTest86

Memtest86+

Windows Memory Diagnostics
 

SpetsnazBeaver

Honorable
Mar 25, 2015
142
0
10,680


Had another TDR crash last night and went to bed. Waking up today I disabled and re-enabled the 660 and ran Furmark at 4k windowed 8XMSAA with dynamic camera (doesn't let me do fullscreen) stably. Avg 22fps with temps mazing out at 62°C. With dynamic background and xtreme burn in at 4k windowed I got the same 62°C at avg 16fps. No artefacting, just some stutter but I assume that's 4k stressing the 660 out.

UPDATE: Ran System File Checker with no issues coming up
 

Neur0nauT

Admirable
TDR errors could point to several things mentioned here. On the occasions that I experienced them, it was overheating VRM modules on the Graphics card which eventually killed the card. So if you are overclocking, you should take any overclocks off. Even try undervolting the GPU VRAM and lowering the clocks to see if it settles down....if it does....then there's a good chance that the graphics card is on it's way out.
 

SpetsnazBeaver

Honorable
Mar 25, 2015
142
0
10,680


Haven't tinkered with any of my components. My GPU temps are at most 62°C.
 

Neur0nauT

Admirable


Your GPU temps could be showing stable, however the VRM's (voltage regulator modules) could be overheating which are not monitored. If these die...then your GPU dies with them.

 
Solution