Dreaded TDR Error Crashing PC Worse Than A Budget Airline

Page 2 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.

BallisticChickn

Honorable
Aug 23, 2013
23
0
10,510
Oh goody, another TDR thread right? Yeah. So I've tried to troubleshoot this and follow as much advice as I could read, but I'm having trouble coming up with anything conclusive. I'm not super experienced troubleshooting hardware other than a HD so I apologize for any foolish oversights in advance.

Setup: Fully stable machine (specs below) for about a year. I downloaded Saints Row IV and after about ten minutes of play the system hard locked with audio stutter and suddenly my poor graphics card shifted into 100% fan speed. I powered off at the switch and restarted. I updated Nvidia drivers and tried again - same deal. Taking the hint I uninstalled the game and tried playing SR 3 - which has always been stable - and THAT caused the same crash. The crash ONLY happens with games and seems to only happen with Steam games. The only other non-steam game I really have is ARMA 2, on full settings it hasn't crashed at all but I'm not sure if that's a meaningful comparison.

System Specs: Intel Core i5 35070k @3.40G - Win 7 64bit - 8 GB RAM (2 sticks of 4GB GSKILL) - EVGA Nvidia GTX 660 - PSU Rosewill Hive 750 - MB ASUS P8Z77-V LK.

Log Info from Who Crashed:

crash dump file: C:\Windows\Minidump\082213-26145-01.dmp
This was probably caused by the following module: nvlddmkm.sys (nvlddmkm+0x8F06BC)
Bugcheck code: 0x116 (0xFFFFFA800AA18010, 0xFFFFF8800530B6BC, 0xFFFFFFFFC000009A, 0x4)
Error: VIDEO_TDR_ERROR
file path: C:\Windows\system32\drivers\nvlddmkm.sys
product: NVIDIA Windows Kernel Mode Driver, Version 320.49
company: NVIDIA Corporation
description: NVIDIA Windows Kernel Mode Driver, Version 320.49

crash dump file: C:\Windows\Minidump\082213-30139-01.dmp
This was probably caused by the following module: nvlddmkm.sys (nvlddmkm+0x8E9ED0)
Bugcheck code: 0x116 (0xFFFFFA8006793010, 0xFFFFF880061C6ED0, 0xFFFFFFFFC000009A, 0x4)
Error: VIDEO_TDR_ERROR
file path: C:\Windows\system32\drivers\nvlddmkm.sys
product: NVIDIA Windows Kernel Mode Driver, Version 320.49
company: NVIDIA Corporation
description: NVIDIA Windows Kernel Mode Driver, Version 320.49

They all basically look the same- obviously I can furnish more info if needed.

What I've Already Done:

1) System restore back to a happier time = Didn't work. Not so happy.

2) GPU stress test (OCCT) (run 2x because why not?) = No Errors Found

3) Hard disk check = No Errors Found

4) Driver Clean & Change - Here I uninstalled drivers, cleaned the registry with driver sweeper, installed the new/old driver and tested the games again. All of these failed to fix the problem. I should note that I did NOT wipe the chipset drivers during this process since I was under the impression that would be a "bad thing." 305.27 310.90 320.18 320.49 326.80 All failed to fix things miserably.

5) Windows mem test = No Errors Found

6) Windows file check (sfc/scan now) = No Errors Found

7) Anti-Virus (super long scan) = No Errors Found

8) CPU Linpack Stress Test (OCCT) = No Errors Found

9) Steam uninstall/reinstall = Didn't Fix Anything.

10) DXDiag = No Errors Found, Current Version Installed

11) Update Windows = Didn't Fix Anything.

12) Power Settings Set to High Performance = Didn't Fix Anything.

13) RAM/CPU Stress Test (Prime 95) = No Errors Found.

14) Physical Re-Seating of the Card (Spoke soothingly to it) = Didn't Fix Anything

15) Video Memory Test (Video Memory stress Test v1.7) = No Errors Found

16) Power Supply Voltage Check = No Errors Found

17) Furmark Benchmark test:
2734 points 45 FpsAvg 44 min FPS and 49 max- Max GPU Temp 87 Celsius - Resting temp 40 C - Resting fan 30% - Resolution 1280 x 720.

18) Furmark Burn In: (I've searched all over my computer & can't find the screenshot BUT here's the info I wrote down... sorry):
Temp started to level off at 1:50 around 90 C and slowly rose to 93 C after 8 minutes. At 8 minutes I stopped the test but I think temps would prob keep rising. This is the only thing that seems out of order to me.

Concluding thoughts:
I haven't tried the registry edit to basically kill off the TDR process and I haven't nuked Windows from orbit while sticking pins into a Steve Ballmer doll (Available at the MSFT HR office) at midnight before reinstalling the OS, but that's the only other things I can think of. There might be other drivers to try I suppose but after trying 5 of them I gave up.

I lack experience with GPU hardware so I'm not really sure if those temp results are way out of wack or aren't really worrisome. I'm pretty much totally out of ideas though so I would be extremely grateful for any help you can spare.

Look, if we don't stop this error I won't be able to play games... that might even force me to... go outside. Can you imagine a worse fate? The sun is out there. Water just falls from the sky. And all that fresh air can't possibly REALLY be healthy for you right? 😉 So again, any help you can give me would be awesome.

 
Solution
You're quite correct, DX can't be uninstalled, only updated although I think there is a repair option but I can't be sure-anyone else out there care to comment?
Either you didn't say or I missed it, but try uninstalling SR4, restart then run Ccleaner (include the registry sweep option) and manually delete any folders related to SR4 before a full, 'power down and wait a bit' restart.
Only other thing I can suggest is you contact Steam technical, like you I find it very strange that the problem seems to be limited to Steam games so perhaps the issue is at their end rather than yours.


Looking over the list, I see I forgot to mention running Memtest86. To be fair I only let it run for 18 hours, and I see some forums where people claim you need a full 24 to be sure of anything. I figured that in combination with everything else, 18 was enough- unless I'm totally wrong on that of course. But it didn't find anything either. Basically hoping EVGA can figure this out.

Oh yeah, and my other computer's MOBO just died (it's a really old one used by my fiancee to play casual games/surf online, so it's not a shock... But STILL). So is there an application you can run to get rid of gremlins? Because apparently I need one.
 


Yup, tried that too without error. I finally managed to borrow a much much much lower quality, basically obsolete video card, to try one of the games with, and it didn't encounter any errors.

EVGA support is RMA'ing the card so I'll update when that process is through.
 
Ok, here's the final update (I hope).

1. EVGA replaced the card, there was definitely something wrong with it after all.
2. Just to be sure the gremlins were all killed (and because I needed to upgrade my HD and since I had the case opened up anyway...) I also reinstalled windows and everything onto a new hard drive.

So far, I have the card reinstalled and haven't had anything bad or glitchy happen and certainly haven't had any more TDRs. So I think (and hope) we can declare victory over this problem. No longer will I be stuck "doing things" outside, I can finally return to the natural and healthy indoor lifestyle that represents the pinnacle of millions of years of evolution and thousands of years of human invention, ingenuity and sacrifice. Or something like that anyway.

I'd really like to thank both coozie7 and ingtar33 for taking the time to help troubleshoot this monster, that was in reality *my* problem. Seriously, this was beginning to drive me a little crazy and your input was really helpful in both solving the problem and keeping me sane (ish). So big thanks to both of you.


 
congratz! glad the problem is licked. sorry we beat around the bush a bit, but it's always better to be sure of the problem before you start to buy new stuff or return the stuff you got. glad returning it turned out to fix the issue. and i'm glad EVGA was accommodating 😀
 
Ok I've no idea why the troubleshooting tip waaaay above is marked as "best solution"... It was a good problem solving suggestion, but as evidenced by the MASSIVE amount of work everyone did after that point- it wasn't the Best Solution. The best solution is actually that this was a graphics card going bad and causing subtle TDRs everywhere. So if you're reading this with the similar problem- no, it had nothing to do with Steam (seriously, WHO flagged that as best solution?!)- follow the checklists for checking your graphics card/memory/etc until you find out where the problem is.

It's a pain. It's a headache. It's a nightmare. But that's just the nature of the beast that it lacks any neat, single silver-bullet answer that will kill it for you.