Display driver nvlddmkm stopped responding and has successfully recovered.

Brucey_Bonus

Honorable
Aug 16, 2012
84
0
10,640
Hello everyone!

I have been getting these driver crashes every so often I put my graphics card under strenuous loads.
The drivers tend to crash when the card is at 98% - 99% usage.

System specifications:
CPU: i5 3570K stock clocks
RAM: Corsair Vengeance LP [8GB] 2×4GB 1600MHz running at 1333MHz
Motherboard: ASUS P8Z77-V PRO
Graphics card: eVGA GTX 670 FTW stock clocks
PSU: Corsair HX650 650W Power supply

I have used the latest non-beta 314.22WHQL drivers from nVIDIA and the crashing still occurs.
I reverted to 310.70 and the crashing still results.
I had Windows Update detect the most recent drivers I could install for the graphics card and it only discovered 311.06 drivers, which still resulted in graphics driver crashing.

During the crash using 310.70 drivers my display will turn black and windows will prompt me that: "display driver nvidia windows kernel mode driver, version 310.70 stopped responding and has successfully recovered."
During the crash using 311.06 and 314.22 drivers my display will artifact and windows will prompt me that: "display driver nvidia windows kernel mode driver, version 311.06 stopped responding and has successfully recovered."; and "display driver nvidia windows kernel mode driver, version 314.22 stopped responding and has successfully recovered.", respectively.

During each driver reinstallation I used the Manual > Clean installation options to minimise any driver conflictions.

I have uninstalled any software I think may be conflicting with the drivers and that included Lucid Virtu MVP 1.0, although I have not uninstalled eVGA Precision X as it includes the fan control features I need to prevent my graphics card from overheating.

The temperatures of the graphics card during the times of the crash are around 55°C and as far as I know are well below my graphics card's thermal operating maximum; and is not overheating.

The temperatures of my CPU during the times of the crash are around 30°C, and never in my system's life has the CPU temperature exceeded 40°C.

If anyone knows any fixes to resolve this problem, any help is much appreciated!
 

Brucey_Bonus

Honorable
Aug 16, 2012
84
0
10,640
I ran eVGA OC SCANNER X Furry eVGA test and:
1 minute 14 seconds
23 artifacts
59°C
Load 97%
Power 98.4% TDP
FAN 68%


Drivers crashed resulting in artifacting and I got the error:
NVIDIA OpenGL Driver
The NVIDIA OpenGL driver lost connection with the display
driver due to exceeding the Windows Time-Out limit and is unable to
continue.
The application must close.

Error code: 7
Would you like to visit
http://nvidia.custhelp.com/cgi-bin/nvidia.cfg/php/enduser/std_adp.php?p_faqid=3007 for help?

Now I'm obviously going to go to this link for help, but could we say an OpenGL issue is causing these constant driver crashes?
 

Brucey_Bonus

Honorable
Aug 16, 2012
84
0
10,640
Installed 306.97 drivers and even downgraded PhysX from 9.12.1031 to 9.12.0604.
Ran eVGA OC SCANNER X FURRY eVGA test and:
Around 2 minutes into got a BSOD

WhoCrashed analysis:
On Sun 05/05/2013 11:24:17 GMT your computer crashed
crash dump file: C:\Windows\Minidump\050513-30388-01.dmp
This was probably caused by the following module: nvlddmkm.sys (nvlddmkm+0x14BA88)
Bugcheck code: 0x116 (0xFFFFFA80072F2010, 0xFFFFF8800F3E3A88, 0xFFFFFFFFC00000B5, 0xA)
Error: VIDEO_TDR_ERROR
file path: C:\Windows\system32\drivers\nvlddmkm.sys
product: NVIDIA Windows Kernel Mode Driver, Version 306.97
company: NVIDIA Corporation
description: NVIDIA Windows Kernel Mode Driver, Version 306.97
Bug check description: This indicates that an attempt to reset the display driver and recover from a timeout failed.
A third party driver was identified as the probable root cause of this system error. It is suggested you look for an update for the following driver: nvlddmkm.sys (NVIDIA Windows Kernel Mode Driver, Version 306.97 , NVIDIA Corporation).
Google query: NVIDIA Corporation VIDEO_TDR_ERROR

Windows Reliability Monitor analysis:
Source
Windows

Summary
Video hardware error

Date
‎05/‎05/‎2013 12:25

Status
Not reported

Description
A problem with your video hardware caused Windows to stop working correctly.

Problem signature
Problem Event Name: LiveKernelEvent
OS Version: 6.1.7601.2.1.0.768.3
Locale ID: 2057

Files that help describe the problem
WD-20130505-1224.dmp
sysdata.xml
WERInternalMetadata.xml

Extra information about the problem
BCCode: 117
BCP1: FFFFFA80072F2010
BCP2: FFFFF8800F3E3A88
BCP3: 0000000000000000
BCP4: 0000000000000000
OS Version: 6_1_7601
Service Pack: 1_0
Product: 768_1

I'm out of ideas and am soon going to relate these issues to being a hardware problem with the graphics card and RMA it.

If anyone else has any ideas of how I can solve this problem please chip in if you can.

Thanks
 

Brucey_Bonus

Honorable
Aug 16, 2012
84
0
10,640
Alright guys I have decided to RMA the card anyhow but a peculiar issue I had noticed is that when enabling K-Boost, my GPU clock would zoom up to 1188MHz and voltage to 1174mV, and be completely stable. But when I stress the card with K-Boost enabled it downclocks the card back to 1005MHz and 1024mV and eventually crashes again. I also think the crashes are to do with the video memory of the card, perhaps being unstable or so, but can video memory just become unstable?

I would like to seek an answer for my card's defectiveness.
 

Junkyard Willie

Honorable
Jul 29, 2013
1
0
10,510
I simply uninstalled the driver (in device manager > Display adapters > right click and uninstall). Then restart the machine and use the generic video driver that Windows 7 installs automatically. Either NVIDIA's current driver is crap or it's good but corrupted on your machine. I tend to think the former.