Another the display driver nvlddmkm stopped responding and has recovered

DasT123

Reputable
Jan 28, 2015
5
0
4,510
Hi everyone!

I would rly appreciate any kind of suggestion how to get rid of this damn display driver issue.
I tried all methods i could find that were successful for other folks, so even if i dont come any closer to a solution perhaps some people will benefit from my "Steps taken so far" section.

---------------------------------------------------------------------------

The Problem:

the display driver "nvlddmkm" stopped responding and has recovered. exclusivly ingame , at every game i tried so far.
after variing time of playing - ranging from 2minutes to 1hour, mostly after some 5 minutes.
EventID 4101 with no further event data or details to go on.

Details on Crash:
The screen goes black and recovers, as is typical for a Tdr issue. Some games cope and i can keep on playing, some games crash.
But if a game copes and i keep on playing it will happen more frequently and ultimatly result in a complete system freeze.
Then my sound will loop on whatever sound it was currently on and the system will freeze.
Sometimes the system will go on a bit at like 1 frame per half a minute, but not for long.
-> BSOD! BlueScreenView tells me ntoskrnl.exe is the culprit with something like this:

Dump File: 012715-20904-01.dmp
Crash Time: 27.01.2015 23:20:06
Bug Check Code: 0x00000101
Parameter1: 00000000`00000031
Parameter2: 00000000`00000000
Parameter3: fffff880`02f63180
Parameter4: 00000000`00000002
Caused by Driver: ntoskrnl.exe
Caused by Adress: ntoskrnl.exe+76e80
File Description: NT Kernel & System
Product Name Microsoft® Windows® Operating System
Company: Microsoft Corporation
File Version: 6.1.7601.18700 (win7sp1_gdr.141211-1742)
Processor: x64
Crash Adress: ntoskrnl.exe+76e80
Full Path: C:\Windows\Minidump\012715-20904-01.dmp
Porcessor Count: 4
Major Version: 15
Mindor Version: 7601
Dump File Size: 328.848 Dump FIle Time: 27.01.2015 23:23:02

This is the system process essentially and doesnt help much. And I cant read those crash parameters.

---------------------------------------------------------------------------

System: Desktop PC

OS : Windows7 Professional 64Bit SP1 with Latest Updates (including optional)

Mainboard : Gigabyte GA-MA770T-UD3P rev1.0
- Bios: Award Modular BIOS v6.00PG (F12 at GigaSupportPage; Newest available there)
- AMD Chipset Driver 8.641 (Newest available on GigaSupportPage, too)
- Realtek Function driver for Realtek Azalia audio chip 5.10.0.6642 (Newest available on GigaSupportPage, too)
- Realtek LAN Driver 7.049.0927.2011

Processor : AMD Phenom II x4 955 3.2GHz

Graphics : Gigabyte GV-N560OC-1GI (GTX 560 TI Factory-OC with 900/1800/2004 MHz for Core/Shader/Memory frequency); GeForce Game Ready Driver 347.25 (DxDiag says Display Driver Version 9.18.13.4725); PhysX 9.14.0702 (without Legacy); Currently without GforceExperience and Nvidia Audo Drivers and Nvidia Vision Drivers.

Memory : 2x Corsair XMS3 DDR3 2GB (TW3X4G1333C9A ; Running at Dual Channel)

2Samsung Displays (Currently running only 1)

---------------------------------------------------------------------------

Steps taken so far:

- System recovery from a previous Windows recovery point
- Rollback to several older Nvidia Driver Versions. (Using DDU removal tool in Safe Mode[as admin:]) !Latest version on the mainboards section at Gigabyte.com is 310.90 wich is too low for the game im currently testing with.
- Tried with and without Phys-X, installed and uninstalled Phys-X Legacy, Set PhysX Device to GPU in Nvidia Control Panel
- Tried with and without Nvidia Sound Drivers. (Without means with the Windows Default Drivers)
- Unplugged the secondary display, cause u never know.
- Ran Malwarebytes, Adw and some other tools versus infection - from Windows aswell as from Safe Mode. I read somewhere that a Trojan or a bitcoin-mining-virus could cause Tdr timeouts.

- Format C: -> Clean Windows7 Install ;
! 1 Point of concern here: The 1TiB internal HDD is split between C🙁638GiB) and D🙁292GiB) and i only quick-formated C:. Dont know if that was wrong, even though as far as i know the system should only use C:

- Installed all redistributables of VC++ (2005,2008,2010,2012,2013 all x86 and x64) and others with AIO-Runtimes v2.2.0 (german , but as its just librarys it shouldnt add driver issues?)
- Verified DirectX with both DxDiag and Webinstaller; Installed DirectX 9c
- Windows Classic as Desktop Theme, Energy Options for Optimal Performance in Windows Energy Options as well as in Nvidia Control Panel.

- Registry edit as proposed by Microsoft Support http://support.microsoft.com/kb/2665946
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\GraphicsDrivers
New DWord (64Bit didnt work; 32bit did, even though 64Bit OS) "TdrDelay" with a value of 8. Result: freeze for several seconds before Display Driver Recovery (Variable does its job, but the problem remains)

- Downloaded and installed the following drivers for my mainboard, all of them the latest Version on the Gigabyte page for my mainboard: Bios Update from "F5" to the latest version "F12" ; AMD Chipset Driver Update to AMD Chipset Driver 8.641 ; Realtek "High Definition Audio System Software Ver: R2.70" (wanted to avoid this originally because i read somwhere it can be causing problems); Realtek LAN Driver 7.049.0927.2011 (wanted to avoid this too for the same reason as last)

- DownClocking factory-OC Values (900/1800/2004 MHz) to default card values (822/1644/2004 MHz) with MSI Afterburner 301

! - The Memory Frequency of DDR3 should be 666MHz, SIW says its running on 669,7MHz - problem?
! Also the system says i got CM3X2G1333C9 instead of the TW3X4G1333C9A i actually have. As both have the same specs it should be fine ?
Values for Timing are all right (9-9-9-24)

- FurMark 1.15 benchmark running fine at both Fullscreen(1920*1080) and Windowed (GPU Hottest at 73°C; FAN at 81%)
- Ran Video Memory Stress Test 1.7.116 with No Errors Found
- Still Running MemTest 4.1 with No Errors

- Monitoring Voltages with SIW 4.4.0514e and CPU-Z 1.67.1 and MSI Afterburner 301 --CPU 1.33V-1.34V ;GPU 0.95V -1.00V (0.95V SIW 1.037 MSI-A) ; Memory 1.5V(cpu-z) - 1.6V(SIW)
!Noticed in SIW that +5VCCH (The One for mainboard and USB when power is switched off) is not constant, ranging fom 3.79-4.30V
Monitoring voltages for memory processors and GPU while ingame (second display), they seem to be carved in stone.
- Monitoring temperatures , they are just fine. Cleaned out dust anyway.
- Replacing nvlddmkm.sys with the nvlddmkm.sy_ from current nvidia driver via "expand.exe nvlddmkm.sy_ nclddmkm.sys"
- Downloaded and installed GPU BIOS from Gigabyte Site "NVIDIA Source BIOS Version : 70.24.11.00.01/70.24.21.00.02"


---------------------------------------------------------------------------

Steps not taken so far:

- Lend a graphics card and memory from a friend and switch em out. Wanted to avoid this. Also, hardware shouldnt be an issue? As the crash appears only ingame and FurMark and all other tests ran just fine ?
- Test with a game that runs with Nvidia 310.90 Drivers, as those are the latest drivers on Gigabyte page for my card.
- Deactivating single features in 3D-Settings of Nvidia Control Panel

What can i say, i rly dont know what else i could try. What am i missing?
I would be so happy about directions, and even happier about a solution ;-)

Greetings!
DasT123
 
Solution
-you might go to control panel, find device manager and disable unused high def audio devices for your GPU if you don't get sound via your video cables to your monitor. This can help reduce load on the GPU.

-you might also consider underclocking the GPU by 100 MHz and see if it resolves your problem.
(make sure your BIOS is not overclocking the BUS on the motherboard, it should be at 100MHz)

I think that the GPU recovered would mean that windows thought the GPU took too long then issued a reset to the device. The device got the reset responded within the timeout. if it did not respond within 30 seconds it should bugcheck and make a memory dump. I think the card is not seeing a signal sent to the card or is just taking too long...
-you might go to control panel, find device manager and disable unused high def audio devices for your GPU if you don't get sound via your video cables to your monitor. This can help reduce load on the GPU.

-you might also consider underclocking the GPU by 100 MHz and see if it resolves your problem.
(make sure your BIOS is not overclocking the BUS on the motherboard, it should be at 100MHz)

I think that the GPU recovered would mean that windows thought the GPU took too long then issued a reset to the device. The device got the reset responded within the timeout. if it did not respond within 30 seconds it should bugcheck and make a memory dump. I think the card is not seeing a signal sent to the card or is just taking too long to process something (like sound that is not going to a speaker)




 
Solution
So i wanted to take off more load from the GPU and shift it to CPU.

- Deactivated all possible 3D-Settings in Nvidia Control Panel.
Except the settings for "count of prerendered frames" and "previous virtual reality frames" , both of which i maxed to 4. Sorry if it isnt translated correctly.
-> Result was nvlddmkm crashed as soon as the game loaded a save, didnt even have the time to look around.

- Restored Nvidia Control Panel Settings and set both values to 1, their min value.
- Bios:
Set CPU multiplier to 15x200MHz -> 3000MHz (instead of 16x200MHz); Disabled K8 Cool n Quiet; Disabled AMD C1E support; enabled HDD S.M.A.R.T. (unrelated, must have been turned off with bios update)
-> Could play for nearly one and a half hours until nvlddmkm crashed. When looking at eventprotocol i noticed "\Device\Video5 Graphics Exception: Shader Program Header 1 Error" 2,3,9,14,18, followed by "\Device\Video5 Graphics Exception: ESR 0x405840=0x8304420e" repeating 1 to 3 times. This occoured in intervals of 1min to 40min.

- Turning Shadercache off in Nvidia Control Panel to see what happens.
-> no more shader header errors , could play for an hour before nvlddmkm.

Then i read your answer johnbl!
- underclocked the video card core clock to 800MHz and deactivated all nvidia sound device in device manager. (could have thought of the second one myself :)
-> Seems thats what was missing, playing since 2 hours with no error. i will keep testing (playing). Will be checking back if i encounter more errors.

Much aprich! Thanks very much!
 
Another nvlddmkm error! at 1:54 pm
3 hours without it is a new record though!

i noticed i got 3 of the shader header errors again, all at the same second ( at 11:03 am , so i dont know if its related ).

\Device\Video5 Graphics Exception: Shader Program Header 11 Error
0000000002003000000000000D00AAC0000000000000000000000000000000000000000000000000

\Device\Video5 Graphics Exception: Shader Program Header 18 Error
0000000002003000000000000D00AAC0000000000000000000000000000000000000000000000000

\Device\Video5 Graphics Exception: ESR 0x405840=0xa0040800
0000000002003000000000000D00AAC0000000000000000000000000000000000000000000000000

Any more suggestions ?
And how do i change the thread state back to unresolved ?

Greets
DasT123

Edit1:
- increased GPU core voltage to 1060mV as proposed in some OC forums
-> still nclddmkm error, more frequently again.
 
Thank you for unresolving the thread! And thanks for your suggestions!

Disabled the Nvidia Streaming Service in Administrative Tools/Services, set service start from automatic to deactivated. Restart.
-> nvlddmkm error after ruffly 20sec ingame; didnt want to reproduce that error at that moment, so i switched it on again.

Since i have started disabling things in Device Manager:
- Disabling Realtek Ethernet Adapter

-> Crash i have not encountered so far:
System wide freeze , with game-display going blackscreen and other screen just freezing.
Still could move mouse for about 20 more seconds (at both displays, could not click anywhere thou) before it froze too.
System remained in this state for 20+ minutes. No BSOD, no reboot.
-> No event log (only thing in log was related to some host process related to Windows Search not responding which i fixed by reindexing)
-> Tried to reproduce the crash and ended up with one nvlddmkm-classic in 12 hours of gaming!

Since its fairly stable now i will go on like that until i have more time or it gets worse again.
Still not resolved it seems..
 
Issue is still occouring!
but as it just occours ingame and only with games with heavy load on the system i came to believe its only because the system is so old.
its 5years old and thou its very nice that i can even run witcher3 i guess the system meets its limits.
so either the games arent coded to run at that old systems , or the driver doesnt handle high end demands well.
either way i didnt find the cause.
some games are handling the crash well and i dont even have to restart them , some games just dont recover and i got to restart.
Good luck with your display driver issues!