Question Error 4101, Display: Display driver nvlddmkm stopped responding and has successfully recovered.

Status
Not open for further replies.

kortex

Distinguished
Nov 9, 2007
76
0
18,630
I'd like some help troubleshooting my gaming issues. I can't be in some games for more than a few seconds before the game crashes. I've been building and diagnosing computer problems for around 15 years and this problem is really confusing and frustrating me. I apologize for the long read but I've tried quite a few things. I've been trying to play the following three games and all of them crash. I'm baffled. I've read quite a bit on this error online but none of the fixes seem to work for me. If anyone has any ideas, I'd love to hear them.

UserBench: https://www.userbenchmark.com/UserRun/42082176

System Specs:
  • ASUS Maximus XI Hero WIFI
  • i9-9900k (not OC)
  • 2080ti (not OC)
  • EVGA SuperNova 1200W
  • G.SKILL F4 DDR4 3200 C14 4x8GB. XMP Enabled.
  • Samsung 970 Pro 512gb
  • 2x Acer XB271HU
  • Cooling
    • two large radiators (don't make me look them up)
    • dual d5 pumps
    • Heatkiller IV GPU block
    • Heatkiller IV CPU block
I. What's going wrong

Prey

Initially unplayable. Disabled V-Sync, set to borderless window. Worked great on highest settings. Played about half the game. Now it freezes upon loading a saved game or seconds after I've loaded the game. I've tried everything in game settings and no combinations makes it work. I get two event viewer entries. One is: Error "Application Hang" (it specifies prey.exe). The other is Event ID 4101 with the message in the title of this post.
Far Cry 5
I got through the intro just fine and then the game crashed. Now I can't load the save or the game crashes. I can't play a new game either. I've messed with graphics settings and nothing has helped. I get error 4101 after seconds within the game or at loading a saved game.

II. What I've tried
  1. Switched PCIe Slot for 2080ti
  2. Verified integrity of game files in steam.
  3. Prime95 torture test. Ran for 20 mins or so. No problems.
  4. Unreal Heaven on extreme. Ran through all scenarios. No issues.
  5. Removed RAM and tried each stick individually
  6. Memtest. Everything passed.
  7. Removed display drivers with DDU (I followed the instructions) and re-installed latest version
  8. Removed display drivers and installed versions of the driver dating all the way back to March 2020.
  9. Uninstalled/reinstalled games in steam
  10. Modified settings: G-SYNC, VSYNC, all detail settings to low
  11. Unplugging one monitor.
  12. "Clean Booting" (disabling all non necessary services and applications) and trying to run the games
  13. Uninstalling/reinstalling DirectX iterations
  14. Performed chkdisk. No errors.
  15. Crystal Disk Info: 100% Health
  16. Running system file checker (sfc /scannow ). Corrupted files were found and repaired but no change in the games (or the system for that matter).
  17. Downclocking RAM. Enabling XMP. Enabling XMP II.
  18. Formatting my hard drive and re-installing windows. I installed the bare minimum drivers--not even sound drivers. The games still crashed.
    1. I didn't install armory crate or corsair software or ANYTHING except bare minimum drivers.
  19. Adding TdrDelay values to the registry of 10 and then 20.
  20. Disabled all RGB effects in bios (this was recommended)
  21. Downclocking video card. I tried to do this... I was able to downclock the memory but not the clock speed. I have no idea why.
  22. Switching the power cables from my PSU to the video card.
  23. Switching the ports on the PSU that are powering the video card.
III. Questions that have been asked (I posted a while ago on Reddit)
  1. Where did you buy your graphics card?
    1. RTX 2080ti bought directly from NVIDIA.
  2. How is your system cooled?
    1. Liquid. Custom loop.
  3. BIOS up to date?
    1. You bet.
  4. Chipset drivers up to date?
    1. Undoubtedly
  5. Last windows install?
    1. April 23rd 2021
  6. Do the games crash or the system?
    1. The games crash. The system hasn't crashed once.
  7. Does your system bluescreen?
    1. Not once.
  8. GPU Temp during Unreal Heaven or Userbench?
    1. 102F. Yes, F.
  9. Do any games run ok?
    1. Valheim, max settings runs perfectly
    2. CS:GO, max settings runs perfectly
  10. Has my card been used for mining?
    1. Nope. Bought new, don't mine.
  11. Power setting in NVIDIA control panel?
    1. I switched it to high performance and it didn't change anything.
  12. Power settings for PCI in windows?
    1. Off (no power saving)
 
Last edited:

Ralston18

Titan
Moderator
Check Reliability History for error codes, warnings, and even informational entries that correspond with the gaming crashes.

Noted "sfc /scanow" was run. What about "dism"?

The three games that crash: Prey, Far Cry, and ?__ . Source?

Unreal/ Unigine Heaven(?)

Use either Task Manager or Resource Monitor to observe system performance. Use both but only one at a time.

First watch after booting while the system simply idles. Then watch while doing light work or browsing. Last watch while gaming.

Determine what system resources are being used, to what extent (%), and what is using the resource. You may spot some pattern that leads to the crashes.

Also take a deeper look into what all is running via Task Scheduler and Process Explorer. Look for anything that only appears with respect to the games that crash.

All in all:

#21 "Downclocking video card. I tried to do this... I was able to downclock the memory but not the clock speed. I have no idea why. "

Very much seems to be a point of uncertainty.

FYI:

https://forums.evga.com/2080ti-memory-overclock-even-worth-it-m3035338.aspx

Saw a few other links that circled around the matter - not linking any of the other links - may go too far astray.

May need more digging into things there. However, you are in the best position to decide about that.
 

kortex

Distinguished
Nov 9, 2007
76
0
18,630
Swap out your PSU.

It looks like you've covered all of the other bases but I don't see any info on your PSU being checked out. You've run Prime and Heaven, but individually they don't load the PSU up like a game will.

I ran OCCT power test. It got the processor hotter than it's ever been and used everything the max it could. My system didn't crash. Do you still think it's the PSU? Not saying you're wrong, it totally could be. Wouldn't the system crash though? Maybe not, if the power wasn't able to be delivered to the video card. Maybe the display would just crash. Also of interest, I've watched the video card pull 260W continuously and the display hasn't crashed. Thoughts?
 

kortex

Distinguished
Nov 9, 2007
76
0
18,630
Check Reliability History for error codes, warnings, and even informational entries that correspond with the gaming crashes.

I feel somewhat embarrassed to say I didn't do this. There ARE some kernel events directly following (1s or so) some of the crashes.
Fault bucket , type 0
Event Name: LiveKernelEvent
Response: Not available
Cab Id: 0

Problem signature:
P1: 141
P2: ffffbc0610e24050
P3: fffff800664f4ec0
P4: 0
P5: 20c4
P6: 10_0_19042
P7: 0_0
P8: 256_1
P9:
P10:

Attached files:
\\?\C:\Windows\LiveKernelReports\WATCHDOG\WATCHDOG-20210426-1404.dmp
\\?\C:\Windows\TEMP\WER-456125-0.sysdata.xml
\\?\C:\ProgramData\Microsoft\Windows\WER\Temp\WERFE58.tmp.WERInternalMetadata.xml
\\?\C:\ProgramData\Microsoft\Windows\WER\Temp\WERFE68.tmp.xml
\\?\C:\ProgramData\Microsoft\Windows\WER\Temp\WERFE67.tmp.csv
\\?\C:\ProgramData\Microsoft\Windows\WER\Temp\WERFE78.tmp.txt

These files may be available here:
\\?\C:\ProgramData\Microsoft\Windows\WER\ReportArchive\Kernel_141_7d640848a3f786ca869c519d2cc5305a9fe1_00000000_cab_95835267-6b66-4ecf-beb8-3eed6a38ff62

Analysis symbol:
Rechecking for solution: 0
Report Id: 95835267-6b66-4ecf-beb8-3eed6a38ff62
Report Status: 2049
Hashed bucket:
Cab Guid: 0

I'm getting some of these too, but they don't seem to be related to the time of the display driver crash. I'm not sure what to make of them.

The application-specific permission settings do not grant Local Activation permission for the COM Server application with CLSID
{2593F8B9-4EAF-457C-B68A-50F6B8EA6B54}
and APPID
{15C20B67-12E7-4BB6-92BB-7AFF07997402}
to the user DESKTOP-E6LNB67\myname SID (S-1-5-21-2271528218-3128320148-1138339636-1001) from address LocalHost (Using LRPC) running in the application container Unavailable SID (Unavailable). This security permission can be modified using the Component Services administrative tool.

Noted "sfc /scanow" was run. What about "dism"?
Both passed without error.
View: https://imgur.com/D4Iwwmt

View: https://imgur.com/EiRW28w


The three games that crash: Prey, Far Cry, and ?__ . Source?
Fallout 4. Apologies.

Unreal/ Unigine Heaven(?)
Unigine Heaven. Brainfart.

Use either Task Manager or Resource Monitor to observe system performance. Use both but only one at a time.

First watch after booting while the system simply idles. Then watch while doing light work or browsing. Last watch while gaming.

Determine what system resources are being used, to what extent (%), and what is using the resource. You may spot some pattern that leads to the crashes.

Also take a deeper look into what all is running via Task Scheduler and Process Explorer. Look for anything that only appears with respect to the games that crash.
Maybe there's some type of logging that would help? I haven't attempted to do that yet. I've watched, just not logged.


All in all:
#21 "Downclocking video card. I tried to do this... I was able to downclock the memory but not the clock speed. I have no idea why. "

Very much seems to be a point of uncertainty.

FYI:

https://forums.evga.com/2080ti-memory-overclock-even-worth-it-m3035338.aspx

Saw a few other links that circled around the matter - not linking any of the other links - may go too far astray.

Yes. That's concerning to me. The card DOES boost itself under load--I have seen that. Adjusting the clock speed in MSI Afterburner or Asus GPU Tweak III does nothing. I didn't have them both installed at the same time. I can't over or underclock it by myself, which is weird. I'm honestly not trying to overclock the memory or clock at all. I had read that some users found more stability by reducing clock speed.

May need more digging into things there. However, you are in the best position to decide about that.

I feel like I'm losing my mind to be honest.
 

Ralston18

Titan
Moderator
Task Manager/Resource Monitor: about all you can do is watch.

One thing to remember is that you will need to play the games while "watching" via another open Window or a second monitor. Either way likely resulting in your not being able to win.

All in all, it is likely that the games that cause problems are demanding too much power over all or hitting some peak power demand that causes the PSU to falter.

And, as @dorsai correctly pointed out we have not really checked out the PSU.

PSU's are a critical component and they do degrade over time especially with heavy use (gaming, video editing, bit-mining) and just natural progression to the PSU's designed in EOL (End of Life). In addition, they provide 3 voltages to different components so resulting problems can and do vary depending on the voltage(s) involved.

Seeing increasingly numerous and varying errors is, in my mind, a sign of a failing PSU.

Two links of interest:

https://www.lifewire.com/how-to-manually-test-a-power-supply-with-a-multimeter-2626158

If you, a family member, or friend has a multi-meter and knows how to use it then your can do some testing on the PSU.

https://www.tomshardware.com/reviews/best-psus,4229.html

Not with immediate intent to go purchase a new PSU. Just to add some insight with respect to PSU's and power requirements. Apply the calculators to your system.

And if you can swap in another known working PSU - all the better.
 

kortex

Distinguished
Nov 9, 2007
76
0
18,630
Task Manager/Resource Monitor: about all you can do is watch.

One thing to remember is that you will need to play the games while "watching" via another open Window or a second monitor. Either way likely resulting in your not being able to win.

All in all, it is likely that the games that cause problems are demanding too much power over all or hitting some peak power demand that causes the PSU to falter.

And, as @dorsai correctly pointed out we have not really checked out the PSU.

PSU's are a critical component and they do degrade over time especially with heavy use (gaming, video editing, bit-mining) and just natural progression to the PSU's designed in EOL (End of Life). In addition, they provide 3 voltages to different components so resulting problems can and do vary depending on the voltage(s) involved.

Seeing increasingly numerous and varying errors is, in my mind, a sign of a failing PSU.

Two links of interest:

https://www.lifewire.com/how-to-manually-test-a-power-supply-with-a-multimeter-2626158

If you, a family member, or friend has a multi-meter and knows how to use it then your can do some testing on the PSU.

https://www.tomshardware.com/reviews/best-psus,4229.html

Not with immediate intent to go purchase a new PSU. Just to add some insight with respect to PSU's and power requirements. Apply the calculators to your system.

And if you can swap in another known working PSU - all the better.

OK. It’s as I feared. I broke out my multimeter and tested the power supply according to the article you sent. All voltages normal if not slightly above. Nothing lower. I went ahead and tested the VGA power cables. Three pins on each should be 12V. They all are.

Even though that article says the PSU should be good after these tests, that doesn’t (IMHO) mean it operates that way under load. I’m probably going to have to switch it out and try another. Anyone have a spare PSU sitting around (kidding)? It’s so confusing to me why it would be a PSU if during power supply torture test from OCCT, nothing goes wrong.

12 case fans, two d5 pumps and then the rest of the normal components mean a lot of power supply cables. Hereeeeeere we go.

Any other thoughts btw? I sincerely appreciate your help, even if I don’t find a solution.
 

Ralston18

Titan
Moderator
You are correct: the tests are not under load.

And it also could be a matter of some threshold set of conditions. All works until some specific combination of power demands and voltages is needed. Could involve temperatures and heat.

Then the PSU falters.

As for "torture test" - think about it.

First the test is likely to be of relatively short (subjective) duration and not involve hours and hours as gamers and bit-miners do. Then a second thing to think about is that any test sufficient to match real world situations is likely to start breaking things. How long will folks keep using such a test?

Testing labs, etc. plan to break things. So their testing is likely to be more realistic.
 
Even though that article says the PSU should be good after these tests, that doesn’t (IMHO) mean it operates that way under load.

My thoughts are you've done so many tests and nothing pans out...one of the most prominent common factors is the power supply. And you are correct that voltages measured at idle are not the same as voltages measured at load. You could have a single weak component in the PSU that only faults when it heats up. The OCCT test may just not be hitting the systems components all at the same time the way a game does. It's worth a shot swapping out the supply and it certainly won't hurt anything so why not try it ? On the plus side I think those 1200w EVGA supplies have a 10 year warranty.
 

kortex

Distinguished
Nov 9, 2007
76
0
18,630
You are correct: the tests are not under load.

And it also could be a matter of some threshold set of conditions. All works until some specific combination of power demands and voltages is needed. Could involve temperatures and heat.

Then the PSU falters.

As for "torture test" - think about it.

First the test is likely to be of relatively short (subjective) duration and not involve hours and hours as gamers and bit-miners do. Then a second thing to think about is that any test sufficient to match real world situations is likely to start breaking things. How long will folks keep using such a test?

Testing labs, etc. plan to break things. So their testing is likely to be more realistic.

Thank you that’s a great point. So, I borrowed a Corsair ax1200i from a friend. Here’s where it gets interesting. Prey lasted longer than usual (2 mins or so) before a crash. Far Cry 5 didn’t but DEFINITELY had higher FPS (I just run the benchmark because it’s going to crash anyway) before the crash. Both crashes were the same as before in terms of what’s reported by the system.

Motherboard, video card... something.
 
Last edited:

kortex

Distinguished
Nov 9, 2007
76
0
18,630
My thoughts are you've done so many tests and nothing pans out...one of the most prominent common factors is the power supply. And you are correct that voltages measured at idle are not the same as voltages measured at load. You could have a single weak component in the PSU that only faults when it heats up. The OCCT test may just not be hitting the systems components all at the same time the way a game does. It's worth a shot swapping out the supply and it certainly won't hurt anything so why not try it ? On the plus side I think those 1200w EVGA supplies have a 10 year warranty.

thank you for the reply again. That makes sense. I just replied to Ralston with what happened when I switched the PSU out. I’m going to go pound sand
 

kortex

Distinguished
Nov 9, 2007
76
0
18,630
Is it possible that peak wattage required by the system is higher than the 1200 watt PSU is capable of delivering? What are all of the components in the system and their wattage requirements?

Hi, I did add them up and it's fine. Like...really fine :). Big update. I procured a 1080ti from a friend and...........................................no crashes. HOWEVER:

  1. The 1080ti is NOT using the riser cable that the 2080ti is using.
  2. I can't necessarily hook the 2080ti up without the riser cable without draining my loop.
  3. What should my next steps be?
edit: I had no idea about riser cables as I hadn't built a computer since 2009. I didn't even feel it was worth mentioning :|
 
Last edited:

kortex

Distinguished
Nov 9, 2007
76
0
18,630
I wish I knew you had been using a riser cable...removing it would have been my first suggestion as many of them are junk. If it was my system I would rebuild it in a standard configuration with the GPU directly in the PCI-E slot.

Well, this is my first time using one. The GPU is vertically mounted and the cable is from Caselabs. I have hard tubing and it’s going to be a bitch to try and move it. Are there “good” cables?
 
Status
Not open for further replies.