Hello,
I think I am having issues with my GPU (Asus ROG Strix RTX 4090 OC edition) crashing. What usually happens is that after playing a game for about 3-5 minutes my display stops responding (HDMI no signal), my GPU fans go to 100% and I’m unable to do anything except reboot. Sometimes it won’t happen until after I game for hours and then it will occur within 3-5 minutes of doing nothing. It seems to be an issue with my gfx card entering/exiting a certain mode? Strangely, twice now this issue has randomly gone away for over a month and then returned.
EventViewer usually produces the following two errors nvlddmkm error codes together:
“The description for Event ID 14 from source nvlddmkm cannot be found. Either the component that raises this event is not installed on your local computer or the installation is corrupted. You can install or repair the component on the local computer.”
If the event originated on another computer, the display information had to be saved with the event.
The following information was included with the event:
\Device\Video3
0000(0000) 00000000 00000000
Element not found
A separate identical error code/message is almost always simultaneously occurs for Event ID 0
Note: I have also seen this error with the following information included in the event:
\Device\000000ff
Variable String too Large
\Device\000000ff
Error occurred on GPUID: 100
\Device\Video3
CMDre 00000002 00003ffc ffffffff 00000007 00ffffff
\Device\00000107
Row Remapper: New row marked for remapping, reset gpu to activate.
\Device\Video3
UCodeReset TDR occurred on GPUID:100
Element not found
I have also seen this error in EventViewer:
Display driver nvlddmkm stopped responding and has successfully recovered.
*Note: For a while when I was getting this error code my game would just freeze, but my GPU fans didn’t ramp to 100% and I didn’t have to reboot. The video driver did seem to recover successfully, but lately I get this error even though it doesn’t actually recover.
A couple times a .dmp file has been created. Someone on another forum analyzed it and said: “It's the GPU driver, but the reason for the crash sounds more software related. The crash error is "An attempt was made to release a semaphore such that its maximum count would have been exceeded.”
After researching these error codes, it appears that nvlddmkm errors are very generic and could indicate a variety of different problems.
Here are the troubleshooting steps I have already tried based on researching this issue:
• DDU drivers in safe mode and updated to latest driver and an old driver that previously worked for me.
• Clean reinstall of Windows 11
• Updated GPU Bios
• Disabled XMP
• Completely Uninstalled all 3rd party lighting controllers (Asus Armoury Crate, Corsair iCUE, Razer Synapse)
• Checked PSU power consumption. I’ve never seen it hit over 700w and it’s a 1000w PSU.
• Changed power management mode to “Prefer maximum performance.”
• Changed Hardware accelerated GPU scheduling (HAGS) to off.
• Tried using a DisplayPort cable instead of HDMI and different HDMI ports.
• Scanned SSDs with Samsung magician. Both are in good health.
• Checked CPU and GPU temperature while gaming and both are normal (under 70C).
• Changed Link State Power Management to OFF
• Changed ECC state to On
• Power limited GPU to 80%
• Manually flipped the switch on the graphics card that sets it to Quiet instead of Performance vBIOS profile
• Changed user permission of nvlddmkm.sys to full control
• Disabled "Fast startup" for Windows 11
• Disabled G-sync
While 90% of the time this crash occurs within the first 3-5 minutes of gaming, I have also had this occur a few times while away from my computer (my computer not running anything). Another couple times I was able to game for hours and within 5 minutes of stopping, the same crash occurred. A few times it occurred right after booting into Windows after previously crashing.
Interestingly, I’m able to run low GPU usage games without issue.
I have had this PC for almost 1 year and to the best of my knowledge my issues started only after updating to the latest version of ASUS Armoury Crate back in June.
I would really appreciate your thoughts or suggestions.
Thanks in advance!
Link to full PC specs:
https://pcpartpicker.com/b/DYNPxr
I think I am having issues with my GPU (Asus ROG Strix RTX 4090 OC edition) crashing. What usually happens is that after playing a game for about 3-5 minutes my display stops responding (HDMI no signal), my GPU fans go to 100% and I’m unable to do anything except reboot. Sometimes it won’t happen until after I game for hours and then it will occur within 3-5 minutes of doing nothing. It seems to be an issue with my gfx card entering/exiting a certain mode? Strangely, twice now this issue has randomly gone away for over a month and then returned.
EventViewer usually produces the following two errors nvlddmkm error codes together:
“The description for Event ID 14 from source nvlddmkm cannot be found. Either the component that raises this event is not installed on your local computer or the installation is corrupted. You can install or repair the component on the local computer.”
If the event originated on another computer, the display information had to be saved with the event.
The following information was included with the event:
\Device\Video3
0000(0000) 00000000 00000000
Element not found
A separate identical error code/message is almost always simultaneously occurs for Event ID 0
Note: I have also seen this error with the following information included in the event:
\Device\000000ff
Variable String too Large
\Device\000000ff
Error occurred on GPUID: 100
\Device\Video3
CMDre 00000002 00003ffc ffffffff 00000007 00ffffff
\Device\00000107
Row Remapper: New row marked for remapping, reset gpu to activate.
\Device\Video3
UCodeReset TDR occurred on GPUID:100
Element not found
I have also seen this error in EventViewer:
Display driver nvlddmkm stopped responding and has successfully recovered.
*Note: For a while when I was getting this error code my game would just freeze, but my GPU fans didn’t ramp to 100% and I didn’t have to reboot. The video driver did seem to recover successfully, but lately I get this error even though it doesn’t actually recover.
A couple times a .dmp file has been created. Someone on another forum analyzed it and said: “It's the GPU driver, but the reason for the crash sounds more software related. The crash error is "An attempt was made to release a semaphore such that its maximum count would have been exceeded.”
After researching these error codes, it appears that nvlddmkm errors are very generic and could indicate a variety of different problems.
Here are the troubleshooting steps I have already tried based on researching this issue:
• DDU drivers in safe mode and updated to latest driver and an old driver that previously worked for me.
• Clean reinstall of Windows 11
• Updated GPU Bios
• Disabled XMP
• Completely Uninstalled all 3rd party lighting controllers (Asus Armoury Crate, Corsair iCUE, Razer Synapse)
• Checked PSU power consumption. I’ve never seen it hit over 700w and it’s a 1000w PSU.
• Changed power management mode to “Prefer maximum performance.”
• Changed Hardware accelerated GPU scheduling (HAGS) to off.
• Tried using a DisplayPort cable instead of HDMI and different HDMI ports.
• Scanned SSDs with Samsung magician. Both are in good health.
• Checked CPU and GPU temperature while gaming and both are normal (under 70C).
• Changed Link State Power Management to OFF
• Changed ECC state to On
• Power limited GPU to 80%
• Manually flipped the switch on the graphics card that sets it to Quiet instead of Performance vBIOS profile
• Changed user permission of nvlddmkm.sys to full control
• Disabled "Fast startup" for Windows 11
• Disabled G-sync
While 90% of the time this crash occurs within the first 3-5 minutes of gaming, I have also had this occur a few times while away from my computer (my computer not running anything). Another couple times I was able to game for hours and within 5 minutes of stopping, the same crash occurred. A few times it occurred right after booting into Windows after previously crashing.
Interestingly, I’m able to run low GPU usage games without issue.
I have had this PC for almost 1 year and to the best of my knowledge my issues started only after updating to the latest version of ASUS Armoury Crate back in June.
I would really appreciate your thoughts or suggestions.
Thanks in advance!
Link to full PC specs:
https://pcpartpicker.com/b/DYNPxr