Question The well-documented "nvlddmkm.sys" error and how it's costing me my sanity !

Oct 29, 2024
11
2
15
Evening all,

Apologies I am going to rant a bit to give a bit of context feel free to skip ahead where I will give full system details along with issues and troubleshooting steps I've taken and links to my minidump files. Anyway, context:

I have over the last few weeks started to collect some new parts to upgrade/downsize my case to a MFF Lian Li A3 build.

I decided I was going to take the opportunity to updgrade to AM5 using some of my existing components but a fair few brand new ones.

The System:
  • CPU: 7800x3d (new)
  • Motherboard: Asrock B650M Pro RS (new)
  • 3090 FE (my existing GPU that I have had since launch)
  • RAM: CORSAIR VENGENCE DDR5 32GB (2X16GB) DDR45 , 6000MHZ CL30
    NVME: SAMSUNG 980 2TB (from old system but with fresh windows install)
  • PSU: Thermatake Toughpower SFX1000 ATX 3.1 Platinum (new)

    The Issue:

As the title eludes to I am constantly (sometimes it will wait an hour, some time 5mins in booting up) getting blackscreens then restarts, no BSOD that I see. Sometimes it shows up in Event Viewer and nvlddmkm.sys error sometimes just bug report with the dump file that does seem to point to the nvlddmkm.sys error.

This error is very well documented and I have literally spent the last 5 days reading forums, reddit, videos on it and feel like I could tell anyone what to do to fix this erro but for me it just wont work.

I have done all the advised usual steps. DDU in safe mode, clean driver installs etc, disabling auto driver updates when reloading in etc. No luck. I have changed power settings. I have given full auth to user accounts on teh actual file in sys32. All the things everyone says should fix it. Even changing some setting ins MSI afterburner etc including slightly downclocking the 3090 , this was a stab in the dark though as again it is 100% fine in my AM4 system and no changed between the systems cooling wise.

Done it all, I've done full clean windows installs. Literally installed nothing else really to really give this the best shot. No luck.


This all got me thinking it was hardware: So to note:

  • The 3090 works fine in my old AM4 system - tested this continues to be fine.
  • PSU - When I first started having issues and not looking into the dump files i thought this was the culprit. the way it would reset without a bsod and nothign in event viewer, i thought it was a PSU issue. Well got a replacement PSU , still happens. Used my Corsair HX1000 from all old systems, still happens. So ruled out PSU being the issue.
  • Motherboard - thought that was the issue but tried on both this MB and a Gigabyte board. Still happens.
  • NVME - maybe?? I have used it both as. fresh install and also just for fun as a clone of my other sytems main drive. Still happens.
  • RAM: run memetests with no problems - have tried just one stick of ram, still happens.​


So hardware wise I have either changed or been able to test in another system everything except the RAM and NVME and I keep coming back to this being hardware somewhere as with either clean windows installs or following all the other advice out there ( such as clean gpu driver etc with ddu) I've covered all the 'software' issues.

Oh and regarding CPU - nothign obvious with a problem there, temps are perfect, never get above 60c really when gaming (for the brief periods I can) so seems to cooled fine, nothing on a phsyical inspection such a bent pins etc on the board.


I have some other RAM turning up tomorrow to give it a try/rule it out and I have another NVME I can try tomorrow as well but just getting so hung up on this can someone give me a sense check and look at my dump files to make sure I'm not missing any other glaring issue?


Oh and I've also tried the usual as well, not running PBO, disabling igpu, running ram at stock speeds/not using expo, updating bios etc. also completely rebuilt the system like 3 times, ensuring everything seated/connected properly.

Linke to some mini dump files from today:
Compressed Mini Dump.zip


Thank you!




 
Last edited:
DDU in safe mode
When you run Safe Mode, does the issues appear there as well? Or do issues only appear when booted normally into OS?

CPU: 7800x3d (new)
Remove GPU from the system, plug monitor to MoBo and look if issues go away.

The 3090 works fine in my old AM4 system - tested this continues to be fine.
In some rare instances, older GPU has issues working in newer system.

To confirm/deny that, use 2nd, known to work GPU. Preferably RTX 40-series and look if system works fine or not.
 
  • Like
Reactions: shipsey
When you run Safe Mode, does the issues appear there as well? Or do issues only appear when booted normally into OS?


Remove GPU from the system, plug monitor to MoBo and look if issues go away.


In some rare instances, older GPU has issues working in newer system.

To confirm/deny that, use 2nd, known to work GPU. Preferably RTX 40-series and look if system works fine or not.


Thank you for responding.


SAFE MODE: DIdnt have an issues when doing the driver removals etc in safe mode but not tested long enough to see if still reporduces then. Will add it to my next round of testing.

REMOVE GPU - Did run it intially with on MOBO for set up - seemed OK but but again probably not tested long enough.

OLDER GPU - Tougher for me to test as dont have another GPU on hand.

EDIT: Just orderd a 4080 super - didnt want to upgarde me GPU as really not needed but desperate to get this system working. I have invested too much time and money into it. Plus my old rig which literaly includes Case/Fans/PSU/Cables/CPU/RAM that a friend was buying might take the 3090 as well as we can confirm it still fine in that system.


Appreciate the advice, have a few more things to look at.
 
Last edited:
  • Like
Reactions: Aeacus
The maximum GUARANTEED RAM speed that your CPU will accept is 5200MHz (with DDR5 RAM). That doesn't mean that you can't clock the RAM at it's stated speed of 6000MHz, but it does mean that it's not guaranteed to work with that CPU.

I would suggest removing all RAM overclock (via DOCP/XMP) and running at the native SPD speed, which is probably 4800MHz. If the system is stable at that speed then overclock it to 5200MHz and see whether the system is still stable. If it is then it's actually working as designed, but you can experiment with increased speed until you find the point where it becomes unstable again.
 
  • Like
Reactions: shipsey
The maximum GUARANTEED RAM speed that your CPU will accept is 5200MHz (with DDR5 RAM). That doesn't mean that you can't clock the RAM at it's stated speed of 6000MHz, but it does mean that it's not guaranteed to work with that CPU.

I would suggest removing all RAM overclock (via DOCP/XMP) and running at the native SPD speed, which is probably 4800MHz. If the system is stable at that speed then overclock it to 5200MHz and see whether the system is still stable. If it is then it's actually working as designed, but you can experiment with increased speed until you find the point where it becomes unstable again.
Appreciate that, I have run tested running ram at stock speeds (no overclock EXPO) though.
 
So new RAM arrived.

Installed and left OCCT running for an hour just to hit it while im working. Made it 20mins so far without a crash (last time with old RAM it crashed wiht same errors in minidump at 15 mins).

Fingers crossed!

I have a 4080 Super GPU arriving tomorrow but I'm trying everything I can so I won't have to use it and be able to return it unopened.


UPDATE: New RAM didnt fix it, still got crashes with the same error. Running at standard speeds not expo.


Getting more and more nervous that even new GPU wont solve this as in my constant research about this seeing the same issues with people who have 40 series cards
 
Last edited:
Evening all,

Apologies I am going to rant a bit to give a bit of context feel free to skip ahead where I will give full system details along with issues and troubleshooting steps I've taken and links to my minidump files. Anyway, context:

I have over the last few weeks started to collect some new parts to upgrade/downsize my case to a MFF Lian Li A3 build.

I decided I was going to take the opportunity to updgrade to AM5 using some of my existing components but a fair few brand new ones.

The System:
  • CPU: 7800x3d (new)
  • Motherboard: Asrock B650M Pro RS (new)
  • 3090 FE (my existing GPU that I have had since launch)
  • RAM: CORSAIR VENGENCE DDR5 32GB (2X16GB) DDR45 , 6000MHZ CL30
    NVME: SAMSUNG 980 2TB (from old system but with fresh windows install)
  • PSU: Thermatake Toughpower SFX1000 ATX 3.1 Platinum (new)

    The Issue:

As the title eludes to I am constantly (sometimes it will wait an hour, some time 5mins in booting up) getting blackscreens then restarts, no BSOD that I see. Sometimes it shows up in Event Viewer and nvlddmkm.sys error sometimes just bug report with the dump file that does seem to point to the nvlddmkm.sys error.

This error is very well documented and I have literally spent the last 5 days reading forums, reddit, videos on it and feel like I could tell anyone what to do to fix this erro but for me it just wont work.

I have done all the advised usual steps. DDU in safe mode, clean driver installs etc, disabling auto driver updates when reloading in etc. No luck. I have changed power settings. I have given full auth to user accounts on teh actual file in sys32. All the things everyone says should fix it. Even changing some setting ins MSI afterburner etc including slightly downclocking the 3090 , this was a stab in the dark though as again it is 100% fine in my AM4 system and no changed between the systems cooling wise.

Done it all, I've done full clean windows installs. Literally installed nothing else really to really give this the best shot. No luck.


This all got me thinking it was hardware: So to note:

  • The 3090 works fine in my old AM4 system - tested this continues to be fine.
  • PSU - When I first started having issues and not looking into the dump files i thought this was the culprit. the way it would reset without a bsod and nothign in event viewer, i thought it was a PSU issue. Well got a replacement PSU , still happens. Used my Corsair HX1000 from all old systems, still happens. So ruled out PSU being the issue.
  • Motherboard - thought that was the issue but tried on both this MB and a Gigabyte board. Still happens.
  • NVME - maybe?? I have used it both as. fresh install and also just for fun as a clone of my other sytems main drive. Still happens.
  • RAM: run memetests with no problems - have tried just one stick of ram, still happens.​


So hardware wise I have either changed or been able to test in another system everything except the RAM and NVME and I keep coming back to this being hardware somewhere as with either clean windows installs or following all the other advice out there ( such as clean gpu driver etc with ddu) I've covered all the 'software' issues.

Oh and regarding CPU - nothign obvious with a problem there, temps are perfect, never get above 60c really when gaming (for the brief periods I can) so seems to cooled fine, nothing on a phsyical inspection such a bent pins etc on the board.


I have some other RAM turning up tomorrow to give it a try/rule it out and I have another NVME I can try tomorrow as well but just getting so hung up on this can someone give me a sense check and look at my dump files to make sure I'm not missing any other glaring issue?


Oh and I've also tried the usual as well, not running PBO, disabling igpu, running ram at stock speeds/not using expo, updating bios etc. also completely rebuilt the system like 3 times, ensuring everything seated/connected properly.

Linke to some mini dump files from today:
Compressed Mini Dump.zip


Thank you!




11: kd> !error 0xc000009a
Error code: (NTSTATUS) 0xc000009a (3221225626) - Insufficient system resources exist to complete the API.

what is globmerger.sys ?
Kerb3961Kernel.sys ?
part of some encryption/decryption?
 
Last edited:
  • Like
Reactions: shipsey
thanks, just did a bit of googling on that one and got a bit lost. All i could see was make sure have enough space on hard drive and maybe expand pagefile sizes?

Then saw quite a few threads still talking about being related to graphics drivers.

Do you have anything to add? Given you spotted it you already know more than me 😀

Thank you
 
11: kd> !error 0xc000009a
Error code: (NTSTATUS) 0xc000009a (3221225626) - Insufficient system resources exist to complete the API.

what is globmerger.sys ?
Kerb3961Kernel.sys ?
part of some encryption/decryption?§
sorry seen you edited now. - no idea what those others are, this is based of a clean instanl of windows and minimum things installed due to all my troubleshooting. So no idea what is causing them

I'm actually going to do another fresh install with a differnt NVME tomorrow and try with an AMD card. Seen this error far to much that it make me worry even more about upgrading to a 40 series.


(for reference I just took my 3090 out of the new system and put it back in the old system and its been running like a dream).
 
note: looks like Microsoft updated the symcryptk library on sept 5 2024
to make the library resist attack by quantum computers. the caller of the library has to do the memory alllocations so you might be running something that is a old version that is calling into the new library incorrectly. Ie a build mismatch of some sort.


i saw some encryption library in your driver list (symcryptk.dll). The system had bugchecked after 8 seconds.
the internal error indicated a out of resources error. you might clear pagefile.sys and reboot.
other than that you would have to provide a kernel memory dump file= memory.dmp and it would contain the proper debug info.
here is https://github.com/microsoft/SymCrypt
I have looked at a lot of memory dumps and this is the first one that I have seen these driver being used.
 
Last edited:
  • Like
Reactions: shipsey
i saw some encryption library in your driver list (symcryptk.dll). The system had bugchecked after 8 seconds.
the internal error indicated a out of resources error. you might clear pagefile.sys and reboot.
other than that you would have to provide a kernel memory dump file= memory.dmp and it would contain the proper debug info.
here is https://github.com/microsoft/SymCrypt
I have looked at a lot of memory dumps and this is the first one that I have seen these driver being used.
ahh thanks for that. Ill run full kernal dump file if still having issues.

Intersting about symcrypt - never heard of it till now, seems to be pretty baked in to Windows though from my quick search, assume its something that is usualy there?


https://techcommunity.microsoft.com...main cryptographic,Stack HCI, and Azure Linux.
 
ahh thanks for that. Ill run full kernal dump file if still having issues.

Intersting about symcrypt - never heard of it till now, seems to be pretty baked in to Windows though from my quick search, assume its something that is usualy there?


https://techcommunity.microsoft.com/t5/security-compliance-and-identity/microsoft-s-quantum-resistant-cryptography-is-here/ba-p/4238780#:~:text=SymCrypt is Microsoft's main cryptographic,Stack HCI, and Azure Linux.
I think some app on your computer has installed these libraries and kernel files.
maybe to access azure? these are not on normal machines after a windows install.
I do not see anything special about your machine.
11: kd> !sysinfo machineid
Machine ID Information [From Smbios 3.4, DMIVersion 0, Size=1900]
BiosMajorRelease = 5
BiosMinorRelease = 32
BiosVendor = American Megatrends International, LLC.
BiosVersion = 2.08
BiosReleaseDate = 03/01/2024
SystemManufacturer = ASRock
SystemProductName = B650M Pro RS WiFi
SystemFamily = Default string
SystemVersion = Default string
SystemSKU = Default string
BaseBoardManufacturer = ASRock
BaseBoardProduct = B650M Pro RS WiFi

looks like some of these files were released to selected windows insider clients using the Canary Channel build in June of 2023.

some asus machines were blocked from getting the build because of boot issues.

you had this driver running:
win32kbase_rs.sys
Rust. Specifically, win32kbase_rs.sys contains a new implementation of GDI region. While this is a small trial, we will continue to increase the usage of Rust in the kernel.
 
Last edited:
  • Like
Reactions: shipsey
I think some app on your computer has installed these libraries and kernel files.
maybe to access azure? these are not on normal machines after a windows install.
I do not see anything special about your machine.
11: kd> !sysinfo machineid
Machine ID Information [From Smbios 3.4, DMIVersion 0, Size=1900]
BiosMajorRelease = 5
BiosMinorRelease = 32
BiosVendor = American Megatrends International, LLC.
BiosVersion = 2.08
BiosReleaseDate = 03/01/2024
SystemManufacturer = ASRock
SystemProductName = B650M Pro RS WiFi
SystemFamily = Default string
SystemVersion = Default string
SystemSKU = Default string
BaseBoardManufacturer = ASRock
BaseBoardProduct = B650M Pro RS WiFi

looks like some of these files were released to selected windows insider clients using the Canary Channel build in June of 2023.

some asus machines were blocked from getting the build because of boot issues.

you had this driver running:
win32kbase_rs.sys
Rust. Specifically, win32kbase_rs.sys contains a new implementation of GDI region. While this is a small trial, we will continue to increase the usage of Rust in the kernel.
hmm ok well - going to do a fresh install of windows today anyway as have a new NVME turning up - I use the windows media creation tool to make a bootalbe USB so unlesss this is included in that I will make sure to pause windows updates as soon as loaded.

As for everything else I am going to really limit what I isntall until im comfortable of stability to easier troubleshoot.

Funny thing is I can probably name everything I installed.

HWINFO
OCCT
MSI Afterburner
DDDU
CHROME
WINDBG
L-Connect
G-HUB
Battle.net + Overwatch 2
EA APP + BF4 and BF2042
 
As for everything else I am going to really limit what I isntall until im comfortable of stability to easier troubleshoot.
That's very wise. Test it as much as you can with just Windows, all updates, and all drivers installed. I quite often see people do a reinstall that includes all their third-party stuff - and they quite often end up reinstalling the problem.
 
  • Like
Reactions: shipsey
That's very wise. Test it as much as you can with just Windows, all updates, and all drivers installed. I quite often see people do a reinstall that includes all their third-party stuff - and they quite often end up reinstalling the problem.
been trying to limit stuff for sure, but there are going to be some stuff that will genearally needed.


So far with this build by today I have replaced every piece of hardware to narrow that down , im going fomr Nvidia to AMD for GPU and just going to basically build it form scratch today. Just doing all the usual troubleshooting steps. Fingers crossed this is the last time!
 
Rather than installing everything in one go, install Win and look if it is stable. Then install 1 program at a time, while making sure Win is stable. If you see issues happen again, you know the latest change you did between being stable and being unstable.
Good shout, that is the plan.
 
So little update.

I think its done. I think its all good. Been running with new hardware and everything so far is all good. still going to take it slow setting it all up software wise to keep tsting but yeah think its done.

Now the sad part is i know someone else is going to have this error and they are goign to be like me and googling and reading everything I can on it trying to fix it myself.

This thread will be found and it may not even help them as I didnt 'fix it' i changed to AMD GPU lol ---

Man this 'little upgrade/case swap' really snowballed and I've ended up with a 100% new PC. Only thing from old PC in it is one of my old NVMEs but I still have a brand new NVME as well.

Oh well, thanks for everyones help. Much appreciated.
 
  • Like
Reactions: Aeacus