Question DPC_WATCHDOG_VIOLATION BSOD & Event ID 153, 14, 0 nvlddmkm.sys crashes ?

Jul 31, 2025
4
0
10
Hello! I have pretty much been going through this hell of an issue for almost a month now. I got a new build around a month ago. For the first 2-3 weeks, everything was working flawlessly: gaming at ultra settings, no stutter, no lag, and most importantly, no crashes. Just after that, one day, I got a BSOD about DPC_WATCHDOG_VIOLATION around 5 minutes into a game. Fast forward to this day, I can't run any game stable for less than 10 minutes. The same thing applies when I try to export a video. In Event Viewer I get errors about nvlddmkm, especially Event ID 0, 14 and 153, which I will go in more detail as this thread continues.

The system specs are as listed below:

CPU: Intel Core i9-14900k
GPU: ZOTAC RTX 4090 AMP Extreme Airo
RAM: 32 GB DDR5-4800, Kingston KF552C40-32 + 32 GB DDR5-4800, Kingston KF552C40-32 (64 GB RAM in total)
SSD: CT2000P5PSSD8, 1863.02 GB
All-in-One Cooler for the CPU
PSU: bequiet! Pure Power 12M 1000W
Motherboard: MSI PRO B760M-P
OS: Windows 11, 24H2
Monitor: Dell S2419HGF, 1920x1080, 144Hz

I have noticed the same issue being reported many times, especially for 4090. Here is what I have done so far:

  1. Enabled & Disabled XMP in BIOS (issue persists) ;
  2. Disabled & Enabled Hyperthreading, Turbo Settings regarding CPU (issue persists, no difference) ;
  3. Used DDU to uninstall the driver in safe boot mode and installed an older version [537.58] (This did make games more stable but only a few, tested some other games, they would still crash with Event ID 0) ;
  4. Changed permissions for nvlddmkm.sys file to Full Control for Users (issue remain) ;
  5. Turned on Debug Mode in Nvidia Control Panel
  6. Switched to "Prefer Maximum Performance" in Nvidia Control Panel (no difference) ;
  7. Disabled Hardware Accelerated GPU Scheduling in Windows settings (This caused no BSODs but the crashes remain)
  8. Used MSI Afterburner to undervolt the GPU core & memory clocks for around -52 MHz (no difference) ;
  9. Changed PCIe Gen Mode in BIOS to both 4.0 and 3.0 (no difference) ;
  10. Uninstalled programs like G-Hub and Wallpaper Engine, switched HAGS off for the other programs that supported it (no difference);
  11. Disabled Integrated GPU in Device Manager (issue still persists);
  12. Uninstalled NVIDIA HD Audio in Device Manager (yet again no difference);
  13. Disabled High Precision Event Timer (others said it was the only workaround it, no difference whatsoever);
  14. Ran OCCT tests for every single component, even at extreme. What crashed into the BSOD was VRAM. Could this indicate a hardware issue?
I tried to DDU even the latest drivers, it did not change anything. I've also seen reports of 566.36 being the most stable driver for 4090 but that also did not change anything. As for the errors in Event Viewer, I get these 3-4 specific errors from source nvlddmkm:



The description for Event ID 153 from source nvlddmkm cannot be found. Either the component that raises this event is not installed on your local computer or the installation is corrupted. You can install or repair the component on the local computer.

If the event originated on another computer, the display information had to be saved with the event.

The following information was included with the event:

\Device\Video3
Error occurred on GPUID: 100

The message resource is present but the message was not found in the message table.



The description for Event ID 14 from source nvlddmkm cannot be found. Either the component that raises this event is not installed on your local computer or the installation is corrupted. You can install or repair the component on the local computer.

If the event originated on another computer, the display information had to be saved with the event.

The following information was included with the event:

\Device\Video3
badfbadf(badfbadf) 00000000 00000000

The message resource is present but the message was not found in the message table



The description for Event ID 0 from source nvlddmkm cannot be found. Either the component that raises this event is not installed on your local computer or the installation is corrupted. You can install or repair the component on the local computer.

If the event originated on another computer, the display information had to be saved with the event.

The following information was included with the event:

\Device\Video3
Error occurred on GPUID: 100



As for the BSODS, the bugcheck error in Event Viewer is:

The computer has rebooted from a bugcheck. The bugcheck was: 0x00000133 (0x0000000000000001, 0x0000000000001e00, 0xfffff80784dc43b0, 0x0000000000000000).



Something to notice is that I would get Event ID 153 error on the latest drivers only, but either way, I have been pulling my hair out trying to find any solution available. If anyone has any idea or has been having the same problem, I'd really appreciate any help! Thank you so much for your time!
 
RAM: 32 GB DDR5-4800, Kingston KF552C40-32 + 32 GB DDR5-4800, Kingston KF552C40-32 (64 GB RAM in total)
You mixed ram? Remove the stick that isn't part of the kit.
If you desire 64GBs, buy a kit of 64GBs and consider selling off the old kits to recoup some of the cost.

CPU: Intel Core i9-14900k
Motherboard: MSI PRO B760M-P
That's a pretty bare-looking board to pair with a 14900K, and that issues happen within 10 minutes makes me believe heat is one of the issues.
The i9 should be paired with a motherboard that at least has heatsinks over the VRM area, or else the cpu won't run at its best all the time.

Hello! I have pretty much been going through this hell of an issue for almost a month now. I got a new build around a month ago. For the first 2-3 weeks, everything was working flawlessly...
Ambient temperatures have gone up since then, right?
The AC is running?
 
  • Like
Reactions: MEMOFLEX
You mixed ram? Remove the stick that isn't part of the kit.
If you desire 64GBs, buy a kit of 64GBs and consider selling off the old kits to recoup some of the cost.


That's a pretty bare-looking board to pair with a 14900K, and that issues happen within 10 minutes makes me believe heat is one of the issues.
The i9 should be paired with a motherboard that at least has heatsinks over the VRM area, or else the cpu won't run at its best all the time.


Ambient temperatures have gone up since then, right?
The AC is running?
Both of the RAM sticks are the exact same model, and they both came with a 64 GB kit (2x32 GB). I am gonna try to remove one stick and test the issue, then report back.

As for the CPU and Motherboard, MSI claims that PRO B760M-P is made to support 14th/13th and 12th generation of Intel but I am not sure if it's the perfect match for i9 though. What motherboard would be a balanced match for the i9? I do have a All-in-One Liquid Cooler for the CPU and the temps never usually go extreme (around 55-70 degrees under load).

And as for the ambient temperatures, I keep my room pretty cooled so I think we can keep that ruled out.

I tried to run heavy benchmark tests using OCCT for every single component. CPU was perfectly fine. RAM, perfectly fine. PSU, no issues. GPU had also no issues. What made the system go into a BSOD (DPC_WATCHDOG_VIOLATION specifically) was VRAM. In less than 40 seconds, the system crashed when I ran a benchmark test for VRAM.

I am starting to believe that this is rather a GPU hardware problem, a Motherboard problem or a problem related to the connection between GPU and PSU but I think that is pretty unlikely in this situation.

Either way, thank you for your reply!
 
Both of the RAM sticks are the exact same model, and they both came with a 64 GB kit (2x32 GB). I am gonna try to remove one stick and test the issue, then report back.
AH. With the way the specs were typed out, I thought the sticks were packaged separately. My bad!

As for the CPU and Motherboard, MSI claims that PRO B760M-P is made to support 14th/13th and 12th generation of Intel but I am not sure if it's the perfect match for i9 though. What motherboard would be a balanced match for the i9? I do have a All-in-One Liquid Cooler for the CPU and the temps never usually go extreme (around 55-70 degrees under load).
Empty words from MSI; they, and the other vendors are not going to flat out tell the truth. The associate probably just looked at the cpu support list and said, "Yep, it's there!". You have to find out the hard way when it's in the build and running:
Old article, but it's relative here.

12900-stock-s.png


12900-nolimit-s.png


-Some boards throttle according to manufacturer spec.
-Some boards have their limits preset higher than they can reasonably handle.
-Some boards have their limits preset lower than default, because they know the thing can't handle more than an i3/i5.


You can monitor functions, such as:
-core clocks. By default, the maximum frequency is supposed to scale up/down depending on how many cores are active. Fewer cores, higher maximum. Many/all cores, lower maximum.
-power limits. 125w base, 253w turbo boost for the 14900K.

If the motherboard is thermal/power throttling, it can look like something's wrong with the cpu, even if there isn't.
 
AH. With the way the specs were typed out, I thought the sticks were packaged separately. My bad!


Empty words from MSI; they, and the other vendors are not going to flat out tell the truth. The associate probably just looked at the cpu support list and said, "Yep, it's there!". You have to find out the hard way when it's in the build and running:
Old article, but it's relative here.

12900-stock-s.png


12900-nolimit-s.png


-Some boards throttle according to manufacturer spec.
-Some boards have their limits preset higher than they can reasonably handle.
-Some boards have their limits preset lower than default, because they know the thing can't handle more than an i3/i5.


You can monitor functions, such as:
-core clocks. By default, the maximum frequency is supposed to scale up/down depending on how many cores are active. Fewer cores, higher maximum. Many/all cores, lower maximum.
-power limits. 125w base, 253w turbo boost for the 14900K.

If the motherboard is thermal/power throttling, it can look like something's wrong with the cpu, even if there isn't.

Thank you for the reply!

Yeah I agree with you; MSI and all the other vendors don't ever wanna tell the truth. Honestly, seeing these issues and how much bloatware MSI installs on it's own makes me swap this motherboard (I reinstalled Windows, and MSI "driver utility" which is enabled by default in BIOS, updates the drivers but gives you a bunch of bloatware).

As for the CPU, last time when I tried to use the default Intel settings in BIOS for the cores, I noticed that it was capped at 153W. I tried to set the limit to 253W and that only made Windows crash everytime it would load the desktop in. Forgive me for my incompentence, I am not very good when it comes to tweaking these kind of things. I tried to set a 53x limit for the P-Cores in BIOS (Default was 57x) but there was still crashes. Last time I ran a physics benchmark test that somebody recommended me, I found out that the CPU performance climbed up very fast instead of steady.

These are default BIOS settings: https://drive.google.com/drive/folders/1mGRUuOGx7K3TaQFDNm0ngzGq6VCRAhZf?usp=sharing
Do you perhaps have any sort of guide on how to properly tweak the CPU in BIOS? I'd honestly really appreciate it!
 
Last edited:
How high do gpu core, hot spot, and memory junction temperatures get?
Have you tried completely uninstalling/reinstalling the GPU?

As for the CPU, last time when I tried to use the default Intel settings in BIOS for the cores, I noticed that it was capped at 153W. I tried to set the limit to 253W and that only made Windows crash everytime it would load the desktop in. Forgive me for my incompentence, I am not very good when it comes to tweaking these kind of things. I tried to set a 53x limit for the P-Cores in BIOS (Default was 57x) but there was still crashes. Last time I ran a physics benchmark test that somebody recommended me, I found out that the CPU performance climbed up very fast instead of steady.

These are default BIOS settings: https://drive.google.com/drive/folders/1mGRUuOGx7K3TaQFDNm0ngzGq6VCRAhZf?usp=sharing
Do you perhaps have any sort of guide on how to properly tweak the CPU in BIOS? I'd honestly really appreciate it!
With a turbo power limit that low, it's very likely that the board throttles and forces the cpu to drop in frequency too... shouldn't crash though; might just get a little choppy, or seem sluggish at times.

Unfortunately, I don't have a guide, but instead of changing core ratio, try a negative core voltage offset(you can see it near the bottom of your 3rd snapshot), or do a search on undervolt guides. Core voltage offset mode to (-), and core voltage offset to 0.05v.

Disabled Hardware Accelerated GPU Scheduling in Windows settings (This caused no BSODs but the crashes remain)
What kind of crashes, exactly? Crash to desktop? Screen just goes black, or screen goes black, but can still hear sounds? Screen just freezes?
 
How high do gpu core, hot spot, and memory junction temperatures get?
Have you tried completely uninstalling/reinstalling the GPU?


With a turbo power limit that low, it's very likely that the board throttles and forces the cpu to drop in frequency too... shouldn't crash though; might just get a little choppy, or seem sluggish at times.

Unfortunately, I don't have a guide, but instead of changing core ratio, try a negative core voltage offset(you can see it near the bottom of your 3rd snapshot), or do a search on undervolt guides. Core voltage offset mode to (-), and core voltage offset to 0.05v.


What kind of crashes, exactly? Crash to desktop? Screen just goes black, or screen goes black, but can still hear sounds? Screen just freezes?
Thank you once more for the reply!

I ran an extreme stability test on the GPU to see its temperatures, and I found out that the GPU core temperature sits from 70°C to 75°C, the memory junction is around 69°C to 72°C, and the hotspot varies from 80°C to 87°C.

I can't be certainly sure if this is a whole motherboard and CPU mismatch issue, but I do just hope that it's not an issue related to the GPU (even though the chances of it being a GPU problem are the highest here 😭).

Basically before I re-formatted Windows 11, when I tried to run games (e.g., Cyberpunk 2077, Forza Horizon 5, etc.), they would run for less than 5 minutes at best, and when I tried to have Hardware Accelerated GPU Scheduling enabled, they would give a DPC_WATCHDOG_VIOLATION Blue Screen of Death. Disabling HAGS did not remove the BSODs completely, but it did reduce their number. Without HAGS, again, I would be around for less than 5 minutes in games, and the system would freeze, audio could be heard in the background, the screen would go black after 10 seconds of freezing, and I would get redirected back to my desktop. Checking what caused the issue in Event Viewer, I found out about errors regarding nvlddmkm.sys, specifically Event ID 153 & 14 (with the newer drivers) and Event ID 0 (with the older drivers).

When I ran into BSODs, Event Viewer would give this description:



The computer has rebooted from a bugcheck. The bugcheck was: 0x00000133 (0x0000000000000001, 0x0000000000001e00, 0xfffff804bbfc43b0, 0x0000000000000000).



From the research I've done, the code x133 means DPC_WATCHDOG_VIOLATION, which indicates a problem with the GPU & its drivers (at least in this case).

While the errors about Event ID 153 & 14 regarding nvlddmkm.sys would give this description:



The following information was included with the event:

\Device\Video3
Reset TDR occurred on GPUID:100



The following information was included with the event:

\Device\Video3
0228349c 00000000 00000000 2028f6d8 2028f4e2 2026f9ce 202900cc 2026bafa



I tried to power the PC off completely and reseat the GPU back in its PCIe port. The issues still persist. I tried to reinstall drivers of all versions to see which would be more stable; all would crash, but the most stable one would be 537.58. I am honestly so worn out because it only had been less than a month since I got this PC brand new and seeing it run into such errors in that much time makes you want to question your choices. 🥲

Thank you for the time, though; I really appreciate all your effort!
 
Yeah, sorry I've not been of much help.

I ran an extreme stability test on the GPU to see its temperatures, and I found out that the GPU core temperature sits from 70°C to 75°C, the memory junction is around 69°C to 72°C, and the hotspot varies from 80°C to 87°C.
Those are all very good.

I can't be certainly sure if this is a whole motherboard and CPU mismatch issue, but I do just hope that it's not an issue related to the GPU (even though the chances of it being a GPU problem are the highest here 😭).
Well, there might be multiple things happening, that would be a headache to try and resolve.

When I ran into BSODs, Event Viewer would give this description:
[things]
First one is memory. All the ones after checking other threads - I can't tell what the hell's going on, what with the different individual solutions(or never resolved ones).