Question BSOD DPC Watchdog Violation

Apr 3, 2025
4
0
10
Hello,
For several months I've had a problem with recurring DPG Watchdog Violation blue screens, without understanding the cause. They had largely calmed down for several months, but resumed, which led me to completely reinstall Windows. However, shortly after installation, everything started up again as before.
I started collecting the dmp files, and underneath you have all the drivers implicated by the Windbg analysis.
My first thought was a hardware cause, but I ran all the hardware tests proposed in the bios. I also ran long stress tests on the chipset, processor and dedicated graphics card, with no apparent anomaly.
Impossible to find a precise cause for the problem, it happens just as easily on a PDF reader, Word, Firefox or even without software running.

At first, I thought it might be an interaction between the graphics card drivers (see my configuration below). However, I don't think there's any connection in the end. I've disabled the Nvidia graphics card in the BIOS and uninstalled its driver with DDU, but the problem still persists.

For your information, here's the list of drivers incriminated by WinDbg, as well as the windbg analysis results: https://www.swisstransfer.com/d/788b9dd9-782c-468f-ba2c-004018e7972f
All drivers are up to date.

I know it's a lot, but if anyone has any ideas, I'd love to hear from you. Of course I'm at your disposal if you need more info.


Many thanks in advance!

--------------------------

Configuration
Windows 11 24H2
Lenovo Yoga Pro 7
Ryzen 7 7840HS with AMD 780M
Nvidia GeForce 3050 6GB


28/02/25

MODULE_NAME: nt

IMAGE_NAME: ntkrnlmp.exe

01/03/25

MODULE_NAME: msgpioclx

IMAGE_NAME: msgpioclx.sys

07/03/25

MODULE_NAME: amdgpio2

IMAGE_NAME: amdgpio2.sys

03/17/25: nvidia GPU disabled from Lenovo control panel

MODULE_NAME: amdgpio2

IMAGE_NAME: amdgpio2.sys

03/21/25: nvidia GPU disabled from Lenovo control panel

MODULE_NAME: dxgmms2

IMAGE_NAME: dxgmms2.sys

03/22/25: nvidia GPU disabled from BIOS, chipset version given by Lenovo reinstalled.

MODULE_NAME: msgpioclx

IMAGE_NAME: msgpioclx.sys

24/03/25: nvidia GPU disabled from BIOS, chipset version given by Lenovo reinstalled.

MODULE_NAME: dxgmms2

IMAGE_NAME: dxgmms2.sys

30/03/25: uninstall Lenovo drivers and install AMD WHQL recommend drivers, nvidia disabled in bios

MODULE_NAME: amdgpio2

IMAGE_NAME: amdgpio2.sys
 
note: if I were to guess at a fix, I would think you have a motherboard sound driver corrupting a GPU video HD sound driver and overflowing its stack causing the GPU driver to fail via a timeout. fix would be to update motherboard sound driver, then disable the sound devices that you do not use. (motherboard HD sound, gpu hd sound, and cpu hd sound) I would be looking for a old realtek hd sound driver.

note: there are case where people plug in android phones to the gpu usb and get problems like this also.
------------

from these bugcheck, I would run this command in the debugger
lmiftsm
it will list out all of the drivers in alphabetical order.
I would look for copies of overclock drivers or 2 copies of a overclock driver.

general fix: update the bios, go to the motherboard vendor website and update the drivers for the cpu chipset, also look to see if there are any firmware updates for your machine. I would remove any usb device that is connected to your GPU. Update any old realtek sound driver on your motherboard. update the GPU driver (including the GPU sound driver) then disable any sound device that does not have a speaker/headphone connected to it.

after that I would change the memory dump type to kernel, then provide the actual memory.dmp file so it can be looked at in a windows debugger (kernel dumps are much larger than minidumps)

-------------
march 30 txt file:

looked at the first one, stack was corrupted, strange number of timeout for the device. (7680 clock tick timeout)
normally you would issue this command in the debugger:
!dpcwatchdog
for the corrupted stack you would run this command:
dqs fffff807`493f2a08+300 fffff807`493f2a08-300
and see if you can see a symbol on the raw stack
-----------

march 24.txt
also watchdog timer going off, most likely in a usb, likely to be a usb attached to a gpu. (500 tick timer)
stack is also corrupted at another point.
-----------
march 22.txt
another watchdog timer going off, bugchecked in the windows side of the windows driver framework.
(a interface to some drivers, ignore this one, running firefox)
 
Last edited:
note: if I were to guess at a fix, I would think you have a motherboard sound driver corrupting a GPU video HD sound driver and overflowing its stack causing the GPU driver to fail via a timeout. fix would be to update motherboard sound driver, then disable the sound devices that you do not use. (motherboard HD sound, gpu hd sound, and cpu hd sound) I would be looking for a old realtek hd sound driver.

note: there are case where people plug in android phones to the gpu usb and get problems like this also.
------------

from these bugcheck, I would run this command in the debugger
lmiftsm
it will list out all of the drivers in alphabetical order.
I would look for copies of overclock drivers or 2 copies of a overclock driver.

general fix: update the bios, go to the motherboard vendor website and update the drivers for the cpu chipset, also look to see if there are any firmware updates for your machine. I would remove any usb device that is connected to your GPU. Update any old realtek sound driver on your motherboard. update the GPU driver (including the GPU sound driver) then disable any sound device that does not have a speaker/headphone connected to it.

after that I would change the memory dump type to kernel, then provide the actual memory.dmp file so it can be looked at in a windows debugger (kernel dumps are much larger than minidumps)

-------------
march 30 txt file:

looked at the first one, stack was corrupted, strange number of timeout for the device. (7680 clock tick timeout)
normally you would issue this command in the debugger:
!dpcwatchdog
for the corrupted stack you would run this command:
dqs fffff807`493f2a08+300 fffff807`493f2a08-300
and see if you can see a symbol on the raw stack
-----------

march 24.txt
also watchdog timer going off, most likely in a usb, likely to be a usb attached to a gpu. (500 tick timer)
stack is also corrupted at another point.
-----------
march 22.txt
another watchdog timer going off, bugchecked in the windows side of the windows driver framework.
(a interface to some drivers, ignore this one, running firefox)
Thanks so much for your quick reply!
Ok, so here's a summary of what I did:
- Uninstalled all the graphics card drivers cleanly and reinstalled everything to the latest version
- Uninstalled the Realtek sound driver, and reinstalled it at the latest version I could find (see screenshot here https://www.swisstransfer.com/d/906e3230-04d5-4b37-aafd-5935d72442ea)
- Disabled AMD HD audio and AMD streaming Audio Device.
- Changed dmp type to kernel
Concerning USB peripherals, I'm on a laptop and I rarely use USB peripherals. Sometimes I plug it into an external monitor and use an external keyboard/mouse, and also an Ethernet/USB-C adapter. But I haven't noticed that bugs are more common in these conditions, I'm not even sure I've ever had one like this.

I haven't found any duplicate drivers, and I don't overclock my graphics card. The lmiftsm results are here for info: https://www.swisstransfer.com/d/c2004440-ff96-4315-805f-4fa2ce8b0b69

I did the commands you asked me to do, unfortunately it's beyond my level to interpretate this, sorry. Here are the results (+ screens of what I reinstalled and disabled) : https://www.swisstransfer.com/d/906e3230-04d5-4b37-aafd-5935d72442ea

Many thanks again!

PS: BSODs usually happen 1/2 a week, I think 2 weeks without anything is already a good sign, but I'll let you know if there's anything new.
 
looks like the problem is coming from something directly on your motherboard. I would update the bios and drivers from the motherboard vendors website. Be sure to look for updates that update firmware. after I did the updates I would disable hardware devices in bios that I am not using. turn off any special bios tuning options that might be in bios. something is running from outside of a driver from a kernel address FFFFF80776E372E6, (firmware, or bios program)

for example asus armory crate can use the bios and make a file on your drive before windows boots and cause various problems.
 
your system has a lot of new amd drivers that I have never seen before.
generally I don't like seeing this driver installed
AtihdWT6.sys Tue Apr 30 16:51:17 2024
you might go into windows device manager and find this sound device and see if you can change the driver to the microsoft generic driver. This is just a guess as a fix to your problem.

the bugcheck was in general purpose input output drivers that handle slow devices on your motherboard. sound and microphone would be such a device.
you also need to make sure your bios was updated to match the drivers that are installed.
you can look here and see if you can find the updated amd gpio package:
https://www.amd.com/en/support/download/drivers.html
your version looks pretty current Aug 20 08:25:33 2024
note: look in bios and see if there is some setting for HDMI chipset configuration. you might disable it.
(sorry, this is just another guess would be using the IO on the motherboard and be corrupting the stack)
 
Last edited:
  • Like
Reactions: hugo78
I looked for AtihdWT6.sys it is used by AMD High Definition Audio Service, which I have disabled.

I'll take a look at the drivers and let you know.

The bios options are rather limited, didn't found anything about HDMI chipset. I have the “bios self-healing” option enabled, bios blackflash disabled. What I did think of, however, was that I have an option for UMA Graphics memory. It was set to 2 GB by default, I set it to 1 GB to recover RAM, knowing that the VRAM of the dedicated graphics card takes over when I need more powerful graphics. If possible I'd like to keep it as it is, but do you think this could be the root of the problem?

And considering that updating the bios has a risk, is it worth waiting to see if the bug isn't already fixed like that in your opinion? Or do you think we'll have to go that way anyway?
 
generally, updating the bios is not a problem like it was years ago. the bios needs to match the version of the drivers that you have installed, if they do not match and the interface changes between the one used for the bios and the one used for the drivers you get stack corruption and buffer overflows. best to do all the updates at the same time, firmware, then bios, then all the motherboard drivers, then turn off things in bios you do not use. This is the best method since if you bugcheck your only real fix it this same process.

newer systems route a lot of the motherboard traffic thru the pci/e bus. Now you can get something like a bad usb driver causing timeouts on your video even though years ago they would be unrelated systems.
I have even looked at a system that would bugcheck the video driver if you just kept shaking the mouse for about 5 seconds. The mouse would move but the video would time out if it did not get control for 2 seconds. mouse (slow device) held up video (fast device) It was pretty funny and was caused when someone replaced windows drive into a new system without reinstalling windows.
 
Last edited:
Wow, I didn't think things could be so intertwined!
Thanks a lot, I'll do that and then afterwards I guess I either have to wait for the next BSOD or be happy that they don't appear anymore. I'll let you know how it goes.
And as for the memory of the integrated graphics, is it worth increasing or not?

Thanks again!
 
Wow, I didn't think things could be so intertwined!
Thanks a lot, I'll do that and then afterwards I guess I either have to wait for the next BSOD or be happy that they don't appear anymore. I'll let you know how it goes.
And as for the memory of the integrated graphics, is it worth increasing or not?

Thanks again!
things get pretty ugly now that windows sleep functions save the states of everything in the pagefile.sys and reloads them on reboot.
it is great if everything is working but for bug and malware it causes a lot of problems. often telling people to google how to make windows delete your pagefile on system reboot, and having them make the registry change can make a system much more stable.
(bad RAM, bad drivers often take many hours to bugcheck your system, saving the corruptions to disk and reloading on reboot just means that you will eventually get a bugcheck or failure.