[SOLVED] Strange DPC WATCHDOG VIOLATION at random intervals with barely any consistence.

Status
Not open for further replies.

tamalero

Distinguished
Oct 25, 2006
898
4
18,985
1
Hello folks!
I consider myself kinda tech savvy this but issue is baffling me because it has been ongoing since I upgraded to the TRX40 platform from AMD.

First.. my Specs:

AMD Threadripper 3960X with Noctua TR40 NH14
Gigabyte TRX40 Aorus PRO WIFI
4x DDR4 4000 @ 3600 Kingston HyperX Samsung Bdie modules. (also had tried Corsair Samsung Bdie 3800 modules as well)
Zotac RTX 2070 Super
Asus XONAR DX (now replaced with a Sound Blaster Z from CreativeLabs)
3x NVME ( 250GB, 500GB , 1TB from different brands). 2 of them are PCIE v4.
1x SSD500GB
1x 3TB HDD SATA
SolarFlare SFN6122F 2-Port 10Gb/s PCI-E 3.0 Ethernet (crashes used to happen before I got this card as well).
All together in a Thermaltake X9 case.


Now the issue:

BSOD with a DPC WATCHDOG VIOLATION on BOOT or on SLEEP recovery.
Now before you mention. This is the strange part.
There is NO REAL consistency in the crashes.
The computer can crash ANYTIME on cold boot or on recovering from sleep.
But I've gone sometimes up to 2 weeks with no crash (and I use the computer consistently for more than 12 hours a day). In the other hand... sometimes It can bsod more than 4 times in a single day!.

The only consistent data is:
  1. Crashing on COLDBOOT (the most common). The computer will boot, run into windows login screen, and then slowly will start to freeze, fist the login screen will stop reacting. Then the keyboard will stop working, then the mouse before getting a full freeze, seconds later will BSOD with DPC WATCHDOG VIOLATION.
  2. Crashing on COLDBOOT. After booting and logging into Windows. Everything will load normally (windows, screens, programs, etc..). Except after a few seconds, explorer windows will stop responding. CTRL+ALT+ESC will work once or twice then stop working, one after another.. applications and windows system apps will stop working until it keyboard and mouse stops working. Seconds later, will BSOD with a DPC WATCHDOG VIOLATION. The crashes are NEVER instant.
  3. Upon Sleep. Similar to Crash #2. As soon the computer is out of sleep. It will take a few times to start stalling before crashing.
If the computer does not crash in the next 5minutes after booting or returning from sleeping. Everything will be fine. The system will be very stable unless some very rare game hardcrashes (hello cyberpunk... or Vermintide) And the VERY RARE Kmode crash (only have had 5 crashes in more than a year, and I'm a heavy user, running games while editing video with 3 screens).
The Windows logs say NOTHING. Just that the system had not shutdown properly.
I've done countless of memtests ranging from 4 hours to full 36 hours. No errors.
My CPU is NOT overclocked. I also have tried manual timings on my ram, automatic timings, XMP and Ryzen memory calculator's results.
I have disabled most of onboard crap so my other gear works fine (Sound card, Network card). The crashes happen regardless if they are enabled or not.
I have made multiple gear swap over the months and still no go. Same errors. Also multiple drivers for my video card, network card, chipsets.
Only other issue that has been random but kinda consistent over the months. Is randomly getting my UPS to say "Your battery is low". Even when my UPS are at 100% charge. Unlike the BSOD, this one can appear anytime anywhere in Windows.

The only thing I have not done yet.. is to reinstall Windows from zero again and replace my videocard. Then hope for the best (which I will do as soon my 3080 FTW3 arrives).

Could I have a defective CPU? or a DEFECTIVE Motherboard?
Is there a way to get information other than the generic and pretty much useless BSOD code screens from Windows 10?
When I use BLUESCREENVIEW It always points at ntoskrnl.exe with addresses like: ntoskrnl.exe ntoskrnl.exe+1f4c1b fffff8071fc00000 fffff807206b5000 0x00ab5000 0x9b5f0a62 8/7/2052 8:30:58 PM Microsoft® Windows® Operating System NT Kernel & System 10.0.18362.1256 (WinBuild.160101.0800) Microsoft Corporation C:\Windows\system32\ntoskrnl.exe
No other file is ever mentioned.

One thing that I noticed clearly.. was the dates of the TIME STRING go beyond my current date. Some going beyond 2050!!!
The file dates always going bonkers after this file (the last normal) C:\Windows\system32\drivers\wfplwfs.sys


Using WinDebug:

Code:
PC_WATCHDOG_VIOLATION (133)
The DPC watchdog detected a prolonged run time at an IRQL of DISPATCH_LEVEL
or above.
Arguments:
Arg1: 0000000000000001, The system cumulatively spent an extended period of time at
    DISPATCH_LEVEL or above. The offending component can usually be
    identified with a stack trace.
Arg2: 0000000000001e00, The watchdog period.
Arg3: fffff80720171358, cast to nt!DPC_WATCHDOG_GLOBAL_TRIAGE_BLOCK, which contains
    additional information regarding the cumulative timeout
Arg4: 0000000000000000

ADDITIONAL_XML: 1

OS_BUILD_LAYERS: 1

BUGCHECK_CODE:  133

BUGCHECK_P1: 1

BUGCHECK_P2: 1e00

BUGCHECK_P3: fffff80720171358

BUGCHECK_P4: 0

DPC_TIMEOUT_TYPE:  DPC_QUEUE_EXECUTION_TIMEOUT_EXCEEDED

and

Code:
STACK_TEXT:
ffffbd81`aaaf9b08 fffff807`1fdf4c1b     : 00000000`00000133 00000000`00000001 00000000`00001e00 fffff807`20171358 : nt!KeBugCheckEx
ffffbd81`aaaf9b10 fffff807`1fc35107     : 000028ca`68580e14 ffffbd81`aaad9180 00000000`00000286 ffff8785`01d24e30 : nt!KeAccumulateTicks+0x1c18bb
ffffbd81`aaaf9b70 fffff807`1fb5f291     : 00000000`00000000 ffffe08f`3b028500 ffff8785`01d24eb0 ffffe08f`3b0285b0 : nt!KeClockInterruptNotify+0xc07
ffffbd81`aaaf9f30 fffff807`1fcfdaa5     : ffffe08f`3b028500 00000000`00000000 00000000`00000000 ffffb19b`e553b141 : hal!HalpTimerClockIpiRoutine+0x21
ffffbd81`aaaf9f60 fffff807`1fdc55aa     : ffff8785`01d24eb0 ffffe08f`3b028500 ffffe08f`5f167480 ffffe08f`3b028500 : nt!KiCallInterruptServiceRoutine+0xa5
ffffbd81`aaaf9fb0 fffff807`1fdc5b17     : ffff8785`01d250f8 ffff8785`01d24eb0 ffffe08f`3b028500 fffff807`1fdc5ba4 : nt!KiInterruptSubDispatchNoLockNoEtw+0xfa
ffff8785`01d24e30 fffff807`1fc2d3dd     : 00000000`00000010 00000000`00040282 ffff8785`01d24fe8 00000000`00000018 : nt!KiInterruptDispatchNoLockNoEtw+0x37
ffff8785`01d24fc0 fffff807`1fe070c6     : 00000048`063200e6 00000000`00000004 ffffbd81`aaaee000 00000000`00000003 : nt!KeYieldProcessorEx+0xd
ffff8785`01d24ff0 fffff807`1fc434ad     : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`b4fd8b8d : nt!IopCompleteRequest+0x1a1a16
ffff8785`01d250e0 fffff807`1fdc7e40     : ffffbd81`aaad9100 00000000`00000000 00000000`00000000 ffffe08f`5f167480 : nt!KiDeliverApc+0x19d
ffff8785`01d251a0 fffff807`1fc64edb     : 00000000`00000000 fffff807`00000000 ffffe08f`4df4fa38 fffff807`22f386ff : nt!KiApcInterrupt+0x2f0
ffff8785`01d25330 fffff807`1ff6e0a9     : ffff8785`01d20061 ffff8785`01d254a8 00000000`00000004 01000000`00100000 : nt!ExFreeHeapPool+0x12b
ffff8785`01d25450 fffff807`1dd12011     : 00000000`c00000b5 fffff807`1dd17120 00000000`00000000 ffffe08f`67b04600 : nt!ExFreePool+0x9
ffff8785`01d25480 fffff807`1dd1c193     : 00000000`00000001 ffffbd81`b07f3580 00000000`00000208 00000000`00000000 : hidusb!HumGetDescriptorRequest+0x241
ffff8785`01d254f0 fffff807`1dd1145f     : 00000000`00000000 ffffe08f`6b2dcdc0 ffffe08f`555d52c0 00000000`0000000e : hidusb!HumGetStringDescriptor+0xe3
ffff8785`01d25570 fffff807`6c333939     : 00000000`00000004 00000000`00000001 ffffe08f`555d52c0 ffffbd81`b07f3580 : hidusb!HumInternalIoctl+0x44f
ffff8785`01d25860 fffff807`6c343c4c     : 00000000`0000000f ffffe08f`555d5978 ffffe08f`555d52c0 ffffe08f`555d5978 : HIDCLASS!HidpCallDriver+0xb9
ffff8785`01d258d0 fffff807`6c346683     : ffffe08f`555d52c0 ffff8785`00000000 00000000`0000000f ffffe08f`528db1d0 : HIDCLASS!HidpCallDriverAsynchronous+0x2c
ffff8785`01d25900 fffff807`6c337212     : 00000000`000b01be ffffe08f`56b27620 ffffe08f`528db1d0 00000000`00000000 : HIDCLASS!HidpGetDeviceString+0xd7
ffff8785`01d25950 fffff807`6c3325c5     : ffffe08f`56b27600 ffffe08f`555d52c0 00000000`00000001 00000000`00000001 : HIDCLASS!HidpIrpMajorDeviceControl+0xb22
ffff8785`01d25a50 fffff807`1fc37159     : 00000000`00000002 ffffe08f`54e9adb0 00000000`c00000bb 00000000`00000000 : HIDCLASS!HidpMajorHandler+0x195
ffff8785`01d25ae0 fffff807`69701a33     : ffffbd00`68a07e50 ffffe08f`54e9adb0 00000000`00000002 81000005`7d207867 : nt!IofCallDriver+0x59
ffff8785`01d25b20 fffff807`1fc37159     : ffffe08f`5de36080 ffffe08f`555d52c0 00000000`00000000 00000000`00000000 : HidBatt!HidBattIoControl+0xe3
ffff8785`01d25b60 fffff807`201f2a95     : ffffe08f`555d52c0 00000000`00000000 00000000`00000000 ffffe08f`5d5ae670 : nt!IofCallDriver+0x59
ffff8785`01d25ba0 fffff807`201f28a0     : 00000000`00000002 00000000`00000000 ffffe08f`5d5ae600 ffff8785`01d25ec0 : nt!IopSynchronousServiceTail+0x1a5
ffff8785`01d25c40 fffff807`201f1c76     : 000000d1`40fc8570 00000000`000079d1 00000000`00000000 00000000`00000000 : nt!IopXxxControlFile+0xc10
ffff8785`01d25d60 fffff807`1fdd5358     : ffffe08f`5f167480 000000d1`40fc8558 ffff8785`01d25de8 00000000`00000000 : nt!NtDeviceIoControlFile+0x56
ffff8785`01d25dd0 00007fff`f91fc7d4     : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : nt!KiSystemServiceCopyEnd+0x28
000000d1`40fc8528 00000000`00000000     : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : 0x00007fff`f91fc7d4




And finally...

Code:
SYMBOL_NAME:  hidusb!HumGetDescriptorRequest+241

MODULE_NAME: hidusb

IMAGE_NAME:  hidusb.sys

STACK_COMMAND:  .thread ; .cxr ; kb

BUCKET_ID_FUNC_OFFSET:  241

FAILURE_BUCKET_ID:  0x133_ISR_hidusb!HumGetDescriptorRequest

OS_VERSION:  10.0.18362.1

BUILDLAB_STR:  19h1_release

OSPLATFORM_TYPE:  x64

OSNAME:  Windows 10

FAILURE_ID_HASH:  {c13f5902-b00b-40e7-ef23-edbfd49d76bc}

Followup:     MachineOwner
---------

After checking deeper, it seems to be related to LOGITECH USB drivers? ( I have a Logitech G500s)
This is kinda strange as I used this with my prior threadripper system with no crashes (same mouse).
I've also reviewed all the minidumps and they all point to the same hidusb.sys

Full dump file: https://drive.google.com/file/d/1zk2mp_5wkaDD0Rwb0biCWl_4MbybLP5b/view?usp=sharing
 
Last edited:

tamalero

Distinguished
Oct 25, 2006
898
4
18,985
1
Update, after massive extensive testing and multiple OS re-installations.
Turns out that my USPS 's USB connector was giving bad data to my PC. And it was constantly switching and moving the power setting (I believe it was nonstop sending wrong data that it was under battery then under direct connection on and off) causing my power plans and other things to go hawire.
Removing the cable and removing the drivers left my system 100% stable.
 
Reactions: gardenman

tamalero

Distinguished
Oct 25, 2006
898
4
18,985
1
UPDATE: Still getting BSODS..

after changing the configuration of sleep to never sleep USB devices when plugged in. I now get a KMODE exception error instead of the DCP watchdog violation.

I have unplugged my CANON scanner and my UPS (which I think its the culprit as I get random "your battery is very low" warnings randomly even when the UPS correctly registers 100% with new batteries.

This only leaves my HUION drawing screen tablet and my CORSAIR keyboard and my LOGITECH mouse.

Also deactivated ICUE. Well see if that causes the issue.
 

tamalero

Distinguished
Oct 25, 2006
898
4
18,985
1
Update, after massive extensive testing and multiple OS re-installations.
Turns out that my USPS 's USB connector was giving bad data to my PC. And it was constantly switching and moving the power setting (I believe it was nonstop sending wrong data that it was under battery then under direct connection on and off) causing my power plans and other things to go hawire.
Removing the cable and removing the drivers left my system 100% stable.
 
Reactions: gardenman
Status
Not open for further replies.

ASK THE COMMUNITY