[SOLVED] Clock_watchdog_Timeout error help

Demon of Elru

Distinguished
Nov 20, 2015
48
0
18,530
I am getting this on a brand new build. Not sure whats causing it since the CPU isn't overclocked yet and temps are normal. I have a Core i7 9800X ASRock Fatal1ty mobo. I've posted the crash dump to better find out the problem. This happens at random times but mostly when I'm just looking at a website or being idle.

https://www.dropbox.com/s/hsq7sbw839tnq1z/010119-12578-01.dmp?dl=0

EDIT: Just finished an 8 pass Memtest 0 errors.



 
Solution
looked at nvidia driver release note: (ver 416.81)
windows 10 issue with nvidia gtx 1080ti watchdog violations with certain systems with
motherboard with PLX chips. (not fixed)
Not sure what nvidia card you have but thought I would look it up.
--------
the debugger reported that you had a unusual amount of hardware but when I tried to see what the hardware was the info was invalid. we are talking about low level hardware like PCI/e bus that the system thinks should be there but would not start. Is there any chance that the windows version was migrated from another motherboard? the debugger thinks you have about 3 times the hardware than any other system I have looked at. but almost all of the functions I would use to debug this...
one of your cpu cores hung and stopped responding.

you would have to change the memory dump type to kernel to see what was running at the time the core stopped responding. Sometimes it is some plug and play trying to install a driver.

the memory dump was pretty much not usable, the module list was corrupted.
I could see that the system was running 4 hours 8 minutes before the core stopped responding. I could not read the bios or the list of loaded drivers.

you should, boot into windows, go to your motherboard vendors website, update the bios and any motherboard drivers. if your bios is already current, then reset it to defaults and save the configuration.

you might put your drives data cable on a different port and controller. Generally, the slowest controller will have the best drivers since they will be updated by windows update. (unless you use raid or the intel drivers)

the kernel memory dump will show what was running on each core. what ever was not working was talking between to cores of your cpu.
if you have any overclocking (gpu or cpu) remove the drivers
 
all I see is a minidump. kernel dumps are stored in a different location and file:
c:\windows\memory.dmp

you might want to google "how to force a memory dump using a keyboard" make the registry settings and force a kernel dump while the system is working.

the minidumps you provided are corrupted and the list of modules, and the bios info can not be read.
Maybe the dump will be ok while the system is working. I can still take a look and see if there are issues before your system crashes.
 

Demon of Elru

Distinguished
Nov 20, 2015
48
0
18,530
Oh forgive me, when the next blue screen happens I'll upload the memory dump. Also is it normal for the crash dump to hang at 25%? Seems to hang during a BSOD and not actually reset. I have to manually reset.

 
nope the memory dump should not hang.
it could if there was not enough space on the drive, or some problem with the storage subsystem



 
looked at nvidia driver release note: (ver 416.81)
windows 10 issue with nvidia gtx 1080ti watchdog violations with certain systems with
motherboard with PLX chips. (not fixed)
Not sure what nvidia card you have but thought I would look it up.
--------
the debugger reported that you had a unusual amount of hardware but when I tried to see what the hardware was the info was invalid. we are talking about low level hardware like PCI/e bus that the system thinks should be there but would not start. Is there any chance that the windows version was migrated from another motherboard? the debugger thinks you have about 3 times the hardware than any other system I have looked at. but almost all of the functions I would use to debug this in a kernel memory dump are not working. The data is not being stored where the debugger thinks it should be located.

here is what I would try next:
go into bios and toggle any setting, IE change something and change it back and save the results. This will force the bios to rescan the hardware and reassign the hardware resources and rebuild the database of setting it sends to windows plug and play.

second, you have a nvidia card and I think there was some kind of bug in the current drivers that had some issue with some interrupt multiplexer hardware. I don't know if it has been fixed or if you have this hardware but you might read the release notes for the GPU driver to see if it applies . (or go back to a older build of the GPU driver)

I would also, dump any non standard driver like the old 2011 USB charger driver you had installed.
it is used to override the Max current on a usb port so that apple devices can charge faster. This is a bad idea and your old driver has bugs that can mess up the USB ports. Since usb ports on newer boards run thru the PCI/e bus it can screw up almost all of your hardware.

The debugger also seemed to show that some items that should have been in the memory dump were not in the memory dump but were being stored on disk for some reason. I have seen this happen with certain viruses.
you might want to turn off the virtual memory to delete the pagefile.sys then reboot and turn it back on. it can help dump malware that is hiding in the pagefile.

you should also run cmd.exe as an admin then run
dism.exe /online /cleanup-image /restorehealth
sfc.exe /scannnow

this will try to fix any microcoft core file on disk, then run
a full shutdown to force windows to reload all of the files from disk on the next boot
(rather than used the compressed copies saved as an image)
shutdown /s /f /t 0

after that I would google "how to make a memory dump using a keyboard" make the registry settings and trigger a kernel memory dump while the system is working ok
then put up the working memory dump and I can see if the debugger works correctly and if I can read what hardware is not working.
I think the timeout is going to be some hardware that is not getting setup correctly.

you might go into device manager and turn off any high definition sound sources that do not have a speaker attached to them.

again, I think the problem is related to a pci bus that the system thinks is not actually starting up. The question is it really there or not. if it is then you might need a custom driver from the motherboard vendors website.

I will look and see if I can find the problem info with the nvidia driver
that relates to pci multiplexing devices. (these devices kind of fake the number of PCI/e lanes that the machine has.)

you also have a overclock driver running, it should be removed these tend to tweak voltages even if you don't ask it to. changing the voltages can change the timing windows in the electronics to the point where a hardware signal may occur before the hardware is ready to read the signal. in these cases the driver will wait for a signal that it has already missed, then it will timeout and cause a bugcheck.


I don't know if the problems I am having with the debugger are due to some change that Microsoft made with windows


you can also consider a reinstall of windows and reinstall of the current motherboard drivers. I would skip all of the motherboard utilities. Skip third party virus scanners and do all of the windows updates and see if the system is stable.
you had a specialized gaming rig, and it used many unique drivers. Some of these network drivers do some very nonstandard changes. Like expediting incoming packets and retarding the outgoing packets to give an advantage in gaming. (getting your opponent to shoot at your old location rather than your current location) anyway your ndis driver seemed to have been modified and I could not look at it in the debugger.

you should also look at the kernel memory dump setting maybe it is not dumping all of the required info? the size of the file seemed about correct so I don't know.








 
Solution

Demon of Elru

Distinguished
Nov 20, 2015
48
0
18,530
Yes this is a brand new build im coming from an X79 system and had a thought about old hardware drivers being installed with windows. I currently have a GTX 1080 non ti. I upgraded my windows 7 pro to windows 10 pro during the free upgrade period so my MS account has a digital license, though I had to use my old win 7 key since i couldnt activate it maybe thats an issue too?

So should I uninstall all the utlity programs such as Asrock F tune?

EDIT: Here is a fresh memory dump maybe this will have more info. I didn't have time to make the corrections you suggested this BSOD happened fairly quick.

https://www.dropbox.com/s/32nv3sa9b1ugw72/MEMORY.DMP?dl=0

EDIT: Alright I removed my old PC from my MS account, I think it may have been downloading files and settings for both setups since I kept with the same PC name. If it blue screens again I will try a fresh install of Windows so this time it shouldn't download old files.
 

Demon of Elru

Distinguished
Nov 20, 2015
48
0
18,530
Apologies for the late post but I have come to the determination the CPU itself is bad. Running prime95 on stock settings with my custom loop (1 360mm rad and 1 140mm rad) CPU temps hit 90" within the first 5 minutes. This seems waaaaaay to high especially for a custom water cooling loop on stock settings. I have since gotten a replacement CPU but have not yet had a chance to install it. The blue screens still happen though not often but those together with those temps under 100% load tells me there is something wrong with the CPU. Also in addition to the clock watchdog errors I have gotten a few whea_uncorrectable_errors both point to the CPU. Once the replacement is installed I will keep you posted.
 
Mar 26, 2019
6
0
10
I had the same problem with my laptop and i fixed it with doing a system restore, reinstalling the VGA and chipset drivers and to be sure I updated W10 to the latest version. Hope that helps!