• Happy holidays, folks! Thanks to each and every one of you for being part of the Tom's Hardware community!

Question Trying to diagnose a BSOD that only happens when a Windows 11 Update occurs

gnomicidal

Distinguished
Jul 27, 2010
70
0
18,630
System specs:
CPU: AMD AMD RYZEN 5 5600X WRAITHS
MOBO: MSI B550-A PRO ATX
GPU: MSI GeForce RTX 2070 DirectX 12 RTX 2070 ARMOR 8G 8GB
SSD: SAMSUNG 980 PRO SERIES 1TB
HDD: Seagate BarraCuda ST3000DM008 3TB 7200 RPM 64MB Cache SATA 6.0Gb/s 3.5" Hard Drive Bare Drive
RAM: (2 sets of) G.SKILL TridentZ RGB Series 16GB (2 x 8GB) 288-Pin DDR4 SDRAM DDR4 3200 (PC4 25600) Desktop Memory Model F4-3200C16D-16GTZR

Here's what happens:
1. A windows update will install, usually overnight.
2. On reboot, it will throw a BSOD, usually 1 of 2 different ones: CLOCK_WATCHDOG_TIMEOUT or CRITICAL_PROCESS_DIED.
3. It will be unable to boot back to Windows, even in any version of Safe Mode
4. It will be unable to repair
5. It will be unable to go to a previous Restore Point
6. I cannot locate or read any dump files since I cannot get into Windows 11
6. The only way to fix is to completely reinstall Windows 11 Home

At first I only had an inkling that it was always occurring on Windows Updates, but now I know for sure, because I delayed Windows Updates purposely, and this morning when I noticed one was available, I decided to test my theory and let it 'restart immediately'. To my dismay it happened right away.

So, how many times has this happened? At least 4 or 5 now. Time between BSODS varies from 1 day to months. My last successful period before testing my theory lasted from March 12th until today, April 10th. So now here I am again on a fresh install and I want to figure this out. Is my CPU bad? My PSU? My MOBO? Last time my thought was it was my old crusty Intel NVME, so I replaced it with the above Samsung and it still happened.

What have I tried already?
1. Updated my BIOS to the most recent version
2. Manually upgraded every hardware driver I could on the system
3. Ran a check on both storage drives to make sure they good (I forget the name of the tool)

Happy to hear any suggestions. I will run any program I can to diagnose. I will even rerun any of the things I already. I really would like to keep this system up and going if I can.

Between BSODs it runs like a dream. Zero issues, and zero BSODs for any reason. Plays games, runs anything I want it to and with great performance, which is why this is so confusing. So, shot in the dark if anyone has come across anything like this and has any ideas. I Googled probably over 100 different previous responses and watched probably 25-30 different Youtube videos on suggestions to fix and thus far nothing has worked.

I can also clarify anything if something needs it, let me know.
 
As a starting point, look in Reliability History/Monitor for any error codes, warnings, or informational events being captured just before or at the time of the BSODs.

There are other tools that can be used; however, Reliability History/Monitor is end user friendly and may reveal some pattern leading to the eventual BSOD.

= = = =

Create a new bootable USB. The original USB may be buggy or corrupted.

= = = =

On the system run "dism" and "sfc /scannow" to find and fix any buggy or corrupted files.

https://www.windowscentral.com/how-use-dism-command-line-utility-repair-windows-10-image

https://www.lifewire.com/how-to-use-sfc-scannow-to-repair-windows-system-files-2626161
 
  • Like
Reactions: gnomicidal
The CLOCK_WATCHDOG_TIMEOUT BSOD is typically caused by a CPU problem, although it can be a bad driver. The CRITICAL_PROCESS_DIED BSOD is almost always a hardware problem. The two together might make you look more closely at the CPU.

The CLOCK_WATCHDOG_TIMEOUT BSOD can only be fully analysed with the full kernel dump - and there is only ever one of those, the dump is overwritten with each new BSOD. If you see another CLOCK_WATCHDOG_TIMEOUT BSOD please immediately copy the file C:\Windows\Memory.dmp to another location (a temp folder for example) to prevent it being overwritten by another BSOD. Then upload that kernel dump to a cloud service with a link to it here.

Since we already have reason to suspect the CPU it might be wise to run Prime95 on it...
  1. Download Prime95 and a CPU temperature monitor (CoreTemp will do).
  2. Keep the temperature monitor running all the time you run Prime95. Your CPU will get hot!
  3. Run each of the three Prime95 tests (smallFFTs, largeFFTs, and Blend) one after the other for a minimum of 1 hour per test, 2 hours per test would be better.
  4. If Prime95 generates error messages, if the system crashes/freezes/BSODs, or if your CPU temp reaches 95°C (Tmax for your CPU), then stop Prime95 and let us know what happened.
Note that a properly cooled and stable CPU should be able to run all Prime95 tests pretty much indefinitely.

FYI: The small FFT test stresses the CPU more than RAM. The large FFT test stresses RAM ore than the CPU. The Blend test is a mixture of the two.
 
  • Like
Reactions: gnomicidal
The CLOCK_WATCHDOG_TIMEOUT BSOD can only be fully analysed with the full kernel dump - and there is only ever one of those, the dump is overwritten with each new BSOD. If you see another CLOCK_WATCHDOG_TIMEOUT BSOD please immediately copy the file C:\Windows\Memory.dmp to another location (a temp folder for example) to prevent it being overwritten by another BSOD.
Thanks

How do I copy this over if I can't boot?

Is there a way to grab this file through the command line and copy to a usb drive or something of the sort?
 
As a starting point, look in Reliability History/Monitor for any error codes, warnings, or informational events being captured just before or at the time of the BSODs.

There are other tools that can be used; however, Reliability History/Monitor is end user friendly and may reveal some pattern leading to the eventual BSOD.

= = = =

Create a new bootable USB. The original USB may be buggy or corrupted.

= = = =

On the system run "dism" and "sfc /scannow" to find and fix any buggy or corrupted files.

https://www.windowscentral.com/how-use-dism-command-line-utility-repair-windows-10-image

https://www.lifewire.com/how-to-use-sfc-scannow-to-repair-windows-system-files-2626161
DISM and SFC:

https://www.dropbox.com/scl/fi/jax8...0-45.png?rlkey=17r0x9vtpk8y623mu0ktnlto5&dl=0

https://www.dropbox.com/scl/fi/5hdl...3-54.png?rlkey=y30alp5f8wyr5wzsxjm3op60g&dl=0

https://www.dropbox.com/scl/fi/o56j...1-45.png?rlkey=tngj5tyrifhllfpi6beqq2mwa&dl=0

https://www.dropbox.com/scl/fi/iz0p...4-00.png?rlkey=gxxmmi0f3zkd6ahjx3nu5c3ug&dl=0

Log:

https://www.dropbox.com/scl/fi/zn779n22rx9c3xjt799ij/CBS.log?rlkey=2z7eqva0v7y9wnp5xdtypv6fl&dl=0

Also created a new bootable USB.
 
Open Reliability History/Monitor and mainly look for errors (red X circles) but do not ignore any other error types.

Select the days/dates with multiple errors in order to list out the details below the timeline.

Take some screenshots and post the screenshots here via imgur (www.imgur.com).

Also look in Update history. The log indicates problem with updates and installs. Any failed or problem Windows updates?
 
  • Like
Reactions: gnomicidal
Open Reliability History/Monitor and mainly look for errors (red X circles) but do not ignore any other error types.

Select the days/dates with multiple errors in order to list out the details below the timeline.

Take some screenshots and post the screenshots here via imgur (www.imgur.com).

Also look in Update history. The log indicates problem with updates and installs. Any failed or problem Windows updates?
Last 3 days of reliability monitor:

https://www.dropbox.com/scl/fi/5bzj...4-25.png?rlkey=1hlb15pl6znlu34huyl57nvs1&dl=0

https://www.dropbox.com/scl/fi/1h4w...4-31.png?rlkey=sozv8dqetcbqaxgxggcfavos2&dl=0

https://www.dropbox.com/scl/fi/as3j...4-38.png?rlkey=eqm1mpbudrh6ht95wcnlzwmwi&dl=0
 
For the entries with "failed" view the technical details.

And probably start with Event Viewer for further delving.

Event Viewer requires more time and effort to navigate and understand.

To help withe Event Viewer:

How To - How to use Windows 10 Event Viewer | Tom's Hardware Forum (tomshardware.com)

Just explore first to get a sense of how the tool works and the data being presented.

Look for Event IDs. those that are most common and/or appear in some groups or patterns.
 
  • Like
Reactions: gnomicidal
Getting an about#blank on that link, but I will keep my eye on it and start tracking it and trying to understand it

Running the CPU stress testing that @ubuysa suggested right now

Really appreciate everyone's assistance so far
 
Getting an about#blank on that link, but I will keep my eye on it and start tracking it and trying to understand it
 
  • Like
Reactions: gnomicidal
Yeah it worked now, thank you both.
 
The CLOCK_WATCHDOG_TIMEOUT BSOD is typically caused by a CPU problem, although it can be a bad driver. The CRITICAL_PROCESS_DIED BSOD is almost always a hardware problem. The two together might make you look more closely at the CPU.

The CLOCK_WATCHDOG_TIMEOUT BSOD can only be fully analysed with the full kernel dump - and there is only ever one of those, the dump is overwritten with each new BSOD. If you see another CLOCK_WATCHDOG_TIMEOUT BSOD please immediately copy the file C:\Windows\Memory.dmp to another location (a temp folder for example) to prevent it being overwritten by another BSOD. Then upload that kernel dump to a cloud service with a link to it here.

Since we already have reason to suspect the CPU it might be wise to run Prime95 on it...
  1. Download Prime95 and a CPU temperature monitor (CoreTemp will do).
  2. Keep the temperature monitor running all the time you run Prime95. Your CPU will get hot!
  3. Run each of the three Prime95 tests (smallFFTs, largeFFTs, and Blend) one after the other for a minimum of 1 hour per test, 2 hours per test would be better.
  4. If Prime95 generates error messages, if the system crashes/freezes/BSODs, or if your CPU temp reaches 95°C (Tmax for your CPU), then stop Prime95 and let us know what happened.
Note that a properly cooled and stable CPU should be able to run all Prime95 tests pretty much indefinitely.

FYI: The small FFT test stresses the CPU more than RAM. The large FFT test stresses RAM ore than the CPU. The Blend test is a mixture of the two.
Ran Prime95 today on all 3 tests, 2 hours each.

No errors, all passed.
 
How do I copy this over if I can't boot?

Is there a way to grab this file through the command line and copy to a usb drive or something of the sort?
Boot the WIndows install media, select Repair My Computer, navigate to the Command prompt. Then locate your system drive (note that drive letters may be different in this WinRE system) and copy the file C:\Windows\Memory.dmp to the install media USB drive. You can then upload that from another PC.

Whilst you're there, copy all the dumps in C:\Windows\Minidump to the USB drive as well and upload those.

Meanwhile, if it won't boot at all, then remove one RAM stick and try again. Swap RAM sticks until you've tried with each stick out, or until you get it to boot.

If that fails, and if you can get into the BIOS setup, then disable C-States for the CPU and see whether that lets you boot. (This stops the processors entering a lower power state when idle. Some CPUs struggle coming out of low power C-States).
 
  • Like
Reactions: gnomicidal
Boot the WIndows install media, select Repair My Computer, navigate to the Command prompt. Then locate your system drive (note that drive letters may be different in this WinRE system) and copy the file C:\Windows\Memory.dmp to the install media USB drive. You can then upload that from another PC.

Whilst you're there, copy all the dumps in C:\Windows\Minidump to the USB drive as well and upload those.

Meanwhile, if it won't boot at all, then remove one RAM stick and try again. Swap RAM sticks until you've tried with each stick out, or until you get it to boot.

If that fails, and if you can get into the BIOS setup, then disable C-States for the CPU and see whether that lets you boot. (This stops the processors entering a lower power state when idle. Some CPUs struggle coming out of low power C-States).
Got it, now just have to wait for it to happen again. Just never know when it will.
 
A new windows update popped today. I figured if it was going to show up again it's here, so I manually did the update, restarted, and yup - same issue.

Ok, time to try and get answers if there is a boot file but I'll need to wait until after work.
 
Ok, time to try and get answers if there is a boot file but I'll need to wait until after work.
I'd suspect your ram config.
4 ram modules from different kits can cause compatibility issues.

Can you show screenshots from CPU-Z - memory and spd sections?
(upload to imgur.com and post link)

Make sure DDR voltage is set to 1.35V in BIOS and
Command Rate is set to 2T.

If that still doesn't fix the issue, then try with 2 ram modules from same kit (remove second kit).