[SOLVED] How to debug a PC that sometimes fails to boot (& enters a boot loop)?

Jan 3, 2022
2
0
10
I've been battling with my machine for weeks now, hoping for community's ideas or experience.

My PC usually fails on boot and enters a restart loop. Typically, it passes the ASUS (mobo) screen and shuts down during the Windows load screen with the wheel spinning. The PC will auto-restart shortly after shut down, and repeat failure. The shut down isn't exclusive to the Windows load screen; it's also occurred prior to reaching the Windows screen or after entering ASUS UEFI. Generally, it will shut down between 3-15 seconds after startup.

The caveat is this: about 10% of the time, it boots and works completely fine. The Windows load screen has come to serve as a "gate" for me: if it doesn't fail here, it's going to work. The PC has never shut down after reaching the Windows user log-in screen. Everything looks normal after successful boots (temps, hardware installed, etc.)

Here is what I've already tried:
  • Cleaned / removed dust from interior, components
  • Reseated components & wires (GPU, PSU, RAM x2, wireless card, hard drives x2)
  • Re-applyed thermal compound to CPU, reseated
  • Updated mobo drivers & BIOS
  • Updated GPU drivers
  • Reset CMOS (removed battery, reseated)
  • Reset motherboard overclocks
Specs:
  • Windows 10 Home 64-bit
  • ASUS Z97-A
  • Crucial 128 GB SSD (boots from this drive)
  • GeForce GTX 970
  • i7 4790k
A motherboard beep speaker is installed; it beeps 1 short time as it starts-up, which according to ASUS means the device is OK and booted normally.
Some ideas I'm still considering:
  • Reseating everything again
  • Replacing the CMOS battery with new
  • Updating more drivers (hard drive, etc.)
  • Reformatting / reinstalling Windows
Thank you for any help & consideration.
 
Solution
Some thoughts:

1. a 128gb ssd is very small. If there are insufficient ree nand blocks left, I could see problems.

2. Try booting in safe mode(f8) that loads windows with a minimum of drivers.
If that works, there is probably a bad driver somewhere.

3. thoroughly test ram.

Run memtest86 or memtest86+
They boot from a usb stick and do not use windows.
You can download them here:
If you can run a full pass with NO errors, your ram should be ok.

Running several more passes will sometimes uncover an issue, but it takes more time.
Probably not worth it unless you really suspect a ram issue.
It appears you have done many of the things that would be suggested.

Memtest?

Perhaps a clean install of OS or at least the DISM and SFC commands?

Do you see anything in Event Viewer, particularly Critical/Error and to a lesser degree Warning tabs?
 
Some thoughts:

1. a 128gb ssd is very small. If there are insufficient ree nand blocks left, I could see problems.

2. Try booting in safe mode(f8) that loads windows with a minimum of drivers.
If that works, there is probably a bad driver somewhere.

3. thoroughly test ram.

Run memtest86 or memtest86+
They boot from a usb stick and do not use windows.
You can download them here:
If you can run a full pass with NO errors, your ram should be ok.

Running several more passes will sometimes uncover an issue, but it takes more time.
Probably not worth it unless you really suspect a ram issue.
 
Solution
2-15-22 Update

Thank you everyone for the suggestions. Here is what I've tried:
  • Updated Windows (was actually +2 years behind bc of broken updater)
  • Windows internal scans ("sfc /scannow", DISM)
  • RAM test with MemTest86+
  • Cleaning SSD with a disk drive cleaner (I believe it was CrystalDiskInfo accessed via Hirens Boot CD)
At this point, the machine was failing less frequently, but still failing occasionally. I finally replaced the SSD with a new (Acer SA100 240GB) and reinstalled Windows. And the result is...it still sometimes fails.

The only other clue I know of is that the machine will never successfully boot from a Sleep state. If the machine is working properly and put to Sleep, it will always fail almost immediately on reboot. Sometimes it will successfully boot after 5 loops or so, but most times I need to switch power off to the machine completely and try again later for it to work.
 
Last edited: