[SOLVED] Unusual system freeze that leaves no logs ?

Jan 23, 2023
5
0
10
Mostly looking for recommendations on how to proceed diagnoses on a tricky issue. I'm running out of things to try.

---------------------------------
Observed Behavior:

The workstation will enter a "freeze" state with no clear trigger. At the start of this freeze programs will still run like normal for some tasks like typing, changing tabs, moving a 3d viewport, or switching active software. Internet services will cut out. Some actions like saving, closing, opening software, opening the start menu, or closing full-screen windows will freeze that one software. This will continue for any software that is closed.

The freeze is temporarily resolved by a reboot. Any files modified during this freeze state will be 0kb. This means any and all logs, system files, preferences, or scene files will all be 0kb upon the next restart. The computer otherwise has no signs that it had an emergency shutdown.

This doesn't always occur, can be anywhere from multiple times a day to once a month. So far its occurred around 20 times. No one primary software was running during all occurrences.

---------------------------------
Hardware:

Windows 10 pro 21H2 19044.2364
Intel Xeon E5-1660 V3 3.0GHz CPU
ASRock X99 Extreme6/3.1 Motherboard
Crucial 32GB (4 x 8GB) Registered DDR4-2133 Ram /or/ GSkill 32GB (2 x 16GB) unregistered DDR4-3200
NVidia 3090 GPU
Intel PCIE SSD boot drive /or/ Samsung 980 NVME boot drive
EVGA SuperNOVA 1000 P2 1000W PCU

---------------------------------
Attempts to solve:

  • Memtest86 passed.
  • Replaced registered ram with new unregistered ram
  • wmic checks ok.
  • CHKDSK ok.
  • CHKDSK /f /r ok
  • C driver error checking passed.
  • Full windows updates.
  • Virus checks clear.
  • Updated wacom drivers
  • Cleared up 150 gb on boot drive.
  • Replaced PCIE boot drive with new m2 boot drive
  • Removed unneeded/suspected installed software
  • Cleared out software that runs on boot.
  • Ran DISM and SFC, passed.
  • Removed and re-inserted RAM.
  • Updated graphics drivers
  • Removed and re-inserted GPU.
  • Successful furmark test
  • Removed and re-installed graphics drivers using wagnardsoft DDU and NVidia's driver installer.
  • Tried both studio and gaming drivers for GPU
  • Bios updated
  • Confirmed cables were fully inserted.
  • switching browsers from chrome to firefox
  • Disconnect from power for a week.
Temperatures seem great on all components.
The event viewer has had no suspicious activity, and no logs were created after the start of any freeze.

---------------------------------
None of these attempts had an effect, the freeze still occurs.

I can't force the freeze to occur for testing. While the freeze has occurred under various conditions, I find I can somewhat "force" it by running 4k youtube, foobar, and twitch simultaneously. This seems to have the highest chance of getting the freeze to occur. To note, it has so far not occured during gaming, and has occured with only audible open.

---------------------------------
Thoughts:

Any idea what else it could be or what to try for more information? I've already replaced the boot drive and memory. Now it's down to the CPU, motherboard, GPU, PSU, operating system, or some individual software. My current plan is to replace the GPU with a 980ti, see if it's the GPU. Then replace the CPU and motherboard, but I'd prefer not to sink the cost into that if possible. I'm worried I'll rebuild the whole computer and it'll be something simple with the operating system.
 
Last edited:
Solution
I went from Intel to AMD for CPU, and Asrock to Asus for the motherboard.
For this, did you do a full OS reinstall?

I know you said above that other people have said that is not the fix, but in the context of a whole platform change...a full OS reinstall IS strongly recommended.

Ralston18

Titan
Moderator
RAM:

"Crucial 32GB (4 x 8GB) Registered DDR4-2133 Ram /or/ GSkill 32GB (2 x 16GB) unregistered DDR4-3200 " Matched kits?

CMOS:

Try a new/fresh CMOS battery as a matter of elimination.

DRIVERS:

Manually download directly from the applicable manufacturer's wenbite. Install, and reconfgure. No third party tools or installers.

EVENT VIEWER:

Also look in Reliability History. Pay attention to all error codes, warnings, and even informational events. Look for a pattern of any sort that precedes the freezes.

Noted: "The freeze is temporarily resolved by a reboot. Any files modified during this freeze state will be 0kb " More information/explanation needed. Which logs and who/what is modifying files during the freeze.

Look in Task Manager and Task Scheduler for any signs of some app, or utility that may be clearing the logs.

Go into Event Viewer and ensure that the logs are not being cleared in some manner. Log sizes etc..

PSU:

EVGA SuperNOVA 1000 P2 1000W PCU - how old? History of heavy use for gaming, video editing, bit mining? Installed using only the cables that came with the PSU - correct?

If possible test the PSU with a multimeter.

https://www.lifewire.com/how-to-manually-test-a-power-supply-with-a-multimeter-2626158

Any voltages out of tolerance?
 
Jan 23, 2023
5
0
10
Thanks for your thoughts! Here's some clarification, Underlined important updates:

RAM:
Those are the two kits I tried separately as matched kits, They were not installed in the motherboard simultaneously. I bought the GSkill ram as a replacement to the Crucial ram to see if that was the cause.


Drivers:
I believe you're referring to the GPU drivers. My first attempt was to update using GeForce experience. When that didn't work, I uninstalled the driver using DDU, and reinstalled using the installer from The Nvidia drivers website.


Reliability Monitor:
There we go! hadn't heard of the reliability monitor prior to this. Definitely some information to be gained here. I only have data back to 12/30, but it does look like there's a consistent behavior. I'm seeing the same sort of logs as in this post on the microsoft forums. This does show the "Windows was not properly shut down" event lining up with an informational event:
Microsoft GameInput Successful application reconfiguration

This occurs at every instance of the freeze, however it seem to occur during first log-in after the freeze. It also occurs repeatedly during the day with no freezes. According to the shutdown notice, no events line up with the estimated time of shutdown. That said the time of shutdown in the log does not match the time it actually occured, sometimes the time is off by hours, othertimes off by 20 minutes.

For now, acting on this lead; Searches lead me to this post which contains most of the proposed solutions to this problem. The first test is to remove the xbox app and Gameinput through uninstallation, then to remove gameservices though admin powershell.
Get-AppxPackage *gamingservices* -allusers | remove-appxpackage -allusers

For users there, it appears that a full re-install of windows does not help, and the problem exists in both Windows 10 and 11. Looks like it started last summer. I bet I held off on updates until september for when my problem started. There is some indication that if the removal of this software doesn't work, it may be tied to something called Ntkrnlmp.exe found in the minidumps in WinDbg Preview . This may be tied to problems with the PSU or how power is being supplied to the CPU. Some users report a new CPU did the trick. So far I don't see any minidump files to check on my machine. There's also some discussion on reddit about an alternate way of removing the Gaming services components which I will try next.

Thanks again for the Reliability Monitor tip!


0kb clarification:
To clarify, any files at all modified by the computer in the frozen state will be 0kb. If I save an open photoshop document, the scene file will still exist, but be 0kb and have no data. If I close houdini, autosaved preferences will be 0kb. Any active downloads will be 0kb. Foobar libraries try to save when the software is closed, so those all will be 0kb. The freeze will allow files to be saved, but will not be able to complete the save.

Thanks for the tip on the task scheduler, doesn't appear that anything is amis, but it was good to turn off some things.


PSU:
7 years old. Not overclocked anywhere, shoudn't have used it too heavily on average. Yes using only included cables.
I hadn't considered testing it with a multimeter, will certainly give that a go as my next test if the GameInput fix doesn't work. Will need to get a multimeter first. From discussions linked above, it does sound like Ntkrnlmp.exe acting up could be related to it. Thanks for the link!

----------------
Thanks again! These are some great tips, and if it's the PSU I would have never guessed. I'll update if it freezes again.


EDIT:
What do you know, froze again. This time it may have been"Microsoft Desktop AppInstaller" according to the reliability monitor.

Installation Successful: Windows successfully installed the following update: 9NBLGGH4NNS1-Microsoft.DesktopAppInstaller

Will do some more research


EDIT 2:
Doubt anyone would check this post to help at this point, updating anyways.

The "minidumps" mentioned above seem to be created on bluescreen error events. Since I'm not having those, I do not have minidumps to check. The whole Ntkrnlmp.exe idea seems to be a bust for my issue.

I tried replacing the Graphics card. Downgraded from a 3090 to a 980ti. Freeze occurred again. Now it's down to the CPU, motherboard, PSU, operating system, or some individual software.

Edit 3:
Found some more reports with very similar behaviors on this post. A user links to a post with a possible solution involving setting C-States to disabled in the motherboard's bios. This suggestion believes it has something to do with the motherboard not providing enough power to the CPU. I tried it myself and it didn't work in my case. Reliability monitor only shows the unexpected shutdown, no other events.
 
Last edited:
Jan 23, 2023
5
0
10
It seems I can no longer edit that previous message, so here's some updates.

First, The new CMOS battery and CMOS full reset didn't fix it. I thought it had for a while, but that wasn't it.

Second, it doesn't appear to be a hardware issue. I replaced the PSU, motherboard, and CPU. The problem seemed to be fixed, as it went more than a week without re-occurring. It did occur again today. That marks completely new hardware throughout.

I went from Intel to AMD for CPU, and Asrock to Asus for the motherboard.

Will next try a new surge protector cable, but it's probably windows or peripheral related now...
 
Jan 23, 2023
5
0
10
I have not yet, but it is now on my plan to do so next. It sounds like the most likely solution at this point. Hopefully getting the software all back up and running won't be too bad. Will update here with results.
 
Jan 23, 2023
5
0
10
Updating for future searchers:

Tried a reinstall in place of the OS, a "repair install." That didn't work.

Due to licence restrictions, I couldn't try a clean copy of windows 10. I also missed the cutoff for new windows 10 licences. Bummer.

A new fresh install of windows 11 did the trick on both old and new hardware. No repeats of the issue so far.

Thanks USAFRet and Ralston18! Your advice was valuable in finding a solution. The cause still remain a mystery. The problem is in the OS somewhere, but really have no idea where.