[SOLVED] PC Randomly Shuts Off (Critical Error: Kernel-Power)

Status
Not open for further replies.
Apr 10, 2022
6
0
10
Roughly around after I upgraded from Windows 10 to Windows 11, my PC has been randomly shutting off. I would also like to note that before having this issue, I also installed an additional SSD (for game storage) and an HDD (for misc storage), my OS drive has been completely untouched. The PC most commonly shuts off when i'm playing any kind of game, but also shuts off for less demanding tasks like browsing YouTube on chrome (I can only consistently reproduce the crash with specific games, like Mount & Blade II on the character creation screen).

I've contacted NZXT support (who I bought the PC through) and they had me update my BIOS and drivers, and it has not worked. At one point my PC wouldn't turn back on, and I had to reset the CMOS battery on my motherboard to get it running again. I've also run every stress test with the OCCT application (as recommended by NZXT), but I simply cannot reproduce the crash (I can only reproduce it with specific games).

With every single crash, comes the same critical error in my event viewer (with no errors before it), which i'll detail below:
Log Name: System
Source: Microsoft-Windows-Kernel-Power
Date: 4/10/2022 10:24:03 AM
Event ID: 41
Task Category: (63)
Level: Critical
Keywords: (70368744177664),(2)
User: SYSTEM
Computer: DESKTOP-AF0GBF5
Description: The system has rebooted without cleanly shutting down first. This error could be caused if the system stopped responding, crashed, or lost power unexpectedly.

My System Specs:
CPU: Intel Core i9-10850K 10-Core 3.6GHz
GPU: NVIDIA GeForce RTX 2080 SUPER
Motherboard: NZXT N7 Z490 (Wi-Fi)
RAM: Team T-FORCE Delta RGB 3200MHz 16GB (x2)
PSU: NZXT C850 Gold
Cooling: NZXT Kraken M22

I really need help, I cannot wrap my head around this. I just ordered a UPS (Uninterruptible Power Supply) to see if it's my house's power that's the issue (another thing an NZXT "help" article suggested).

Minidump File: https://drive.google.com/file/d/1BAQR47WLQ82WkA_k9vAWFQkh-ioMs4mH/view?usp=sharing
 
Last edited:
Solution
you should put your minidump from c:\windows\minidump
directory onto a cloud server, share the files for guest access and post a link.
Then someone with a window debugger can take a look to see why windows shutdown with a bugcheck. (or give you a likely cause of the bugcheck)

the error log you are looking at is generated after the system boots up after it crashes. most of the time there should be a .dmp file created that indicates the cause of the problem.

you can also download whocrashed.exe or bluescreenview
tools to take a quick look at these memory dumps.

if the machine just goes to a blank screen, you want to check for overheating, overclocking drivers, older machine you would blow out the dust from fans and make sure the...
you should put your minidump from c:\windows\minidump
directory onto a cloud server, share the files for guest access and post a link.
Then someone with a window debugger can take a look to see why windows shutdown with a bugcheck. (or give you a likely cause of the bugcheck)

the error log you are looking at is generated after the system boots up after it crashes. most of the time there should be a .dmp file created that indicates the cause of the problem.

you can also download whocrashed.exe or bluescreenview
tools to take a quick look at these memory dumps.

if the machine just goes to a blank screen, you want to check for overheating, overclocking drivers, older machine you would blow out the dust from fans and make sure the CPU is being cooled correctly. generally best to have someone look at the memory dumps.
 
Last edited:
Solution
Apr 10, 2022
6
0
10
you should put your minidump from c:\windows\minidump
directory onto a cloud server, share the files for guest access and post a link.
Then someone with a window debugger can take a look to see why windows shutdown with a bugcheck. (or give you a likely cause of the bugcheck)

the error log you are looking at is generated after the system boots up after it crashes. most of the time there should be a .dmp file created that indicates the cause of the problem.

you can also download whocrashed.exe or bluescreenview
tools to take a quick look at these memory dumps.

if the machine just goes to a blank screen, you want to check for overheating, overclocking drivers, older machine you would blow out the dust from fans and make sure the CPU is being cooled correctly. generally best to have someone look at the memory dumps.

I just added the minidump file to the original post.
 
basically the system bugchecked when one core of your cpu tried to talk to another core of your cpu and it did not get a response. the cpu figures the core is hung and it reboots the machine with this bugcheck.

you need to change the memory dump type to kernel and provide the kernel memory dump. c:\windows\memory.dmp

this will show what is running on the second core. Often this will turn out to be a plug and play device trying to start up on one core while another core is trying to use the device. (first core fails to load the driver an tries over and over, second core figures the first core is hung and calls a bugcheck to stop the system.

go into windows control panel, find device manager and look for devices that are not working or maybe are disabled and enable or update the driver.

kernel memory dump will have the info for the plug and play system.
 
Apr 10, 2022
6
0
10
basically the system bugchecked when one core of your cpu tried to talk to another core of your cpu and it did not get a response. the cpu figures the core is hung and it reboots the machine with this bugcheck.

you need to change the memory dump type to kernel and provide the kernel memory dump. c:\windows\memory.dmp

this will show what is running on the second core. Often this will turn out to be a plug and play device trying to start up on one core while another core is trying to use the device. (first core fails to load the driver an tries over and over, second core figures the first core is hung and calls a bugcheck to stop the system.

go into windows control panel, find device manager and look for devices that are not working or maybe are disabled and enable or update the driver.

kernel memory dump will have the info for the plug and play system.
Thank you for the response. I don't know how to view or read DMP files, but from looking at my Device Manager, there are 4 devices listed under "Other Devices" with yellow notice icons next to them. Trying to update them doesn't yield any results, but this is what they're called:
  • Base System Device
  • PCI Data Acquisition and Signal Processing Controller
  • PCI Device
  • SM Bus Controller
Are these my issue, and how do I go about fixing them?

Edit: I just did an "optional" update and "Base System Device" is no longer showing up.
 
Thank you for the response. I don't know how to view or read DMP files, but from looking at my Device Manager, there are 4 devices listed under "Other Devices" with yellow notice icons next to them. Trying to update them doesn't yield any results, but this is what they're called:
  • Base System Device
  • PCI Data Acquisition and Signal Processing Controller
  • PCI Device
  • SM Bus Controller
Are these my issue, and how do I go about fixing them?

Edit: I just did an "optional" update and "Base System Device" is no longer showing up.
there is a good chance. these are going to be low level hardware devices on your motherboard. You generally should go to your motherboard hardware vendors website and install their drivers that they provide. (not the utilities)

at the bottom of this page has a bios update and drivers update.
most likely I would install the intel chipset update as a first start.
N7 Z490 | NZXT
 
there is a good chance. these are going to be low level hardware devices on your motherboard. You generally should go to your motherboard hardware vendors website and install their drivers that they provide. (not the utilities)

at the bottom of this page has a bios update and drivers update.
most likely I would install the intel chipset update as a first start.
N7 Z490 | NZXT
  • PCI Data Acquisition and Signal Processing Controller
is most likely intel thermal management processing. if you don't find it with the installs from nzxt website. you can pick it up directly from intel support site.
Intel® Driver & Support Assistant

(best to get your drivers from your motherboar vendor, in case they are custom to your machine. Also, some driver depend on the bios version installed so you might update the bios also.

Note: looks like you already did the bios update. if the drivers are not getting installed you can try the intel installer. Then check again.

the thermal management drivers might be in the intel management engine installer. basically, a driver to read the temp sensors in the cpu and the motherboard.
 
Last edited:
Apr 10, 2022
6
0
10
  • PCI Data Acquisition and Signal Processing Controller
is most likely intel thermal management processing. if you don't find it with the installs from nzxt website. you can pick it up directly from intel support site.
Intel® Driver & Support Assistant

(best to get your drivers from your motherboar vendor, in case they are custom to your machine. Also, some driver depend on the bios version installed so you might update the bios also.

Note: looks like you already did the bios update. if the drivers are not getting installed you can try the intel installer. Then check again.

the thermal management drivers might be in the intel management engine installer. basically, a driver to read the temp sensors in the cpu and the motherboard.
I've managed to get all the drivers installed, but the crashing still persists, however this time 2 new errors called "volmgr" and "WHEA-Logger" came with the critical error. This is all of the errors in the order they're listed in the event viewer:
LevelDate and TimeSourceEvent IDTask CategoryDescription
Error4/11/2022 10:59:59 AMWHEA-Logger18NoneA fatal hardware error has occurred.

Reported by component: Processor Core
Error Source: Machine Check Exception
Error Type: Internal Unclassified Error
Processor APIC ID: 5

The details view of this entry contains further information.
Error4/11/2022 10:59:57 AMKernel-EventTracing28ProviderError setting traits on Provider {8444a4fb-d8d3-4f38-84f8-89960a1ef12f}. Error: 0xC0000001
Error4/11/2022 10:59:59 AMEventlog1101Event ProcessingAudit events have been dropped by the transport. 0
Critical4/11/2022 10:59:54 AMKernel-Power41(63)The system has rebooted without cleanly shutting down first. This error could be caused if the system stopped responding, crashed, or lost power unexpectedly.
Error4/11/2022 10:59:54 AMvolmgr161NoneDump file creation failed due to error during dump creation.
Error4/11/2022 10:59:59 AMEventLog6008NoneThe previous system shutdown at 10:31:31 AM on ‎4/‎11/‎2022 was unexpected.
 
Last edited:
basically, it says that it could not create a memory dump. most likely because it could not access the drive for some reason. this happens if the drive is full or there is a problem with the drive controller driver. It will also happen on certain bugchecks where the CPU is just too confused to work correctly. In these cases you want to go into bios and reset it to defaults and reboot. if you are using a SATA drive, check you bios for a setting that enable hotswap for the sata port the drive is using. There are cases where the drive disconnects and can not reconnect because of hot swap is not enabled. (then you figure out why it disconnected)

you can also google "how to force a memory dump using a keyboard"
make the registry changes then force a kernel memory dump of the working system. let it run a few minutes and force the dump. I can take a look and see if it shows any problems before the system actually crashes.

you could also change the location where the memory dump gets saved to be on another drive.

I would also run crystaldiskinfo.exe to read the SMART data from the drives to see the drive health.
 
Last edited:
Apr 10, 2022
6
0
10
basically, it says that it could not create a memory dump. most likely because it could not access the drive for some reason. this happens if the drive is full or there is a problem with the drive controller driver. It will also happen on certain bugchecks where the CPU is just too confused to work correctly. In these cases you want to go into bios and reset it to defaults and reboot. if you are using a SATA drive, check you bios for a setting that enable hotswap for the sata port the drive is using. There are cases where the drive disconnects and can not reconnect because of hot swap is not enabled. (then you figure out why it disconnected)

you can also google "how to force a memory dump using a keyboard"
make the registry changes then force a kernel memory dump of the working system. let it run a few minutes and force the dump. I can take a look and see if it shows any problems before the system actually crashes.

you could also change the location where the memory dump gets saved to be on another drive.

I would also run crystaldiskinfo.exe to read the SMART data from the drives to see the drive health.
My OS is on a SATA drive (Western Digital SN550 1TB), but it's already set to AHCI (which I assume is hotswap), as opposed to "RAID mode". Also, my BIOS is already set to all defaults, i've even tried turning XMP on and back off. As of right now, is my goal to acquire an updated dump file?
 
My OS is on a SATA drive (Western Digital SN550 1TB), but it's already set to AHCI (which I assume is hotswap), as opposed to "RAID mode". Also, my BIOS is already set to all defaults, i've even tried turning XMP on and back off. As of right now, is my goal to acquire an updated dump file?
nope, hotswap may be a option in the bios for individual sata ports.
note: that some motherboards do special functions for certain sata ports also. you may want to look in your motherboard manual to avoid the special ports.
hotswap just means that a port might get disconnected while the OS is running.
windows can detect problems with a drive and it can reset the sata port to attempt to fix the problem. This can disconnect the drive but sometimes a bios sata port setting can prevent it from being reconnected. I looked at a machine that had this happen and 4 hours later some critical process could not run and it bugchecked.
 
Apr 10, 2022
6
0
10
nope, hotswap may be a option in the bios for individual sata ports.
note: that some motherboards do special functions for certain sata ports also. you may want to look in your motherboard manual to avoid the special ports.
hotswap just means that a port might get disconnected while the OS is running.
windows can detect problems with a drive and it can reset the sata port to attempt to fix the problem. This can disconnect the drive but sometimes a bios sata port setting can prevent it from being reconnected. I looked at a machine that had this happen and 4 hours later some critical process could not run and it bugchecked.

Sorry for the late reply, but i've looked through every single option in my BIOS and nothing is labeled "hotswap", I've googled it and it only comes back as AHCI, which is what's already selected in my BIOS. My PC has gotten worse as well, now it's shutting off from using Chrome by itself for around an hour. Is there anything else that could be causing it?
 
a machine that shuts off while using it.
I would look for overheated CPU, tripping the motherboard thermal protection circuits. monitor CPU temps. remove any cpu overclocking drivers.

gpu can pull too much power from the motherboard pci/e slot. the motherboard will detect this and reset the CPU to correct the problem. You get a black screen. Sometimes this happens if your GPU has secondary power connections that are not connected or have bad connections. It can happen if the GPU is overclocked or the GPU fans are clogged with dust or are not running at max speed during heavy use.

check the power voltages in BIOS and make sure they are correct with in 5 %.

AIO coolers: that have USB connectors can hit bugs in the USB subsystem and have issues. These coolers can stop cooling because of this causing the CPU temps to spike. Also can develop vapor bubbles in the lines.

you can change the memory dump type to kernel, then when you get the next bucheck provide the file c:\windows\memory.dmp

this is a larger memory dump that will also include the stack traces for all of the cpu cores and will also include all of the internal logs for the various components. Sometimes these will show what is going on.

even a minidump will have the bugcheck code that also shows the parameters that indicate the cause or what component called the bugcheck. It used to be that they were always called by the CPU but now any device that has access to the PCI/e bus can call this bugcheck. this includes USB devices that you have unplugged from the system. (the driver only is hidden when you unplug, not removed)
I have seen AIO coolers usb connections that were sending hundred of thousand of USB packets each minute cause strange problems. (overheating because the person was worried the cooler was not working so they had the software check at the max speed)
 
Last edited:
Status
Not open for further replies.