[SOLVED] PC Randomly Shuts Off (Critical Error: Kernel-Power)

HexJK · Apr 10, 2022

Roughly around after I upgraded from Windows 10 to Windows 11, my PC has been randomly shutting off. I would also like to note that before having this issue, I also installed an additional SSD (for game storage) and an HDD (for misc storage), my OS drive has been completely untouched. The PC most commonly shuts off when i'm playing any kind of game, but also shuts off for less demanding tasks like browsing YouTube on chrome (I can only consistently reproduce the crash with specific games, like Mount & Blade II on the character creation screen).

I've contacted NZXT support (who I bought the PC through) and they had me update my BIOS and drivers, and it has not worked. At one point my PC wouldn't turn back on, and I had to reset the CMOS battery on my motherboard to get it running again. I've also run every stress test with the OCCT application (as recommended by NZXT), but I simply cannot reproduce the crash (I can only reproduce it with specific games).

With every single crash, comes the same critical error in my event viewer (with no errors before it), which i'll detail below:
Log Name: System
Source: Microsoft-Windows-Kernel-Power
Date: 4/10/2022 10:24:03 AM
Event ID: 41
Task Category: (63)
Level: Critical
Keywords: (70368744177664),(2)
User: SYSTEM
Computer: DESKTOP-AF0GBF5
Description: The system has rebooted without cleanly shutting down first. This error could be caused if the system stopped responding, crashed, or lost power unexpectedly.

My System Specs:
CPU: Intel Core i9-10850K 10-Core 3.6GHz
GPU: NVIDIA GeForce RTX 2080 SUPER
Motherboard: NZXT N7 Z490 (Wi-Fi)
RAM: Team T-FORCE Delta RGB 3200MHz 16GB (x2)
PSU: NZXT C850 Gold
Cooling: NZXT Kraken M22

I really need help, I cannot wrap my head around this. I just ordered a UPS (Uninterruptible Power Supply) to see if it's my house's power that's the issue (another thing an NZXT "help" article suggested).

Minidump File: https://drive.google.com/file/d/1BAQR47WLQ82WkA_k9vAWFQkh-ioMs4mH/view?usp=sharing

johnbl · Apr 10, 2022

you should put your minidump from c:\windows\minidump
directory onto a cloud server, share the files for guest access and post a link.
Then someone with a window debugger can take a look to see why windows shutdown with a bugcheck. (or give you a likely cause of the bugcheck)

the error log you are looking at is generated after the system boots up after it crashes. most of the time there should be a .dmp file created that indicates the cause of the problem.

you can also download whocrashed.exe or bluescreenview
tools to take a quick look at these memory dumps.

if the machine just goes to a blank screen, you want to check for overheating, overclocking drivers, older machine you would blow out the dust from fans and make sure the CPU is being cooled correctly. generally best to have someone look at the memory dumps.

HexJK · Apr 11, 2022

johnbl said:
you should put your minidump from c:\windows\minidump
directory onto a cloud server, share the files for guest access and post a link.
Then someone with a window debugger can take a look to see why windows shutdown with a bugcheck. (or give you a likely cause of the bugcheck)

the error log you are looking at is generated after the system boots up after it crashes. most of the time there should be a .dmp file created that indicates the cause of the problem.

you can also download whocrashed.exe or bluescreenview
tools to take a quick look at these memory dumps.

if the machine just goes to a blank screen, you want to check for overheating, overclocking drivers, older machine you would blow out the dust from fans and make sure the CPU is being cooled correctly. generally best to have someone look at the memory dumps.

I just added the minidump file to the original post.

johnbl · Apr 11, 2022

basically the system bugchecked when one core of your cpu tried to talk to another core of your cpu and it did not get a response. the cpu figures the core is hung and it reboots the machine with this bugcheck.

you need to change the memory dump type to kernel and provide the kernel memory dump. c:\windows\memory.dmp

this will show what is running on the second core. Often this will turn out to be a plug and play device trying to start up on one core while another core is trying to use the device. (first core fails to load the driver an tries over and over, second core figures the first core is hung and calls a bugcheck to stop the system.

go into windows control panel, find device manager and look for devices that are not working or maybe are disabled and enable or update the driver.

kernel memory dump will have the info for the plug and play system.

HexJK · Apr 11, 2022

johnbl said:
basically the system bugchecked when one core of your cpu tried to talk to another core of your cpu and it did not get a response. the cpu figures the core is hung and it reboots the machine with this bugcheck.

you need to change the memory dump type to kernel and provide the kernel memory dump. c:\windows\memory.dmp

this will show what is running on the second core. Often this will turn out to be a plug and play device trying to start up on one core while another core is trying to use the device. (first core fails to load the driver an tries over and over, second core figures the first core is hung and calls a bugcheck to stop the system.

go into windows control panel, find device manager and look for devices that are not working or maybe are disabled and enable or update the driver.

kernel memory dump will have the info for the plug and play system.

Thank you for the response. I don't know how to view or read DMP files, but from looking at my Device Manager, there are 4 devices listed under "Other Devices" with yellow notice icons next to them. Trying to update them doesn't yield any results, but this is what they're called:

~~Base System Device~~
PCI Data Acquisition and Signal Processing Controller
PCI Device
SM Bus Controller

Are these my issue, and how do I go about fixing them?

Edit: I just did an "optional" update and "Base System Device" is no longer showing up.

johnbl · Apr 11, 2022

HexJK said:
Thank you for the response. I don't know how to view or read DMP files, but from looking at my Device Manager, there are 4 devices listed under "Other Devices" with yellow notice icons next to them. Trying to update them doesn't yield any results, but this is what they're called:

~~Base System Device~~

PCI Data Acquisition and Signal Processing Controller

PCI Device

SM Bus Controller

Are these my issue, and how do I go about fixing them?

Edit: I just did an "optional" update and "Base System Device" is no longer showing up.

there is a good chance. these are going to be low level hardware devices on your motherboard. You generally should go to your motherboard hardware vendors website and install their drivers that they provide. (not the utilities)

at the bottom of this page has a bios update and drivers update.
most likely I would install the intel chipset update as a first start.
N7 Z490 | NZXT

johnbl · Apr 11, 2022

johnbl said:
there is a good chance. these are going to be low level hardware devices on your motherboard. You generally should go to your motherboard hardware vendors website and install their drivers that they provide. (not the utilities)

at the bottom of this page has a bios update and drivers update.
most likely I would install the intel chipset update as a first start.
N7 Z490 | NZXT

PCI Data Acquisition and Signal Processing Controller

is most likely intel thermal management processing. if you don't find it with the installs from nzxt website. you can pick it up directly from intel support site.
Intel® Driver & Support Assistant

(best to get your drivers from your motherboar vendor, in case they are custom to your machine. Also, some driver depend on the bios version installed so you might update the bios also.

Note: looks like you already did the bios update. if the drivers are not getting installed you can try the intel installer. Then check again.

the thermal management drivers might be in the intel management engine installer. basically, a driver to read the temp sensors in the cpu and the motherboard.

HexJK · Apr 11, 2022

johnbl said:
PCI Data Acquisition and Signal Processing Controller

is most likely intel thermal management processing. if you don't find it with the installs from nzxt website. you can pick it up directly from intel support site.
Intel® Driver & Support Assistant

(best to get your drivers from your motherboar vendor, in case they are custom to your machine. Also, some driver depend on the bios version installed so you might update the bios also.

Note: looks like you already did the bios update. if the drivers are not getting installed you can try the intel installer. Then check again.

the thermal management drivers might be in the intel management engine installer. basically, a driver to read the temp sensors in the cpu and the motherboard.

I've managed to get all the drivers installed, but the crashing still persists, however this time 2 new errors called "volmgr" and "WHEA-Logger" came with the critical error. This is all of the errors in the order they're listed in the event viewer:

Level	Date and Time	Source	Event ID	Task Category	Description
Error	4/11/2022 10:59:59 AM	WHEA-Logger	18	None	A fatal hardware error has occurred. Reported by component: Processor Core Error Source: Machine Check Exception Error Type: Internal Unclassified Error Processor APIC ID: 5 The details view of this entry contains further information.
Error	4/11/2022 10:59:57 AM	Kernel-EventTracing	28	Provider	Error setting traits on Provider {8444a4fb-d8d3-4f38-84f8-89960a1ef12f}. Error: 0xC0000001
Error	4/11/2022 10:59:59 AM	Eventlog	1101	Event Processing	Audit events have been dropped by the transport. 0
Critical	4/11/2022 10:59:54 AM	Kernel-Power	41	(63)	The system has rebooted without cleanly shutting down first. This error could be caused if the system stopped responding, crashed, or lost power unexpectedly.
Error	4/11/2022 10:59:54 AM	volmgr	161	None	Dump file creation failed due to error during dump creation.
Error	4/11/2022 10:59:59 AM	EventLog	6008	None	The previous system shutdown at 10:31:31 AM on ‎4/‎11/‎2022 was unexpected.

johnbl · Apr 11, 2022

basically, it says that it could not create a memory dump. most likely because it could not access the drive for some reason. this happens if the drive is full or there is a problem with the drive controller driver. It will also happen on certain bugchecks where the CPU is just too confused to work correctly. In these cases you want to go into bios and reset it to defaults and reboot. if you are using a SATA drive, check you bios for a setting that enable hotswap for the sata port the drive is using. There are cases where the drive disconnects and can not reconnect because of hot swap is not enabled. (then you figure out why it disconnected)

you can also google "how to force a memory dump using a keyboard"
make the registry changes then force a kernel memory dump of the working system. let it run a few minutes and force the dump. I can take a look and see if it shows any problems before the system actually crashes.

you could also change the location where the memory dump gets saved to be on another drive.

I would also run crystaldiskinfo.exe to read the SMART data from the drives to see the drive health.

HexJK · Apr 11, 2022

johnbl said:
basically, it says that it could not create a memory dump. most likely because it could not access the drive for some reason. this happens if the drive is full or there is a problem with the drive controller driver. It will also happen on certain bugchecks where the CPU is just too confused to work correctly. In these cases you want to go into bios and reset it to defaults and reboot. if you are using a SATA drive, check you bios for a setting that enable hotswap for the sata port the drive is using. There are cases where the drive disconnects and can not reconnect because of hot swap is not enabled. (then you figure out why it disconnected)

you can also google "how to force a memory dump using a keyboard"
make the registry changes then force a kernel memory dump of the working system. let it run a few minutes and force the dump. I can take a look and see if it shows any problems before the system actually crashes.

you could also change the location where the memory dump gets saved to be on another drive.

I would also run crystaldiskinfo.exe to read the SMART data from the drives to see the drive health.

My OS is on a SATA drive (Western Digital SN550 1TB), but it's already set to AHCI (which I assume is hotswap), as opposed to "RAID mode". Also, my BIOS is already set to all defaults, i've even tried turning XMP on and back off. As of right now, is my goal to acquire an updated dump file?

johnbl · Apr 11, 2022

HexJK said:
My OS is on a SATA drive (Western Digital SN550 1TB), but it's already set to AHCI (which I assume is hotswap), as opposed to "RAID mode". Also, my BIOS is already set to all defaults, i've even tried turning XMP on and back off. As of right now, is my goal to acquire an updated dump file?

nope, hotswap may be a option in the bios for individual sata ports.
note: that some motherboards do special functions for certain sata ports also. you may want to look in your motherboard manual to avoid the special ports.
hotswap just means that a port might get disconnected while the OS is running.
windows can detect problems with a drive and it can reset the sata port to attempt to fix the problem. This can disconnect the drive but sometimes a bios sata port setting can prevent it from being reconnected. I looked at a machine that had this happen and 4 hours later some critical process could not run and it bugchecked.

HexJK · Apr 19, 2022

johnbl said:
nope, hotswap may be a option in the bios for individual sata ports.
note: that some motherboards do special functions for certain sata ports also. you may want to look in your motherboard manual to avoid the special ports.
hotswap just means that a port might get disconnected while the OS is running.
windows can detect problems with a drive and it can reset the sata port to attempt to fix the problem. This can disconnect the drive but sometimes a bios sata port setting can prevent it from being reconnected. I looked at a machine that had this happen and 4 hours later some critical process could not run and it bugchecked.

Sorry for the late reply, but i've looked through every single option in my BIOS and nothing is labeled "hotswap", I've googled it and it only comes back as AHCI, which is what's already selected in my BIOS. My PC has gotten worse as well, now it's shutting off from using Chrome by itself for around an hour. Is there anything else that could be causing it?

johnbl · Apr 19, 2022

a machine that shuts off while using it.
I would look for overheated CPU, tripping the motherboard thermal protection circuits. monitor CPU temps. remove any cpu overclocking drivers.

gpu can pull too much power from the motherboard pci/e slot. the motherboard will detect this and reset the CPU to correct the problem. You get a black screen. Sometimes this happens if your GPU has secondary power connections that are not connected or have bad connections. It can happen if the GPU is overclocked or the GPU fans are clogged with dust or are not running at max speed during heavy use.

check the power voltages in BIOS and make sure they are correct with in 5 %.

AIO coolers: that have USB connectors can hit bugs in the USB subsystem and have issues. These coolers can stop cooling because of this causing the CPU temps to spike. Also can develop vapor bubbles in the lines.

you can change the memory dump type to kernel, then when you get the next bucheck provide the file c:\windows\memory.dmp

this is a larger memory dump that will also include the stack traces for all of the cpu cores and will also include all of the internal logs for the various components. Sometimes these will show what is going on.

even a minidump will have the bugcheck code that also shows the parameters that indicate the cause or what component called the bugcheck. It used to be that they were always called by the CPU but now any device that has access to the PCI/e bus can call this bugcheck. this includes USB devices that you have unplugged from the system. (the driver only is hidden when you unplug, not removed)
I have seen AIO coolers usb connections that were sending hundred of thousand of USB packets each minute cause strange problems. (overheating because the person was worried the cooler was not working so they had the software check at the max speed)

Search

[SOLVED] PC Randomly Shuts Off (Critical Error: Kernel-Power)

HexJK

johnbl

johnbl

Polypheme

HexJK

johnbl

Polypheme

HexJK

johnbl

Polypheme

johnbl

Polypheme

HexJK

johnbl

Polypheme

HexJK

johnbl

Polypheme

HexJK

johnbl

Polypheme

TRENDING THREADS

Latest posts

Moderators online

Share this page