Question I'm getting multiple BSODs ?

Page 3 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.
Jun 16, 2024
60
2
35
I've been having multiple different blue screens for the last couple of months. A lot have been solved due to finding out that my old motherboard and most of my old parts were fried during a surge but some keep persisting and calling back mostly to Ntoskrnl.exe. I have taken the pc to a tech a few times and he cannot seem to get the pc to blue screen at all. He had it almost a week last time running stress test and had no issue.

I bring it home and a few hours later it has another crash. I have only one original part from the fried motherboard era and that is my CPU which the tech is certain has no issues. He thinks it might be environmental but I got a uninterruptible power supply on his recommendation due to currently living with not so good electrical system (landlord ties his electric fence into the same power as the house). As of now I have no idea what is wrong. I'm not really that good at reading minidumps and nothing I do seems to fix it or find a root issue.

The blue screen codes I get the most of recently are:
DPC_Watchdog_Violation and IRQL-Not_Less_or_Equal with a smattering of System_Service_Exception, System_Thread__Exception_Not_Hanndled and a single Memory_Managment.

My system specs are:
MB: MSI MAG Z790 Tomahawk Max Wifi
CPU: Intel Core i5-12600KF
CPU Cooler: Noctua NH-U12S redux with NF-P12 redux - 1700pwm
RAM: Teamgroup TForce Vulkan Alpha ddr5-5600 32GB (dual channel)
GPU: Nvidia Geforce RTX 3060 ti
Power: Segotep 750w
Drive's: Crucial P3 4TB, Crucial T500 2TB(C drive)

Minidumps:
https://drive.google.com/file/d/1Khk0LYhd0vzl0bcjDl7xryZkvDLsOhbR/view?usp=sharing
https://drive.google.com/file/d/1YSxOYtPOWlrPtHJhaKrM_bJq0SmRWy8M/view?usp=sharing
https://drive.google.com/file/d/1dskVfBivoYiY6vxJ0sJI7ZFcNUB_5zkh/view?usp=sharing
https://drive.google.com/file/d/1_S7G7JU-GcTsXxwEhiSREq-DUu1ldZ_R/view?usp=sharing
https://drive.google.com/file/d/1SXV4S6GPh1whzn1_IXjM4rWnL_VgzeQj/view?usp=sharing
https://drive.google.com/file/d/13qVF9RM0Btrnl_D-RjlMfM1jR-TeUO0A/view?usp=sharing
 
So, you hold down the shift key and restart the computer, select Troubleshoot on the menu that comes up, then Startup settings, and hit the Restart button. That brings up a boot menu from which you can choose Safe Mode with Networking. You got or get a FAT_FILE_SYSTEM error at some point in that sequence?
 
So, you hold down the shift key and restart the computer, select Troubleshoot on the menu that comes up, then Startup options, and hit the Restart button. That brings up a boot menu from which you can choose Safe Mode with Networking. You got or get a FAT_FILE_SYSTEM error at some point in that sequence?
Yes. It happened after hitting the restart button after startup options.
 
I went back into safe mode after checking if the ctfmon error would pop up in normal mode. The fat_file bsod didn't happen this second time but the ctfmon.exe error is still happening.
 
Does msconfig show Normal startup selected in the General tab dialog? Have you manually disabled any services or is everything default?

It's weird but I don't think it has anything to do with the original problem. My guess is there's some dependency for ctfmon.exe that isn't being satisfied in safe mode. You might be able to use procmon to see what the system is doing in regards to ctfmon.exe when the error appears. It can take some effort to figure out what to filter out as far as the output, though.
 
Does msconfig show Normal startup selected in the General tab dialog? Have you manually disabled any services or is everything default?

It's weird but I don't think it has anything to do with the original problem. My guess is there's some dependency for ctfmon.exe that isn't being satisfied in safe mode. You might be able to use procmon to see what the system is doing in regards to ctfmon.exe when the error appears. It can take some effort to figure out what to filter out as far as the output, though.
Msconfig shows normal startup. No I don't have anything disabled. Everything is default.

I still have the diagnostic tool running in safe mode and so far no bsod. Should I let it run over the night?

Also do you think I need to be looking at getting a new CPU or is it still to early to call?
 
I wouldn't run it overnight. I was curious to see if the tool would run in safe mode and whether or not it would crash but there are quite a few differences between safe mode and normal mode. Power management is basically turned off in safe mode and that seems to be where issues can show up in modern CPUs

I can't say with certainty the CPU is faulty but I'm leaning that way. If you haven't already done so, I'd like you to install and run the Intel Driver and Support Assistant to see if it offers any updates for your system. Some of the drivers loading look at bit outdated, perhaps.
 
@cwsink I need to head to sleep. If you got anything else you'd like me to do or test then post it and I'll get to it just before I head to work.

Thank you for the all help so far. I and my buddy plus my tech have been driving ourselves crazy over this PC for months.

Thank you to all the others in the thread aswell. It's been a big relief getting these issues narrowed down more and more. Hopefully it will be solved soon.
 
Recently one of my games has started to give me this "Exception: Access violation. Illegal read" error. I'm unsure if its related or connected to my bsod issues or not but figured no harm is posting about it. Will put the logs for this here.

Crash Reporter: https://drive.google.com/file/d/1wGxuW38jH6DX-PJYg9puANuxhk4n50Pg/view?usp=sharing
RPT File: https://drive.google.com/file/d/1hhh6T8aZhk-ZrtHqqdbRli6xys5_lUNQ/view?usp=sharing
Game Minidump: https://drive.google.com/file/d/1HuZCiG58D9C3kUb293R0aBz7joNw8nUV/view?usp=sharing
 
It looks like the driver gna.sys was updated but I don't see any other 3rd party driver loading that is newer than those in the earlier dump files. That doesn't seem to have made any difference, unfortunately.

The bugcheck codes are all in the set of those which can happen with faulty memory. Problems paging data in from a drive can look like faulty memory and 6 of the crashes (possibly 7) show a callstack involving NTFS function calls. I'm not seeing any of the other bugcheck codes I would normally expect to see if there was a problem with a drive, though. (KERNEL_DATA_IN_PAGE_ERROR, UNEXPECTED_STORE_EXCEPTION)

All 4 of the most recent dump files show the crash happening on logical core 8 again and would normally make me suspect faulty memory. That it so frequently seems to crash on the same physical core seems like it would be meaningful, to my mind. I'd normally be expecting to see WHEA_UNCORRECTABLE_ERROR or MACHINE_CHECK_EXCEPTION crashes with a faulty Intel CPU (on older architectures, anyway - maybe that's changed with Alder Lake and newer.) If those were showing up I'd be more confident about it being a faulty core but I'm having trouble explaining the frequency of the crashes happening on the same physical core. If it was core 0 then easy enough - that's usually the busiest core on a Windows system. But such a high percentage of crashes happening on the same physical core that isn't core 0 seems highly unlikely.

With Ryzen CPUs you can use Ryzen Master to disable specific cores. Does anyone know if there's a tool that allows that on an Alder Lake CPU? My search results suggest not but it's not something I've ever needed to do on an Intel CPU.
 
I'm fairly certain he's had WHEA errors, but not within the last month. It's also worth noting that these bluescreens seem to have been escalating.

As for the core, it can be done via his EUFI, or via Project Lasso. I believe he's installing Lasso as I write this. He'll update further later.