Question PC is riddled with different BSODs ?

Page 2 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.

ubuysa

Distinguished
I would be very wary of trying solutions that worked for others. I'd really want to see whether it remains stable in Safe Mode because that will tell us whether to look at hardware or software.
 
  • Like
Reactions: Sierrus
Sep 17, 2024
19
0
10
Welp, it just crashed in safe mode with a good ol' PAGE_FAULT_IN_NONPAGED_AREA as I clicked on a program. 1 RAM stick was in.

...Scratch that, I swapped the other stick too and it crashed almost immediately in safe mode, again PAGE_FAULT_IN_NONPAGED_AREA. So it's hardware. 🤷‍♂️
 
Last edited:

ubuysa

Distinguished
That it BSODs in Safe Mode does point at a hardware cause. Is there any chance you could upload the two dumps from those BSODs?

That it BSODs on either RAM stick does probably eliminate the RAM, although that RAM is not on the QVL for the board and CPU and that's always a concern with these sorts of BSODs.

I think as a 'belt-and-braces' test it might be worth downloading a Linux distro (Mint is good) onto a USB drive (on another PC of course). Most Linux distros (like Mint) can run off the USB drive and don't need installing. Try booting a Linux USB then and see whether the system remains stable.
 
Sep 17, 2024
19
0
10
Here's the dumps, I grabbed what was in the folder at the time and right after booting the PC this time, it was hit with a "REGISTRY ERROR" BSOD, and then another PAGE_FAULT_IN_NONPAGED_AREA right when I plugged the USB in. I added those too.

And yeah, unfortunately I am aware that the RAM is not on the MB QVL, but while I was doing the research of what was compatible for this system I saw many people state things like that shouldn't really be an issue, and that similar types are on the QVL so this one should be fine as well etc... I'm still going back to the fact it worked for a year before experiencing these constant BSOD so it makes no sense to me. It's like it was compatible for a year and then randomly decided to not be compatible anymore. 🤷‍♂️

What I'm kinda noticing is that the single stick that's in right now seems to be BSODing more often and more frequently than the other one, but they both still BSOD. Is there a need to swap them anymore, should I add the 2nd one in and keep them both in from now on, or...?
 

ubuysa

Distinguished
The thing about the QVL is that RAM on there has been tested and verified as fully compatible. Other RAM probably will work, but there is no guarantee - and that's a bit of a gamble. It also raises flags for us when you later get BSODs that look RAM related. It may not be an issue at all, but on the other hand we can't just ignore it.

Why now? Components age and their characteristics change very slightly as they do. It's not impossible that the RAM was just about compatible when new but in ageing it's now just out of compatibility.

All that said, the two latest dumps (in Safe Mode I think?) are useful.

One is a 0x51 REGISTRY_ERROR bugcheck, which is interesting. I can see that the failure happened early on in a search of the registry hive (it's just at the "Category" level). The registry hives are loaded from the system drive at boot time and then referenced only in memory from then on. That means there are two potential failure areas here...
  • Bad RAM that is fouling up the registry hive contents loaded in to it (we've looked hard at RAM)
  • Bad system drive that is loading corrupted registry data (we haven't looked here at all yet)
There is another feature of this dump that is generally caused by BAD RAM but which could also be a bad drive; a checksum mismatch...
Code:
*** WARNING: Check Image - Checksum mismatch - Dump: 0x23298, File: 0x26bf9 - W:\Symbols\crashdmp.sys\F69BF9371e000\crashdmp.sys
What that tells us is that the checksum of the driver in the dump does not match the Microsoft checksum for that driver. That is typically caused by bad RAM corrupting the driver in the dump but could potentially be caused by the driver being corrupted as it's loaded. That driver (crashdmp.sys) is not always loaded and is loaded on demand, so perhaps this is a system drive issue?


The other dump also might point at the system drive. It's a 0x50 PAGE_FAULT_IN_NONPAGED_AREA bugcheck, which means a reference to a page that cannot be paged was unavailable. Since this happened in Safe Mode we know that this is not a driver error, so bad RAM is a distinct possibility.

However, in the call stack of this dump we can see disk accesses taking place, the ntfs.sys driver is called. We also see the fltmgr.sys driver called, this is the filter driver manager and usually that might indicate a flaky third-party filter driver, but again we're in Safe Mode, so that's not the case.

We're looking at a bad RAM reference whilst reading (or writing) to the disk and we know this cannot be a bad driver. That means it's either RAM (which we've looked at) or it's a drive (which we've not looked at), and that's probably the system drive since we're in Safe Mode.

Based on these two dumps I would suggest you now look at the system drive (and the boot drive if they are different). That will be the Kingston 250GB NVMe drive I think? Download the Kingston SSD Manager and see whether that reports any issues with the drive, especially with the SMART data, or whether there is a firmware update for it? If that tool has a diagnostic function then use it to test the drive.
 
Sep 17, 2024
19
0
10
Ah, to be clear, 2 of the dumps are in safe mode and different RAM sticks, and 2 from today are not in safe mode (registry error and the last page fault). The minidumps are appearing very weirdly in that folder, I can't tell which one appears when and some get overwritten since only 5 ever show up at once. They don't seem to always be chronologically listed. Some are also 0 bytes so I usually skip those. So I THINK the 2 latest ones are NOT in safe mode. The safe mode ones should be from yesterday.

That said, I tried the Linux USB thing, and it might be too soon to say, but it hasn't crashed at all yet and I've been trying to force it in a few ways. I wanna keep it as bare as possible so I'm not connecting it to the internet or installing any programs yet. But it's on a single RAM stick at the moment, the one I mentioned seemed to BSOD more frequently. Now I'm back to thinking it's Windows-related, LOL. Make it make sense, please... 😐

As far as the system drives go, I now have both the 500GB and 250GB NVMEs loaded with Windows, both of which were BSODing. And I've had Hard Disk Sentinel for a while, and the drives were always showing as healthy with no errors really being reported, however after a year the one that originally had the Windows installation (250GB) had its health dropped to about 97% but was still saying it's healthy. The 500GB and the HDD were always on 100%. I can check this further, then.
 
Sep 23, 2024
25
0
30
Having similar issues. Less than 1 year old build Core i9 13900k, asus rog z790 maximus hero, maxed out ram with 128gb corsair dominator, evga 3080, corsair 1000watt psu. Always ran pretty cool. Started freezing, video TDR failure. Replaced GPU with asus 4090 tuf, replaced cpu with core i9 14900k, replaced cpu fan with corsair liquid cooling, clean install windows 11. Constant bsod errors. Replaced SSD, clean install, same thing. Upgraded bios, same thing. Replaced mobo and ram with same as before. Pretty much a whole new system. Same errors after a clean install. I'm at my wits end. From the 1st build to the rebuild, almost $7k spent and its worse than ever before. I'm afraid to attempt another build. I tried EVERY "fix" I can find. scannow always finds corrupt files and repairs them. When all seems to be working fine, I cannot run any installer for drivers or software/apps. Upgraded the bios on the new mobo and same issues. I see they seem to br pretty common issues. Is this the end?
 
Last edited:

ubuysa

Distinguished
If Linux crashed as well as Safe Mode then your problem is definitely hardware. Unplugging all external devices and removing anything internally that you can do without to give you the most basic system you can is the way to go. You could even remove all drives and test it with just that Linux USB - because we know that's a pristine OS. Eventually, and unfortunately, it will come down to swapping components, and it's not impossible that the motherboard may be the problem.
 
Sep 17, 2024
19
0
10
But there's no way of figuring out what exactly it is without just replacing everything and hoping it works? Like I said before, every part is still under warranty so I can return it, but I can't do it for the whole setup, only individual parts. Based on all the information so far, it's still pointing towards a RAM issue? ...But then it also could be CPU or MB, I guess? If there's any more tests or stuff I could try let me know if there's any way to narrow it down further. A thing I COULD try is get a new RAM set that's actually on the QVL and the CPU is ok with (so I guess I'd better not go above 3200), but making sure that's actually available for me to find will be a pain in the ass... From my understanding, I thought Corsair is the most reliable option, who's better than them for RAM?

What about any BIOS settings I could try messing with? I was thinking I could try manually modifying the RAM settings (like instead of either default 2100 or XMP 3600, set it to around 3200?), but is there anything in BIOS that could potentially alleviate issues like these?
 

ubuysa

Distinguished
Replacing the RAM with QVL supported RAM would be a good place to start. All the RAM vendors on the QVL are good.

You might be able to sell the RAM you have to someone else?
 
Last edited:
Sep 23, 2024
25
0
30
As an added bonus, i can't wake the system if it goes to sleep. The mouse and keyboard don't work unless I did a hard restart which yields more BSOD errors until windows starts. Cannot update windows. They won't install. When I type in the search bar, it randomly disappears after a couple seconds so I have to type very fast and click on a result before it drops out. I can't run installers for the keyboard or mouse drivers. When I open edge, I get random "page cannot be displayed" I refresh and it's fine. If I play a game, BSOD. Remember, I replaced EVERYTHING! I even tried another clean install of windows from my own usb and it wouldn't install. This was after a complete format and deleting the partitions. Looks like I'll be replacing everything again.
 
Sep 17, 2024
19
0
10
...Why does no one ever tell you when you run the Memory Diagnostics to pick the EXTENDED option? I was just running it on default settings before and it always passed, but now I found a discussion where it was specifically stated to press F1 once the diagnostics start and pick the Extended option, and if the test freezes on 21% for more than 2 hours, the memory is the issue. Lo and behold I ran this and it's been frozen on pass 1 of 2 at exactly 21% for like 5 hours now. Does this FINALLY determine the RAM is the issue? If so, I'm ordering a new set immediately and replacing it.
As an added bonus, i can't wake the system if it goes to sleep. The mouse and keyboard don't work unless I did a hard restart which yields more BSOD errors until windows starts. Cannot update windows. They won't install. When I type in the search bar, it randomly disappears after a couple seconds so I have to type very fast and click on a result before it drops out. I can't run installers for the keyboard or mouse drivers. When I open edge, I get random "page cannot be displayed" I refresh and it's fine. If I play a game, BSOD. Remember, I replaced EVERYTHING! I even tried another clean install of windows from my own usb and it wouldn't install. This was after a complete format and deleting the partitions. Looks like I'll be replacing everything again.
Your issue sounds different from mine, because I can pretty much do everything, but when the PC feels like hitting me with a BSOD, it just does. I can install Windows, I can run programs, I can boot into BIOS and past BIOS, but it just sporadically hits me with a BSOD whenever it wants. I'd suggest opening a new thread so you can get specific assistance and suggestions? You'd need to provide more details as well.
 
Sep 23, 2024
25
0
30
...Why does no one ever tell you when you run the Memory Diagnostics to pick the EXTENDED option? I was just running it on default settings before and it always passed, but now I found a discussion where it was specifically stated to press F1 once the diagnostics start and pick the Extended option, and if the test freezes on 21% for more than 2 hours, the memory is the issue. Lo and behold I ran this and it's been frozen on pass 1 of 2 at exactly 21% for like 5 hours now. Does this FINALLY determine the RAM is the issue? If so, I'm ordering a new set immediately and replacing it.

Your issue sounds different from mine, because I can pretty much do everything, but when the PC feels like hitting me with a BSOD, it just does. I can install Windows, I can run programs, I can boot into BIOS and past BIOS, but it just sporadically hits me with a BSOD whenever it wants. I'd suggest opening a new thread so you can get specific assistance and suggestions? You'd need to provide more details as well.
Thanks. Your issues were the only thing similar to mine that I have found. They started like yours and they persisted as I started replacing hardware. Now EVERYTHING is new and the same issues are even more abundant. I really have no idea what to do as I can't even diagnose anything without a freeze up and constant BSOD errors. This was a perfectly working machine until a few weeks ago. Now I have 2 rigs both with the same issues, the oldest rig less than 1 year old and the newest is brand new. I tried EVERY fix suggested. Nothing is working.
 
Sep 17, 2024
19
0
10
...Alrighty, so I got a new RAM kit that's actually on the MB QVL and doesn't OC the CPU, tried to do another fully clean Windows install on the 250GB NVME, and the piece of shit bluescreened again, multiple times. There's 3 times it did shortly after the installation and the dumps are all 0b, I don't really get this. Then I tried installing drivers and programs and it BSOD with an attempt to install the audio driver, and then once more when I ran RGB Fusion. Seeing as how I don't have $7k to blow on 3 new PCs as SOME people do, I need to get to the bottom of this cancer and RMA the actual faulty part while I still can. What's next? :unamused:
https://www.mediafire.com/file/1zt8kt1bfzoh2m6/Minidumps.7z/file

For some more context, 2 potential culprits, the CPU and MB, both had some weird things going on before I ran into the BSOD walls. The CPU cooler started making grinding noises on its fans for a while before I replaced it, could that have damaged the CPU so severely it became faulty? As for the MB, back when I first installed RGB Fusion and tinkered with the settings before I resorted to OpenRGB, the MB seemed to have permanently absorbed the RGB settings and every time I reinstalled Windows or even cleared CMOS the CPU cooler RGB defaulted to an orange color, a setting I mistakenly added the very first time I used Fusion. I could never reset it back to default A-RGB colors ever since. But CMOS should have at least cleared this, so something funky might be going on with the MB.

Literally what do I do now? I can't RMA everything so I have to figure out what EXACTLY is causing this.
 
remove totalcmd64.exe and retest for failure.
----------------
i looked at memory.dmp from sept 17
any chance your copy of TOTALCMD64.EXE is leaking non paged pool?
(i would remove the tool or do a find string on klog (I will look up the proper command))

I am not sure who owns this entry
KLOG 66 21483344 18 2731424 UNKNOWN pooltag 'KLOG', please update pooltag.txt

pooltag= KLOG

edit: klog is the windows volume manager
c:\Windows\System32\drivers>findstr /m /l KLOG *.sys
volmgrx.sys

edit: https://mskb.pkisolutions.com/kb/298102#:~:text=More Information 1 Click Start, point to Search,,for example, C:\WINNT\system32\drivers, and then click Search Now.


22: kd> !poolused /t 10 4
........................
Sorting by Paged Pool Consumed

NonPaged Paged
Tag Allocs Used Allocs Used

MmSt 0 0 6573 28695328 Mm section object prototype ptes , Binary: nt!mm
MmRe 0 0 1853 18653920 ASLR relocation blocks , Binary: nt!mm
CM25 0 0 3092 14446592 Internal Configuration manager allocations , Binary: nt!cm
Ntff 20 7360 8503 11972224 FCB_DATA , Binary: ntfs.sys
FMfn 4 1216 22418 9598144 NAME_CACHE_NODE structure , Binary: fltmgr.sys
RX00 78 54464 15 6922656 ATI video driver
MINI 2330 13172512 1184 5913856 UNKNOWN pooltag 'MINI', please update pooltag.txt
Vi54 0 0 696 5701632 Video memory manager PTE array , Binary: dxgmms2.sys
RvaL 0 0 1692 5191664 UNKNOWN pooltag 'RvaL', please update pooltag.txt
MPsc 0 0 6264 4710528 UNKNOWN pooltag 'MPsc', please update pooltag.txt
Toke 0 0 2411 4320960 Token objects , Binary: nt!se
KLOG 66 21483344 18 2731424 UNKNOWN pooltag 'KLOG', please update pooltag.txt
NtfF 0 0 1664 2662400 FCB_INDEX , Binary: ntfs.sys
CM16 0 0 540 2641920 Internal Configuration manager allocations , Binary: nt!cm
hibr 1 64 1 2281472 UNKNOWN pooltag 'hibr', please update pooltag.txt
SW09 10 7456 2 2105344 Software Bus Enumerator

TOTAL 225349 433522848 244481 172889312
 
Last edited:
Sep 17, 2024
19
0
10
i looked at memory.dmp from sept 17
any chance your copy of TOTALCMD64.EXE is leaking non paged pool?
(i would remove the tool or do a find string on klog (I will look up the proper command))

I am not sure who owns this entry
KLOG 66 21483344 18 2731424 UNKNOWN pooltag 'KLOG', please update pooltag.txt

pooltag= KLOG

edit: https://mskb.pkisolutions.com/kb/298102#:~:text=More Information 1 Click Start, point to Search,,for example, C:\WINNT\system32\drivers, and then click Search Now.


22: kd> !poolused /t 10 4
........................
Sorting by Paged Pool Consumed

NonPaged Paged
Tag Allocs Used Allocs Used

MmSt 0 0 6573 28695328 Mm section object prototype ptes , Binary: nt!mm
MmRe 0 0 1853 18653920 ASLR relocation blocks , Binary: nt!mm
CM25 0 0 3092 14446592 Internal Configuration manager allocations , Binary: nt!cm
Ntff 20 7360 8503 11972224 FCB_DATA , Binary: ntfs.sys
FMfn 4 1216 22418 9598144 NAME_CACHE_NODE structure , Binary: fltmgr.sys
RX00 78 54464 15 6922656 ATI video driver
MINI 2330 13172512 1184 5913856 UNKNOWN pooltag 'MINI', please update pooltag.txt
Vi54 0 0 696 5701632 Video memory manager PTE array , Binary: dxgmms2.sys
RvaL 0 0 1692 5191664 UNKNOWN pooltag 'RvaL', please update pooltag.txt
MPsc 0 0 6264 4710528 UNKNOWN pooltag 'MPsc', please update pooltag.txt
Toke 0 0 2411 4320960 Token objects , Binary: nt!se
KLOG 66 21483344 18 2731424 UNKNOWN pooltag 'KLOG', please update pooltag.txt
NtfF 0 0 1664 2662400 FCB_INDEX , Binary: ntfs.sys
CM16 0 0 540 2641920 Internal Configuration manager allocations , Binary: nt!cm
hibr 1 64 1 2281472 UNKNOWN pooltag 'hibr', please update pooltag.txt
SW09 10 7456 2 2105344 Software Bus Enumerator

TOTAL 225349 433522848 244481 172889312
Total Commander? Uhhh I don't think so, it's been bluescreeening even before I manage to install TCMD. Unless my MB is literally saving every previous installation of Windows on its hardware and then pulling the software out of its ass, if that's even possible, and then it results in some incompatibilities with later reinstallations, TCMD itself can't be the culprit. That would've been just 1 random instance of it.

What else could those issues point to? I'm pretty convinced it's hardware at this point, but it could be remnants of software in some way that don't clear even when I fully reset everything? Like I said, my MB literally defaulted its RGB settings from way back when I first used RGB Fusion a year ago, even after clearing CMOS.
 
Total Commander? Uhhh I don't think so, it's been bluescreeening even before I manage to install TCMD. Unless my MB is literally saving every previous installation of Windows on its hardware and then pulling the software out of its ass, if that's even possible, and then it results in some incompatibilities with later reinstallations, TCMD itself can't be the culprit. That would've been just 1 random instance of it.

What else could those issues point to? I'm pretty convinced it's hardware at this point, but it could be remnants of software in some way that don't clear even when I fully reset everything? Like I said, my MB literally defaulted its RGB settings from way back when I first used RGB Fusion a year ago, even after clearing CMOS.
there was also a memory leak reported from rt640x64.sys
22: kd> !wdfdriverinfo rt640x64.sys
----------------------------------
Default driver image name: rt640x64
\SystemRoot\System32\drivers\rt640x64.sys Thu May 23 23:57:12 2024
I think it is the reltek network driver. you might check for a update

Unable to load image \SystemRoot\System32\drivers\rt640x64.sys, Win32 error 0n2
WDF library image name: Wdf01000
FxDriverGlobals 0xffff990c31bd1a70
WdfBindInfo 0xfffff8018e7db8c8
Version v1.15
Library module 0xffff990c278b08a0
ServiceName \Registry\Machine\System\CurrentControlSet\Services\Wdf01000
ImageName Wdf01000
----------------------------------


this might be the updated version (at least the file date is newer)
https://www.realtek.com/Download/List?cate_id=584

just fyi: debugger also indicates that you have two devices disabled.
bluetooth usb and mtkwlex service
media tek wireless lan controller.


WARNING: More than 2000 WDF objects detected. Your driver might be leaking objects/memory. Extended Object breakdown More information

from the memory dump it looks like all of the usb devices were waiting for a wake up signal.
IE the usb ports were suspended. You might tell windows not to suspend the ports.
there were various errors in the USB internal logs.
plug and play had a lock but the physical device was set to zero and indicated a duplicate.
It might be related to the disabled devices trying to load a driver. You might want to install the proper driver and enable the devices.

note: also look in the bios for settings that automatically suspend the usb. this option was required a years ago by the EU. Some devices may not wake up correctly. I would tell the bios not to suspend the usb to save power. then maybe tell windows not to suspend certain devices via windows control panel device manager (it is on one of the tabs for the device)

you might want to disable your usb light and the gigabyte control center utility. just for testing.
 
Last edited: