Question PC is riddled with different BSODs ?

Sep 17, 2024
5
0
10
I'll try to make my post short but also give as many details I can think of.

My PC specs:

  • AM4 Gigabyte B550 AORUS ELITE AX V2
  • AMD Ryzen 5900X
  • Arctic Liquid Freezer 2 240 A-RGB
  • 2x16GB Corsair LPX Vengeance DDR4 3600 C18
  • ASUS ROG STRIX 1000G 1000 W
  • Sapphire Nitro+ RX 6800XT
  • Kingston A2000 M.2 NVME 250GB + Kingston A2000 M.2 NVME 500GB
  • Toshiba P300 7200RPM 3TB HDD
  • Windows 10 Pro 22H2
The PC was assembled last year around April, and I would say everything was fine except for 1 thing that I can recall. It randomly froze up after the spinning mouse circle, usually when browsing the Internet but sometimes when opening random programs too. After spinning for a while, it freezes the whole PC, taking Windows Explorer down with it and then coming back to life after about a minute.

This occurred every once in a while at first, then more often after some months. Eventually BSODs popped up in some rare occasions until the BSODs became more frequent as well about maybe 3-4 months ago. That's when I started looking up ways to try and resolve the issue, since it's a self-made PC from parts all over the place.

The BSODs I've so far experienced are as follows, from most often occurring to the ones least occurring:

  • PAGE_FAULT_IN_NONPAGED_AREA
  • IRQL_NOT_LESS_OR_EQUAL
  • SYSTEM_THREAD_EXCEPTION_NOT_HANDLED
  • SYSTEM_SERVICE_EXCEPTION
  • KMODE_EXCEPTION_NOT_HANDLED
  • DRIVER_IRQL_NOT_LESS_OR_EQUAL
  • CLOCK_WATCHDOG_TIMEOUT
  • ATTEMPTED_WRITE_TO_READONLY_MEMORY
  • CACHE_MANAGER
  • DPC_WATCHDOG_VIOLATION
And these programs were only sometimes listed as the cause of the crash:

  • fltmgr.sys
  • ntfs.sys
  • amdkmdag.sys
  • ntkrnlmp.exe
What I've tried so far:

  • Uninstall Windows about 3-4 times now, on both the 250GB NVME and most recently the 500GB one, which is where I'm currently at. The BSODs occurred even during some of the Windows installations earlier, so that didn't fix it.
  • Removed and swapped around my RAM sticks and test them with Memtest86 and Windows Memory Diagnostic. The BSODs kept occurring and the tests came up with not a single issue with however many passes it took.
  • Removed the HDD and NVMEs, BSODs kept occurring.
  • Updated BIOS with whatever newest update is available on the MB site, BSODs kept occurring.
  • Download every single bit of driver software I could scrounge up on both specific manufacturer and MB sites, BSODs kept occurring.
  • I cleared CMOS on the MB and that came the closest to a fix I could notice, since BSODs were barely occurring shortly after I did this, until they started occurring again so this didn't work either.
The times the BSODs occur the most is when Windows is booting, either during the spinning circle on the BIOS Del/F12 logo screen, or the welcome page to Windows, or just shortly after Windows is started. But also sometimes randomly when installing something (most recently that I noticed it was with audio drivers and Gigabyte's RGB Fusion) or just opening any program on the PC.

Now as far as crash dumps go, the PC only creates 5 files and sometimes it created multiples so older ones get overridden, but I have 2 sets of them from 2 months ago when I first reinstalled Windows on the 250GB NVMe SSD and it BSODed a million times during that process (this one is on the 250GB NVME that I currently have removed, let me know if it's needed as well), and the most recent installation of it on the 500GB NVME. I just threw them all in and hopefully they show absolutely anything worthwhile. https://www.mediafire.com/file/48tu5s3sb23xe4e/Minidumps.7z/file

I'm going insane, because even as I still have warranty for all the parts to return for another 6 months or so, I have absolutely no idea what even is the issue to return only that 1 specific part in hopes that resolves it. If anyone has any clue what the problem is and what to do about it, please let me know since the only other last resort I have is taking it to a random repair shop but I have no idea what they'll do and how much worse it'll end up being. Lemme know if anything else is needed.
 
Sep 17, 2024
5
0
10
Try the built in Windows troubleshooters - the troubleshooters may find and fix something.

Also run "dism" and "sfc /scannow".

https://www.windowscentral.com/how-use-dism-command-line-utility-repair-windows-10-image

https://www.lifewire.com/how-to-use-sfc-scannow-to-repair-windows-system-files-2626161

Failing that then delve into Reliability History/Monitor and Event Viewer.
Ah, I did the DISM and sfc /scannow commands too, forgot to mention that, but on the old NVME 2 months ago. From what I recall, the process ended with saying "corrupt files were fixed" and I still kept having BSODs afterwards. I can try again now on this installation but it seems this is not the fix.

As for the reliability monitor, I'm looking at it now and it's telling me the times it shutdown unexpectedly and that it created a MEMORY.dmp file which is about 1GB size. Should I zip this up and upload it too?

As for Event Viewer, sorry to say I'm not sure what I'm looking at there.

EDIT: Did the DISM, showed no problems at all, and SFC just said "found corrupt files and fixed them". It said the same before and I still had BSODs so it's likely not this.

And here's the Memory.dmp file in case it means anything. https://www.mediafire.com/file/sdsgq17k4ynzaku/MEMORY.7z/file
 
Last edited:

ubuysa

Distinguished
I've looked at the dumps from 17th and 18th and this looks very much like a RAM problem to me. Test your RAM with Memtest86...
  1. Download Memtest86 (free), use the imageUSB.exe tool extracted from the download to make a bootable USB drive containing Memtest86 (1GB is plenty big enough). Do this on a different PC if you can, because you can't fully trust yours at the moment.
  2. Then boot that USB drive on your PC, Memtest86 will start running as soon as it boots.
  3. If no errors have been found after the four iterations of the 13 different tests that the free version does, then restart Memtest86 and do another four iterations. Even a single bit error is a failure.
 
Sep 17, 2024
5
0
10
I've looked at the dumps from 17th and 18th and this looks very much like a RAM problem to me. Test your RAM with Memtest86...
  1. Download Memtest86 (free), use the imageUSB.exe tool extracted from the download to make a bootable USB drive containing Memtest86 (1GB is plenty big enough). Do this on a different PC if you can, because you can't fully trust yours at the moment.
  2. Then boot that USB drive on your PC, Memtest86 will start running as soon as it boots.
  3. If no errors have been found after the four iterations of the 13 different tests that the free version does, then restart Memtest86 and do another four iterations. Even a single bit error is a failure.
Thanks for looking into it, and the more BSOD errors that were popping up the more I thought it would be a RAM issue too, but something just feels different. Especially since I tested the PC with both sticks together and separately (they're a kit) and both Memtest86 and Memory Diagnostic didn't show even a single error when I ran those. I also ran XMP since the PC's assembling last year but since reinstalling Windows I have it turned off and it's still giving BSODs. The frequency of them doesn't really change whether XMP is on or off, if that means anything.

I can try Memtest again multiple times, and if there's still no errors showing on any combination of the sticks, how trustworthy is that? I just need to be absolutely certain they're the problem before I have to replace them. Unfortunately I don't have any other ones to test the configuration with at the moment, either (to see if issues persist with a new set).
 

ubuysa

Distinguished
If Memtest86 runs clean twice then remove the RAM overclock (DOCP/XMP) and run the RAM at its native (SPD) speed. See whether everything is stable then.

It's also worth noting that the maximum guaranteed RAM clock speed for that CPU is 3200 MHz (see https://www.amd.com/en/products/processors/desktops/ryzen/5000-series/amd-ryzen-9-5900x.html). Typically most CPUs can safely exceed this speed - but stability is not guaranteed if you do.

You might also want to download the SysnativeBSODCollectionApp and save it to the Desktop. Then run it and upload the resulting zip file to a cloud service with a link to it here. The SysnativeBSODCollectionApp collects all the troubleshooting data we're likely to need. It DOES NOT collect any personally identifying data. It's used by several highly respected Windows help forums (including this one). I'm a senior BSOD analyst on the Sysnative forum where this tool came from, so I know it to be safe.

You can of course look at what's in the zip file before you upload it, most of the files are txt files. Please don't change or delete anything though. If you want a description of what each file contains you'll find that here.
 
Sep 17, 2024
5
0
10
It's also worth noting that the maximum guaranteed RAM clock speed for that CPU is 3200 MHz (see https://www.amd.com/en/products/processors/desktops/ryzen/5000-series/amd-ryzen-9-5900x.html). Typically most CPUs can safely exceed this speed - but stability is not guaranteed if you do.
.........MAN, ok that really pisses me off now, I spent like 2 months before I bought it trying to figure out all the configurations of every component I was assembling and everything turned out compatible with each other, but I didn't see that specific bit of information. 🤦‍♂️ Thanks for pointing it out, now I'm starting to believe it's the RAM after all. It's like the XMP was slowly killing it over the course of the previous year and now it's become too corrupted or something. Since the Memtest that I did showed no errors, it's making me think the sticks themselves are still fine, but their connection with my system (either the CPU or MB) is what's corrupted. You'd think since I turned the XMP back to default it would run fine again but there's still BSODs so it's just weird.

I'm actually seeing so many people online state that 3600MHz CL16 is like a "sweet spot" for the 5900X so I'm just even more confused. It's like a roulette at this point. :rolleyes:

I'll try some more Memtests then and Sysnative soon, but it's looking like I need to consider getting a 3200MHz set instead to comply with the CPU. 🤷‍♂️
1. Have you run a chkdsk on the drive?
2. Is anything overclocked ?
3. Does it do it in windows Safe Mode?
1. I did, can't remember at which point exactly but it didn't result in any errors.
2. Other than XMP being on for a year before the first Windows reinstallation, no. The GPU MIGHT be pre-OC'd though, but there's no way for me to confirm since the sellers of that specific GPU at that time all had "OC" in its description.
3. I haven't checked, but it's doing it so sporadically now that I wouldn't know how to try and "trigger" it at this point. I could try doing a few restarts in succession and see if anything pops up.
 

KingLoki

Great
Jul 10, 2024
101
12
85
If the GPU is factory oclocked ot shouldn't cause a problem. XMP has been ok, so no reason to cause a random problem unless RAM becomes faulty. Being at random times, it does make ot harder to nail. You have had help from others on here and no luck. Could easily be a faulty driver, windows service process at the time or even possibly a windows update causing it ??
 
Last edited:
Sep 17, 2024
5
0
10
Annnnnd it BSOD during the Sysnative process, PAGE_FAULT_IN_NONPAGED_AREA - https://www.mediafire.com/file/4zld713z9y5698c/Dump.7z/file

I'll finish backing up my stuff in preparation for the worst case scenario then continue the Sysnative process and then some Memtests. In the meantime if anyone figures something out let me know and I'll get back to you with the testing stuff. Thank you for the assistance so far!
Could easily be a fauly driver, windows service process at the time or even possibly a windows update causing it ??
I did go through pretty much every single driver I have installed and got the latest updates, including Windows ones. I have no idea which ones might be causing issues anymore, if any.
 
Last edited:

Ralston18

Titan
Moderator
Will defer to @ubuysa et. al, regarding memory dumps and RAM.

If possible take a couple of screenshots of the Reliability History/Monitor window.

Expand the window so all can be seen. Select a couple of representative dates that show errors, warnings, etc..

Take the screenshots and post the screenshots here via imgur (www.imgur.com).

= = = =

As for Event Viewer: it does require more time and effort to navigate and understand.

However, the information can be very helpful and clicking any given entry can provide further details. The details may or may not be helpful.

FYI:

How To - How to use Windows 10 Event Viewer | Tom's Hardware Forum (tomshardware.com)

Yes: seconding the requirement to have good backups. Ensure that the backups are recoverable and readable.