Question BSOD - MEMORY_MANAGEMENT and PAGE_FAULT_IN_NONPAGED_AREA

Status
Not open for further replies.

sati-edimax

Distinguished
Mar 22, 2017
32
2
18,535
Hello

In the last few days my system started to drop BSOD - 3 on one day, (I went and did memtest scans between), then after testing memory for few days (info below) one day everything worked ok and ofc next day another 2 BSOD happened.

4 of those 5 BSODs where MEMORY_MANAGEMENT and the last was PAGE_FAULT_IN_NONPAGED_AREA (1st MEMORY, 2nd MEMORY, 3rd MEMORY, 4th MEMORY, 5th PAGE). The PAGE one also happened back in January but after that I didn't have any other BSODs until now that's why I didn't do anything with it assuming it was a one time thing and everything went back to normal after reboot.

Here are the minidumps: https://we.tl/t-ywNk6oby0B
And the large one from last BSOD: https://we.tl/t-R4kfVeq3ML

Some info about my PC (initial build around 2013):
CPU i7-4700
Gigabyte GA-Z97-D3H (updated with latest available drivers & BIOS that Gigabyte released)
GTX 770 (didn't update drivers after installing them as I didn't have any reason to update those, everything was working good for my needs)
4x8GB Goodram DDR3 GY1600D364L10/8G (all the same model, same specs, only difference is that one pair of 2 was bought in 2013 during PC setup, second pair was added around 2016/2017 and right after install until now those never caused any issues)
2 SSDs (one for OS, second for data&files)
1 HDD (data&files)
PSU: Chieftec 800W, don't remember the exact model
OS: Windows 10 (installed late 2020/start of 2021), last time updated in mid May, I didn't use Windows Update for updating drivers, only for the system updates
PC runs connected to an UPS so it never had any power surge shutdowns

Never OC anything, change parameters in BIOS, no XMP on memory. Last hardware update was the addition of 2nd SSD that came during the OS installation so this setup was working with zero issues until January, when the very first BSOD happened and another 5 months until more BSODs.

BSOD occurrences:
1st (MEMORY) - while I was browsing the internet on Firefox (with YT live stream in one of the tabs if that matters) the current active tab crashed, I wanted to restart Firefox and when I clicked close BSOD happened. OS was on for like 3 hours.
2nd (MEMORY) - right after above BSOD and BSOD-reboot the 2nd BSOD happened when I wanted to open Firefox (clicked the app icon and BSOD). OS was on for like 1 minute?
3rd (MEMORY) - right after BSOD-reboot no2 but this time I've opened Opera (in hopes that maybe a faulty plugin or something just broke Firefox), it showed me start page and then it BSODed. OS was on for like 1 minute?
4th (MEMORY) - again while browsing Firefox, just like at no 1, active tab crashed, this time I tried to force close the Firefox process from Task Manager and when I cliked "Close/End task" BSOD. OS was on for like 8 hours.
5th (PAGE) - right after BSOD-reboot no 4 I had to open Excel (had it opened during BSOD no 4 so I had to save my work/changes/try to restore them from autosave if Excel managed to do it), saved stuff in it and the BSOD happened when I closed Excel. OS was on for like 5 minutes (if it wasn't for Excel I would close the PC right away)

Maybe interesting point is that during above I managed to open/close properly Paint, Control Panel/Windows Settings (my first clue was "maybe Windows did an update and that's why"), Task Manger, folders, I was able to copy the dump files. So it's not that Windows BSOD on everything...

Between the 1-3 and 4-5 BSODs I tested all my memory sticks for few days with memtest86 v4.3.7. Results:
1st run 4 sticks - 2 errors on TEST 7 (didn't take a photo of it :/ to post here)
each stick separately - no errors (each stick 4-5 full passes)
2nd run 4 sticks - no errors (4 passes)

I was kind of surprised that memtest was showing that each 8GB stick had 7.6GBs of memory so after googling I checked my BIOS and disabled Intel GPU (I wasn't even aware I had it on lol) and redid all the test again:
1st run 4 sticks - no errors (2 passes)
each stick separately - no errors (each stick 4 passes)
2nd run 4 sticks - no errors (2 passes)

I also used memtest86 v5 (did just 2 passes on all 4 sticks, no errors, CPU temp during test was about 50C all the time) and Windows Memory Diagnostics v0.4 (again all 4 sticks, 2 passes with extended tests, no errors).

After all those tests I booted my PC into Windows, it worked for like 2 hours, Firefox all the time open (Opera also was open for some time) - everything worked fine, I've checked my drives with Crytal Disk Info (everything good, no warnings), updated Firefox, updated Opera. Booted next day - it worked for like 7-8 hours and only after that amount of time the 4th BSOD happened. I also booted to some Linux Live CD and used GSmartControl and tested every drive = no errors on every test. And now I'm here searching for help.

Any ideas what can be the reason of those BSODs? Or what to check, what to do? (I know my PC is ~10 years old and it might be hardware dying on me but it would be cool to point out which as maybe I can find a "cheap" replacement and use this PC for next few years and not just buy a new PC now :) )

Thank you.
 
There are events in the dump files that typically indicate a failing drive.

Code:
[Microsoft-Windows-StorDiag/Class /Completion of request. ]Completing a failed non-ReadWrite SCSI SRB request

What make/model is your system drive?
 
What make/model is your system drive?

Ofc I had to forget to write this, sorry, my bad 😉

OS SSD: Goodram Iridium Pro 240GB MLC (SSDPR-IRIDPRO-240) from 2017-08
2nd SSD: Goodram CX-400 512GB (SSDPR-CX400-512-G2) from 2021-08
HDD: WD Black 1TB (WD1003FZEX) from 2013 (this disk had "issues" https://forums.tomshardware.com/threads/wdc-black-uncorrectable-sector-count.3307447/ but after zeroing it and using MHDD REMAP function it went back to being full green in CrystalDisk Info)
 
Together these would normally suggest faulty memory - especially with the MEMORY_CORRUPTION_ONE_BIT failure bucket ID showing up in most of them.

All of them show the system using 4 DIMMs set for DDR3-1333. Have you ever had a crash while running just 2 DIMMs dual channel? Have you tried 2 DIMMs at DDR3-1600 and/or DDR3-1866 dual channel?
 
Yes, each time I booted into Windows and got those BSODs I had all 4 sticks in their default settings (didn't touch anything in BIOS, everything was working good so why change stuff :) ) As a side note until this PC was working on 2x8GB there were no crashes for 3-4 years (until the additional 2x8GB were added).

I didn't check now in memtest or in OS working on only 2 sticks in dual channel (either pair 1&3 or 2&4 DIMM slots on MOBO), and didn't try changing their frequency to 1600/1866.

But would changing the frequency help when it was working on 1333 for the past 7-10 years without issues? Wouldn't it be the like the other way around that if I had issues on 1600/1866 then lowering it into 1333 could help? (sorry for stupid questions, but I'm not very good in this stuff at all)

Currently I don't use this PC at all (to not get more BSODs while working on it), I'm trying to find out what to check/verify and then try that stuff out to check if it helps...

Thanks
 
I agree that falky RAM looks like the most likely cause, especially if it's been running for the last 7-10 years. Trying it on two RAM cards at a time is an excellent way to check your RAM.
 
I'm not asking you to run memory diagnostics with 2 DIMMs. I just want to know if it has ever crashed during normal use when running 2 DIMMs in dual channel. Your CPU is officially supported by Intel up to DDR3-1600 which is what I would consider a good baseline since all of your other components should also be able to handle DDR3-1600 settings.

Honestly, I'm having flashbacks to my last system (Z97 motherboard/4770K CPU) that would intermittently crash with 4 DIMMs installed which often resulted in a MEMORY_CORRUPTION_ONE_BIT failure bucket ID. It would pass any and all memory diagnostics with 2 DIMMs no matter which DIMMs or slots. Very rarely memory diagnostics with 4 DIMMs would show errors.
 
Ah sorry, maybe I read your post in a wrong way when I wrote my reply :)

As for running on 2 sticks - after I upgraded into 4 sticks I never had any crashes, memory issues or BSODs until January (1st ever) and now. So I never had any occasion to run only on 2 sticks.

I will check it. I assume I should check both pairs?

Does getting BSOD might damage my drives? What I mean is that I would like to use the PC "in normal" way so that I won't just boot it and leave it running on Windows, so for that I need other drives as I have 99% of all files/stuff there. But I don't want to damage the 2nd SSD or the HDD just by having them plugged in case of another BSOD.
 
If possible, I'd use the older pair in dual channel set for DDR3-1600 for what you'd consider sufficient time to know whether or not the crashes have stopped. Then do the same with the other pair instead.

I'm not sure if the latest dump was generated before you updated the motherboard BIOS but it shows the version as F8. There's a F9 version if this is the correct product support page. I doubt it will make a difference but, at the very least, your motherboard will be as stable/compatible/performant as Gigabyte managed to make it.
 
A BSOD doesn't damage anything. In fact, it's designed to protect your data. Because something unexpected has happened, the Windows kernel halts the system via a BSOD to prevent data corruption.
 
I had some time last weekend and here's some info on what I did+results:

1. admin CMD: chkdsk /f /r
Result: Windows has scanned the file system and found no problems. No further action is required. (scan on boot, I found the results in Event Viewer)
2. admin CMD: DISM /online /cleanup-image /restorehealth
Result: The restore operation completed successfully. The operation completed successfully.
3. admin CMD: sfc /scannow
Result:Windows Resource Protection found corrupt files and successfully repaired them. For online repairs, details are included in the CBS log file C:\Windows\Logs\CBS\CBS.log. Log uploaded to: https://we.tl/t-sOiJWdpOuc
4. Disable, reboot, enable page file size (managed by Windows, didn't change the suggested size)
5. OCCT power supply test
Did it for like 20 seconds as the mobo has set the temp alarm for CPU - I panicked and stopped the test. Should I just ignore the alarm and try to run the test longer?
6. Windows updated itself

7. BIOS update
At the moment I'm not confident enough that during the process (when it will be in windows that is) there might not be a BSOD which might probably screw up something badly :/ Will try it later

After doing above (so chkdsk, dism+sfc, "recreating" page file and the windows update thing) I wanted to test the PC for a bit and what I noticed:
- while idle (literally nothing open aside from task manager as I wanted to just leave the PC idle for sometime to check if it won't BSOD so I needed something to be on the screen to see "oh it crashed" as I didn't had the monitor on all the time) the OS after a while started to run defragsvc on every drive (first the HDD which was getting 100% usage) and then the OS C drive (same usage at 100%)
- whenever I moved my mouse the above usage dropped to 0% so I could easily use the OS
- the PC for like 1 hours of idle didn't BSOD
- but then when I wanted to open a video file (~1GB, which always openend within secods) from the HDD drive it took like 30-60 seconds to just open Media Player Classic, another 30-60 seconds to even start the movie (like it showed the 1st frame but didn't move past that) and when I moved the progress bar to like middle nothing happened for like 1 minute and then I closed MPC (or tried as it wasn't responding) and decided to close the PC (it was late and I decided to do a backup first then to go back to testing) - but this was the first time that I had such "lags" on the HDD after I "fixed" it with MHDD remap
- next day I used partedmagic and did a full backup of the HDD and overall there were no issues aside from just 1 file that at first didn't want to be copied (Read Error while copying file xxx) but after I copied everything else (took like 6 or 8 hours for everything else, I was doing it folder by folder not just ctrl+a/c/v 😉) I went back to that "missing" file and it copied succesfully but SUPER slow (like copy speed was in KB/s and the file was like few MB, while other stuff was copying at 40-70MB/s)

Currently I plan to check Windows without that HDD. Will write anything after I've tested it for few days / until I get another BSOD. (in case of a BSOD I'll try switching RAM / trying higher freq / updating BIOS).

Thanks for current help 😉
 
Very good work. :)

Just so I'm clear, is the page file on the system drive (the Windows default setting) or a secondary drive?
 
I'm 99% sure it's on the OS drive as I've selected the "System managed size" and the "Automatically manage paging file size for all drives" setting on the "Virtual memory" tab.
 
Hi, quick update:

I didn't encounter any BSOD after ~1.5 week of running the PC with only the OS SSD 👍. So I've plugged the 2nd SSD and currently testing it this way. If nothing happens I'll plug the data HDD.

One thing that I don't really know at the moment is if I should try to use the funky HDD (the one that had problems while copying/backuping data from it) and in case of a BSOD maybe use a different SATA cable/SATA port or I should just paste a label on it as "last resort drive, use only in dire situations" and instead plug a different HDD (same capacity, same model, just less working hours and never had any SMART warnings). I'm currently leaning toward the second option as I still have 3 more 1TBs WD Black drives in my drawer...
 
Eh 🙁

Sadly I just got a BSOD (didn't change anything, still running on 2 SSDs, still doing pretty much the same stuff) - this time a PAGE_FAULT_IN_NONPAGED_AREA on dsgmms2.sys while watching YT on Firefox (same scenario - tab crashed, BSOD happened while trying to close Firefox, as a side note - other tabs were working correctly lol)

Here's the minidump: https://we.tl/t-dPGGDLMjAt
And the full memory dump: https://we.tl/t-oZNzQjeLFa

After reboot everything seems ok, didn't get another one while trying to open anything (FFx, etc., not like with the memory management ones when it always bsod after the reboot + opening the app).

Should I try to update the GPU driver? (dunno maybe in some specific cases the combination of current GPU driver, the Windows version, YT and browser version just crashes lol) Or maybe some other suggestion (Windows reinstall maybe?)
 
I doubt it will make a difference but it does look like there's a newer BIOS (F9) available for your motherboard. I've seen BIOS updates fix so many mysterious crashes that it's usually the first thing I ask someone to do when troubleshooting - assuming the system is stable in the BIOS settings menus. I'm not sure which revision you have of the Z97-D3H but all three show a BIOS version F9 available. Can you try updating to the latest BIOS and then see if the crashes continue?

The latest crash has the same bugcheck code (PAGE_FAULT_IN_NONPAGED_AREA) but the callstack looks different. The latest looks more like a memory management error while the previous suggested disk I/O problems.
 
Last edited:
Well, that one dump you just uploaded clearly fingers your Nvidia graphics driver nvlddmkm.sys as the culprit. In the call stack excerpt below you can see nvlddmkm.sys called immediately prior to the bugcheck...
Code:
ffff9d01`a7860bd8  fffff803`4000bad8 nt!KiPageFault+0x358
ffff9d01`a7860be0  00000000`c1d00001
ffff9d01`a7860be8  00000000`ff010000
ffff9d01`a7860bf0  ffffa284`49233138
ffff9d01`a7860bf8  fffff803`5b5a052f nvlddmkm+0x80052f

The version of nvlddmkm.sys that you have installed is (in graphics driver terms) ancient, dating from 2021...
Code:
2: kd> lmDvmnvlddmkm
Browse full module list
start             end                 module name
fffff803`5ada0000 fffff803`5d1fe000   nvlddmkm T (no symbols)           
    Loaded symbol image file: nvlddmkm.sys
    Image path: \SystemRoot\System32\DriverStore\FileRepository\nv_dispi.inf_amd64_5d5c294bb8d17217\nvlddmkm.sys
    Image name: nvlddmkm.sys
    Browse all global symbols  functions  data
    Timestamp:        Mon Jul 12 10:07:57 2021 (60EBEA4D)
    CheckSum:         023BE5B7
    ImageSize:        0245E000
    Translations:     0000.04b0 0000.04e4 0409.04b0 0409.04e4
    Information from resource tables:

Youy might want to look for an update for your Nvidia graphics card driver.
 
@cwsink - I did the BIOS update to F9. The update had no issues, PC boots, Windows booted correctly after the installation/required power shutdown. The update re-enabled the Intel GPU so I went to BIOS settings and disabled it and it kind of broke the OS 😉

The welcome screen on windows (the one that shows username + welcome + the animated circle) froze, then I had like a total broken image that after 10 seconds or so changed into my desktop but overall the OS was freezing for like 2-5 minutes, then working for like 10-15 seconds, and again freezing for 2-5 minutes. I managed to reboot it normal way (ctrl+alt+del -> power icon -> reboot, so I didn't have to use the reset button on the PC case) and went back to BIOS to enable the Intel GPU (that's the only thing I changed in BIOS after the update as I didn't notice any other changes - XMP still disabled, AHCI on SATA) with 64 MBs memory and with this the OS is working correctly, no freezes. I can turn off the Intel GPU via Control Panel > Device Manager. I did turn it off and the OS didn't start to freeze, was running ok but I decided to run with it on for some time.

@ubuysa - I will update the GPU driver, don't really have anything to lose with a newer version I guess (but I want to run the PC for few days on the new BIOS, so will do the GPU driver update later).
 
Update (previous post was written before I've powered the PC):

It seems that the F9 BIOS is "funky" - after passing the BIOS splash screen nothing happens, no signal on the monitor (either from the Nvidia or Intel GPU), system doesn't want to start. After waiting ~10 minutes nothing happened (not even the lights on the front panel were working/blinking) I had to turn the PC off (the reset button didn't do nothing, only the power button for few seconds powered the PC off).

After another try the OS booted into Windows 10 recovery, I skipped it by using "reboot and continue" option, I went into BIOS, everything seemed OK, exit without saving and after that the OS booted properly. Running currently for 3.5 hours.

I'll see if this happens again and if yes I'm going to downgrade the BIOS to the previous F8 version.
 
Hmm... keep in mind the older memory.dmp file was showing what I interpret to be possible disk I/O problems and numerous error events suggesting the same. Drive problems can be intermittent. I'm keeping an eye out for possibly more.

As suggested by ubuysa, updating to the latest Nvidia driver is a good idea. Though, it's possible you already have the latest for a GTX 770.

Please make any new dump files available for comparison if the crashes continue.

edit: I'd be curious to know if Reliability Monitor is showing any critical error events involving disk I/O.
 
Last edited:
  • Like
Reactions: ubuysa
After booting the system yesterday was working ok - no BSOD. I'll check today if it will boot properly. If yes I'm going to update the display driver for sure.

Here are the screenshots and xml export from the Reliability Monitor: https://we.tl/t-T8bg4TE394 Each day is separated on individual images, so "2023-07-08.png" is summary, "2023-07-08 X.png" is the first/second/third/etc. error counting from TOP. If I should translate something from the individual errors please let me know (the Windows is in Polish, I tried to translate the summary pages). The summary 2023-07-09 I guess is related to the issue that the OS didn't want to boot? Nothing else didn't happen yesterday.

From what I read hardware errors have 141, 117 and 1a1 codes (141 and 117 suggest display driver issues from what I read). From 2023-06-10 until 2023-07-08 there were only Windows Update information. It seems that the display driver/hardware error only appeared after updating BIOS to F9 (and especially when I disabled Intel GPU).

I've also searched Event Viewer https://we.tl/t-caLF72Rs5w and it seems that when I had Intel GPU disabled (so the problem around 20:00/8PM 2023-07-08) the event viewer logged few "display driver nvlddmkm stopped responding and has successfully recovered" logs/events.
 
Thank you for making the effort to provide so much information. It's much appreciated. :)

The 117, 141, and 1a1 codes are live kernel dumps (LKDs) and they are supposed to generate a dump file without crashing the system. Does the folder C:\Windows\LiveKernelReports exist? If so, are there any dump files in it or its subdirectories? By default, that's where dump files from live kernel dumps get saved - when it works.

The Intel and Nvidia GPUs and their drivers are supposed to be able to coexist without problems. There might be a BIOS setting to manually select which to use as the primary adapter but only using the display ports on the Nvidia GPU should have the same effect - assuming that's the GPU you want to use. When I had it, I disabled the integrated GPU on my Intel 4770K just to avoid it being accidentally used by any software on the system and potentially prevent unnecessary heat. I don't recall having problems with both enabled, though.

If the GPU related errors continue a good troubleshooting step is to remove the Nvidia GPU and use the integrated GPU to see if the problems stop in that configuration. It depends if the LKDs all point to the Nvidia GPU/drivers as the culprit.
 
Sadly the boot problem is still here - this time it managed to boot into windows welcome splash and then the screen froze (the animation circle just got stuck), after few minutes it showed a broken combination of my desktop + the welcome screen and after another few seconds the monitor showed "no singnal input" message and turned itself off. I tried to change ports / cables (used HDMI, DVI, DP) from Nvidia, tried to switch into IntelGPU - no effect. Keyboard was unresponsive (pressing NumLock/CapsLock didn't light on/off).

After reboot I went straight into BIOS to downgrade it and while entering Q-splash it froze again lol. After another reboot I managed to downgrade BIOS to F8 - it seems F9 doesn't work with my hardware (or the mobo is actually funky and just doesn't accept a newer BIOS version).

I'm currently using Nvidia GPU, Intel GPU is enabled (from the task manager Intel sits on 0%, Nvidia jumps from 0 to 1% while idle), system booted properly (no freezes).

I've upgraded the GPU driver (went from 471.41 to 474.44, rebooted after the upgrade) but it seems the problem still occurs - just now while I wanted to check Google Maps the screen froze and the Event Viewer dropped 1 error nvlddmkm (id 14, \Device\Video3 0c83(1800) 00000000 00000000) and 4 warnings that the "display driver nvlddmkm stopped responding and has successfully recovered" 🙁 The PC unfreezed itself and is now working but seems that the newer GPU driver didn't help. Reliability Monitor shows the same hardware error (codes: 3x141, 1x117) during the above freeze on "google maps" as the one I've uploaded previously.

As a side note the same warning and error happened during the 1st boot today before the BIOS downgrade (both in Event Viewer and the Reliability Monitor).

I've also checked the LiveKernelDumps: https://we.tl/t-Mn3dvNvDXT (only the WATCHDOG folder had any files it it, other in the C:\Windows\LiveKernelReports\PoW32kWatchog and USBHUB3 were empty). No other dump got generated during "google maps freeze".

It seems I went from memory problems to GPU problem (maybe the GPU is also nearing it's EOL after ~10 years or usage, 7 as a video editing/render PC/GPU in Adobe software?)... Next step would be to unplug Nvidia and try to run only on Intel GPU - if I'll do it do I have to unplug the whole GPU or just plugging off the power cables is enough? (so the GPU would be still in it's mobo slot)

Ah forgot to mention - until I've disabled the Intel GPU for the memory tests (as memtest was showing lowered amount of memory) I never noticed any issues in games or similar and every time the software/game settings showed that it was using the Nvidia GPU, that's why I wasn't even aware that Intel GPU was enabled in BIOS.
 
Last edited:
You'd need to physically remove the Nvidia GPU as it does get power from the PCIe slot. Any power cables are for additional power beyond what the PCIe slot can provide and if the GPU doesn't get enough power you'd have issues. Typically, very similar to the issues you're currently having with the GPU as far as crash codes.

The WATCHDOG LKDs all involve the Nvidia driver (2 are 0x141, 1 is 0x117). I'd next want to try the system without the Nvidia GPU to see if the crashes stop in that configuration.
 
  • Like
Reactions: ubuysa
Status
Not open for further replies.