[SOLVED] Help with BSOD

Status
Not open for further replies.
Jul 17, 2019
3
0
10
Hey Guys. Been having a really difficult time narrowing down a problem with my rig as I can’t seem to find any consistency and hoping someone might have some ideas. This is a pretty long post, so please bear with me.
Here is my build as it currently stands:
Motherboard: Gigabyte Z370 Aorus Gaming 7 Bios F7
Cpu: Intel 8700k not overclocked
Ram: 16gb Corsair Vengeance DDR4 3000mhz (2x8gb) currently @ jedec default 2133mhz
Gpu: 8gb Nvidia 1070 Founders Edition not overclocked outputting to 3 of 4 displays
2gb Nvidia Gtx 760 Secondary used as Physx and outputting to 1 of 4 displays
Sound: Sounblaster Recon3D Pcie
Storage: 1- Samsung 970 EVO Nvme SSD
5 Mechanical Hdd’s used as storage and mechanical redundancy

I rebuild and overhauled the system back a year ago and upgraded to the Gaming 7 board, Intel 8700k, and at the time 16gb (2x8gb) Corsair Vengeance @ 2666mhz set to XMP and a 500gb Samsung SSD connected via Sata. The CPU was never overclocked, the RAM is only OC’d per the XMP, and the GTX 1070 was overclock +125 core/+400 mem. The system ran fine for 10 months with no issues or problems.
In Feb I decided to upgrade my storage to a Samsung 970 EVO Nvme. This would free a Sata slot for another HDD to give me some mechanical backup to my data. I also applied the new BIOS F13 to make sure I had the best compatibility for the Nvme.
During the fresh install of Windows I got a BSOD “Page Fault in Non Paged Area, reason Win32kbase.sys” during a restart removing some the bloatware for my printer (HP deskjet 1660) after installing the driver. I figured no big deal it never BSOD’S so I moved on, ran a SFC and CHKDSK just to be safe and imaged the install when done.
Over the next 4 months the system would randomly (1-3 of every 20 starts) BSOD during windows sign-in with the same BSOD Page Fault in Non Paged area pointing to Win32kbase.sys (80%) or Win32kfull.sys (20%). The dump would never create, it would just sit @ 0% even though my settings for the dump file were set correctly. If the system made it past sign in without crashing, it would operate normally with no issues. I could browse the web on Chrome, watch movies, do hours of heavy gaming with 100% load on my 1070 and high CPU/MEM usage and never get a BSOD. I tried removing the GPU overclock in MSI Afterburner. Ran several SFC’S, CHKDSK’S, and DISM (all returned no errors or corruption). But kept getting the BSOD.
At this point I figured I had some sort of corruption in the windows install even after passing SFC and CHKDSK, so decided to do a fresh install, and this is where the problems mounted. Any “()” below is what I was thinking, and the BSOD is always page fault in non paged area.
  • During the first reinstall got the BSOD after a restart doing a windows update for .net and my sound card. (Maybe my sound card is bad?)
  • Reinstalled windows again to see if it would duplicate and got a BSOD after restart installing the first driver which was the chipset (Ok, maybe not my sound card?).
  • Reinstalling windows again and got a BSOD formatting the SSD partitions in windows setup. (Is there something for the Nvme I am forgetting?).
  • Took the tower apart and cleaned all contacts. Made sure I had good connections and swapped to the other memory DIMMS. Restarted and installed windows. Made it through installing all the drivers. After reading that this BSOD usually means a problem with memory I did a windows memory diagnostic, and it was good no errors. Didn’t have a lot of free time to extensively test the ram, so I picked up a new kit @ Best buy. And with the only change to the system being the Nvme drive I figured I must be missing something as well. Did some reading and found out there is a driver provided by Samsung for the drive.
  • Installed the new RAM, set the XMP and reinstalled windows again. After the chipset, installed the Samsung Nvme driver. Got a BSOD two restarts later after installing the intel RST driver.
  • Did some more reading and extracted the driver to install during windows setup, loaded setup, formatted the drive, did a clean using DISKPART, then installed the Nvme driver. Continued and installed windows. Got a BSOD several restarts later. At this point began to think there might be something with the Nvme (even though SMART checked good), the MB slot, or possibly even my power supply (as it was 13yrs old). Atleast this time it created the dump file and pointed to Ntos knl 0x50.
  • Purchased a new Powersupply (850 watt EVGA G3), and a new 970 EVO. Installed the new power supply and 970 EVO, but this time changed to the 3rd M.2 slot. Was using the 2nd as my 1st one knocks out two of my Sata ports. Also restored all BIOS settings to default, left the RAM at non XMP 2133mhz, and disconnected EVERYTHING except both graphics cards, sound card, nvme, keyboard and mouse.
  • Reinstalled windows and got BSOD “memory management” on setup finalization of “getting devices ready”. (Ok, Maybe it really is my sound card as it’s the only attached device?)
  • Was forced to reinstall windows since the installer crashed and made it all the way through drivers, and windows updates, except BSOD this time after restart installing the MB apps for RGB Fusion (to control the LED’s) and SIV (to setup fan profiles).
  • Reinstalled windows without the MB apps but BSOD on restart installing the printer driver vs uninstalling unnecessary printer bloatware. (Can’t be the printer software as other crashes were prior to the printer being installed, or connected?).
  • At this point I was trying to reverse any other changes made from when it was stable. So I reverted back to the BIOS prior to my Nvme which is F7.
  • While doing more reading on the BSOD crashes, especially the 0x50 Ntos krnl related to any overclocking, I decided to look at the voltages in BIOS for the memory. The VCCIO voltage @ default “auto” non-xmp 2133mhz was @ 0.946v, and the System agent @ 1.05v. I was reading that the two voltages should be 0.05v apart. So I bumped the VCCIO voltage to 1.0v. Since then I have restarted at least 15 times, run prime95 (custom test using 15/16gb ram), and did a windows memory diagnostic, all came back clean.
So at this point I am not sure if it’s the BIOS being reverted back, me bumping up the VCCIO voltage to 1.0, or just in the lull between BSOD’s. I have sat at the computer for 15 mins continually restarting and not getting a crash. The only constant to my issues is the a BSOD will only happen while signing into windows. Windows always boots to sign-in with no issues, and once past the sign-in screen works just fine. Anyone have any advice as to what to do next if I do get another BSOD? If I don’t get another one I would like to get the memory back to XMP @ 3000mhz, but if the instability was from voltage what should I be looking for after setting the XMP or manual setup of timings?
Thanks for reading such a long post, and thank you to anyone who offers to help.
 

Colif

Win 11 Master
Moderator
New Ram
New PSU
New NVME - can you test this actually works as its the 1 new part that seems to have started the cascade
try running https://downloadcenter.intel.com/download/19792/Intel-Processor-Diagnostic-Tool on CPU unless it crashes too fast.

are all the hdd disconnected while you do this work?
try removing 2nd GPU while installing
same with soundcard


how long does it run in safe mode?
At the login screen, click on the power button at bottom right of screen
while holding left shift, click restart PC
this loads you into advanced startup

choose troubleshoot
choose advanced
choose start up options
click the restart button
choose a safe mode (it doesn't matter which) by using number associated with it.
Pc will restart and load safe mode

how long you last in safe mode depends on if this is just a software problem.

while in safe mode
Can you follow option one on the following link - here - and then do this step below: Small memory dumps - Have Windows Create a Small Memory Dump (Minidump) on BSOD

that creates a file in c windows/minidump after the next BSOD
copy that file to documents
upload the copy from documents to a file sharing web site, and share the link here and I will get someone to convert file into a format I can read
 
Jul 17, 2019
3
0
10
Colif,
Thanks for getting back to me so quick. I am on the road for work and return home tomorrow night. Will be happy to run the CPU diag utility and will post the results.

The mechanical HDD's are always disconnected during windows install. This way windows only has one drive to install too, and I can't accidently format a drive I didn't want to. The HDD's are connected once windows and all drivers are in. The drives have not been giving me any errors, SMART issues (checked every drive in HWinfo), or read/write performance degredation.

I have not tried booting the system in safe mode as some of the crashes were during windows setup and install (making me think it is a hardware issue). Will boot into safe mode after I run the CPU diag. If the system makes it in, did you want me to leave it running idle and see if it stays up?

I will reconfigure the dump settings as you instructed and post the dump to Dropbox once the system crashes?

If you don't mind me asking (always willing to learn), what makes you think it's the CPU? I had ruled it out as it's passes prime95, games at heavy load, and completes the memory diagnostic.

Thanks again for your time.
 

Colif

Win 11 Master
Moderator
Yes, errors during setup do tend to point at hardware, so testing CPU was just a way to reduce number of items it could be caused by. I got you to run intel tool as I didn't know it had already run Prime95 (might have missed it in your post)

I normally help people with BSOD, its why I answered this thread, but yours is likely to be hardware and you seem to have done most of the things I would suggest. I will see if others have any smart ideas

almost everything is new, only part you haven't replaced is motherboard. No real way to test it though.
 
Jul 17, 2019
3
0
10
Sorry this has taken me over 8 months to post an update but I have been very busy with family and work. Below is a recap of my issue and what was done to resolve and repair my desktop computer.
I initially started getting a BSOD 0x50 Page Fault in Non paged Area, during boot at the windows sign-in screen shortly after adding a NVME SSD, and removing my SATA SSD. The BSOD'S were never consistant and would happen 1 of every 10 to 20 boots.
A dump file was never being created no matter how many times I checked the settings within windows. And on the rare times that it did, it gave no useful information other than the 0x50 code, Win32kbase.sys or Win32kfull.sys.
If windows made it past the sign-in screen the system would run with no issues. It could be put under heavy gaming loads with >90% GPU usage and >50% CPU usage for hours with no problems. Windows update could run as well and apply updates during reboots, but crash as soon as I reached the sign-in screen.
These where my attempted fixes:
  1. Run Defrag/trim and error checking
  2. Run SFC /SCANNOW
  3. Run DISM tool
  4. Reinstall Windows from scratch (got a BSOD once during windows setup while trying to format the NVME)
  5. Install Samsung provided NVME driver after windows install, and during windows setup adding driver before installing windows.
  6. Restored RAM from 2666mhz XMP profile to default 2133mhz.
  7. Replaced RAM with a 3000mhz kit and applied XMP. Even bumped the VCCIO and VCCSA voltages to ensure the RAM was stable. Got a crash with those settings, reverted to default 2133mhz and still got a crash.
  8. Replaced 11yr old 850 Power Supply with new 850 Power Supply.
  9. Replaced 1070gtx founders GPU.
  10. Removed second GPU 760gtx that I was using for Physx
  11. Changed slot of NVME on motherboard.
  12. Replaced NVME with similar model.
  13. Tried installing windows with only bare minimum connected equipment, still got crashes.
  14. Ran Intel Processor diagnostic tool. Ran with no errors, on occasion would get a fail when using RAM XMP profile but was never continuous.
  15. Ran Memtest86 several times with clean runs everytime.
At this point I was at a complete loss and knew the only thing left to replace would be my motherboard and CPU. The one item I didn't replace was my dedicated Soundblaster sound card as I felt it couldn't possibly cause a memory crash.
In the end I decided to replace both my motherboard and CPU at the same time, as I couldn’t stand the thought of seeing one more BSOD.
I can happily say after 6 months the system has been repaired and no longer BSOD'S. I stopped worrying once the computer made it through 40 consecutive cold boots (done over a few weeks of use) and has never crashed.
My belief is the fault may have been with the L1, L2 or L3 cache in the CPU as it did fault in the Intel Processor Diagnostic a few times when the RAM was overclocked. But unfortunately, am not sure as I replaced the board and CPU at the same time. I hope this may possibly help someone out who encounters the same problem I did, and won't have nearly the stress or aggravation in getting it sorted out
 

Colif

Win 11 Master
Moderator
We didn't expect you to come back and report your work :)
If you had come back we could have helped figure it out but I totally understand life getting in the way

Thanks for at least giving perhaps some hints to someone else who gets similar problems :)
 
Status
Not open for further replies.