Question Help with BSOD

Jul 17, 2019
2
0
10
0
Hey Guys. Been having a really difficult time narrowing down a problem with my rig as I can’t seem to find any consistency and hoping someone might have some ideas. This is a pretty long post, so please bear with me.
Here is my build as it currently stands:
Motherboard: Gigabyte Z370 Aorus Gaming 7 Bios F7
Cpu: Intel 8700k not overclocked
Ram: 16gb Corsair Vengeance DDR4 3000mhz (2x8gb) currently @ jedec default 2133mhz
Gpu: 8gb Nvidia 1070 Founders Edition not overclocked outputting to 3 of 4 displays
2gb Nvidia Gtx 760 Secondary used as Physx and outputting to 1 of 4 displays
Sound: Sounblaster Recon3D Pcie
Storage: 1- Samsung 970 EVO Nvme SSD
5 Mechanical Hdd’s used as storage and mechanical redundancy

I rebuild and overhauled the system back a year ago and upgraded to the Gaming 7 board, Intel 8700k, and at the time 16gb (2x8gb) Corsair Vengeance @ 2666mhz set to XMP and a 500gb Samsung SSD connected via Sata. The CPU was never overclocked, the RAM is only OC’d per the XMP, and the GTX 1070 was overclock +125 core/+400 mem. The system ran fine for 10 months with no issues or problems.
In Feb I decided to upgrade my storage to a Samsung 970 EVO Nvme. This would free a Sata slot for another HDD to give me some mechanical backup to my data. I also applied the new BIOS F13 to make sure I had the best compatibility for the Nvme.
During the fresh install of Windows I got a BSOD “Page Fault in Non Paged Area, reason Win32kbase.sys” during a restart removing some the bloatware for my printer (HP deskjet 1660) after installing the driver. I figured no big deal it never BSOD’S so I moved on, ran a SFC and CHKDSK just to be safe and imaged the install when done.
Over the next 4 months the system would randomly (1-3 of every 20 starts) BSOD during windows sign-in with the same BSOD Page Fault in Non Paged area pointing to Win32kbase.sys (80%) or Win32kfull.sys (20%). The dump would never create, it would just sit @ 0% even though my settings for the dump file were set correctly. If the system made it past sign in without crashing, it would operate normally with no issues. I could browse the web on Chrome, watch movies, do hours of heavy gaming with 100% load on my 1070 and high CPU/MEM usage and never get a BSOD. I tried removing the GPU overclock in MSI Afterburner. Ran several SFC’S, CHKDSK’S, and DISM (all returned no errors or corruption). But kept getting the BSOD.
At this point I figured I had some sort of corruption in the windows install even after passing SFC and CHKDSK, so decided to do a fresh install, and this is where the problems mounted. Any “()” below is what I was thinking, and the BSOD is always page fault in non paged area.
  • During the first reinstall got the BSOD after a restart doing a windows update for .net and my sound card. (Maybe my sound card is bad?)
  • Reinstalled windows again to see if it would duplicate and got a BSOD after restart installing the first driver which was the chipset (Ok, maybe not my sound card?).
  • Reinstalling windows again and got a BSOD formatting the SSD partitions in windows setup. (Is there something for the Nvme I am forgetting?).
  • Took the tower apart and cleaned all contacts. Made sure I had good connections and swapped to the other memory DIMMS. Restarted and installed windows. Made it through installing all the drivers. After reading that this BSOD usually means a problem with memory I did a windows memory diagnostic, and it was good no errors. Didn’t have a lot of free time to extensively test the ram, so I picked up a new kit @ Best buy. And with the only change to the system being the Nvme drive I figured I must be missing something as well. Did some reading and found out there is a driver provided by Samsung for the drive.
  • Installed the new RAM, set the XMP and reinstalled windows again. After the chipset, installed the Samsung Nvme driver. Got a BSOD two restarts later after installing the intel RST driver.
  • Did some more reading and extracted the driver to install during windows setup, loaded setup, formatted the drive, did a clean using DISKPART, then installed the Nvme driver. Continued and installed windows. Got a BSOD several restarts later. At this point began to think there might be something with the Nvme (even though SMART checked good), the MB slot, or possibly even my power supply (as it was 13yrs old). Atleast this time it created the dump file and pointed to Ntos knl 0x50.
  • Purchased a new Powersupply (850 watt EVGA G3), and a new 970 EVO. Installed the new power supply and 970 EVO, but this time changed to the 3rd M.2 slot. Was using the 2nd as my 1st one knocks out two of my Sata ports. Also restored all BIOS settings to default, left the RAM at non XMP 2133mhz, and disconnected EVERYTHING except both graphics cards, sound card, nvme, keyboard and mouse.
  • Reinstalled windows and got BSOD “memory management” on setup finalization of “getting devices ready”. (Ok, Maybe it really is my sound card as it’s the only attached device?)
  • Was forced to reinstall windows since the installer crashed and made it all the way through drivers, and windows updates, except BSOD this time after restart installing the MB apps for RGB Fusion (to control the LED’s) and SIV (to setup fan profiles).
  • Reinstalled windows without the MB apps but BSOD on restart installing the printer driver vs uninstalling unnecessary printer bloatware. (Can’t be the printer software as other crashes were prior to the printer being installed, or connected?).
  • At this point I was trying to reverse any other changes made from when it was stable. So I reverted back to the BIOS prior to my Nvme which is F7.
  • While doing more reading on the BSOD crashes, especially the 0x50 Ntos krnl related to any overclocking, I decided to look at the voltages in BIOS for the memory. The VCCIO voltage @ default “auto” non-xmp 2133mhz was @ 0.946v, and the System agent @ 1.05v. I was reading that the two voltages should be 0.05v apart. So I bumped the VCCIO voltage to 1.0v. Since then I have restarted at least 15 times, run prime95 (custom test using 15/16gb ram), and did a windows memory diagnostic, all came back clean.
So at this point I am not sure if it’s the BIOS being reverted back, me bumping up the VCCIO voltage to 1.0, or just in the lull between BSOD’s. I have sat at the computer for 15 mins continually restarting and not getting a crash. The only constant to my issues is the a BSOD will only happen while signing into windows. Windows always boots to sign-in with no issues, and once past the sign-in screen works just fine. Anyone have any advice as to what to do next if I do get another BSOD? If I don’t get another one I would like to get the memory back to XMP @ 3000mhz, but if the instability was from voltage what should I be looking for after setting the XMP or manual setup of timings?
Thanks for reading such a long post, and thank you to anyone who offers to help.
 

Colif

Titan
Moderator
New Ram
New PSU
New NVME - can you test this actually works as its the 1 new part that seems to have started the cascade
try running https://downloadcenter.intel.com/download/19792/Intel-Processor-Diagnostic-Tool on CPU unless it crashes too fast.

are all the hdd disconnected while you do this work?
try removing 2nd GPU while installing
same with soundcard


how long does it run in safe mode?
At the login screen, click on the power button at bottom right of screen
while holding left shift, click restart PC
this loads you into advanced startup

choose troubleshoot
choose advanced
choose start up options
click the restart button
choose a safe mode (it doesn't matter which) by using number associated with it.
Pc will restart and load safe mode

how long you last in safe mode depends on if this is just a software problem.

while in safe mode
Can you follow option one on the following link - here - and then do this step below: Small memory dumps - Have Windows Create a Small Memory Dump (Minidump) on BSOD

that creates a file in c windows/minidump after the next BSOD
copy that file to documents
upload the copy from documents to a file sharing web site, and share the link here and I will get someone to convert file into a format I can read
 
Jul 17, 2019
2
0
10
0
Colif,
Thanks for getting back to me so quick. I am on the road for work and return home tomorrow night. Will be happy to run the CPU diag utility and will post the results.

The mechanical HDD's are always disconnected during windows install. This way windows only has one drive to install too, and I can't accidently format a drive I didn't want to. The HDD's are connected once windows and all drivers are in. The drives have not been giving me any errors, SMART issues (checked every drive in HWinfo), or read/write performance degredation.

I have not tried booting the system in safe mode as some of the crashes were during windows setup and install (making me think it is a hardware issue). Will boot into safe mode after I run the CPU diag. If the system makes it in, did you want me to leave it running idle and see if it stays up?

I will reconfigure the dump settings as you instructed and post the dump to Dropbox once the system crashes?

If you don't mind me asking (always willing to learn), what makes you think it's the CPU? I had ruled it out as it's passes prime95, games at heavy load, and completes the memory diagnostic.

Thanks again for your time.
 

Colif

Titan
Moderator
Yes, errors during setup do tend to point at hardware, so testing CPU was just a way to reduce number of items it could be caused by. I got you to run intel tool as I didn't know it had already run Prime95 (might have missed it in your post)

I normally help people with BSOD, its why I answered this thread, but yours is likely to be hardware and you seem to have done most of the things I would suggest. I will see if others have any smart ideas

almost everything is new, only part you haven't replaced is motherboard. No real way to test it though.
 

ASK THE COMMUNITY