Very unstable system with frequent freezes and BSODs

barroso

Reputable
Oct 4, 2015
52
0
4,630
The system was stable some 2 months after assembly except for the 24 pin atx connector's hook not being able to attach to the socket which at first wouldn't let the pc boot until i fastened it tighter, and some time later one time it wouldn't turn on, again until i fastened it.

After i started playing a certain game i started getting the 1st freezes and the 1st BSOD after which i immediately updated drivers and BIOS which however didn't solve the issue. In fact from that point on the instability increased and spread from strictly in game to any area of use.

Some of the BSOD messages: bad pool header, memory management, (driver)irql not less or equal, system service exception, page in non-paged area, some with several different .sys files(win32k.sys, dxgms1.sys), countless with just the technical information.

I also started getting the "overclocking failed" message which appears sometimes after freezes even though there is no overclock at all going on.

In fact while conducting tests from memtest86 to the OCCT tests, Furmark(all with zero errors)... i realized dram is not working at its original speed and instead defaulted at 2133mhz CL20. When i try turning any of the 2 XMP profiles on i instantaneously get the "overclocking failed" message upon reboot and can't boot windows until i restore defaults.

I tried turning XMP on with each stick separately and while one crashed instantaneously the other managed too boot 1st time but it actually increased the instability and i had to restore it to defaults.
 


Yes im currently using only one, the one that managed to boot with XMP on, and nothing at all changed, same BSODs, same crashing frequency.

Even tried setting clock at one value under the 2133mhz at which it defaulted, and that actually severely increased instability and had to restore to defaults.
 
Okay.. And is that stick running with XMP right now?

Any XMP is just going to make your system unstable even more until you fix the problem 😛

My personal rig couldn't handle the XMP from my 2400MHz CL10 memory. Constant display driver crashes and BSODs. So I just turned the XMP off. I couldn't have been bothered to find out if it was the mobo/memory cont/memory itself because there is almost no performance gain and that's a lot of gray hair for nothing.
 
I do run that one stick at its defaulted settings with XMPs off and did the full 8 hour memtest with zero errors.

I updated bios 2 times, neither update did affect the issues in any way.

The thing that freaks me out the most is that everything run perfectly fine for 2 months as i said, no issues whatsoever.

Asus Z170 pro gaming
I7 6700k
R9 390x
Kingston HyperX 2x8gb 3000mhz CL15
Hyper evo 212 cooler
Samsung 850 evo ssd
Evga G2 850W
 
It is due to overheat it seems ...

Clean your PC and replace the thermal paste and refix the CPU

And GPU as well if you can

Do whatever steps are applicable to reduce the temperature and try

Note: If system is giving you BSOD at same point all the times, then, that may be some other issue. If the PC is giving BSOD in different instances or different situations, then it is the temperature problem

Try this ...

Continuously monitor GPU and CPU temperature and check what is the CPU & GPU temp just before BSOD

and

After System restart open win+R and type eventvwr and check if there is any event made your system to go to BSOD

 
if the asus mb has the newest bios file as they did fine a bug in most skylake cpu under prime 95 bug. with real fast ram the memory controller is in the cpu and not on the mb. may have a failed memory controller or the ram voltage is to low for the ram sticks at the speed you want. start with setting the ram non xmp 2100 default speed for ddr4 use cpu-z to read ram eprom set the timing for the slower speeds if the ram is stable at the slower speeds go into dram voltage and change it slowly up then up the ram speed to it xmp prile see if the ram is stable. if you cant get it stable at any speed your cpu may have gone bad.
 
Is this part(http://imgur.com/zXWor3E) of the event viewer you were referring to?

As far as temperatures are concerned i did stress test both CPU and GPU with OCCT and Furmark without issues. Also tracked both with Hwinfo while gaming and while CPU overall and GPU "thermal diode"(which is the same value that Furmark tracks) temperatures are in order the GPU's "VRM1 temp" can go as high as 102'C, but as i said it run perfectly fine with titles that are more demanding than the game that provoked the 1st BSOD for full 2 months. In that 2 month period i already did reapply thermal paste on CPU as i had to remove cooler on one occasion.

Crashes definitely dont always happen in the same occasion, it can happen anywhere between gaming, windows and web browsing, and as temperatures are far from the one mentioned while im just web browsing id dare say its not related to temps. Just any random crash can prompt the "overclocking failed" message, idk if thats relevant and it then takes several reboots until a successful boot.

I did try running ram at lower clock with defaulted voltage. Also tried raising voltage while on the defaulted frequency and both increased instability and i had to reset settings to default in order to be able to boot. Id even get crashes while in BIOS in those occasions.

One more thing, by "clean your PC" what do you mean exactly, and how could that affect the stability issue? Temperatures?
 
Strange enough. Is there any other PCIe device in the system? It is mentioned that you are not overclocking and still getting the Overclocking Failed message on startup. Are you comfortable with breadboard testing your PC? If yes, then take it out of the PC Case. Put the motherboard on the any non-conductive surface like woodboard, breadboard, even the packing box of motherboard can be used for this purpose. Uninstall every thing from the board including CPU.

Remove the CMOS battery and put it back in after like 2-3 minutes. Don't install the CPU. Check the socket carefully with magnifying glass and LED light for any bend/broken pin on the socket. If all clear, just connect the 24-pin ATX connector and update the BIOS using USB BIOS Flashback.

Once done, clear the CMOS. Now install the CPU and connect the 8-pin or 4-pin CPU Power connector. Power on the PC and see what QCode displays on the QLED. If it is not CPU related, install the RAM and repeat the process. By now, you should get display from the built-in GPU.

If all clear, load up the windows and run Intel Processor Diagnostic Tool and see if it passes the test. Use the system for a while and see if any BSOD or freezing occur. Once confident, install the graphics card as well.

If despite all above steps, you get BSOD/Freezing it is time to test the RAM first. Sorry for lengthy post.
 
Aight so CPU failed Intel processor diagnostics's "CPU frequency" test, others(including temp) were passed???

Unfortunately i dont have the possibility to test with other hardware, that will happen only when i send everything to the retailer if it comes to that.

I will test with integrated graphics, it can sometimes take hours until the 1st crash.

@easylover

Yes i will conduct the tests you suggested, however id like to know 1st if the fact that the mentioned test failed sheds some light on the issue?
 
if the test failed your should rma the cpu then rerun all the tests with a new cpu. as you dont have spare parts your stuck with rma the cpu as you do have a intel test failiure thta you can print out and send back. you dont want to send back to the store non doa parts.
 
It turned out that while both RAM stick passed memtest86 with 0 errors it was one of them that was causing the instability, ironically the one that was able to boot with XMP profile on. After i removed it i never had a single crash again. I can't understand why the issues started only after 2 months, during which everything ran without problems.

I cannot help but wonder, was it really bad luck to get both CPU and RAM defective(again, CPU failed Intel Diagnostic's "frequency test" and one RAM stick causes system crashes) or may this have something to do with the fact that the 24 pin ATX connector cannot fit all the way(i read it is not advisable to run PC in this case as it can cause damage) or with defective motherboard maybe?

Id like to know whether its all crystal clear or whether i should conduct some more tests before RMA?