Computer BSoD's within one hour after boot.

BartB

Prominent
Jun 13, 2017
9
0
510
Hello there,

The last couple of days I'm experiencing a lot of BSoD's. it are a couple of different ones WHEA_UNCORRECTABLE_ERROR, IRQL_NOT_LESS_OR_EQUAL, MEMORY_MANAGEMENT, SYSTEM_SERVICE_EXCEPTION, and [Has no name] 0x00000124. It happens within one hour after boot, and applications crash much more often especially games and Firefox. I have tried reinstalling Windows 10, but this won't help. The dumps that I have from the blue screens can you find here. I do not have dumps of every error since i was forced to reinstall Windows. I've tested the memory and SSD, no errors to be found. I haven't tested other hardware since I don't know how to test for defects in other hardware like a PSU or Motherboard. My PC runs Windows 10 Education x64.

Here is a list of my PC components:
CPU: Intel Core i7 6700K
Motherboard: Asrock Gaming K6 Fatality Motherboard
RAM: G.Skill Ripjaws V 2 x 8 GB DDR4@2666Mhz
GPU: Asus STRIX-GTX980TI-DC3OC-6GD5
PSU: Corsair RM750
CPU Cooler: NZXT Kraken x61
Storage: Samsung 850 EVO 500GB

And here is a list of my peripherals:
Keyboard: Corsair RGB rapidfire k65
Mouse: Razer Naga 2014
Case lighting: NZXT Hue+
Wireless adapter: Eminent EM4554 | wBUS150

My CPU and GPU are not overclocked, only the RAM is clocked to 2666Mhz instead of 2133Mhz. The hardware runs at comfortable temperatures CPU ~50°C and GPU ~80°C under load.
What can I can do to fix these BSoD's? I could really use some help because my PC is basically useless right now.

Greetings,

BartB
 
Solution
AMD has in the past disabled parts of the CPU that are bad and sold the chip as an x3 or x2 CPU. Sometimes people have tried to enable the disabled parts in case AMD disabled a fully working chip just to get a sale. (If the x3s are selling and the x4s aren't, it might make business sense to disable a fully working x4 and turn it into a saleable x3.) I've never known Intel to do this however.

If you disabled parts of the CPU, there are several reasons why this could have worked. It could be a bad CPU. Intel doesn't have any tools that let you select what to enable, so it's hard for you to pick which cores get disabled and which stay working. You might be able to enable more of them. Or not. (You also have an i7 CPU, so perhaps...


The RAM is made for 2666Mhz and I ran memtest86 twice but it didn't find any errors in the memory.
The ram is already running Intel XMP. Do you have any other suggestions?
 
1) Try update the chipset driver https://downloadcenter.intel.com/download/20775
2) Check the BIOS version, if the MB does not have the newer one, update it. If you don't feel comfortable to do it, ask someone for help.

Because the RAM are fine, and 0x00000124 related to the hardware or drivers, you may take out the gtx980ti, use onboard iGPU to see what happens. Check the SSD S.M.A.R.T. with Samsung Magician. If you had other HDD, may try to disconnect it too. Also try the regular keyboard/mouse, and unplug the wireless adapter. keep in mind when you try the troubleshot, you have to use minimum hardware so that you can find the problem.

 

I installed the driver update utility from the link but it only found updates for the Intel graphics, I updated those and removed the GPU and wireless USB dongle. I installed Samsung SSD S.M.A.R.T. Magician, SSD status is "Good", updated the SSD drivers, but the system still crashes. Do you know any other possible solutions?
 


My motherboard is already up to date, voltages are within range, 12.000, 5.112 and 3.344. I now use a CM Storm Quickfire TK and a Logitech wireless mouse, I don't have anything less basic then this.
 
Can you run the "whocrashed" and post the error back? I don't want to open that Minidump files, because you said the RAM are fine, the drivers/BIOS are up to date, no oc the cpu/gpu. So the whocrashed may let you know what caused the problem.

So do you have other add-on hardware, like the sound card, and which antivirus software did you use?
 


Here is the whocrashed rapport: https://drive.google.com/file/d/0BwQjxd35gR9FNENiMEp6dXVUR3c/view?usp=sharing. I have NZXT Hue+ and NZXT Kraken X61 who both use internal usb to connect to pc so they have drivers, and other wise i dont have anything else with drivers. I use the default Windows Defender as anti virus software.
 
The errors related to overheat and the drivers.
1) check the OS files with SFC.exe. How to https://support.microsoft.com/en-us/help/929833/use-the-system-file-checker-tool-to-repair-missing-or-corrupted-system-files
2) take out the gtx980ti, use onboard iGPU, download and install the Intel® Processor Diagnostic Tool to test (performs a stress test) the cpu. https://downloadcenter.intel.com/download/19792
Because I think maybe the NZXT cooler driver cause problem.
3) Try the CCleaner to clean your PC. https://www.piriform.com/CCLEANER
 
Did you update your SSD's firmware? I don't mean drivers, I mean the actual firmware. In cases like this you will get a lot of misdirection from the BSOD's. Just need to update everything from your BIOS to your drivers, to even your SSD firmware. I had a similar issue months back, updated everything and it was the SSD Firemware Update that worked.

Before you do update it though, be sure to back up your PC. A free tool like Rollback Rx or Macrium Reflect will work.
 

I already updated the SSD's firmware to the latest version with Samsung Magician.
 


I checked the OS files with SFC.exe it couldn't scan twice for some reason but after reboot and trying again it didnt found anything. I also ran Intel's Processor Diagnostic Tool but it didn't find any problems with my CPU.
At last I ran CCleaner on the registry. But my PC keeps still crashing.
 
Because both Hal.dll and ntoskrnl.exe related to the hardware and drivers, you may use other parts to test, like use the minimum hardware, regular mouse/keyboard, air cooler, etc. I recommend you go to local PC shop ask for help, because you did try most of the basic tests, sorry I can't help you out. Try our mod team to see what they said.
 

Ok, thank you for your help!
How can I contact the mod team?
 
Only thing I can even begin to suggest doing is to install windows on a different drive/partition. If it works, you know it's an issue with your software. If it too has problems you'll know it's a hardware issue.

Only time I've had a HAL issue was with a bad CPU or board. If it's hardware, that's what I'd replace.
 


I've found the problem, I enabled only 2 CPU cores for a week and I haven't had any crashes. Is my CPU broken or could it also be the motherboard?
 
AMD has in the past disabled parts of the CPU that are bad and sold the chip as an x3 or x2 CPU. Sometimes people have tried to enable the disabled parts in case AMD disabled a fully working chip just to get a sale. (If the x3s are selling and the x4s aren't, it might make business sense to disable a fully working x4 and turn it into a saleable x3.) I've never known Intel to do this however.

If you disabled parts of the CPU, there are several reasons why this could have worked. It could be a bad CPU. Intel doesn't have any tools that let you select what to enable, so it's hard for you to pick which cores get disabled and which stay working. You might be able to enable more of them. Or not. (You also have an i7 CPU, so perhaps hyperthreading isn't working. Try disabling HT and see if it works better.) Another chance might be that your motherboard is having issues. Lowering the CPU will put less of a load on the board. If a heatsink isn't on correctly, or something is damaged on the board, lowering the load will help with that. The PSU could also be having the issue. If the PSU is damaged and can't provide the power correctly than it can also cause issues.

As I said in my first post, the only time I've had a HAL issue was with a bad CPU. My gut says that. But I've had way more bad motherboards than CPUs. You might need a shop to swap parts out and find the broken one.
 
Solution
bugchecks would seem to indicate a power problem. I would look into the cause.
if a graphic card is not given proper power, then the power to the cpu fluctuates and you get bugchecks just like if you overclocked the system. (failing cpu coolers can also cause this)

if you provide the memory dumps, I can use the windows debugger and dump the reason the CPU shut the system down. generally, if you look at the system up timer and it is over 15 minutes you will have a overheating problem, if it is under 15 seconds, then the motherboard logic reset the CPU because of a power problem.