[SOLVED] WHEA_UNCORRECTABLE_ERROR out of the blue

SniperSpree58

Reputable
Jun 22, 2015
12
0
4,510
Hey everyone,

I recently started experiencing a similar issue to another user. This is a new PC, built about 2 weeks ago, and for the first week and a half it worked like a charm, no issues whatsoever. Then, about 4 days ago, I randomly started experiencing WHEA_UNCORRECTABLE_ERROR BSODs at random times. Sometimes it'll crash as soon as I boot up, sometimes it will crash when I launch a program, and it almost always crashes whenever I run a game. I first thought maybe it's a driver issue, so I uninstalled all the drivers I could think of then reinstalled them, nope wasn't that. I then started focusing on hardware being the issue. I removed my GPU and made sure there was no drivers for it, different RAM, a different drive by completely swapping SSDs, and reinstalled windows completely. All of that to no avail. The only time I can use this PC without crashing is when it's in safe mode, or safe mode with networking, which leads me to rule out my PSU being the issue.

So far my culprits are either my CPU or Motherboard, I'm PRETTY sure it's not the CPU as running the intel diagnostic tool and prime95 for a couple hours didn't crash my computer (I was running in safe mode). Like in other threads and their questions, temps are exceptional in my setup so that can't be the issue, I have not overclocked my PC whatsoever, still running 2133mhz ram even though their rated to 3200mhz.

PC's specs are:
i7-9700k (not overclocked)
Zotac RTX 2070 (not overclocked)
ASrock Z390M-ITX/AC
Corsair Vengeance LPX 16GB 3200mhz (running at the default 2133hmz)
Sabernet 512GB M.2 (not currently installed, though it's on hand)
128GB Samsung 840EVO SSD (currently installed)
EVGA SuperNOVA 650 G+ 80+ Gold
Windows 10 Pro

I have bluescreenview open right now looking at the minidumps and it's supposedly caused by the ntoskrnl.exe driver with a bug check code of 0x00000124, I would upload a copy of all my minidumps but I'm not sure what website to upload them to if anyone wants to evaluate them.

I will post a screenshot here saying it's the hal.dll driver: View: https://i.imgur.com/F0uSwsU.png

another one here saying its the ntsokrnl.exe: View: https://i.imgur.com/96E6s4d.png


At this point it's more confusing to me how people were generally saying WHEA_UNCORRECTABLE_ERROR BSODs were hardware related, but running in safe mode works just fine. If it's any help, games and random apps either crash during use/on launch without a BSOD, and some apps just will not run or launch (such as the intel diagnostic tool, would only work in safe mode). Any help would be appreciated, if I can't get a solution by the end of Thursday I'm just going to return my motherboard as that's what I'd imagine the culprit is, though I'm still holding on just in case it is a software issue. Thanks in advance!


EDIT!!!!

The issue was intel turbo boost that somehow either enabled itself or just wasn't a problem for two weeks. I disabled it and my core now maxes at 3.6Ghz and the problem is now solved! Thanks for the support everyone.
 
Last edited:
can you go to c: windows/minidump
copy the files from here to a file sharing website and show link here
I will get someone to convert them into a form I can read.

safe mode doesn't put as much stress on system as normal mode does

WHEA - Windows Hardware Error Architecture. Its an error called by cpu but not necessarily caused by it. Can be any hardware, can be drivers. can be caused by heat
remove msi afterburner if installed. remove Intel Extreme Tuning If installed
dumps might show us more
 
can you go to c: windows/minidump
copy the files from here to a file sharing website and show link here
I will get someone to convert them into a form I can read.

safe mode doesn't put as much stress on system as normal mode does

WHEA - Windows Hardware Error Architecture. Its an error called by cpu but not necessarily caused by it. Can be any hardware, can be drivers. can be caused by heat
remove msi afterburner if installed. remove Intel Extreme Tuning If installed
dumps might show us more


Sorry for the late reply, just got back after two weeks of being gone. Here is all the minidumps in the folder: https://drive.google.com/drive/folders/18j0OV2hvSQEyt75zPXyrnDa-j4OLTch5?usp=sharing

Just to keep everyone updated, I replaced my motherboard and the same issue still persists.
 
Just a little checklist

CPU

Intel - https://downloadcenter.intel.com/download/19792/Intel-Processor-Diagnostic-Tool

All - https://www.mersenne.org/download/

passed both tests

Ram

https://www.memtest86.com/
not run but checked with different ram and now has a new motherboard

GPU

https://geeks3d.com/furmark/

https://benchmark.unigine.com/heaven

errors still happen with no GPU attached

HDD

HD Sentinel - https://www.hdsentinel.com/hard_disk_sentinel_trial.php

Samsung - https://www.samsung.com/semiconductor/minisite/ssd/product/consumer/magician/

WD - https://support.wdc.com/downloads.aspx?p=3&lang=en

Seagate - https://www.seagate.com/au/en/support/downloads/seatools/seatools-win-master/

Tried different drive, still BSOD

PSU

the paper clip method - https://forums.tomshardware.com/threads/what-is-the-paperclip-method-of-testing-a-psu.1336402/

or multimeter,

or in the BIOS to check the +3.3V, +5V, and +12V. - https://www.lifewire.com/power-supply-voltage-tolerances-2624583

Motherboard

Has new motherboard

I assume you clean installed windows 10 when you got new motherboard? or tried to?


Unless there is something else attached not listed above, the only thing you haven't checked is PSU?
 
Solution
Just a little checklist

CPU

Intel - https://downloadcenter.intel.com/download/19792/Intel-Processor-Diagnostic-Tool

All - https://www.mersenne.org/download/

passed both tests

Ram

https://www.memtest86.com/
not run but checked with different ram and now has a new motherboard

GPU

https://geeks3d.com/furmark/

https://benchmark.unigine.com/heaven

errors still happen with no GPU attached

HDD

HD Sentinel - https://www.hdsentinel.com/hard_disk_sentinel_trial.php

Samsung - https://www.samsung.com/semiconductor/minisite/ssd/product/consumer/magician/

WD - https://support.wdc.com/downloads.aspx?p=3&lang=en

Seagate - https://www.seagate.com/au/en/support/downloads/seatools/seatools-win-master/

Tried different drive, still BSOD

PSU

the paper clip method - https://forums.tomshardware.com/threads/what-is-the-paperclip-method-of-testing-a-psu.1336402/

or multimeter,

or in the BIOS to check the +3.3V, +5V, and +12V. - https://www.lifewire.com/power-supply-voltage-tolerances-2624583

Motherboard

Has new motherboard

I assume you clean installed windows 10 when you got new motherboard? or tried to?


Unless there is something else attached not listed above, the only thing you haven't checked is PSU?


I have reinstalled windows a total of 5 times over the course of my trouble shooting, as your list says I have tried multiple drives as well. In case it's relevant to the solution my computer works in safe mode without crashing, thats what I'm using to type this up right now, but it will crash in normal windows.

I have tested my PSU and it works fine.
 
I wasn't sure if you had installed since getting new motherboard

not crashing in safe mode could mean

its a driver (seems unlikely to me unless there is an additional extra thing you attach to PC you haven't mentioned above. we get a lot of musical types here, its worth me asking :)

or its hardware still, as safe mode doesn't put a lot of strain on hardware.

Have you thought about booting into Ubuntu Live USB and see if you can detect any problems in there. WHEA errors aren't just windows, they can happen in linux as well - https://tutorials.ubuntu.com/tutorial/tutorial-create-a-usb-stick-on-windows

@PC Tailor can you look at dumps, curios what raw says, if its blaming a driver or intel. If it says win8 driver, it might be software
 
I have ran the dump files and you can see the full report here:
Dump 1: https://pste.eu/p/GOih.html
Dump 2: https://pste.eu/p/9xiu.html

Summary of findings:
BugCheck 124
Probably caused by : GenuineIntel

Bugcheck Description:
WHEA_UNCORRECTABLE_ERROR
"This bug check indicates that a fatal hardware error has occurred. This bug check uses the error data that is provided by the Windows Hardware Error Architecture (WHEA).

Parameter 1 identifies the type of error source that reported the error. Parameter 2 holds the address of the WHEA_ERROR_RECORD structure that describes the error condition.

When a hardware error occurs, WHEA creates an error record to store the error information associated with the hardware error condition. "

About your bugcheck:
"A WHEA ERROR is almost solely hardware based. It is possible in rare circumstances for this to be a driver, however it is very unlikely. It is often caused by:
  • Component overheating
  • Unstable overclocking or XMP profile
  • Faulty hardware"
Some things to consider:
I would highly advise you to view the full report above, as this will contain much more detail as to the bugcheck and modules running at the time.

I could not open the other 3 dump files as they attempted to map a file of size zero with the maximum size specified as zero. - Which typically means the file has become corrupt.

also both dumps available blame Intel.

Were all of your parts brand new?
 
I have ran the dump files and you can see the full report here:
Dump 1: https://pste.eu/p/GOih.html
Dump 2: https://pste.eu/p/9xiu.html

Summary of findings:


About your bugcheck:

"A WHEA ERROR is almost solely hardware based. It is possible in rare circumstances for this to be a driver, however it is very unlikely. It is often caused by:
  • Component overheating
  • Unstable overclocking or XMP profile
  • Faulty hardware"
Some things to consider:
I would highly advise you to view the full report above, as this will contain much more detail as to the bugcheck and modules running at the time.

I could not open the other 3 dump files as they attempted to map a file of size zero with the maximum size specified as zero. - Which typically means the file has become corrupt.

also both dumps available blame Intel.

Were all of your parts brand new?


Yes, all my parts are brand new, and I would also like to reiterate that the PC worked perfectly fine for approx. 2 weeks before all this started happening. It couldn't be XMP because I have it selected on auto and my RAM is running at the default 2133mhz, not what the RAM says it can run at (3200mhz). None of my components are overclocked nor have they been since I got these components. My temps are actually really good so it doesn't make sense for it to be that either.

Say it is the CPU according to your analysis of the dump file, why would my PC work perfectly fine in safe mode but not in normal mode if it is a hardware issue? Running CPU stress tests in safe mode runs perfectly fine and does not crash my system. I'm really not trying to purchase another CPU because of my location, it makes it a real pain, I'm trying to exhaust all options first.
 
I wasn't sure if you had installed since getting new motherboard

not crashing in safe mode could mean

its a driver (seems unlikely to me unless there is an additional extra thing you attach to PC you haven't mentioned above. we get a lot of musical types here, its worth me asking :)

or its hardware still, as safe mode doesn't put a lot of strain on hardware.

Have you thought about booting into Ubuntu Live USB and see if you can detect any problems in there. WHEA errors aren't just windows, they can happen in linux as well - https://tutorials.ubuntu.com/tutorial/tutorial-create-a-usb-stick-on-windows

@PC Tailor can you look at dumps, curios what raw says, if its blaming a driver or intel. If it says win8 driver, it might be software


No, nothing additional is attached to my PC, the only thing is my keyboard and mouse as of now. The logs @PC Tailor analyzed said they were both caused by GenuineIntel.
 
Does your CPU currently run all the way up to 4.9 GHz under load? If so, manually set it in the bios to stop around 4.7 GHz and see if that makes any difference.

Also, how many sticks are you using for your 16 GB of RAM? If it's 4 sticks, it is sometimes necessary to bump up RAM voltage a tad.

As has been stated, stability in safe mode doesn't mean too much, as much or your hardware is running just a basic Windows driver that won't fully push the device.

And you said your temp are good. What are you using to measure those (got a screen shot)?
 
  • Like
Reactions: SniperSpree58
No, nothing additional is attached to my PC, the only thing is my keyboard and mouse as of now. The logs @PC Tailor analyzed said they were both caused by GenuineIntel.
I won't be able to elaborate now, but just clarifying it blaming intel doesn't mean the CPU is the blame, in this case its likely just the CPU that has called the error, ntt necessarily caused it.

And the details of the bugcheck and the comment I posted (except the "things to consider" bit) are automated from my program so it's just highlighted most common causes.

Safe mode doesn't guarantee drivers are the issue, it's just an indicator, safe mode doesn't use a lot of the hardware in the same way it would in normal operation.

Do you have a network adapter? This can be a common cause too.
 
Does your CPU currently run all the way up to 4.9 GHz under load? If so, manually set it in the bios to stop around 4.7 GHz and see if that makes any difference.

Also, how many sticks are you using for your 16 GB of RAM? If it's 4 sticks, it is sometimes necessary to bump up RAM voltage a tad.

As has been stated, stability in safe mode doesn't mean too much, as much or your hardware is running just a basic Windows driver that won't fully push the device.

And you said your temp are good. What are you using to measure those (got a screen shot)?

So the problem is solved by turning off intel turbo boost in my BIOS, but now my CPU is stuck at 3.6Ghz, do you know how to boost it back up to like 4.7 or whatnot to regain my speed?
 
I won't be able to elaborate now, but just clarifying it blaming intel doesn't mean the CPU is the blame, in this case its likely just the CPU that has called the error, ntt necessarily caused it.

And the details of the bugcheck and the comment I posted (except the "things to consider" bit) are automated from my program so it's just highlighted most common causes.

Safe mode doesn't guarantee drivers are the issue, it's just an indicator, safe mode doesn't use a lot of the hardware in the same way it would in normal operation.

Do you have a network adapter? This can be a common cause too.

Problem was intel turbo boost technology in the BIOS, thanks for the help!
 
Seems a little odd without temperature problems, whats the highest CPU temp you reach under high load?
Turbo boost itself shouldn't cause WHEA error unless something else is potentially wrong with the CPU, such as temps or unstable OC etc.

I haven't OC'd my CPU at all so it shouldn't be that, I also max out at around 45ish celsius under heavy load (prime95 for like 30 minutes) so temps aren't a problem at all. Maybe it's just turboing too high?