Question "Whea Uncorrectable Error" with new and old systems ?

Feb 6, 2022
10
0
10
Hello, im going crazy with this "whea uncorrectable error". My old system it was somewhat ok, since i did get BSOD only sometimes, like 2-4 times of week.
But now this new system is giving it all the time. I moved only GPU, SSD and HDD from old system and after that i did reinstall windows 10 on SSD.

System specs:
Thinkstation D30
2x Xeon E5-2660
4x 8GB HP 1333 MHz RAM Sticks
Radeon RX570 8GB Pulse
256GB SSD
1TB HDD
No overclocks

HWMonitor shows core voltages between 0.801V - 0.966V.

So far problem is with having multiple firefox tabs open or Sony Vegas Pro and Resolve will give BSOD always. In Vegas i can sometimes edit videos and sometimes not when BSOD hits. In Resolve it seems i can edit the videos without problem but when i hit "render" it will usually give me BSOD.

Done so far:
  • Updated and checked all drivers.
  • Checked temps, all look somewhat good, only CPU core temps go up with all testing, but no BSOD.
  • Checked disks with CrystalDiskInfo.
  • Tested CPUs with Intel Processor Diagnostic Tool 64bit, all passed.
  • Did run 3DMark Demo without BSOD.
  • Tested RAMs with MemTest, i simply did open it three times and i did give it 3GB RAM. This was interesting, when i opened one more windows and did start testing with it, my CPU temps did go up to 60-62C. No errors.
  • Tested system with CinebenchR23, no BSOD.
  • Checked dump file with WinDbg and found image name to blame "GenuineIntel.sys".
Dump file: https://www.dropbox.com/s/fc0vwti3fg69oi1/MEMORY.DMP?dl=0
 

Lutfij

Titan
Moderator
i did reinstall windows 10 on SSD
Where did you source the installer for the OS? What version(not edition) of the OS are you on?

Speaking of your prebuilt, did you replace the PSU in the build? What PSU did the unit come with?
this new system
This isn't new, not with DDR3 ram, it isn't.

BIOS version on your prebuilt's motherboard at the time of writing?
 
Feb 6, 2022
10
0
10
i did reinstall windows 10 on SSD
Where did you source the installer for the OS? What version(not edition) of the OS are you on?

Speaking of your prebuilt, did you replace the PSU in the build? What PSU did the unit come with?
this new system
This isn't new, not with DDR3 ram, it isn't.

BIOS version on your prebuilt's motherboard at the time of writing?

-From Microsoft site i did download windows media creation tool. Right now im on Version 21H2 Build 19044.1503.
-PSU is some Delta 1100W 80 Plus Gold what comes with prebuilt system.
-Bios is "A1KT61A" from 31 Mar 2017, since its only one available for type 4223 Thinkstation D30.

I know the system is not "new", but when facing same issue with old PC and this "new" PC is driving me crazy.

I did make BIOS flash USB stick just in case and did watch Tech Yes City video about tuning Xeon CPUs.

So far what i did in BIOS is:
  1. Advanced -> WHEA configuration -> WHEA Support -> Disabled.
  2. Advanced -> CPU configuration -> Intel Virtualization Technology -> Disabled.
  3. Advanced -> North Bridge configuration -> IOH configuration -> Intel VT-d and Intel I/OAT -> Disabled.
  4. Advanced -> Memory configuration ->Patrol Scrub and Demand Scrub -> Disabled.

After that i did disable Meltdown and Spectre Protection.

Now it was time to test it, after editing 20min long video in resolve, i did hit the "Render" button. It was going good until 22% was done, then Resolve Studio did simply stopped responding. I was checking CPU temps with OCCT and CPU #2 Package temp was hitting 78 celsius. I did close the Resolve Studio and open my room window (since its 1 celsius outside). After opening Resolve Studio again and hitting "Render" it was taking time, but rendering was successful. CPU package temps did stay around 60-63 celsius.

It seems problem is high CPU package temps still, but BSOD maybe did come from bad BIOS config? Since i did face BSOD with Whea Uncorrectable Error code just simply login to windows and opening Firefox with 2 tabs.
 
Feb 6, 2022
10
0
10
Hmm problem is still here. I get BSOD even when i have like 7 tabs open in Firefox and Twitch open in second screen.

Also im unable to render almost anything with Resolve, since 99% of times it gives me BSOD. Im little bit lost what can be problem.
 

Colif

Win 11 Master
Moderator
Advanced -> WHEA configuration -> WHEA Support -> Disabled.
I wonder what this did. I would turn it back on as if anything, its not going to be cause but might at least help us find it.
can you give us all dumps you get, most WHEA reports look the same but it might show a clue.
WHEA - Windows Hardware Error Architechture
Its an error called by CPU but not necessarily caused by it.
It can be any hardware
can sometimes be drivers
can be caused by overclocking
Can be caused by overclocking software so things like Intel Extreme tuning and even MSI Afterburner
Can be caused by heat.

I moved only GPU, SSD and HDD from old system
If you had whea in old system and have it still now, I would start on the parts you didn't replace.
What make/model are the drives?
GPU might cause WHEA but its not common. GPU have other ways to show they are problems.

Lutfi has 2 stock answers for BSOD, PSU or version of windows. I find there are way more choices.

I will ask a friend to convert the dump file into a format I can read.
 

gardenman

Splendid
Moderator
Tested RAMs with MemTest, i simply did open it three times and i did give it 3GB RAM. This was interesting, when i opened one more windows and did start testing with it, my CPU temps did go up to 60-62C. No errors.
What memtest? Memtest86 does not run within Windows and you do not give it a set amount of RAM, it tests all RAM installed (at the moment). It has it's own OS and you boot to it on a flash drive or disc. Test with the real memtest86 if you haven't. Instructions can be found here. It will take hours.

I ran the dump files through the debugger and got the following information: https://jsfiddle.net/mt25vfco/show This link is for anyone wanting to help. You do not have to view it. It is safe to "run the fiddle" as the page asks.
File information:022222-28359-01.dmp (Feb 21 2022 - 23:24:08)
Bugcheck:WHEA_UNCORRECTABLE_ERROR (124)
Probably caused by:memory_corruption (Process running at time of crash: firefox.exe)
Uptime:2 Day(s), 10 Hour(s), 02 Min(s), and 14 Sec(s)

File information:021922-35515-01.dmp (Feb 19 2022 - 12:11:58)
Bugcheck:WHEA_UNCORRECTABLE_ERROR (124)
Probably caused by:memory_corruption (Process running at time of crash: Resolve.exe)
Uptime:0 Day(s), 22 Hour(s), 44 Min(s), and 47 Sec(s)
System: Lenovo Thinkstation D30 422346G

Possible Motherboard page: https://pcsupport.lenovo.com/us/en/...-workstations/thinkstation-d30/4223/downloads
You have the latest BIOS already installed, version 61A.

This information can be used by others to help you. Someone else will post with more information. Please wait for additional answers. Good luck.
 

Colif

Win 11 Master
Moderator
not every day i see a PC with 16 memory slots

although whea errors generally aren't drivers, top one makes me wonder if its not lan drivers
Jun 11 2018e1i65x64.sysIntel(R) Gigabit Adapter driver
(though these days I always seem to suspect lan drivers)
I wonder if there are any newer ones
try running this and see if it offers anything - https://www.intel.com.au/content/www/au/en/support/intel-driver-support-assistant.html

2nd whea more like normal, doesn't tell me a lot.
these might be a little old. GIven you moved the drives over, maybe
Nov 20 2015iaStorS.sysIntel SATA Storage Device Controller driver
can't see I see Intel sata drivers often. Wonder if its part of Intel Rapid storage technology
 

Satan-IR

Splendid
Ambassador
File information:021922-35515-01.dmp (Feb 19 2022 - 12:11:58)
Bugcheck:WHEA_UNCORRECTABLE_ERROR (124)
Probably caused by:memory_corruption (Process running at time of crash: Resolve.exe)
Uptime:0 Day(s), 22 Hour(s), 44 Min(s), and 47 Sec(s)
This is why I asked about application. I ran the dump and came acroess Resolve.exe too. Don't know if it's the EXE for a game or other app?

That app conflicting with drivers and/or RAM errors might be the cause.
 
cpu reported a cache error
Error : ICACHEL0_IRD_ERR (Proc 29 Bank 0)
Severity : Fatal

29: kd> !sysinfo cpuinfo
[CPU Information]
~MHz = REG_DWORD 2195
Component Information = REG_BINARY 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
Configuration Data = REG_FULL_RESOURCE_DESCRIPTOR ff,ff,ff,ff,ff,ff,ff,ff,0,0,0,0,0,0,0,0
Identifier = REG_SZ Intel64 Family 6 Model 45 Stepping 7
ProcessorNameString = REG_SZ Intel(R) Xeon(R) CPU E5-2660 0 @ 2.20GHz
Update Status = REG_DWORD 0
VendorIdentifier = REG_SZ GenuineIntel
MSR8B = REG_QWORD 71400000000


cpu released in 2012 and discontinued in 2015
motherboard
Vendor LENOVO
BIOS Version A1KT61AUS
BIOS Starting Address Segment f000
BIOS Release Date 03/27/2017
Manufacturer LENOVO
Product Name 422346G
Version Lenovo Product
Serial Number 739
Processor Version Intel(R) Xeon(R) CPU E5-2660 0 @ 2.20GHz
Processor Voltage 88h - 0.8V
External Clock 100MHz
Max Speed 4000MHz
Current Speed 2200MHz

(looks like 2 CPU packages)


I do not see any overclock drivers that would cause this.
I would be looking at the power supply voltage levels to see that they are ok
I would look at cooling to make sure all the fans are running. monitor temps of the CPU, blow out dust
System Uptime: 2 days 10:02:14.667
looks like most of the devices were waiting for a wake signal when the cpu got the fatal cache error.

looks like only 2 cpus were active at the time of the crash. one was doing network traffic, the other was windows cleaning up some virtual memory and telling the cpu to flush TBL . That failed and a bugcheck was called.
this would be required since the CPU can rely on the info.

only thing I can think of would be power to the CPU. maybe a capacitor that went bad? or the power supply not providing proper power.

What's difference between CPU Cache and TLB? - GeeksforGeeks

here is the calls. (read from the bottom up)
18 ffffdb003a800180 ffffb28216896080 ( 5) ffffdb003a80b840 ................

# Child-SP RetAddr Call Site
nt!HalpMceBarrierWait+0xaf
nt!HalpMceHandlerWithRendezvous+0x130
nt!HalpHandleMachineCheck+0x5f
nt!HalHandleMcheck+0x35
nt!KiHandleMcheck+0x9
nt!KxMcheckAbort+0x7a
nt!KiMcheckAbort+0x277
nt!KiFlushRangeWorker+0x124
nt!KeFlushMultipleRangeTb+0x135
nt!MiFlushTbList+0x88
nt!MiTerminateWsleCluster+0x363
nt!MiDecommitPages+0x1200
nt!MiDecommitRegion+0x7d
nt!MmFreeVirtualMemory+0x6d3
nt!NtFreeVirtualMemory+0x95
nt!KiSystemServiceCopyEnd+0x25
0x00007ffd`78c2d134

you might be able to flush windows working set in an effort to reduce this error, you can do this with a microsoft utility
from here: Sysinternals Utilities - Windows Sysinternals | Microsoft Docs
I will have to look to see what tool does this.

the utility should be RAMMAP look under the menu item EMPTY
(you can just run them all)
 
Last edited:
cpu reported a cache error
Error : ICACHEL0_IRD_ERR (Proc 29 Bank 0)
Severity : Fatal

29: kd> !sysinfo cpuinfo
[CPU Information]
~MHz = REG_DWORD 2195
Component Information = REG_BINARY 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
Configuration Data = REG_FULL_RESOURCE_DESCRIPTOR ff,ff,ff,ff,ff,ff,ff,ff,0,0,0,0,0,0,0,0
Identifier = REG_SZ Intel64 Family 6 Model 45 Stepping 7
ProcessorNameString = REG_SZ Intel(R) Xeon(R) CPU E5-2660 0 @ 2.20GHz
Update Status = REG_DWORD 0
VendorIdentifier = REG_SZ GenuineIntel
MSR8B = REG_QWORD 71400000000


cpu released in 2012 and discontinued in 2015
motherboard
Vendor LENOVO
BIOS Version A1KT61AUS
BIOS Starting Address Segment f000
BIOS Release Date 03/27/2017
Manufacturer LENOVO
Product Name 422346G
Version Lenovo Product
Serial Number 739
Processor Version Intel(R) Xeon(R) CPU E5-2660 0 @ 2.20GHz
Processor Voltage 88h - 0.8V
External Clock 100MHz
Max Speed 4000MHz
Current Speed 2200MHz

(looks like 2 CPU packages)


I do not see any overclock drivers that would cause this.
I would be looking at the power supply voltage levels to see that they are ok
I would look at cooling to make sure all the fans are running. monitor temps of the CPU, blow out dust
System Uptime: 2 days 10:02:14.667
looks like most of the devices were waiting for a wake signal when the cpu got the fatal cache error.

looks like only 2 cpus were active at the time of the crash. one was doing network traffic, the other was windows cleaning up some virtual memory and telling the cpu to flush TBL . That failed and a bugcheck was called.
this would be required since the CPU can rely on the info.

only thing I can think of would be power to the CPU. maybe a capacitor that went bad? or the power supply not providing proper power.

What's difference between CPU Cache and TLB? - GeeksforGeeks

here is the calls. (read from the bottom up)
18 ffffdb003a800180 ffffb28216896080 ( 5) ffffdb003a80b840 ................

# Child-SP RetAddr Call Site
nt!HalpMceBarrierWait+0xaf
nt!HalpMceHandlerWithRendezvous+0x130
nt!HalpHandleMachineCheck+0x5f
nt!HalHandleMcheck+0x35
nt!KiHandleMcheck+0x9
nt!KxMcheckAbort+0x7a
nt!KiMcheckAbort+0x277
nt!KiFlushRangeWorker+0x124
nt!KeFlushMultipleRangeTb+0x135
nt!MiFlushTbList+0x88
nt!MiTerminateWsleCluster+0x363
nt!MiDecommitPages+0x1200
nt!MiDecommitRegion+0x7d
nt!MmFreeVirtualMemory+0x6d3
nt!NtFreeVirtualMemory+0x95
nt!KiSystemServiceCopyEnd+0x25
0x00007ffd`78c2d134

you might be able to flush windows working set in an effort to reduce this error, you can do this with a microsoft utility
from here: Sysinternals Utilities - Windows Sysinternals | Microsoft Docs
I will have to look to see what tool does this.

the utility should be RAMMAP look under the menu item EMPTY
(you can just run them all)
you might also be able to disable just the failing cpu core. ( i have never done this) if it is the same one that fails each time
core 29 was running
win32kfull!PriorityBoost::UpdateProcessPriorityForSpinning+0x64
at the time of the bugcheck.

----------
guess disabling a core is called core parking
CPU Core Parking in Windows 10 | Best Methods to Enable/Disable (itechviral.com)
or you can set program affinity but it would be for each program (not practical in this case)
 
Last edited:
Feb 6, 2022
10
0
10
That Max Speed 4000MHz looks weird, boost of E5-2660 should be max 3000MHz.

Resolve.exe is the Resolve studio, where i edit videos and try to render those, but 99% of times i get BSOD. With last setup, that happened barely ever.
 
That Max Speed 4000MHz looks weird, boost of E5-2660 should be max 3000MHz.

Resolve.exe is the Resolve studio, where i edit videos and try to render those, but 99% of times i get BSOD. With last setup, that happened barely ever.
most of the time max speed is not being used and is only a default.
current speed is more interesting to look for overclocking and underclocking (due to heat throttling of the cpu) strange current speeds can mean bad versions of overclocking software installed. (versions that do not know proper voltages for newer cpus)
 

Satan-IR

Splendid
Ambassador
That Max Speed 4000MHz looks weird, boost of E5-2660 should be max 3000MHz.

Resolve.exe is the Resolve studio, where i edit videos and try to render those, but 99% of times i get BSOD. With last setup, that happened barely ever.
If that 99% is while you're using the application well there's your answer as what causes them, or triggers them.

You said old system also had BSODs. If they happened while using this application (assuming you do a lot of video work from "the Resolve studio, where i edit videos and try to render those ") that's the common cindition linking the BSODs in the previous and current setup.

Maybe other would have some input as what (driver) is conflicting and not playing nicely with the Resolve.exe and system goes all bananas.
 

Fect123

Reputable
Feb 17, 2021
39
2
4,545
have you tried having a look at Event viewer? if you look under Windows Logs > System, i'm fairly sure that it'll be helpful too.

Try to look for the exact time of the last whea error BSOD. it should be listed as 'Critical'. If you look at the logs before & after it, windows will tell you exactly what's up.
 
Feb 6, 2022
10
0
10
@Satan-IR Yes, i did have problem with Resolve and just with rendering videos. But i have BSOD with Vegas Pro 16 video editing, just editing gives me BSOD. Also only Firefox gives me BSOD when i have example 7 tabs open and 1 tab with YouTube video open in other screen. Like last time i opened Facebook video on 7th tab on main monitor when there was YouTube video playing on second monitor, i did get BSOD.

From last BSOD and event viewer, i can see these in same time:
-Error setting properties in {8444a4fb-d8d3-4f38-84f8-89960a1ef12f}. Error: 0xC0000001 (Kernel-EventTracing)
-4X Metadata preparation failed: container {A9EC563A-C8A0-5752-93CC-9E25F4E665BF}, result = 0x80004005 (DeviceSetupManager)

https://mega.nz/file/vZYGCRbL#QiG-FWt51nnPkDv4CaK8basjHPxmHmUrpX8_n--4L4Q
 

Satan-IR

Splendid
Ambassador
As asked above did you check RAM modules with bootable memtest, one stick at a time?
What memtest? Memtest86 does not run within Windows and you do not give it a set amount of RAM, it tests all RAM installed (at the moment). It has it's own OS and you boot to it on a flash drive or disc.
When it seems different software may cause or trigger BSODs Resolve, Vegas Pro, Firefox and Twitch etc. and it happened across two machines (with some different components) it's likely to be a hardware issue too.

As said above, something that was and is present in both machines then and now.
 
Feb 6, 2022
10
0
10
Still same problem. Now i did have time to run memtest. It took all night but no errors. Also new RAM sticks are here (im moving to 1600MHz), but i wanted to make sure sticks are good before selling those.

Memtest image: View: https://imgur.com/rlVuN9P


Its super weird, i can play games without issues, but if i have Firefox open on both monitors, YouTube playing in other one and like 9 tabs open in main monitor, usually i get BSOD.

I think it can be CPU, something to do with boost clocks or GPU problem. I have two new 2670 CPUs what im going to install and i have spare GTX 1060 6GB what im going to use more testing.

Latest dumps:

https://mega.nz/file/uZg1xYxR#V3g7eOE6rZFQf4Rgym4zGfJgPFG67r_Lo5v1HaWym_k
https://mega.nz/file/fcYyRbII#wsW6VdNra7gth06eBpIMI7uH-dCA2wwtaKIIeUDAQ4A
 
Feb 6, 2022
10
0
10
it doesn't help one is upside down, top looked different until i realized that

it looks a little darker but could be the angle.

Yeah the top one is bright at middle and darker outside. I did notice that and no matter what angle i look it, its same. So i think that is faulty CPU. Also termal paste did have areas like some sort clear oil or some moisture on chips, it was super weird. But overall i think both CPUs are going in the trash bin.
 
Feb 6, 2022
10
0
10
Yeah, its not result of camera angle and lighting adjustments. Couple contacts on CPU did have some oily stuff on. It seems like there was couple possible reasons, 1) Thermal paste was bad 2) Thermal paste was some cheap and when CPU did hit 80 celsius multiple times it did start turning to oil.