[SOLVED] WHEA UNCORRECTABLE ERROR - Intermittent BSOD on high end new build

Page 2 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.

Colif

Win 10 Master
Moderator
tbh I was only using it to change the RGB lighting according to CPU temperature (e.g. green, yellow, red)
i can do that in Icue on my AIO. it lets you save it onto hardware profile of AIO so it doesn't need the drivers to run it. I actually have it saved on mine, but as soon as I load they all turn whatever color I am using that week.

I would love to find 1 program that lets you change colours of all of them. I found this earlier as someone asked that very question but I don't know if it does colours, and on looking it doesn't mention rgb fusion so maybe not- https://rgbsync.com/

link to a similar problem this month with cam << looks like its cam
 
Last edited:
Reactions: jcsammut

Colif

Win 10 Master
Moderator
does nzxt cam have game profiles? I read it can change the color of lighting if you play games, maybe it has one for Hitman and causes the bsod when it tries to change lighting controlled by something else? Just thinking out loud

since its latest feature is Profile sync - Automatically change the behavior of your lights, fans and more based on how you're using your computer. Your lights can dim at night, or your cooling devices can ramp up when you launch a demanding game.
https://www.nzxt.com/camapp
might have teething issues.
 
Reactions: jcsammut

madmatt30

Titan
Ambassador
Do you have game hooking enabled in cam?

If so disable it if you don't need the in game overlay monitor, prefer rivatuner myself anyway.

Cam has had numerous problems with odd games over the years, ranging from bsod to disabling of keyboard/mouse/controllers.
 
Reactions: jcsammut

jcsammut

Honorable
Mar 17, 2014
46
4
10,545
2
Guys, thanks for all your responses, much appreciated - apologies for the delay in responding back here.

I have reviewed the post regarding NZXT Cam and it is pretty damning! The issue is the lightstrip controller in my NZXT case appears to proprietary and only works with their software, but I am going to take a look at OpenRGB now.

I did not use the game profiling functionality, but perhaps an update to the latest version to support that feature is to blame?

I am pretty sure I DID have game hook enabled, I am currently not using Cam (to rule out any other cause) so cannot check, but I have done some extensive testing over the weekend without it:-

3dMark Benchmarks, Prime95, Furmark, Cinebench, Shadow of the Tomb Raider, MW, MW Cold War - all with ultra settings and not had a single issue. I decided to test other stuff in case it was just a coincidence that it was Hitman3, because that is all I have been playing up until now.

So tonight I am going back to Hitman 3, will put in a few more hours with it and if have no issues, will probably put it as the cause.

Will feedback and update tomorrow - thanks once again for all your help, really great community here!!!
 

Colif

Win 10 Master
Moderator
I sort of guessed that if PC isn't crashing (finally) that you were "testing" it playing all the games you couldn't play before :D

Silence is fine if it means its working and you not constantly crashing. That is basically how all BSOD questions end. Hopefully. Some are not so simple.
 
Reactions: jcsammut

jcsammut

Honorable
Mar 17, 2014
46
4
10,545
2
@Colif - thanks for all your help, I am ready to declare this issue closed. I have marked your thread where you link to other NZXT Cam users with similar issues as the best answer as this demonstrated it was not an isolated case, and prompted me to try to recreate without this software loaded.

I still find it unusual that software can cause that type of hardware message, but I have tested it for several hours more and cannot recreate the issue when NZXT Cam is not loaded.

Also since removing it (and Gigabyte App centre) my event log has cleared up massively:-

Event TypeLast hourLast 24 hoursLast 7 days
Critical003
Error012,538
Warning1151,867

I wanted to thank you for your support, patience and persistence!

Also want to thank @madmatt30 and @gardenman for their help and input too!

Will update the thread if anything else comes to light that may help other users!
 

jcsammut

Honorable
Mar 17, 2014
46
4
10,545
2
@Colif Sorry to be a pain but it just happened again 😢.

This time NZXT Cam was not running (though I had briefly loaded it to turn off Autostart at Windows before rebooting). Should I just uninstall it all together?

Happened again during Hitman 3, probably less than an hour into playing it. Could Hitman 3 some how be causing this, with the hardware error a red herring? After all it has less load/temps than when I stress test it and all runs fine.

Could it be Thermal paste related (clutching at straws here I know 😄)?

There were a few Events logged around the time of the crash so I have listed them below, one odd one is disk related (disk 5 surprise removed).

Latest minidump can be found here

Crash Dump Analysis

Crash dumps are enabled on your computer.

Crash dump directories:
C:\Windows
C:\Windows\Minidump

On Tue 02/03/2021 12:21:27 your computer crashed or a problem was reported
crash dump file: C:\Windows\Minidump\030221-14125-01.dmp
This was probably caused by the following module: ntoskrnl.exe (nt+0x3F5A80)
Bugcheck code: 0x124 (0x0, 0xFFFFD48CEBB8E028, 0xFC800800, 0x60C0859)
Error: WHEA_UNCORRECTABLE_ERROR
file path: C:\Windows\system32\ntoskrnl.exe
product: Microsoft® Windows® Operating System
company: Microsoft Corporation
description: NT Kernel & System
Bug check description: This bug check indicates that a fatal hardware error has occurred. This bug check uses the error data that is provided by the Windows Hardware Error Architecture (WHEA).
This is likely to be caused by a hardware problem.
The crash took place in the Windows kernel. Possibly this problem is caused by another driver that cannot be identified at this time.

On Tue 02/03/2021 12:21:27 your computer crashed or a problem was reported
crash dump file: C:\Windows\MEMORY.DMP
This was probably caused by the following module: pshed.dll (PSHED!PshedBugCheckSystem+0x10)
Bugcheck code: 0x124 (0x0, 0xFFFFD48CEBB8E028, 0xFC800800, 0x60C0859)
Error: WHEA_UNCORRECTABLE_ERROR
file path: C:\Windows\system32\pshed.dll
product: Microsoft® Windows® Operating System
company: Microsoft Corporation
description: Platform Specific Hardware Error Driver
Bug check description: This bug check indicates that a fatal hardware error has occurred. This bug check uses the error data that is provided by the Windows Hardware Error Architecture (WHEA).
This is likely to be caused by a hardware problem.
The crash took place in a Microsoft module. Your system configuration may be incorrect. Possibly this problem is caused by another driver on your system that cannot be identified at this time.
Event TypeEventTimeInfo
CriticalKernel-Power12:21:55The system has rebooted without cleanly shutting down first.
ErrorWHEA-Logger12:22:02A fatal hardware errror has occurred Reported by Processor Core; Machine Check Exception; Bus/Interconnect Error; APIC ID 6
ErrorBugcheck12:22:04The computer has rebooted from a bugcheck
ErrorEventLog12:22:00The previous system shutdown at 12:18:26 was unexpected
Warninge1rexpress12:21:57Network link is disconnected
WarningCertificateServicesClient-AutoEnrollemnt12:22:12Certificate for local system with Thumbprint is about to expire or already expired
Warningdisk12:16:46Disk 5 has been surprised removed
WarningKernal-PnP12:21:56The driver \Driver\WudfRd failed to load for the device
WarningUser Device Registration12:22:26Windows Hello for Business provisioning will not be launched

Is it worth trying Windows Driver Verifier?

Thanks again!
 

gardenman

Admirable
Moderator
I ran the dump file through the debugger and got the following information: https://jsfiddle.net/vgx8yu6w/show This link is for anyone wanting to help. You do not have to view it. It is safe to "run the fiddle" as the page asks.

File information:030221-14125-01.dmp (Mar 2 2021 - 07:21:27)
Bugcheck:WHEA_UNCORRECTABLE_ERROR (124)
Probably caused by:memory_corruption (Process: HITMAN3.exe)
Uptime:0 Day(s), 0 Hour(s), 43 Min(s), and 15 Sec(s)

This information can be used by others to help you. Someone else will post with more information. Please wait for additional answers. Good luck.
 
Reactions: jcsammut

Colif

Win 10 Master
Moderator
things its not - Its not the AMP
Its not NZXT (maybe - depends how good its uninstaller is)
things I should know better - thinking BSOD are ever fixed :)

disk 5 has been surprised removed?

time to check parts you haven't already (unless I didn't read it all( ok, I didn't))
I have noted where you ran tests/games already

Operating System
Windows 10 Pro 64-bit
CPU
AMD K19 5900X with Noctua NH-D15S (with additional fan)
Prime 95 run

RAM
32.0GB Dual-Channel @ 1796MHz (16-19-19-39) - XMP profile
run memtest? if not
Try running memtest86 on each of your ram sticks, one stick at a time, up to 4 passes. Only error count you want is 0, any higher could be cause of the BSOD. Remove/replace ram sticks with errors. Memtest is created as a bootable USB so that you don’t need windows to run it

Motherboard
Gigabyte Technology Co. Ltd. X570 AORUS ULTRA (AM4) 36 °C
NO tests for motherboard

Graphics
SAMSUNG (3840x2160@60Hz)
4095MB NVIDIA GeForce RTX 3090 (MSI) 49 °C
3dMark Benchmarks, Furmark, Cinebench, Shadow of the Tomb Raider, MW, MW Cold War

Storage
1863GB Seagate ST2000DM001-1CH164 (SATA ) 33 °C
931GB Sabrent Rocket 4.0 1TB (PCIE4 NVME (SSD))
1863GB Sabrent Rocket 4.0 2TB (PCIE4 NVME (SSD))
Seagate - https://www.seagate.com/au/en/support/downloads/seatools/seatools-win-master/
Sabrent- https://www.sabrent.com/rocket-control-panel/

Other
Intel Wireless Gigabit VR Adapter
Wireless keyboard with trackpad
Arctis 7x Headphones
Xbox One Wireless Controller Adapter
NZXT 710i Case

I don't see a Disk 5?

Gardenman is early

What temps do you get in game? (I may have asked this already)

i hate whea errors, the text never helps.

did I mention this? might not be related but I see VR etc - https://www.techspot.com/news/88695-amd-acknowledges-usb-connectivity-issues-x570-b550-motherboards.html - I haven't had it myself
 

jcsammut

Honorable
Mar 17, 2014
46
4
10,545
2
Thanks @gardenman; @Colif

So I turned on Driver Verifier and Windows kinda went kaboom! :oops:

Every time Windows rebooted I got a Stop Code: Driver Verifier Detected Violation (luckily the system restore I had made prior to this saved me), so I checked WhoCrashed:-

The associated minidump is here

Crash Dump Analysis

Crash dumps are enabled on your computer.

Crash dump directories:
C:\Windows
C:\Windows\Minidump

On Tue 02/03/2021 13:26:28 your computer crashed or a problem was reported
crash dump file: C:\Windows\Minidump\030221-14265-01.dmp
This was probably caused by the following module: win32kfull.sys (0xFFFFF310FC8354AC)
Bugcheck code: 0xC4 (0xF6, 0x3FC, 0xFFFFDB8E80F9E240, 0xFFFFF310FC8354AC)
Error: DRIVER_VERIFIER_DETECTED_VIOLATION
file path: C:\Windows\system32\win32kfull.sys
product: Microsoft® Windows® Operating System
company: Microsoft Corporation
description: Full/Desktop Win32k Kernel Driver
Bug check description: This is the general bug check code for fatal errors found by Driver Verifier. A driver references a user-mode handle as kernel mode.
This appears to be a typical software driver bug and is not likely to be caused by a hardware problem.
The crash took place in a Microsoft module. Your system configuration may be incorrect. Possibly this problem is caused by another driver on your system that cannot be identified at this time.

Does any of this help? Previous reports of this issue appeared to be GPU related, but I have removed and reinstalled the drivers with DDU.

Sorry for all the hassle guys - and thanks again for your help! (y)
 

gardenman

Admirable
Moderator
I ran the dump file through the debugger and got the following information: https://jsfiddle.net/esxcwarv/show This link is for anyone wanting to help. You do not have to view it. It is safe to "run the fiddle" as the page asks.

File information:030221-14265-01.dmp (Mar 2 2021 - 08:26:28)
Bugcheck:DRIVER_VERIFIER_DETECTED_VIOLATION (C4)
Probably caused by:memory_corruption (Process: OriginWebHelperService.exe)
Uptime:0 Day(s), 0 Hour(s), 00 Min(s), and 46 Sec(s)

This information can be used by others to help you. Someone else will post with more information. Please wait for additional answers. Good luck.
 

Colif

Win 10 Master
Moderator
i can't tell you exactly what win32kfull.sys does but it often shows up with GPU errors. It is part of windows kernel

win32kbase!DxgkEngAcquireWin32kAndPDEVLocks+33 (+0x74b)
Dxgk - Direct X Kernel interface
Graphics card. or drivers

But its unusual for Driver verifer not to say a driver name. I only worked that out by what was happening at time. I may have read it wrong, but didn't gpu show up before?
Feb 23rd WHEA mentioned DirectX
its never actually said its the Nvidia drivers, I would have expected Driver verifer to say a driver name.

nice to not see Hitman for a change. I bet it uses Origin though.

What temps do you get in game? (I may have asked this already)

trying to think of suggestions, I did mention a few tests above you may not have run.
 

jcsammut

Honorable
Mar 17, 2014
46
4
10,545
2
Hi @Colif

Thanks again for all the suggestions, apologies for the delay in replying have been testing everything as advised above, quick summary:-
  • I decided to try Driver Verifier as at that point I had nothing to lose (will not be repeating it in a hurry) 😀
  • Memtest (via bootable USB) & Windows memory diags - no errors after 4 passes each
  • Prime95 - depending on the test at the time, the CPU ranges between 60c to 79c (reached 84c on one specific test for about 20 seconds, and 90c for about 30 seconds on another - the CPU is rated for 90c before throttling). PC remains stable and has been running like this for a good hour or so, never remains above 80c for longer than 30 seconds, the Fan curve appears to be working nicely working nicely
  • AMD Ryzen Master Test - Default mode (obviously no o/c) passed no issues
  • Furmark - at 1080p - temps max out at 80c (GPU is rated for 91c before throttling), again running for a good hour and PC remains stable, again Fan curve working nicely
  • Have also run Prime95 and Furmark simultaneously to to approximate load from a high performance game
  • Interestingly Hitman 3 did push the 3090 to 82c (@4k ultra settings) until I adjusted the Fan curve and now it tops out at 75c, but the CPU doesn't tend to go above 70c, so nowhere near the levels of Prime95
  • Cast NZXT 710i with good airflow, Cooling: 3 intake front fans , 2 back exhaust fans, with the 2 CPU coolers exhausting through the back too
  • I have uninstalled NZXT Cam - as you say, I guess this depends on how good the uninstaller is!
  • Weird that you mention Origin as Hitman 3 is Epic. Did a search on OriginWebHelper and that has caused crashes in the past, so I have disabled it, as it is not needed anyway!
  • The Surprise Disk Removal relates to an XVD drive that is a temporary drive used for Xbox PlayAnywhere gaming, so don't think this is an issue
  • I have played another several hours in a fairly demanding Hitman 3 level without a crash, so going to keep thrashing this to see if I can get it to break again!
  • If it goes again I may just replace the thermal paste for the sake of it!
Did I miss anything out? 😄

Thanks again as I think I would be losing it if it wasn't for your/Gardenman's help!
 

jcsammut

Honorable
Mar 17, 2014
46
4
10,545
2
Thanks @Colif, Lets leave it there for now, I agree I think we have exhausted everything! (y)

People will be able to see the entirety of the thread if they want the info, but I am now marking your latest post as Best Answer - reason being we have pretty much discounted everything else, and the one common denominator in all of this is Hitman 3.

Thanks again, I really do appreciate all the time, patience and effort you have put in to helping me!

I will report back if I find out anything of note.

All the best!
 
Reactions: Colif

ASK THE COMMUNITY