Question BSODs when gaming ?

Apr 11, 2022
10
0
10
Hi,

I'm frequently running into the "WHEA_UNCORRECTABLE_ERROR" and "IRQ_NOT_LESS_THAN_OR_EQUAL" BSODs when gaming. Need help to determine what I should test next and how I can do it.

Here is my machine info:
---------------------------------------------------------------------------------------------------
CPU: Intel Core-i7 8700K
Motherboard: ASRock Z370 Pro4
Ram: Corsair Vengeance 2x16GB
SSD/HDD: Western Digital M.2 (WDS250G1B0B-00AS40)
Samsung 970 Evo Plus M.2
Seagate Barracuda SATA (ST1000DM003-1CH162)
Crucial BX300 SATA (CT480BX300SSD1)
GPU: Zotac Geforce RTX 2080 Ti
PSU: Seasonic Focus80 Gold 650W, ~4 years of use
Chassis: Fractal Meshify Mid-Tower
OS: Windows 10 (10.0.19043 Build 19043)
---------------------------------------------------------------------------------------------------


Up to this point I have attempted the following things:

  • Clean install of latest GPU drivers (no effect)
  • Updated BIOS to latest version (no effect)
  • Updated Intel Drivers (no effect)
  • Checked disk with CrystalDiskInfo (no problems found)
  • Ran memtest (no problems found)
  • 4/13/22: Ran Prime95 ("smallest FFT" test fails, "Large FFT" test BSODs within 1-2 minutes)
  • 4/14/22: Ran Intel Processor Diagnostic Tool (no problems found)
  • 4/15/22: Ran Prime95 with increased DRAM Voltage ("smallest FFT" test BSODs)
---------------------------------------------------------------------------------------------------

Any help is appreciated thanks!
 
Last edited:

Colif

Win 11 Master
Moderator
Can you follow option one on the following link - here - and then do this step below: Small memory dumps - Have Windows Create a Small Memory Dump (Minidump) on BSOD - that creates a file in c windows/minidump after the next BSOD

  1. Open Windows File Explore
  2. Navigate to C:\Windows\Minidump
  3. Copy the mini-dump files out onto your Desktop
  4. Do not use Winzip, use the built in facility in Windows
  5. Select those files on your Desktop, right click them and choose 'Send to' - Compressed (zipped) folder
  6. Upload the zip file to the Cloud (OneDrive, DropBox . . . etc.)
  7. Then post a link here to the zip file, so we can take a look for you . . .
 

gardenman

Splendid
Moderator
Hi, I ran the dump file through the debugger and got the following information: https://jsfiddle.net/kcL1we5b/show This link is for anyone wanting to help. You do not have to view it. It is safe to "run the fiddle" as the page asks.

File information:041222-7500-01.dmp (Apr 12 2022 - 09:08:47)
Bugcheck:WHEA_UNCORRECTABLE_ERROR (124)
Probably caused by:GenuineIntel (Process running at time of crash: System)
Uptime:0 Day(s), 9 Hour(s), 24 Min(s), and 00 Sec(s)

BIOS info was not included which can sometimes mean an outdated BIOS is being used.

This information can be used by others to help you. Someone else will post with more information. Please wait for additional answers. Good luck.
 
Apr 11, 2022
10
0
10
Extra data point: I tried running Prime95 and got WHEA_UNCORRECTABLE_ERROR BSOD within 1-2 minutes. I ran it a second time and got the same result within the same timeframe.
 

Colif

Win 11 Master
Moderator
If it wasn't for fact i approved the edit, I wouldn't have seen you passed Intel test as well. Perhaps in future just reply with answer :)

wonder why it fails Prime. can you look in the folder prime is installed in, might be a text document called stress. Curious if it has any info
 
Apr 11, 2022
10
0
10
Hey sorry! I was actually formulating this reply after making that edit, I wasn't expecting you to respond so quickly! :LOL:

So I started trying to run specific prime subtests rather than the blend.


------------------------------------------------------------------------
First, I ran the "Smallest FFT's" test (high power/heat/CPU Stress)

[Thu Apr 14 12:12:34 2022]
FATAL ERROR: Rounding was 0.5, expected less than 0.4
Hardware failure detected running 4K FFT size, consult stress.txt file.
FATAL ERROR: Rounding was 0.5, expected less than 0.4
Hardware failure detected running 4K FFT size, consult stress.txt file.
FATAL ERROR: Rounding was 5.361886735e+13, expected less than 0.4
Hardware failure detected running 4K FFT size, consult stress.txt file.
FATAL ERROR: Rounding was 0.5, expected less than 0.4
Hardware failure detected running 4K FFT size, consult stress.txt file.
FATAL ERROR: Rounding was 0.470703125, expected less than 0.4
Hardware failure detected running 4K FFT size, consult stress.txt file.
[Thu Apr 14 12:17:21 2022]
FATAL ERROR: Rounding was 0.5, expected less than 0.4
Hardware failure detected running 384K FFT size, consult stress.txt file.
[Thu Apr 14 12:21:16 2022]
FATAL ERROR: Resulting sum was 0, expected: 239059607843050.6
Hardware failure detected running 4K FFT size, consult stress.txt file.
FATAL ERROR: Resulting sum was 22919765477622.11, expected: 22919764598032.41
Hardware failure detected running 4K FFT size, consult stress.txt file.
FATAL ERROR: Rounding was 0.5, expected less than 0.4
Hardware failure detected running 4K FFT size, consult stress.txt file.
FATAL ERROR: Rounding was 0.5, expected less than 0.4
Hardware failure detected running 4K FFT size, consult stress.txt file.
FATAL ERROR: Rounding was 0.5, expected less than 0.4
Hardware failure detected running 4K FFT size, consult stress.txt file.
FATAL ERROR: Rounding was 0.5, expected less than 0.4
Hardware failure detected running 4K FFT size, consult stress.txt file.
------------------------------------------------------------------------

Then I ran the "Large FFT's" Test (stresses memory controller and RAM)

And within a couple minutes I got the BSOD again.

Now, this test also seems to put the CPU under high usage, so I don't know if it actually points to a RAM problem...plus memtest passed before
------------------------------------------------------------------------
 

Colif

Win 11 Master
Moderator
rounding errors can be fixed by giving memory more voltage in bios
likely cause of all your BSOD

Just bump it up a little and then run prime again. Once it can actually finish without BSOD you might be better off.

Prime is a stress test. it does stress CPU + ram and also can find things like this.
 
Apr 11, 2022
10
0
10
Hmm having some trouble figuring out how to change the DRAM voltage, maybe someone knows the trick...

I went to the BIOS and tried to adjust it, but it's set to "auto" and nothing I type in that field sticks.

220414192307.png


I thought maybe i had some xmp profile thing preventing me from making changes but I didn't see any clear way to disable it (the xmp options are either "Auto" or "XMP 2.0 Profile 1")

220414192331.png
 
Apr 11, 2022
10
0
10
That was it thank you!

I went ahead and tried running Prime95 after raising the DRAM Voltage to 1.25v, 1.3v, and 1.35v (The highest it would allow me to set)

In each case, i ran the "smallest FFT" test. I got WHEA_UNCORRECTABLE_ERROR BSOD very quickly each time.

I went back to the original 1.2v and the "smallest FFT" test doesn't BSOD, though of course I still have the rounding errors.
 

Colif

Win 11 Master
Moderator
hmm.

now i don't know if cause is CPU or ram, they both pass individual tests but not Prime. I have seen ram pass tests and still be the cause, in that replacing the sticks stopped the problem from happening. Also seen it be CPU, so its not easy to say which.

do you have XMP on?

PSU: Seasonic Focus80 Gold 650W, ~4 years of use
10 year warranty? i know the focus plus are 10 years, I don't really suspect it but well, power is another element of Prime tests.
 
Apr 11, 2022
10
0
10
hmm.

now i don't know if cause is CPU or ram, they both pass individual tests but not Prime. I have seen ram pass tests and still be the cause, in that replacing the sticks stopped the problem from happening. Also seen it be CPU, so its not easy to say which.

do you have XMP on?


10 year warranty? i know the focus plus are 10 years, I don't really suspect it but well, power is another element of Prime tests.


Based on what I saw in the BIOS (See screenshot from earlier), I think XMP is on.....it's set to "auto" but above that it shows the profile setting (16-18-18-35 1.2v)

Are there any other measures I can take to try to differentiate the problem between CPU/RAM/Power Supply? Or am I now in replace and see mode?
 
Apr 11, 2022
10
0
10
were you overclocked? our tests can't seem to see that, so if I had known you overclocked it would have been first suggestion... to remove it

Yes, to 4.9GHZ. TBH I had completely forgotten to what extent I had overclocked it and it had been working fine until recently. I downclocked to 4.7GHZ and now it's stable with a very occasional framerate stutter while playing games (maybe those would have been BSODs before?)

I moved to a desert environment with a ~20-30F temperature difference from where I lived before and I suspect the local increase in temperature compared to where I used to live must have contributed to the risk of a failure. I thought my cooling would be handling it ok but evidently not.