[SOLVED] Ryzen 5 5600x - Random Reboot on Idle, WHEA-Logger Event 18

Jul 22, 2021
1
0
10
Been having relatively uncommon unexpected reboots over the past week with my 5600X, after every one Event Viewer logs a new WHEA-Logger Event 18. All settings are default (no overclocking, no undervolt) and the system runs fine under load. The restarts have all only happened on idle. CPU temperatures seem to be fine, idle at around 45-50C and maxing out at around 85C under load. The restarts aren't so common that it's a pain, maybe like 4 in the past week - but I'm worried that this might be a sign of worsening conditions. I've only had this build for a little over a month, and before now everything's been running fine. The only recent change that comes to mind is an Nvidia driver update that - for whatever reason - could be corrupting memory. Even more confusing is that the system fails in such a way that Windows is unable to generate crash dumps and ignores my setting to not immediately restart after a crash.

Prior Troubleshooting:
  1. I've tried reseating my CPU and cooler, this hasn't changed anything at all (temps or reboot-wise).
  2. I've reverted my Nvidia drivers since I've read that some people have had other glaring issues (although they've been under load) with the most recent driver update.
  3. I've updated my BIOS to the most recent version, still had a reboot after that.

Specs:
CPU: AMD Ryzen 5 5600x
GPU: EVGA FTW3 3080Ti
Mobo: Asus ROG STRIX B550-F w/ WiFi
BIOS Version: 2403
Memory: Corsair Vengeance 16GB (2 x 8GB) DDR4
PSU: EVGA SuperNova 850W G2

Event Log:
A fatal hardware error has occurred.

Reported by component: Processor Core
Error Source: Machine Check Exception
Error Type: Cache Hierarchy Error
Processor APIC ID: 3

Source: WHEA-Logger
Event ID: 18

Let me know if there's any more information I can provide to figure this out. Thanks in advance for the help.
 
Solution
You could try booting from a USB flash drive set for Memtest86, and loop on that several hours... (any errors at all are cause for concern)

Or, lower your RAM speed a notch, or , test at simple desktop ops using but one RAM stick at a time....(if reboots persist with both RAM modules used one at a time, RAM would seem to be unrelated...

Do you have a restore point back to a few days before the recent Nvidia driver update?
You could try booting from a USB flash drive set for Memtest86, and loop on that several hours... (any errors at all are cause for concern)

Or, lower your RAM speed a notch, or , test at simple desktop ops using but one RAM stick at a time....(if reboots persist with both RAM modules used one at a time, RAM would seem to be unrelated...

Do you have a restore point back to a few days before the recent Nvidia driver update?
 
Solution
...

Prior Troubleshooting:
  1. I've tried reseating my CPU and cooler, this hasn't changed anything at all (temps or reboot-wise).
  2. I've reverted my Nvidia drivers since I've read that some people have had other glaring issues (although they've been under load) with the most recent driver update.
  3. I've updated my BIOS to the most recent version, still had a reboot after that.
...
Reset CMOS, again even if you have since the BIOS update, and do it with both battery pull and shorting the pins. Run with everything full default for a while to see if the problems repeat.

Also, from a COMMAND prompt with admin rights do SFC /scannow.
 

A Gamer

Distinguished
Mar 29, 2016
68
2
18,545
Just info : pinpoint the crashing core is easy and fast , just open the "Event Manager" and search for " WHEA error" and especially " APIC " number . For example APIC "5" error = Core2 fail on an 5900x ( remember it start from core 0 to core11 in bios ) , Apic10 = Core5 etc
 

kneel23

Commendable
Aug 10, 2021
3
0
1,510
Me too. Brand new build, brand new windows. No tweaks. No PBO CPO XMP or curve optimizers. everything default

Gigabyte X570 Aorus Master
AMD 5950X
RTX 3080 TI FTW3

memory is fine.

Internet is full of people talking about this problem since late 2020. No [actual] solutions . All assumed solutions are debunked when the reboots happen 1-2 days later.

Common thread is the 5950X or 5900X series.

Starting to be really frustrating, Brand new amazing build, but random rebooting makes it worthless.

Event viewer:
"
A fatal hardware error has occurred.
Reported by component: Processor Core
Error Source: Machine Check Exception
Error Type: Cache Hierarchy Error
Processor APIC ID: 1"

Mine seem to always be APIC ID 0 or 1, never any others
 
Aug 25, 2021
2
0
10
Wanted to chime in and say that sadly I'm on the same boat. Brand new system assembled last week and I've been experiencing this dreaded WHEA ID 18 errors as well, 430 errors since 8/19.

Same as OP, all BIOS settings are at default and the system restarts at random when idling, specially after I've locked it and left it waiting for login.
I tried all sorts of troubleshooting steps, like the ones mentioned above as well as even adding a little extra voltage to the CPU, all yielded squat: workload=no problem, idling=hard reset.

System specs are:
AMD 5600X CPU (batch BG 2123PGS)
Corsair H60 120mm AIO
Asrock B550M Steel Legend with latest BIOS 2.20
32GB G.Skill Trident Z Neo 3200 (ran with and without XMP enabled )
XFX Quick 308 RX 6600XT
Team Group MP33 Pro M.2 1TB
Corsair HX1050 PSU
2 x 2TB WD Black HDD in Raid 0 using AMDs' RaidXpert Utility

All that shoved into a Phantom 630 with plenty of airflow: 140mm rear exhaust, 200mm rear top exhaust, 120mm H60 front top exhaust, 200mm front intake, 200mm side intake (So no overheating issues)
Idle temps stay around 31-33 C, Temps running Prime95 for 2 hours were 71 C, Aida64 temps for 2 hours stayed at 67-68 C
After searching online and encountering a plethora of possible solutions I decided to just RMA the CPU for a replacement.
I'm hoping that will fix the issue as it has for a few others.
 
Aug 25, 2021
2
0
10
Thought I'd give an update if anyone is still looking at this thread:
Still have not gotten my replacement 5600X in yet. But, after much troubleshooting it seems to be the video card driver issue for me. I borrowed a Ryzen 3600 to test the system and got another motherboard just in case, same issues persist on idle conditions when using the "only" released video driver from AMD. Using the default Microsoft basic driver keeps the system stable (though you can't do much with it except browse the web a bit). As soon as the driver is installed and the monitor turns off (default windows lock screen timeout) it resets with the whea 18 error.
I reached out to XFX just in case I have a damaged/dying video card.