Random reboots with event "A fatal hardware error has occurred"

techprince

Distinguished
Dec 15, 2014
37
1
18,535
My Configuration :
CPU : AMD FX 6300
Cooler : CM Hyper TX3 Evo
MB : Gigabyte 78LMTUSB3 Rev 6 - F2 (Latest BIOS)
RAM : Kingston HyperX 4GB x2 1333MHz DDR3
GPU: MSI RX 480 Gaming X 4G
PSU: Corsair VS550
HDD: Samsung 840 SSD 128GB + WD Blue 1TB
UPS: APC Back-UPS Pro 1000
OS : Windows 10 Pro x64 1709 (up to date)

Its been 2 years for all the components but HDD, RAM and GPU. HDD was added 3 months ago, GPU was added a year ago where as memory is 3+ years old.

Error Details :
The computer has rebooted from a bugcheck. The bugcheck was: 0x00000124 (0x0000000000000000, 0xffffcb8dcf7238f8, 0x0000000000000000, 0x0000000000000000).

A fatal hardware error has occurred.
Component: AMD Northbridge
Error Source: Machine Check Exception
Error Type: HyperTransport Watchdog Timeout Error
Processor APIC ID: 0

I have not OC'ed anything, never oc'ed in the past. I also have "Performance Boost" disabled.

Temperatures for both CPU and GPU are well within the max limit. CPU package temp never goes above 47dc. GPU never goes above 74dc while stress testing.

The first reboot suddenly occurred when i was playing a game. Then second reboot occurred when i was in Steam, browesing 2 days later. Third reboot occurred when game started and the fourth reboot occurred just after the Windows logo while booting 5 days after the first one.

So far i have tested :

CPU with prime95 = no issues.
Memory with Memtest86 over 9 passes = no issues
GPU with Furmark = no issues
HDD with SeaTools = no issues
Cleaned GPU drivers with DDU multiple times and installed the new optional drivers.
Updated Audio drivers.
Ran sfc /scannow, chkdsk /f
Resetted RAM and GPU.
Checked all the connections inside.

Now if the problem was with GPU, i should see some artifacts or freezes. I never had any. I dont get any of those issues. I can play game for hours/days and it wont do anything.

If the problem is with memory, i should have seen some error while stress testing.

If the problem is with CPU, computer should have restarted while stress testing.

If the problem is with HDD, SMART should have detected it or SeaTools should have shown something.

PSU load never goes above 350W even while gaming. The two random reboots were while in Steam and while booting where PSU load wasnt even 150W. So PSU most probably would not cause reboots in these cases.

That leaves the motherboard. I have one year warrenty left on my motherboard. Should I RMA my board? what do you think?

 


I have it attached to APC Back-UPS PRO 1000