Question is there a way to check or scan mother board for errors

Status
Not open for further replies.
Dec 16, 2023
1
0
10
hello everyone.
i have a problem with my pc that have made me nuts trying to solve with no success so far , it does random restarts with no symptoms and no given time , sometimes it restarts many times within one hour , sometimes it will not restart for many hours , i will try to explain what i did ...
i use my pc at my shop mainly for stock, send email, prepare offers, print papers etc. so i think there no stress or heavy usage ...

hardware is as follows:
motherboard: ASUS Z370-P
Cpu : Core i7-8700
RAM : 16gb 1 slot Kingston pc-2666 ddr43 ram
hard drive : Crucial Mx500 1tb M.2 SSD
Vga : motherboard built in
Power supply : thermaltake touch power grand 850 watts full modular

things i have tried so far
first of all i removed all connected hardware (usb printer , speakers , cd drive ) only wireless keyboard and mouse left
1-updated the bios , changed the bios battery
2-installed a new sata ssd with new windows installation and removed the m.2 ssd
33- replaced the ram with new one on the same slot and tried the new ram on all the other slots
4- tried an external vga (rtx2070)
5- replaced the CPU and tried a new core i5 -8400
6-tried another new power supply
7- removed all the connections of the case front panel (reset switch, power switch, power led)
8- removed all case fans (except CPU fan)

with doing all of the above and also trying my removed hardware on another pc and they worked fine I'm 100% sure now that i have narrowed the problem to the motherboard ..

my question is: how can i troubleshoot the problem on the motherboard ...is there any way to fix a damaged motherboard? how can i pinpoint the damaged component on the motherboard and try to repair it without buying a new board?

thanks for any help
 
That's something I can't answer. You should know that when the system just shuts down like that, with no error screen, it is finding what is called a triple hardware exception. At some point when a fault occurs, normally there would be what is called a stack unwinding to get to some minimal function whereby recovery might be possible. Once you've reached a certain stage of failure this is no longer possible, and any operation is likely to further corrupt any storage device. For the safety of the storage device, there is an immediate reset.

Hardware debugging of a motherboard can require equipment which is very proprietary and costing a good fraction of a million dollars. One of the best tools you can use is to see what Linux thinks during such a failure. You could add a second hard drive, e.g., standard SATA, disable the original, and add Linux to that. Or you could use a "test drive" sort of distribution, e.g., KUbuntu has a bootable DVD (or thumb drive) version that takes a very long time to boot, but then runs entirely in RAM without being installed (you could click a button at that time and install, but it is more or less an entire Linux system in RAM for that boot).

Within Linux you can run "dmesg --follow", and maybe use a web browser or something like it, and see what errors show up (the "dmesg --follow" is a live view of logging). If an exception occurs, then probably something will show up. Without a special console though the message will also go away immediately, so you might need to set up a video camera to monitor that log. Short of adding a serial console via a serial UART that is probably the best you can do (serial consoles log to a second computer, and so logs are not lost when the original computer dies).

You've tried just about everything else, so it might actually be the motherboard. Sometimes a ground loop can cause issues if one of the ground connections is slightly off. If a mounting point is loose, or worse, too tight, or not lined up correctly, this could be an issue. Heat is one of the biggest unknowns, and if you have a thermal camera, and can watch the heat at different places on the motherboard up until the reboot, then you might see a hot spot (and there might or might not be anything you can do about that; a thermal camera is also expensive).
 
Status
Not open for further replies.

TRENDING THREADS