Question Advice on potential bad motherboard

May 25, 2024
6
1
10
I've hand built dozens of PCs over the past 20+ years. Never had a problem that wasn't just me being stupid.

I wanted to upgrade a low-end-ish utility computer in my house. I bought an ASUS Prime B450M-A II motherboard and some G.Skill memory to go with it. I re-used the existing GPU & NVME, and swapped the CPU from another box. All of this is/was known-good hardware.

After it was rebuilt, it would boot fine, and then after a random amount of time it would lock up hard reporting a ton of EXT4 FS errors on the console. Booting from a rescue disk and scanning the NVME always reported zero filesystem errors. I did this several times. I reset the BIOS to defaults, updated the BIOS, tried various parameters in the BIOS, moved the NVME from the mobo to a PCIE slot. Nothing changed. Convinced something was wrong with the NVME, I bought a new one and did a fresh OS install. Same problems. Since the GPU was on the same bus, I took a long shot and swapped with one from another box. Same problems. I ran memtest for hours, it never found anything.

The only thing left, that I can think of, is the motherboard. Am I reasonable to attempt to return it? The symptoms are very repeatable, and very specifc. At this point could it be anything else?
 
May 25, 2024
6
1
10
Thanks for the idea; I did a fresh OS install on a hard drive and it's been running perfectly for hours. The motherboard hates NVMEs. I'll cycle a couple more through it for more data; but with one known good and one brand new NVME showing the same symptoms, I'm convinced there's something wrong with the motherboard and I'll see the same result with the others.
 
May 25, 2024
6
1
10
After more testing, it turned out to be the original NVME was bad. In my initial testing it was always installed even if it wasn't in use. Once I physically removed it the issue went away. I put the new NVME in and it's been +24h with no sign of problems. I don't know how an unused NVME could cause problems. I put it in an external USB-C enclosure to copy data from it to the new NVME and have had no problems. 🤷‍♂️
 
May 25, 2024
6
1
10
... annnnd it worked flawlessly for a few days and then it tanked again. I was only able to capture dmesg by booting from USB and mounting one of the NVME partitions:

[ 579.497218] nvme0n1: I/O Cmd(0x2) @ LBA 2081, 1 blocks, I/O Error (sct 0x3 / sc 0x71)
[ 579.497242] I/O error, dev nvme0n1, sector 2081 op 0x0:(READ) flags 0x80700 phys_seg 1 prio class 0
[ 579.497273] nvme0n1: I/O Cmd(0x2) @ LBA 2082, 1 blocks, I/O Error (sct 0x3 / sc 0x71)
[ 579.497276] I/O error, dev nvme0n1, sector 2082 op 0x0:(READ) flags 0x80700 phys_seg 1 prio class 0
[ 579.497284] nvme0n1: I/O Cmd(0x2) @ LBA 2083, 1 blocks, I/O Error (sct 0x3 / sc 0x71)
... repeats for a while
 
May 25, 2024
6
1
10
Case closed: it was the power supply.

If you told me this I wouldn't believe it.

There was an old SATA HD in the box, and I decided to pull it out on the 29th. This is what tipped me off to the problem. When I removed it, shortly after the system locked up. I did this multiple times and it was very repeatable. I replaced the power supply and removed the HD. No problems for a week. It wouldn't go more than a couple days previously.
 
  • Like
Reactions: Hotrod2go