• Happy holidays, folks! Thanks to each and every one of you for being part of the Tom's Hardware community!

Question Bad hardware diagnosis ?

Oct 6, 2024
7
1
15
Hello all! My friend has been slowly building a PC and has it completed, here are the specs:

Msi Tomahawk x570s
Ryzen 5950x
Msi 4070 super
1TB 980 pro
2TB 980 pro
128gb DDR4 3600 corsair vengeance
Win 11 Home


Original parts before issues:
CPU
MB
RAM
1 Tb 980

Around May, he purchased the 2TB,GPU, and 64 extra GB of ram. Ever since then, he’s been getting blue screens “DPC_WATCHDOG_VIOLATION”.

It started off occasional (weekly) then got faster (multiple times a week) and reached the point that it would freeze within 10 minutes, give this blue screen, and stay on the blue screen until manually shutting it down (long press). It happens without a load as well.

I’ve been attempting to diagnose the problem, here are the steps i’ve taken:
  • Driver reinstallation for every component (DDU for gpu)
  • temperature check on all components (under load and idle)
  • crash dump analysis
  • event viewer analysis
  • gpu stress test (in a working machine)
  • using ram (in a working machine)
  • running memtest86 (no issue)
  • hard disk sentinel (no issue)
  • Reseating GPU, RAM.
  • Alternating sticks and slots for RAM
  • Using known working RAM
  • Using known working GPU
  • Reseating m.2
  • Switching m.2 port and using one at a time.
  • Using m.2 on a pci -> nvme adapter
  • Using known working SSD(sata) and HDD
  • Wiping drives to reinstall windows

When attempting to reinstall windows after formatting the drives, it would install and once you would get to the page with keyboard layout and selectable options, it would freeze again, giving the watchdog violation. It got to the point where the PC would freeze while trying to select a drive to install it on, regardless of the troubleshooting steps took above even while trying to install windows.

I did not swap / test the CPU in another machine. Reason being, this PC was very operational with 0 crashes up until this May, right when these upgrades started happening.

HOWEVER, upon inspection of the 2TB M.2, I found that the heat sink was removed and there is a possibility that the M.2 was physically damaged during this process, here is a photo.
View: https://imgur.com/a/l4dofpi
The conclusion I’ve came to is that this drive slowly started to damage the motherboard (power shorts, instability, etc). Reason being, the PC freezes no matter the hardware combination during the troubleshooting steps I took.

I would love some insight into this issue. What do you guys think? The next step we are going to take is purchasing a new motherboard and tossing the 2TB M.2.
 
Last edited by a moderator:
Hello all! My friend has been slowly building a PC and has it completed, here are the specs:



Ryzen 5950x

Msi 4070 super

1TB 980 pro

2TB 980 pro

128gb DDR4 3600 corsair vengeance

Win 11 Home

Msi Tomahawk x570s



Original parts before issues:

CPU

MB

RAM

1 Tb 980



Around May, he purchased the 2TB,GPU, and 64 extra GB of ram. Ever since then, he’s been getting blue screens “DPC_WATCHDOG_VIOLATION”.



It started off occasional (weekly) then got faster (multiple times a week) and reached the point that it would freeze within 10 minutes, give this blue screen, and stay on the blue screen until manually shutting it down (long press). It happens without a load as well.



I’ve been attempting to diagnose the problem, here are the steps i’ve taken:



Driver reinstallation for every component (DDU for gpu)

temperature check on all components (under load and idle)

crash dump analysis

event viewer analysis

gpu stress test (in a working machine)

using ram (in a working machine)

running memtest86 (no issue)

hard disk sentinel (no issue)

Reseating GPU, RAM.

Alternating sticks and slots for RAM

Using known working RAM

Using known working GPU

Reseating m.2

Switching m.2 port and using one at a time.

Using m.2 on a pci -> nvme adapter

Using known working SSD(sata) and HDD



Wiping drives to reinstall windows

When attempting to reinstall windows after formatting the drives, it would install and once you would get to the page with keyboard layout and selectable options, it would freeze again, giving the watchdog violation. It got to the point where the PC would freeze while trying to select a drive to install it on, regardless of the troubleshooting steps took above even while trying to install windows.



I did not swap / test the CPU in another machine. Reason being, this PC was very operational with 0 crashes up until this May, right when these upgrades started happening.



HOWEVER, upon inspection of the 2TB M.2, I found that the heat sink was removed and there is a possibility that the M.2 was physically damaged during this process, here is a photo.



View: https://imgur.com/a/l4dofpi




The conclusion I’ve came to is that this drive slowly started to damage the motherboard (power shorts, instability, etc). Reason being, the PC freezes no matter the hardware combination during the troubleshooting steps I took.



I would love some insight into this issue. What do you guys think? The next step we are going to take is purchasing a new motherboard and tossing the 2TB M.2.
That SSD shouldn't be used with so obviously damaged printed circuit and embedded micro components.
Another thing I noticed "and 64 extra GB of ram" did you just ad it to existing ones ? That can cause that exact problem too.
 
  • Like
Reactions: Lutfij
That SSD shouldn't be used with so obviously damaged printed circuit and embedded micro components.
Another thing I noticed "and 64 extra GB of ram" did you just ad it to existing ones ? That can cause that exact problem too.
He purchased the exact same kit as he was already running hence the extra 64 gb. He was originally running 2x32 but purchased the second kit to make it 4x32. The problems were already existing prior to purchasing the second 64gb kit. Hope this clears things up. And yes, I agree the SSD should not have been being used. I wasn’t present during this installation and became aware of the SSD tonight.
 
He purchased the exact same kit as he was already running hence the extra 64 gb. He was originally running 2x32 but purchased the second kit to make it 4x32. The problems were already existing prior to purchasing the second 64gb kit. Hope this clears things up. And yes, I agree the SSD should not have been being used. I wasn’t present during this installation and became aware of the SSD tonight.
Did you try RAM at base or lower speed? Full speed can be a bit too much for memory controller when 4 sticks/channels are used. Also looks like Ryzen is not fond of Vengance RAM for some reason, many are complaining about it. I always thrust Kingston for all AMD/Ryzen builds.
 
Did you try RAM at base or lower speed? Full speed can be a bit too much for memory controller when 4 sticks/channels are used. Also looks like Ryzen is not fond of Vengance RAM for some reason, many are complaining about it. I always thrust Kingston for all AMD/Ryzen builds.
We ran the ram at 2133,2666, and max speed. Also tested like this while only using 1,2, and 4 sticks.
 
That leaves Vengance "problem
I believe the issue was the CPU. After using a new motherboard and power supply and testing with the vengeance and kingston, the problem still persisted. As a last resort I decided to pull his CPU (5950x) and inserted mine (5600x) after trying every hardware swap without success. This finally let windows install, get to desktop, and successfully run without issue. It sucks since he purchased a new board but if my theory is correct (SSD damage -> power issues -> fried memory controller on CPU) he would be better off with a new board anyways since there could also be underlying damage to it. Thank you for attempting to solve my problem.