Question Random BSODs involving ntoskrnl.exe and PSHED.dll ?

Jan 6, 2024
5
0
10
If I'm in the wrong section let me know. This is my first ever post.

Hello! I do hope there's someone out there who's able to help me out with this. I build a (partly) new system and I'm getting BSODs with an WHEA Uncorrectable Error message. I already skimmed through tons of forum questions (here and elsewhere) and tried out guides I found on the internet on "how to fix" this error - without any luck.

My suspicion is it's either the motherboard (which is the only part in the system which isn't new) or a faulty CPU (I really don't hope it's that) but again, I'm not an expert and I'm hoping someone might be able to help.

Other than installing Windows and a few apps I haven't used the system all too much. The BSODs don't follow a specific pattern. I tried to trigger them but without luck. It seems like they appear faster when I do stuff like browsing YT and open a bunch of apps but it I can't nail it down. Sometimes they appear even though I'm not using the system at all and it's just idleing on the Desktop.

The system looks as follows:
- OS: Windows 11
- Motherboard: ASUS ROG STRIX B550-F GAMING
- CPU: AMD Ryzen 9 5900X
- RAM: Corsair Vengeance LPX DDR4 32GB (2x 16GB) 3200MHz - 16-20-20-38
- PSU: be quiet! STRAIGHT POWER 11 CM 850W
- Graphics: Gainward Gainward GeForce RTX 4060 Ti 16GB Panther
- SSD: Seagate FireCuda 530 1 TB (PCIe 4.0 x4, NVMe 1.4, M.2 2280)

Here's what I did for troubleshooting:
- Installed the latest BIOS update and restored settings to default (there's no OC and never was)
- Made sure Windows is up to date, went through all the system devices and checked for outdated drivers - none
- Installed AMD Chipset drivers
- I ran 3min BurnInTest from Passmark which stress tests the whole system - no errors while running
- I ran Windows memory diagnostic on advanced - no errors
- Stress test CPU via CPU-Z to see whether it triggers the BSOD - nope
- I ran `dism /online /cleanup-image /restorehealth` in command line - Two times I got BSODs while running it but the third run went fine
- I ran `sfc /scannow` - it said it found damaged files and repaired them, but still got a BSOD after that
- At first I had 4 ram bars (totaling in 64GB) all of the same model and switched out pairs but still BSODs

The system and the CPU don't overheat. The CPU idles at around 50C. When I stress test it goes up to 70C - 75C.

You can find minidump data here: https://pastebin.com/rDbTTSJ5 (part of it is in German because the system is set to German language - let me know if this is an issue)

What else can I do to troubleshoot?
 
Yesterday I ran memtest86 and it showed two errors which looked like:

Code:
Test: 6 Addr: C49CE988 Expected: FEFFFFFF Actual: FEFFFF6E CPU: 9

I noticed that my RAM modules where installed in the wrong slots (A1, B1) whereas the ASUS recommends having them on A2 and B2. So I did that, ran memtest86 again (3 times) and no errors but still got a BSOD after running the computer for a while.

Now I switched out the bars for a different manufacturer and size and am stil debugging.
 
Another day another BSOD. The system ran fine for about an hour and I started a bunch of applications (OBS, Brave browser on YT, and HWMonitor).

Please note, I replaced the RAM modules with two working modules from my other system which runs totally fine without any issues. So I'm assuming it's actually either the CPU or the mainboard. People on the internet say that CPU failures are rare but, again, I'm no expert and can't really guess what's going on.

I'm still hoping there's someone out there who can interpret the dump files.

WinDbg analyze of the dump file: https://privatebin.net/?7e0d306880c43069#3nSWyZgzETbKJYuBBXfWaL6yAdTj975kGpT9GEBqwHP7

Here's the minidump file: https://drive.google.com/file/d/1KqOPzOlpSuw8v0VzyXBZMfKCZ5i3FX1f/view?usp=drive_link

I do have a full memory dump but it's ~2GB so can't really share it here.