Question Help determine faulty component (PSU/GPU/Mobo)

ZeeSteen

Distinguished
Jan 16, 2014
28
0
18,540
Hi all,

I left on vacation for 10 days and came back yesterday evening.
My system worked absolutely fine before.

As I booted my system and started working the system blacked out and rebooted very quickly while doing office work.
In a meeting of 20m this happened 3 times.

Hardware list:
  • GPU: AMD Radeon RX 7900 XTX
  • SPARE GPU: ZOTAC GeForce GTX 1070 AMP Extreme
  • CPU: AMD Ryzen 9 7950X3D
  • Mobo: ROG STRIX X670E-E GAMING Wifi
  • RAM: Kingston Fury Beast KF560C36BBEK2-64
    • 2x 32GB
  • PSU: be quiet! Straight Power 12 850W

So I did the following to start pinpointing what was wrong:
  1. Windows Memory Diagnostics --> SUCCES, no errors
  2. CPU OCCT Normal Variable test for 1 hour -> SUCCES, no errors
  3. Power OCCT Power Test both GPU's for 1 hour
    1. Attempt 1: FAIL after: 36s into test
    2. Attempt 2: FAIL after: 1m 17s
  4. Connected DP to MB instead of RX 7900 XTX
  5. Power OCCT Power Test Integrated GPU for 1 hour -> SUCCES, no errors
  6. Power OCCT Power Test RX 7900 XTX 1 hour --> FAIL, after 1m 32s
  7. GPU 3D Adaptive Test BOTH GPU's. Variable, All GPU's, 10-60%, 4%inc 30s. For 1 hour --> FAIL after 5m 6s
  8. GPU 3D Adaptive Test Integrated GPU. Variable, All GPU's, 10-60%, 4%inc 30s. For 1 hour (stopped after 40m)
    1. SUCCES, no errors
    2. Becaused failed on both GPU simul I didn't test RX 7900 sep again and assumed it would fail
  9. Swapped out AMD Radeon RX 7900 XTX (370W) for old GPU ZOTAC GeForce GTX 1070 AMP Extreme (250W)
    1. Installed latest drivers
    2. All of a sudden graphical repsonse started becoming VERY sluggish/laggy
    3. Weirdest BSOD of my life (see BOSD Screenshot)
    4. After this system would not boot.
    5. Mobo Q-led showing RED for either DRAM or CPU. (I thought i was DRAM, but apparently only CPU can be red)
    6. Removed 1070 again
    7. System boots again.
  10. Did OCCT Memory stress test (80%) for 15m --> SUCCESS
  11. HELP... Not sure where to go from here
Problem exclusions:
  • Temeprature is most likely not the culprit. See temperatures at crashtime from OCCT CRASH-TMP
  • Memory is most likely not the issue because:
    • Windows Memory diagnostics test -> succesful
    • OCCT Memory Stability test -> Succesful
    • The issue can specifically be triggered by stability testing PCIe GPU but not Integrated GPU

Thoughts I have
  • Could be GPU - but then why did the 1070 cause this weird error?
    • Maybe because I didn't DDU?
  • Assumed it could be something with PSU, failing to deliver full power
  • Could be mobo PCIe connectivity?

BSOD screenshot:
FNs8RiG.jpeg


Temperatures during crash OCCT
tFaRBZt.png
 
Last edited:

ZeeSteen

Distinguished
Jan 16, 2014
28
0
18,540
What PSU?

Change from AMD GPU to NVIDIA usually requires use of DDU.
Check CPU and GPU temperatures.
Accidentally ctrl entered my post before finishing it.

I'll do DDU and add the 1070 again.

CPU/GPU temps were not the issue - see update in post with GPU / CPU temps during crash
PSU is be quiet! Straight Power 12 850W (platinum)
 

ZeeSteen

Distinguished
Jan 16, 2014
28
0
18,540
I had a similar issue on my pc. My issue came down to faulty ram so my pc wouldnt even boot. What is your ram configuration. 2 or 4 stick and how many gb each.
RAM is 64 GIG in 2x 32GB config.
Pretty confident it's not a RAM issue. Did both OCCT and Windows memory stability and diagnostic testing which were fine.