Question GPU Crashing system?

Jan 5, 2023
11
0
10
Hello friends I've got a 6950 XT (Powercolor Red Devil) and it I THINK has been crashing my system. My system plays literally anything fine for 2-3 hours more or less and then it randomly decides to freeze the game (first) and then the PC freezes totally, then on of my displays no longer receives signal and the other monitor is also frozen however stays on. This state lasts 40 minutes till I get impatient and I don't even care about the BOSD anymore and I kill the power by holding the power button for 10s and reboot, when it's started it runs perfectly smooth again.

This happens after 2-3 hours and I have experienced this 4 times in the last week now which is rather concerning, I've recently got a new PSU but it ran fine a few weeks back so I doubt that's the issue ( be quiet! Pure Power 11 1000W PSU ). My motherboard has bios updates available but I'm hesitant to update as my system is unstable. I'm currently running a version which is 2 months old, the latest bios update was 12 days ago. Should I try to update?

The things I have tried/tested:

  • Temperatures are good (hotspot is max 95c on msfs 2020)
  • DDR5 running at jdec speed

FIY:

- Not running an overclock on any parts except XMP (normally)

Anyone knows what might be going on?
 
Welcome to the forums, newcomer!

When posting a thread of troubleshooting nature, it's customary to include your full system's specs. Please list the specs to your build like so:
CPU:
Motherboard:
Ram:
SSD/HDD:
GPU:
PSU:
Chassis:
OS:
Monitor:

What BIOS version are you currently on? Side task, see if you can remain in BIOS indefinitely. Please include all the parts that were replaced/upgraded. What discrete GPU were you working with prior to the Powercolor 6950XT Red Devil?
 
Welcome to the forums, newcomer!

When posting a thread of troubleshooting nature, it's customary to include your full system's specs. Please list the specs to your build like so:
CPU:
Motherboard:
Ram:
SSD/HDD:
GPU:
PSU:
Chassis:
OS:
Monitor:

What BIOS version are you currently on? Side task, see if you can remain in BIOS indefinitely. Please include all the parts that were replaced/upgraded. What discrete GPU were you working with prior to the Powercolor 6950XT Red Devil?

Hello thank you

CPU: 7950X
Motherboard: msi x670-p pro
Ram: DDR5 Trident Z5 Neo 6000Mhz CL32 (16x2)
SSD: KC 3000 1TB & 2TB (total 3TB, 1TB OS drive)
GPU: 6950XT PowerColor Red Devil
PSU: be quiet! Pure Power 11 1000W
Chassis: DF700 Flux
OS: Windows 11 Pro
Monitor: LG UltraGear 27GP850-B & iiyama G-Master Black Hawk GB2530HSU-B1

I used iGPU before, and 7D67v13. I will try to be in BIOS indefinitely. I have upgraded my PSU from an OEM one from my dad which was about to explode and I'm not sure what it was but it's gone now.

It seems like the bios that was there 7D67v15 is also gone now from MSI's page which is rather weird.
 
I used iGPU before
Try and use DDU to remove all GPU drivers, then manually reinstall with the latest version sourced from AMD's support site in an elevated command, i.e, Right click installer>Run as Administrator.

As for your BIOS, even if the BIOS you're stating is missing, you have one more BIOS version pending update. If you're able to be in BIOS indefinitely, then you have nothing to worry about, though IMHO, to rule out the GPU from incurring a transient spike, assuming a worst case scenario, power down, disconnect from the wall and remove your discrete GPU. Then update the BIOS, then clear CMOS and then drop the discrete GPU in. You can then look into doing what I've stated about DDU(in the first line of this post.
 
I used iGPU before
Try and use DDU to remove all GPU drivers, then manually reinstall with the latest version sourced from AMD's support site in an elevated command, i.e, Right click installer>Run as Administrator.

As for your BIOS, even if the BIOS you're stating is missing, you have one more BIOS version pending update. If you're able to be in BIOS indefinitely, then you have nothing to worry about, though IMHO, to rule out the GPU from incurring a transient spike, assuming a worst case scenario, power down, disconnect from the wall and remove your discrete GPU. Then update the BIOS, then clear CMOS and then drop the discrete GPU in. You can then look into doing what I've stated about DDU(in the first line of this post.
I have done DDU now thank you, my BIOS stayed alive the whole night so I have also updated the bios and I will now see if issues have been solved or not, thanks a lot.
 
I just built a similar system. 7950X, 6900 XT, 2x16GB CL32 6200MT/s RAM.
Check Windows event viewer (system and application logs) to see if anything specific started running when the crashing occurrs.

You should test CPU, GPU, and RAM separately to check stability.

CPU - Prime 95 blended stress test with all AVX off. Needs to run for at least 15 mins.
GPU - Furmark stress test. Again, at least 15 mins.
RAM - Memtest86. At least one full pass.

Let us know the results of all the tests.
 
I just built a similar system. 7950X, 6900 XT, 2x16GB CL32 6200MT/s RAM.
Check Windows event viewer (system and application logs) to see if anything specific started running when the crashing occurrs.

You should test CPU, GPU, and RAM separately to check stability.

CPU - Prime 95 blended stress test with all AVX off. Needs to run for at least 15 mins.
GPU - Furmark stress test. Again, at least 15 mins.
RAM - Memtest86. At least one full pass.

Let us know the results of all the tests.
I have run these all and they have completed I think, Furmark and Prime95 didn't "complete" I had to close them after my timer runs out of 15 minutes MemTest86 told me there were 0 errors found.

Issues still occuring.
 
Anything showing in Windows event log when the freeze occurs?

I would definitely update the BIOS. I had issues with the BIOS that came on my board. It didn't start behaving normally until I updated it to the latest. Is that RAM on the QVL for that board?

Oh, make sure Memory Power Down mode is ENABLED in BIOS. With Zen 3 a lot of sites say that this needs to be DISABLED to help with stability, but with Zen 4 the opposite appears to be true. I was randomly crashing with XMP enabled until I switched Memory Power Down to ENABLED.

Is this only occurring in one particular game?

Run Unigine Valley demo (free) looping for a couple hours to see if you can reproduce the issue.
 
Anything showing in Windows event log when the freeze occurs?

I would definitely update the BIOS. I had issues with the BIOS that came on my board. It didn't start behaving normally until I updated it to the latest. Is that RAM on the QVL for that board?

Oh, make sure Memory Power Down mode is ENABLED in BIOS. With Zen 3 a lot of sites say that this needs to be DISABLED to help with stability, but with Zen 4 the opposite appears to be true. I was randomly crashing with XMP enabled until I switched Memory Power Down to ENABLED.

Is this only occurring in one particular game?

Run Unigine Valley demo (free) looping for a couple hours to see if you can reproduce the issue.

The windows event logs has nothing

The bios is updated already and the QVL list has the ram i believe and it's also marketed as for AMD, i have power down mode on auto, i will put it on enabled now

This happens in GTA 5 and Microsoft Flight Simulator but also when I'm just watching youtube or idling
 
@alceryes Sir today it happened twice, I was patient and I have gotten an error out of it: Driver unexpected time out. I have DDU'd so many times now and all my power connectors are properly in, im using the recommended driver and not the latest.

I have 3 days left to RMA it with a high chance of getting money back, because the 28 days will expire soon and that will mean there will only be replacements or repairs which I rather don't have.

Do you have any suggestions on what's up or should I just call it a day and send it back and pick up a 4080?

Also I dont run any overclock not even ram at this point
 
Have you reloaded the OS yet?

Also, the 7950X has an iGPU on it. There's nothing wrong with having the drivers for it installed and just NOT using it - in fact that may be causing issues. I have all hardware enabled and drivers installed, even though I'm not using the iGPU, wifi, or bluetooth.
Download the latest drivers from AMD.
Use DDU (one last time 😉) to uninstall both iGPU and 6950X.
Reboot and perform a clean install using the latest AMD drivers (it will need to reboot again) and test.

Other than a full OS reinstall, I would have to get my hands on it to diagnose further - you've done a lot. You may have bad components (not necessarily GPU, but possibly).

I wouldn't miss the return window. Send it back if it's still not working properly.
 
Have you reloaded the OS yet?

Also, the 7950X has an iGPU on it. There's nothing wrong with having the drivers for it installed and just NOT using it - in fact that may be causing issues. I have all hardware enabled and drivers installed, even though I'm not using the iGPU, wifi, or bluetooth.
Download the latest drivers from AMD.
Use DDU (one last time 😉) to uninstall both iGPU and 6950X.
Reboot and perform a clean install using the latest AMD drivers (it will need to reboot again) and test.

Other than a full OS reinstall, I would have to get my hands on it to diagnose further - you've done a lot. You may have bad components (not necessarily GPU, but possibly).

I wouldn't miss the return window. Send it back if it's still not working properly.
I will do so right now and do what you have said, I will see if it happens one more time and because it worked earlier without GPU really well I would suspect it's the GPU but I didn't have a GPU so it might as well be something else that can't handle the GPU, if I had an old GPU i would've plugged that in and tested already but unfortunately that's no option.

I have however my suspicions that it is the GPU because a friend of mine had this exact issue with his (also recently bought) GPU, RMA'd bought a 4080 and has no issues whatsoever he has gone through a lot of troubleshooting as well and was unable to fix it. He had an Asus card and I got a PowerColor so I am still remaining hope it's firmware related and not hardware.

I am also worried that the shop won't be able to reproduce as this happens only after 4-5 hours and often while watching youtube or twitter videos and not while gaming particularly (did happen while gaming but as often as youtube or twitter which is not what I would expect).

My friend will also bring his PSU with him tonight so I will also plug that in instead (Ofc use all his cables) and see if that helps, can always try! I'll be back tomorrow with hopefully good news and otherwise I'll just RMA it and thank you for your help.
 
I will do so right now and do what you have said, I will see if it happens one more time and because it worked earlier without GPU really well I would suspect it's the GPU but I didn't have a GPU so it might as well be something else that can't handle the GPU, if I had an old GPU i would've plugged that in and tested already but unfortunately that's no option.

I have however my suspicions that it is the GPU because a friend of mine had this exact issue with his (also recently bought) GPU, RMA'd bought a 4080 and has no issues whatsoever he has gone through a lot of troubleshooting as well and was unable to fix it. He had an Asus card and I got a PowerColor so I am still remaining hope it's firmware related and not hardware.

I am also worried that the shop won't be able to reproduce as this happens only after 4-5 hours and often while watching youtube or twitter videos and not while gaming particularly (did happen while gaming but as often as youtube or twitter which is not what I would expect).

My friend will also bring his PSU with him tonight so I will also plug that in instead (Ofc use all his cables) and see if that helps, can always try! I'll be back tomorrow with hopefully good news and otherwise I'll just RMA it and thank you for your help.
Also, just to be clear, the latest AMD driver will install the proper driver for BOTH the 6950XT and the iGPU in one shot. Don't install two separate drivers.