[SOLVED] GPU crashes under load ?

Status
Not open for further replies.
Apr 21, 2022
20
1
15
My graphics card completely crashes whenever i put it under heavy load, monitor goes black, gpu fans ramp up to 100%, i can hear the drivers reset in the background through my headphones. I have to turn off my PSU, wait like 10 seconds and then turn it back on so it can reboot. Weird part is, after restarting my drivers are completely gone, the resolution drops to like 360p and i have to DDU and reinstall them every time this happens. Here's the weird part: this only happens under a specific circumstance. For example, let's say my PC was off for half a day, i can boot it up, run furmark for 4 hours straight with no issues, but the moment i close furmark (thus ending the load) and wait a couple minutes, the next time i try putting it under load it immediately crashes. Sometimes it just crashes by itself after a couple minutes, without me even putting it under load.

Here are my PC specs:

GPU: Sapphire AMD Radeon RX 6700 Pulse
MOBO: Gigabyte B450M DS3H
CPU: Ryzen 5 2600
RAM: Corsair Vengeance LPX 3000Mhz 16Gb 4x4
M.2: Kingston A2000 250Gb
HDD: WD Blue 1TB 7200rpm
PSU: Seasonic S12II-520Bronze 520W

Things i've tried:

  • Reinstalling drivers with DDU
  • Rolling back to older drivers
  • Uninstalling programs like msi afterburner, rivatuner etc.
  • Clearing BIOS settings.
  • Updating BIOS.
  • Updating chipset drivers.
  • Dropped from windows 11 to windows 10 (fresh install).
  • Re-slotting the card into the PCIE slot.
  • Reverting my OC+Undervolt tuning on AMD Radeon to completely stock settings.
  • Turned off freesync in the drivers and on my monitor.

I'd like to mention that temperatures on the card seemed fine, after 4 continuous hours on furmark the core temperature was dead-set on 64C and the junction was sitting around 88C.

Nothing seems to have worked. I'm not sure if this has anything to do with it, but I've just bought this card a couple weeks ago. Up until then, I've been using a gt 730 for about 9 months.

I'm suspecting the power supply, but from what i could see online with similar issues it could also be the motherboard or the GPU itself. I have no clue.
 
Solution
The part number for the PSU is CP-9020132, it's an 80+ Gold.

It's mediocre PSU and not the latest either.

CP-9020132 came out back in 2017, while the latest,
CP-9020229 came out in 2021 and got mixed reviews.
TH review of the latest one: https://www.tomshardware.com/reviews/corsair-tx650m-power-supply-review

And non-English review of the one you currently got (might want to use Google Translate to read that),
link: http://www.realhardtechx.com/index_archivos/Corsair_TXMseries_TX650M_650W_1.html

So, it "should" be okay. Unless GPU transient power spikes surpass what any 650W unit is capable to deliver, 🤔 resulting in issues what you're getting.

What are GPU transient power spikes? A video to watch...
I'm not sure if this has anything to do with it, but I've just bought this card a couple weeks ago. Up until then, I've been using a gt 730 for about 9 months.

GT 730 is 38W GPU, vs RX 6700 which is 175W GPU. There's a big difference.

PSU: Seasonic S12II-520Bronze 520W

Time to retire this trusty workhorse and go with far modern PSU.

Here, i suggest Seasonic Focus or PRIME series, in 650W range.
pcpp: https://pcpartpicker.com/products/compare/WrNypg,yc38TW,7fndnQ,fnjJ7P/

All 3 of my PCs are also powered by Seasonic PSUs, full specs with pics in my sig. And i too had S12II-520 powering my main build, Skylake, for a short while, until i bought PRIME TX-650 for it.
S12II-520 is the best group regulated PSU ever made and very reliable as well. Still, it 1st came out back in 2009 and at this current date, it is way too old for any modern hardware. I retired my S12II-520 back in 2016.
 
GT 730 is 38W GPU, vs RX 6700 which is 175W GPU. There's a big difference.



Time to retire this trusty workhorse and go with far modern PSU.

Here, i suggest Seasonic Focus or PRIME series, in 650W range.
pcpp: https://pcpartpicker.com/products/compare/WrNypg,yc38TW,7fndnQ,fnjJ7P/

All 3 of my PCs are also powered by Seasonic PSUs, full specs with pics in my sig. And i too had S12II-520 powering my main build, Skylake, for a short while, until i bought PRIME TX-650 for it.
S12II-520 is the best group regulated PSU ever made and very reliable as well. Still, it 1st came out back in 2009 and at this current date, it is way too old for any modern hardware. I retired my S12II-520 back in 2016.

When i bought my 6700 i knew it'd be drawing a lot more power, and basically almost top out my power supply. Even so, i thought maybe i could make it work somehow, an undervolt here a power limit there. It worked fine for the first 2 weeks but as you said, it's a 175W graphics card. It's probably making my PSU work at its limit, and transient spikes may well be causing it to crash.
I'll ask a friend to test his PSU and come back to you with results.
 
UPDATE: I've gotten my hands on a friends' power supply, namely a Corsair TX650M, in order to test it out on my system. After ensuring all necessary cables and connectors were in place, and that it was powering everything crucial in the PC, i went on to stress tests. (i had to do a fresh install of windows since my friend didn't give me the sata cables, and the pc refused to boot off of only my m.2).

Nevertheless, a couple minutes into furmark and it crashed once again. I made sure to DDU and fresh install drivers but to no avail. Sometimes it crashed right in the middle of installing the drivers. Sometimes it crashed during boot-up (while i was trying to restart in safe mode). Every single time it was the same exact crash, black screen, GPU fans ramp up to 100%, and the only way to restore the system is to turn off the PSU and turn it back on after a couple seconds.

My friend's PSU is perfectly fine, and i'm pretty sure after this that mine is as well. Out of curiosity, i tried running memtest just so i could rule out the tiny possibility of it being the RAM's fault, midway through the test the gpu crashed again. So i plugged in my old gt 730 and ran memtest fully. After almost 3 hours it passed with flying colors, no errors whatsoever.

So at this point, my PSU is fine, my m.2 nvme is fine (checked it with crystaldiskinfo), i'm 99% positive that the CPU has nothing to do with it, my RAM is fine, the only 2 options left are the motherboard (either the entire thing or PCIE-slot) or the GPU itself.

I'm planning on swapping GPUs with my friend (he has a 5700XT), and the final test would be to see where it crashes. If it crashes his PC, then there's no way it isn't the GPU. If my PC crashes the same way, then it's most likely the MOBO.

The entire 9 months i had my gt 730 i continuously stress tested my r5 2600, to the point where i got an AIO and managed to overclock it to 4.1Ghz stable at around 67C tops in cinebench for hours. I've seen it draw over 100W, and be completely fine.

I will return after at most one day with my findings.
 
namely a Corsair TX650M

Which TX650m? Since there are several. Is it the 80+ Bronze or 80+ Gold version? Or what is the part number of the PSU?

I'm planning on swapping GPUs with my friend (he has a 5700XT), and the final test would be to see where it crashes. If it crashes his PC, then there's no way it isn't the GPU.

It could be GPU, yes. Since Sapphire isn't known of making the most reliable GPUs. Sapphire is one of the cheaper options, and with cheap price, you can't expect much reliability.
 
Which TX650m? Since there are several. Is it the 80+ Bronze or 80+ Gold version? Or what is the part number of the PSU?



It could be GPU, yes. Since Sapphire isn't known of making the most reliable GPUs. Sapphire is one of the cheaper options, and with cheap price, you can't expect much reliability.

The part number for the PSU is CP-9020132, it's an 80+ Gold.
 
The part number for the PSU is CP-9020132, it's an 80+ Gold.

It's mediocre PSU and not the latest either.

CP-9020132 came out back in 2017, while the latest,
CP-9020229 came out in 2021 and got mixed reviews.
TH review of the latest one: https://www.tomshardware.com/reviews/corsair-tx650m-power-supply-review

And non-English review of the one you currently got (might want to use Google Translate to read that),
link: http://www.realhardtechx.com/index_archivos/Corsair_TXMseries_TX650M_650W_1.html

So, it "should" be okay. Unless GPU transient power spikes surpass what any 650W unit is capable to deliver, 🤔 resulting in issues what you're getting.

What are GPU transient power spikes? A video to watch:

View: https://www.youtube.com/watch?v=wnRyyCsuHFQ


With it, it could be possible that your GPU spikes to ~450W for milliseconds, and with the rest of the load on PSU, it surpasses the 650W rating.
Still, Steve didn't test Radeon GPUs and i'm guessing the range where RX 6700 may spike into. Could be only GPU fault as well. Better idea is when you've tested with 5700XT, which is 225W GPU.
 
  • Like
Reactions: Eminic
Solution
It's mediocre PSU and not the latest either.

CP-9020132 came out back in 2017, while the latest,
CP-9020229 came out in 2021 and got mixed reviews.
TH review of the latest one: https://www.tomshardware.com/reviews/corsair-tx650m-power-supply-review

And non-English review of the one you currently got (might want to use Google Translate to read that),
link: http://www.realhardtechx.com/index_archivos/Corsair_TXMseries_TX650M_650W_1.html

So, it "should" be okay. Unless GPU transient power spikes surpass what any 650W unit is capable to deliver, 🤔 resulting in issues what you're getting.

What are GPU transient power spikes? A video to watch:

View: https://www.youtube.com/watch?v=wnRyyCsuHFQ


With it, it could be possible that your GPU spikes to ~450W for milliseconds, and with the rest of the load on PSU, it surpasses the 650W rating.
Still, Steve didn't test Radeon GPUs and i'm guessing the range where RX 6700 may spike into. Could be only GPU fault as well. Better idea is when you've tested with 5700XT, which is 225W GPU.

UPDATE #2:

I've had the 5700XT for the past day or so, and i can say with full confidence that nothing went even slightly wrong during this time. Furmark, RDR2, Spider-Man Remastered, Warzone 2.0, all ran flawlessly. I even went into the Radeon software and turned the power limit all the way up to +50%, and it managed to pull 300+ watts in furmark (it was completely fine although i closed it after 30~ minutes due to insane VRM and memory temps, wasn't my gpu so didn't wanna do any damage). After raising the pw limit to +50%, i went into the BIOS and overclocked my cpu again to a stable 4.1 Ghz allcore, once again everything was completely stable. I've tested it thoroughly throughout the day, with all sorts of different tests.

My friend, however, had no such luck. The day i swapped it with him, he didn't have that much time to test it, and so ran furmark and a bit of cyberpunk for 10-20ish minutes and told me it was fine. He was tired however, and went to sleep. Once he finished his shift the next day he returned and started a furmark stress test. Around an hour or so into it, the card crashed.

He told me the symptoms and they were word-for-word the exact same ones i've experienced as well. Monitor went black, gpu fans ramped up to 100%, and when he restarted the PC and tried booting into windows he found that the drivers were gone. I was on discord with him at the time while he was explaining this to me, and while we were talking (the pc just booted into windows without the drivers) he told me it crashed by itself once again in front of him.

I've narrowed it down, finally, to it being the card's fault. I've already requested an RMA, however, i'm 95% sure that this is caused either by a faulty chip or a part that's prone to excessive overheating (explaining why it crashed after some time into the load, and why after quickly restarting it crashed by itself again, the theory here being that the overheating part did not have time to cool down). This is further reinforced by some tests I've done, where i found that if, after one of these crashes, i shut off the PC and let it cool down for at least 30-60ish minutes, the card would boot up fine. I first discovered this after literally ragequitting one night because i had been DDU-ing and reinstalling drivers for an unbearable amount of time. After i slept 8+ hours, i was able to boot up the PC, cleanly install the drivers and it worked fine for a while.

Either way, it seems my luck with graphics cards is terrible, seeing as I've had 3 die on me in the past 1-2 years. I appreciate the support on this issue although it was without result.
 
Status
Not open for further replies.