Question GPU Problem

Oct 28, 2021
3
0
10
Hey guys,

I've asked about this issue in another forum already, but I got split answers, so I'd like to ask here.

Relevant hardware:
-RX580 Nitro+
-600W PSU

Now, here's my issue:
My GPU is broken... kind of. Basically, it fails to "boot up" (or about 1 out of 20 times it doesn't boot, it artifacts). What I mean by that is that as soon as Windows loads the driver, it breaks. However, thats not always the case.
There are times where it works for 10s of reboots without any problems (e.g. no artifacting and it boots up), there are times where I have to reboot my PC 2-3 times to get it to work and there are times where I have to reboot my PC 30+ (!) times to get it to work.
I'm unsure what's going on; the card was never overclocked beyond factory settings, it's been bought brand new so I doubt there's a mining BIOS on it or anything. I have no issues under load, once it boots up it works as if nothing ever happened.

TLDR; as soon as the driver for my GPU gets loaded, it has a random (but high) chance of not "booting up". Only way to fix it is by restarting my PC as it hangs.

The GPU works just fine without a driver, no artifacts or anything.

Some more information & what I've tried:
-Sometimes I get a BSOD with "VIDEO_TDR_FAILURE"
-It started happening around 1.5-2 years ago, I thought it was the motherboard so I didn't RMA (yes, stupid)
-I tried older drivers and older Windows 7, as the issue started appearing with Win10
-I have been told my PSU might be too weak/old (8 years, due for a replacement, cannot test a more powerful one as I don't have one)
-Not the motherboard/BIOS as it happens on all
-I tried reducing the clock speeds (not through BIOS, but through Radeon Software) as I thought that this might be the problem (could flashing a BIOS with lower clock speeds help?)
-Other GPUs (R9 270) work flawlessly

Also, I'm not sure about it, but around the same time it happened (cannot recall if it was exactly then as it's too long ago), I have had one or two power outages. Now, I cannot imagine that would've damaged anything, but it's worth mentioning.

I've thought about reflowing it (even though it's just a temporary fix), but considering it works "just fine" once I get into Windows I don't want to risk breaking it and having to spend insane amounts of money on a new GPU.
 
Not sure if this applies to your situation, but my RX 580 failed about 6 months ago, it would run for about 20 minutes and then freeze. I pulled it out of the closet yesterday and started researching what I could do to repair it. I guessed the failure was intermittent and heat related and read about the issues with surface mounted components. I felt this could be my problem and decided to heat up the card and do a reflow or reball. The idea is that surface mounted chips have balls of solder connecting the chip's contacts to the solder pads on the board. Over time as the board expands and shrinks from heating and cooling these contacts can break or become intermittent. So I removed the fan and heatsink board from the main board, I cleaned off the heat transfer grease from the GPU and got out my trusty paint stripping heat gun and digital infrared thermometer. I slowly heated the main board to about 180c (note celsius) once the board was heated I focused on the GPU chip and the VRAM chips and I kept them at about 190c -200c for about 5 minutes. This should soften the solder enough for the chips to settle in to the solder. Then I let the board cool completely, applied new heat transfer grease and reassembled. I didn't have much hope for this, but it actually worked! Make sure you don't over heat, if you liquify the solder parts will start blowing off your board. Their are tons of videos of this technique on youtube.