Question RX580 Rapid Heat Climb

Feb 22, 2019
7
0
10
So, I have already RMAd my XFX RX580 once, as the first video card literally caught on fire, and violently threw sparks out of my system. Put the old 280 back in, and everything passed every possible stress test, no problem.
Got my replacement RX580, and it has an issue, that is intermittent and has no pattern, where the monitor shuts off (goes into sleep mode), the keyboard no longer responds (numlock, etc), and it is shut down, however the tower still has power, LEDs are on, fans are spinning.
Reset and power buttons are completely unresponsive when this happens. I have to flip the switch to my power supply in the back, to get it to boot back up.
This only happens when I run the power test on OCCT, and not always, and when I am playing a video game (i.e.- Dragon's Dogma, Dragon Ball Xenoverse 2).
While monitoring OCCT readings, I noticed that the video card's temp, in windowed mode, would steadily climb to a peak of about 72 Celsius, and hold that.
In full screen mode, it rapidly reads the GPU sensor as going from idle to 78, then shuts down if I allow it to reach what it reads as 80.
If I exit out of full screen, it immediately reads the GPU temperature about 30 degrees cooler (Celsius).
Running a GPU separate burn out test (Furmark and OCCT), this issue does no happen, full screen or not.
CPU passes the tests.
Memory passes.
Voltage holds fairly steady.
O/S: Win10 64bit
Motherboard: ASUS TUF SABERTOOTH 990FX
PSU: CORSAIR HX Series HX1050
RAM: CORSAIR Vengeance 16GB (2 x 8GB) 240-Pin DDR3
CPU: AMD FX-8350 Black Edition
GPU: XFX Radeon RX 580 GTS Black Edition
HDD: Mushkin Enhanced Reactor 2.5" 2TB (Solid State)
The video card and SSD I bought brand new a few months ago.
Everything else, I bought new around August of 2014.
 
Feb 22, 2019
7
0
10
That is what confuses me - why does it only happen on the OCCT power test, running GPU & CPU, and only in full screen? As well as during video gaming?
I can run separate benchmarks and burnouts, and the video card runs like a champ.
With those thermal sensors rapidly bouncing around, that does make me lean towards the GPU being faulty - trying to be fussy, though, since I don't care for waiting on Newegg's return service, again (it is not fast).
 
Feb 22, 2019
7
0
10
So, get this - after repeated testing, taking the entirety of the PC apart, cleaning it, changing PCIe I plugged into in the motherboard, making sure all fans work, improving my jumbled cable management (significantly), and plugging a modular PCIe 6+2 pin into my GPU, I have narrowed down some of the reboot sequences.
Other than in-game, I can simulate it on OCCT - and ONLY OCCT, nothing else will crash it for benchmarking/stress-testing.
With a GPU Shader Complexity between 1 and 5, it runs indefinitely without error.
Shader complexity of 6, it instantly reboots the system.
Shader complexity of 7, it may last awhile, it may shut down in a few minutes; when it does so, the reset/power buttons on the case are unresponsive, and I have to flick the power supply's switch.

So, I have definitely narrowed it down to a bad GPU - just really weird the way it is behaving.
 
Feb 22, 2019
7
0
10
Could it be a driver fault? I have already used DDU and safe mode to do a fresh install, so I doubt it. Just hoping to avoid having to send the video card out, again (even though I full well know the product is faulty, again).
Not used to getting bad parts with the XFX brand on them :(
 
Feb 22, 2019
7
0
10
Had one of the resets (still testing) - BIOS was reading my 12V as only giving out 9.8V.
So, is the video card truly bad, or is my Corsair HX1050 going out?
Hrmmmm.....
 
Feb 22, 2019
7
0
10
No dice, since any form of voltage manipulation counts towards voiding the warranty.
And the HX1050 gives me breathing room - first PC I ever built was across 2009-2010 - bought two of the Radeon HD4870X2s, the first dual GPU unit I believe, right when they came out. Then tinkered and fucked with that build for awhile.

One of my brother's may have a spare PSU with sufficient power, so if he does I'll borrow it to rule out PSU failure.

Otherwise, think I'll just play around on either my Xbox one or PS4 (work thirteen hour shifts as a diemaker, so I'm not flush with freetime, anyways) while I send the HX1050 in (still covered under warranty, as well).
 
"Undervolting voids warranty"
That's absolute BS. While it's easier to tell if a consumer fries their CPU/GPU with too much voltage an OC, undervolting doesn't harm anything. Heck, AMD gives you all the tools you need built right into their driver software (WattMan).
 
Feb 22, 2019
7
0
10
It isn't about them being able to tell if I messed with it.
It's an honor system. The same reason I have gone back to a gas station several hours after I intentionally stopped there, got home, then found out that one of my daughters snagged a twenty five cent piece of gum.
Gotta do the right thing, even at my own inconvenience and loss :)
 
No dice, since any form of voltage manipulation counts towards voiding the warranty.
And the HX1050 gives me breathing room - first PC I ever built was across 2009-2010 - bought two of the Radeon HD4870X2s, the first dual GPU unit I believe, right when they came out. Then tinkered and fucked with that build for awhile.

One of my brother's may have a spare PSU with sufficient power, so if he does I'll borrow it to rule out PSU failure.

Otherwise, think I'll just play around on either my Xbox one or PS4 (work thirteen hour shifts as a diemaker, so I'm not flush with freetime, anyways) while I send the HX1050 in (still covered under warranty, as well).

Going to doubt the PSU but it never hurts to test.