Question Graphics card crashes under load.

Aug 14, 2020
1
0
10
0
Okay so this might be a long one but i'm honestly at my wits end.


So, my birthday was recent and a group of online friends were coming down for my birthday and one of them gifted me his older PC to upgrade me from my, well, modest laptop to an actual tower PC.


Here's the specs:

OS: Windows 10 Home 64bit

Motherboard: MSI X99S SLI PLUS


CPU: Intel Core i7 5820k @ 3.30ghz (I've not yet dabbled in overclocking the CPU)


GPU: Zotac GTX 980 Ti AMP! Extreme


RAM: 16GB Corsair Vengeance @2133MHZ (2x8GB)


Storage: 250GB Samsung SSD 860 EVO

931GB Western Digital WD My Passport (this is an external drive)


Monitors: Iiyama ProLite E2283HS

AOC G2770PF



Okay, now I'll preface this by saying that this issue does not seem to happen in every single game I've played. I've played a host of games from the Original DOOM through Elite: Dangerous. The games that it most frequently crashes on are: Overwatch, Elite: Dangerous, Grand Theft Auto V (Specifically this only seemed to happen on custom FiveM servers) and, the worst offender by far, Warframe.

I have actually been using Warframe as my testing realm for this issue as it will consistently crash in one particular area, Fortuna.


Everything will be running smoothly in whatever I'm doing. When suddenly, with absolutely no prior warning the screen will turn a seemingly random block colour and the graphics drivers will reset. My second monitor goes black too regardless of whether or not anything other than my windows background was being displayed on it.


I've attemped the likes of driver rollbacks, DDU, updating to the latest drivers, reseating everything, checking the leads from my PSU to my components to ensure they're fully in their slots, underclocking (both the core clock and memory clock), running with only one stick of RAM (worth a try I guess? hahah) resetting the CMOS, running on only one monitor, disabling the NVIDIA Shadowplay overlay, disabling Steam overlay, even using regedit to change the value of the Timeout Detection and Recovery registry key, and the most obvious of them all, lowering in-game settings in every game i've attempted to play. There's quite possibly a thing or two i've forgotten in that list.


This setup seems to have worked perfectly fine for my friend, I recall when we would play together and he was using all the parts together and never once did he have issues like I'm having.


As for trying to figure it out myself? Well, i'm fairly certain it's not the temps as i've been keeping a damn close eye on them in MSI Afterburner. The highest temperature i've seen was 79 celsius. I also keep speccy at hand and no other parts report any worrying temperatures.

I'm wary (but at this point willing) to go as far as to flash a bios onto it. I'm aware of the risks and if that should go wrong can plug the GPU into my partners PC in order to reflash a backup onto it. This is a last resort option for me ideally.


Oh yes, the PSU. It's a Corsair RM850X. I've check the connections (well, externally, I haven't opened the PSU up nor do I plan to) on both ends of it and all is fine as far as I can see.

Just to re-iterate this only seems to happen in-game. General desktop usage does not cause this.
It also performs fine (as far as I can tell) under benchmarks. I'm willing to attempt any benchmarks that are suggested however. So far I've tried Heaven and GTA V's in-game benchmark and it performs well.


So does anybody have any ideas? I'm pretty damn stumped here as you can likely tell. RMAing is off the table as it's out of warranty.

Thanks in advance for reading and any replies!

EDIT: Forgot to add OS
 
Last edited:

Grobe

Distinguished
When suddenly, with absolutely no prior warning the screen will turn a seemingly random block colour and the graphics drivers will reset.
This suggest that the GPU is probably the source of failure.

In addition of what you've already tried, you can:
  • Use Memtest in order to rule out problems to RAM.
  • Use OCCT in order to determine if the voltage levels stays within the limits. This may rule out problems for PSU / onboard voltage regulators. Then you'll also pick up a second set of temperature check (probably not necessary at this time).
I don't think updating the BIOS will help you out. But if you do anyway, please read what kind of problems the update (if any) are supposed to solve and if that corresponds to the issue you describe here.
 

ASK THE COMMUNITY