[SOLVED] GPU dies under load, no signal to monitor but PC remains on.

Aug 21, 2019
The graphics card in question is an MSI GTX 970 Gaming. Bought used for parts and not working. The seller said the card had occasional problems posting and would restart the PC after a certain time (1-30 minutes). Now the card is in my system I'm having an entirely different problem and I could use some help coming up with ideas as to what the problem may be. As the title suggests the GPU dies under load. The fans and MSI logo LED remain on as does my PC. A restart is required to get a signal to the monitor again. On one occasion I was able to get a signal again by unplugging the HDMI cable from the card. In non demanding operation I don't have any issues with the graphics card.

Doom has been my testing ground for the card as it is such a well optimised game. On the BIOS that came with the card 84.04.1F.00.F1 (MSINV316MH.125) and at stock clocks Doom was fine in the menus but would stop working once a mission had loaded. With a highest possible underclock on both core and memory I was able to play for about 1 minute before the GPU would die. With an updated BIOS (MSINV316MH.186 ) Doom would stop working immediately once in the menus at stock clocks. When underclocked I managed 3 minutes before the GPU failing. FPS was perfectly fine and there were no artefacts on screen during testing. During my testing I noticed that voltage wouldn't go beyond 1.006.

I think that this is where the issues lies. In order to remedy the low voltage I created a custom BIOS with Maxwell BIOS Tweaker and I forced a constant 1.24 volts. Settings all sliders at 1.225 resulted in 1.24. Doing that has now caused clocks on both core and memory to go to their lowest speeds possible at all times. This is with Prefer Maximum Performance selected in Nvidia Control Panel.

I have asked a graphics card repair channel on YouTube what they think the problem is and they believe a reflow is in order. Before doing that I would like to test components on the PCB with a multimeter. If anyone with experience with graphics card repair can chime in on problem areas please do.

The PSU I'm using is pretty bad. It is a CX750M V1. Despite being trash I am close to certain it is not causing this. Same goes for any software issues on Windows. I do think this is a hardware issue on the graphics card.

Images of the PCB can be found here: https://www.techpowerup.com/gpu-specs/msi-gtx-970-gaming.b3077
Labelled PCB: http://www.modders-inc.com/wp-content/uploads/image//2014/09/labeled.jpg

Card is: V316 VER 1.1 , S/N 602-V316-03SB