PC Restarting When GPU Is Under Load/Reaches ~70 degrees

Feb 7, 2019
8
0
10
0
Hello,

Recently, I've been having a rather annoying problem with my PC. It has been restarting randomly and as I decided to dig deeper into the issue, ran a multitude of benchmarks (Prime95 and Heaven Benchmark 4.0) and got a couple of GPU logs from GPU-Z, I came to the conclusion that the cause of the problem is either heat or power related. It would restart when the GPU was at full load for a couple of minutes and reached 67-70ish degrees. I tried turning off the GPU fans off and running Heaven Benchmark at the lowest settings and see if it crashes without much load on the graphics card and at high temps (It was at <55% load the whole time) but once it reached 75 C it restarted once again. I'm trying to figure out what could be the cause of the issue - the gpu? the mobo? the psu? Sadly I have no means of swapping out parts at the moment to try and figure what could be the culprit. Best I can do is take the GPU to a local shop which should have a test bench available.

Here are my specs:
(Note: The CPU is overclocked to 3.8 instead of the stock 3.5 GHz, everything else is stock)

Motherboard: Gigabyte GA-Z170X-Gaming 3

CPU: Intel i5-6600K

GPU: GTX 980TI SC+

RAM: 2x8GB DDR4 @ 2133 MHz

PSU: EVGA SuperNOVA 850 G2 (850W), 80+ gold



P.S.: The whole system is pretty much 3 years old, if that helps in any way.


Any help will be appreciated since I'm in quite a pickle.
Thank you in advance!


UPDATE:

Just did a Furmark bench and it ran at 100% load, 80*C for 10 minutes without ANY issue...

Proof:
https://imgur.com/xj6nwLA

Valley screencap:
https://imgur.com/a/yVOix2Y

Having taken the picture, I noticed something odd: GPU-Z's logs are all showing a core clock of 1316.3, whereas FURmark is showing 1303MHz while running and Valley and Heaven benchmark (both unigine benches) show 1493MHz core clock (ridiculously over stock, as I haven't OC'd the card either). Is GPU-Z wrong or are unigine/furmark showing wrong readings of the core clock?
I suppose it is worth mentioning that the PC crashes while playing Quake Champions (just the lobby screen is enough), For Honor and Trackmania Stadium. 3 Completely differently engine-based games, yet the crash still persists, besides for csgo. Any ideas?

 
I would download HWInfo and run it.
I would look at the PSU voltages and temps.
The voltages should be +12, +5 and + 3.3 +/- 5%.
The temps should be roughly CPU below 70C and GPU below 80-85C.
If they are....I might open up another app and start loading the system.....all while keeping and eye on the voltages and temps. I think this may point you in the right direction.

Of course....you have to do all this before it crashes.

However....if this is difficult.....you can set up HWInfo to log the data to a file.

That way....you would have all the data right up to the crash....which can be very valuable.
 
Feb 7, 2019
8
0
10
0


Alright, I did that but I see no abnormalities. I have included the results:



The first screencap from w/ the red rectangle has the voltages on idle and the graphs show the progression while running Heaven Benchmark on max. The GPU temps were as stated above - 75 C and it crashed. The +5 voltage is stable throughout.

P.S.: Sorry for the weird graphs but that was the only way I could get the info out of the report from HWiNFO.
 
That's excellent.
At least it's indicating it's not your PSU.
So do you really thing it's the 75C that's causing it?
Because I don't think a GPU should crash at 75C.
....and if it is....I think there's something wrong with the GPU.
Can you run on another GPU or integrated graphics just to see what happens?
 
Feb 7, 2019
8
0
10
0


Yeah, the 980TI should run up until 91 C if memory serves. I'm suspecting a faulty PCIe port on the motherboard which might lead to the motherboard prematurely overheating/shutting off the PC. And to answer your question, I can't get a hand on another GPU to test as of now.
I really hope it's not the GPU but by the looks of things it might be the fucky part.
 
Feb 7, 2019
8
0
10
0

bout
I have ran Prime95 for ~20 min on heat and power intensive tests, the CPU got to 48-50 C, 100% load, absolutely no problems.
 

Rataan

Honorable
Apr 26, 2012
57
0
10,640
1
Maybe the thermal paste on the GPU has gone bad? That might explain why it looks like an overheating issue but the temperatures look normal. Maybe try underclocking the GPU and see if it lasts longer before the next crash.
 
The sensor for the temperature is located internal to the GPU.
Nothing with the paste would make the sensor misread.
Although it could be a bad sensor but I tend to doubt it.
I would bet on a bad card over a bad temp sensor any day.
 
Feb 7, 2019
8
0
10
0


Pretty much the same story. I put the core clock to -90 (The max Afterburner would allow) but it still crashed at the usual 74 degree mark, pretty much on the same spot on the benchmark as well (~2 min in).
 
Feb 7, 2019
8
0
10
0


I turned the GPU fans start from ~80% power and climb till 85C. It only got to 65C this time for around 2 minutes and then it crashed. This leads me to believe it's either a load or a power issue as the temp didn't even reach 70C.
 
Feb 7, 2019
8
0
10
0


Yep, pretty much around the 30-40 range, it didn't get past 50 while torture testing in Prime95 so I have totally stopped suspecting it.
Currently, I'm mostly expecting it to be a faulty GPU or a bad PCIe slot. I'll try to get to a test bench some time next week where I'll test the GPU and get a definitive answer and update the thread.
 

ASK THE COMMUNITY

TRENDING THREADS