Question Safe GPU temps for a 3080

Status
Not open for further replies.
Aug 27, 2023
9
0
20
Hi!
My 3080 crashes after staying for a few minutes near its stock thermal limit (83°C).
Here are the maximum temps it reached right before crashing:
  • GPU temp: 80.9°C
  • GPU junction temp: 92.0°C
  • GPU hotspot: 93.9°C

On a normal card, would these be considered safe ? Is my card defective?
 

turtletarget111

Honorable
Dec 24, 2018
300
149
10,890
Typically, most Nvidia graphics cards can operate at or below 90 degrees perfectly fine. Most cards will begin to thermal throttle once they get above 90 degrees. The RTX 3080, specifically, will thermally throttle when it reaches above 93 degrees. Given that you are right at the temperature ceiling, I would consider increasing the fan speeds on your case or video card. You can use MSI Afterburner to adjust the fan curve of your graphics card, and adjust the curve of your other fans in your BIOS. Your card doesn't appear do be defective, but you should work to drop the temps as much as you can. Lower temps are always better. Hope this helped, take care.
 
Last edited:

Phaaze88

Titan
Ambassador
Hi!
My 3080 crashes after staying for a few minutes near its stock thermal limit (83°C).
Here are the maximum temps it reached right before crashing:
  • GPU temp: 80.9°C
  • GPU junction temp: 92.0°C
  • GPU hotspot: 93.9°C

On a normal card, would these be considered safe ? Is my card defective?
Those are safe, yes.
Defective? I don't know - can't tell with what's in front of me right now. Please post more of the PC's specs:
-cpu
-motherboard
-case
-psu(important that you include make, model, and how long it's been in use)
 
Aug 27, 2023
9
0
20
I've been using the card for one year.
It crashes around these temps in my computer :
  • Ryzen 7600x
  • Asus B650E mb
  • 32gigs of DDR5 at 6000Mhz
  • A LHR RTX3080 Gainward “Phoenix”
  • a 850W Corsair RMX psu
My psu is 7 years old now. All my case fans are140mm Noctua and set above 1000rpm
I had no issues with the card a few months ago with way lower case fan speeds and the same ambient temp.
The case, fans and card are clean.

I'm able to reproduce the crash in another computer, with a different PSU, with every component almost brand new, in a fractal torrent case with the big fans set to stock speeds.

Youtube video of the crash in x2 speed


what fans settings you have on this gpu and when it was dust off the last time also air flow in the case fans position .
The fan settings are stock and the airflow is good. I could probably increase the card's fan curve and make it more stable. What bothers me is that I used to use it for hours in worse conditions, and it was not black screening like this.
 
Last edited:
Aug 27, 2023
9
0
20
Video is set to private.


What if you set the card's power limit to 90% in Afterburner?
My bad, it should be unlisted now.
Yeah, that would probably help, just like making a custom fan curve.
Considering I bought it at 2x msrp and still is under warranty, I'd like to get a card that works as expected instead of fixing it myself :(

What bothers me is that is used to work perfectly in worse conditions before
It's not reaching super high temps in my opinion and crashes instead of thermal throttling

The question is can the card be considered defective ? I really want to say yes and get a replacement, but I'm biased
 

Phaaze88

Titan
Ambassador
I'm not having any luck seeing the video. Keeps going to private.


The question is can the card be considered defective ? I really want to say yes and get a replacement, but I'm biased
If your mind's already made up, it's gonna be difficult for anyone to convince you the gpu isn't defective; you already tried the gpu in another PC, and it was still crashing.
 
Aug 27, 2023
9
0
20
I'm not having any luck seeing the video. Keeps going to private.



If your mind's already made up, it's gonna be difficult for anyone to convince you the gpu isn't defective; you already tried the gpu in another PC, and it was still crashing.
My bad for the video. It works in a private windows for me now.

Thanks for taking the time to look into this btw

Agreed this is probably not the right way to frame the question.
A better one would be :

Would a card in working order be able to function in the conditions that crash my current card?

And/or :

Is there proof of 3080s working at temps higher than the ones that crash my current card ? (81°c ; 92°c t-j ; 94°c t-hot)
 
First of all I've never seen a review of a 3080 breaking 80C, but I wasn't able to find any reviews of that specific one just Palit (OEM for Gainward). The cooler on the one you have seems decent enough so is the room temp really high?

What are the memory temps looking like?

If your room temp isn't particularly high and you're certain that airflow is good then you may have had a defective card the entire time.
 
  • Like
Reactions: Roland Of Gilead
Aug 27, 2023
9
0
20
First of all I've never seen a review of a 3080 breaking 80C, but I wasn't able to find any reviews of that specific one just Palit (OEM for Gainward). The cooler on the one you have seems decent enough so is the room temp really high?

What are the memory temps looking like?

If your room temp isn't particularly high and you're certain that airflow is good then you may have had a defective card the entire time.

Room temp should be 26+-2°C
Memory temps on the crashes I logged are between 89 and 98°C

What I still can't figure out is why fine for a year (without air conditioning last summer) and can't run a game for more than twenty minutes now...

Could it be just thermal paste degradation? Did I fry something in the card by keeping it too hot while playing Diablo (unlikely?)?
 

WallysWorld

Honorable
Aug 28, 2019
13
3
10,515
I have a Gigabyte RTX 3080 Ti and I haven't seen it go over 77c with hard gaming, but I've also undervolted it to save power and temperature. I would try that and see if that helps. There are plenty of online videos showing how to do so and I think it would be worth the effort.

How to undervolt a RTX 3080
 
  • Like
Reactions: savonfou

Phaaze88

Titan
Ambassador
Could it be just thermal paste degradation? Did I fry something in the card by keeping it too hot while playing Diablo (unlikely?)?
Video works for me now. I had about given up on it.
What happens in the latter half of your video is something discussed in this GN video:
View: https://www.youtube.com/watch?v=wnRyyCsuHFQ


The suspects are:
-psu
-motherboard
-gpu, to be more specific, the voltage regulator
-if cable extensions are involved, they too are suspect
With what you've tried so far, I'm leaning towards #3. Beyond lowering the power limit, the only alternative of this one is a new card.


The undervolt guides are a roundabout method, when adjusting the power limit and core clock sliders do the same thing, without having to go into Curve Editor.
 
  • Like
Reactions: savonfou
Room temp should be 26+-2°C
Memory temps on the crashes I logged are between 89 and 98°C

What I still can't figure out is why fine for a year (without air conditioning last summer) and can't run a game for more than twenty minutes now...

Could it be just thermal paste degradation? Did I fry something in the card by keeping it too hot while playing Diablo (unlikely?)?
What you need to understand is that the temps you're seeing now are very bad for a temperature controlled room. If they were worse before the card being damaged is a possibility. I'd like to think it may just be thermal pad degradation, but there's no way to be certain without pulling apart the card to find out which would require repasting and likely replacing pads.
 
  • Like
Reactions: savonfou
Aug 27, 2023
9
0
20
What you need to understand is that the temps you're seeing now are very bad for a temperature controlled room. If they were worse before the card being damaged is a possibility. I'd like to think it may just be thermal pad degradation, but there's no way to be certain without pulling apart the card to find out which would require repasting and likely replacing pads.
We are talking maximum mem junction temps at equilibrium under maximum gpu load, right ?
If it was out of warranty, I would have gladly pulled it apart to swap the pads or even zip-tied a fan on top of the integrated ones to increase airflow.

What I don't understand is, if the temps are to blame for the issue, which seems like a likely answer, why is "Performance Limit - Thermal [Yes/No]" always "No" in the HWinfo logs I have right before and during the crashes? Is the reporting unreliable, or the card just not thermal throttling? None of the performance limiters seem to trip, except the "Power" one


Video works for me now. I had about given up on it.
What happens in the latter half of your video is something discussed in this GN video:
View: https://www.youtube.com/watch?v=wnRyyCsuHFQ


The suspects are:
-psu
-motherboard
-gpu, to be more specific, the voltage regulator
-if cable extensions are involved, they too are suspect
With what you've tried so far, I'm leaning towards #3. Beyond lowering the power limit, the only alternative of this one is a new card.



The undervolt guides are a roundabout method, when adjusting the power limit and core clock sliders do the same thing, without having to go into Curve Editor.
Ahah good to see a GN video here, Steve is a national treasure for all pc enthusiasts :smile:
I agree with your observation.

I was able to use the card yesterday by pinning the fans at 75% and power limiting it to 80%
With this config, the main temp sensor maxes in the low 70s but the mem junction still ended up around 90°C

By the way, the link in your description "Geforce 10 and Up: Just Power Limit the Things" is broken. I'd be interested to read about the tradeoff between power limiting and undervolting if I end up being forced to do one or the other
 

Phaaze88

Titan
Ambassador
mem junction still ended up around 90°C
That is still OK, as long as those are spikes, and it's not sitting up there. The temperature limit for GDDR6X is 105C:

By the way, the link in your description "Geforce 10 and Up: Just Power Limit the Things" is broken. I'd be interested to read about the tradeoff between power limiting and undervolting if I end up being forced to do one or the other
I'll check with a moderator, thanks.
 
We are talking maximum mem junction temps at equilibrium under maximum gpu load, right ?
If it was out of warranty, I would have gladly pulled it apart to swap the pads or even zip-tied a fan on top of the integrated ones to increase airflow.

What I don't understand is, if the temps are to blame for the issue, which seems like a likely answer, why is "Performance Limit - Thermal [Yes/No]" always "No" in the HWinfo logs I have right before and during the crashes? Is the reporting unreliable, or the card just not thermal throttling? None of the performance limiters seem to trip, except the "Power" one
I'm talking all your temps period being higher than they should be now, and if it was likely worse previously that's where the issue could have arisen. High temps can degrade the pads which cover voltage regulation and there's no temperature sensors (at least that can be polled) for those.
Ahah good to see a GN video here, Steve is a national treasure for all pc enthusiasts :smile:
I agree with your observation.

I was able to use the card yesterday by pinning the fans at 75% and power limiting it to 80%
With this config, the main temp sensor maxes in the low 70s but the mem junction still ended up around 90°C
If dropping the peak power consumption resolves things then I'd like to point the finger at cooling issue which would likely be the pads. The only two things I've seen be the issue when someone's reached that point is damaged voltage regulation or just the pads needing to be replaced. This doesn't mean it's the only potential cause by any stretch, but rather the most likely.
 
  • Like
Reactions: savonfou
Aug 27, 2023
9
0
20
I have an update to close this thread if anyone ends up with a similar issue :)

This problem ended up as a 2-month-long warranty dispute between Amazon, the third party GPU seller, and me.
I contacted the manufacturer (Gainward) support for information, here is a quote from their take on temperature :
Regarding the temperature, we can tell you that this increased temperature is caused by the GDDR6X graphics memory modules, as the fast memory operations of the modules generate a lot of heat.

These graphics memory modules are specified for up to 105 degrees Celsius, but up to 120 degrees Celsius is also possible in some cases.

However, this leads to a load on the graphics chip.

You can counteract this by protecting the graphics card so that dust cannot get in, since dust can limit heat dissipation.

Further steps would require opening the graphics card, but this is not possible as it would void the warranty.
Which is exactly what Phaaze88 said!

At stock settings, the 3080 I had was turboing to an unstable state, causing it to crash.
The seller tried to claim that the GPU was manually overclocked to deny the warranty claim...
Here is a quote from Gainward on cards turboing with GPU boost :

We can say that the tool GPU Boost can overclock a not modified graphics card, which is used with an official Nvidia driver, if it has power, voltage and thermal headroom.

Due to the fact that the card uses the entire power budget allocated to it, the power draw numbers of the card will be higher than advertised TBP or TGP numbers.

Amazon accepted the warranty, and I was able to return the card for a refund.
I bought a 4080 with the money. For a short while, I had both a 3080 and a 4080.
I haven't run into any issues with the 4080, even after running Furmark for a while.

At stock settings, in a room at 25°c with the same case fan speeds:

3080 Gainward phoenix:
  • idling temp: ~52°c
  • Furmark maximum main temp : ~84°c + was black screening
  • Temp delta between main temp and hotspot under load: ~ 15°c

4080 PNY Verto RGB:
  • idling temp: ~38°c
  • Furmark maximum main temp : ~73°
  • Temp delta between main temp and hotspot under load: ~ 7°c

Please note that this is not a fair comparison, the 4080 is massive compared to a 3080 and takes 3 slots instead of 2. Still, the differences give credit to the hypothesis that the 3080 had a cooling issue.

My takeaways:
  • Don't ever buy stuff from scalpers, they'll try to screw you over and find a way to leave you with a defective product if they can
  • Power limiting series 3000 and 4000 GPUs is great to make them run cooler without sacrificing much performance
 
Last edited:
Status
Not open for further replies.