Question RTX 3090 extreme thermal throttling after tear down for cleaning and reapplying thermal paste

Apr 28, 2023
4
0
10
Hey! Today I bought myself some thermal pads and a new thermal paste with the intention of opening up my EVGA RTX 3090 FTW3, cleaning it and replacing that “thermal foam” and they use in two places with some thermal pads and also replacing the thermal paste.

I’ve done that successfully but now I’m gaming and my GPU is hard thermal throttling, lowering to the range of 1400-1600 MHz with the fans blowing as hard as they can. My GPU core temperature, though, hardly goes higher than 68°C and the memory temperatures are similar. But what I’ve noticed is that the Hotspot temperature is shooting up towards 105°C. I’m guessing this isn’t normal. When I boot the game, it holds on to my undervolted clock target of 1800MHz but the hotspot creeps up in temperature, dragging the fans speed towards 100% with it. Then when it hits around 104°C, the GPU starts slowly but surely decreasing clock speed and HWM indicates a thermal PerfCap.

Has anyone any idea of that I might’ve done wrong? I left most of the original thermal padding and i guess it can’t be the new thermal paste because the GPU core temperature is stable at 68°C?
Im completely lost, and until I realized a thermal throttling was happening, I was stressing because I thought I had damaged something. Maybe I did…
 
Solution
Hot spot can come from any sensor off the gpu die, of which there are many, so 'which hot spot' is anyone's guess.
From my experience using liquid metal, which has only been direct die cpu, it should be painted to BOTH surfaces - the die and the cooler cold plate. Just the die wasn't enough. Of course, I didn't paint the entire cooler cold plate, just the general area where the die should contact it.
Then there's the matter of the new pads you used. Part of the die may not be contacting the cold plate at all, because the pads don't flex enough.

Like I said, you made it harder doing both at the same time. Now you have to pick a step, make adjustments, and if that wasn't the solution, then go with the other step and make changes there.

Phaaze88

Titan
Ambassador
Has anyone any idea of that I might’ve done wrong?
When this happens:
"GPU core temperature, though, hardly goes higher than 68°C... But what I’ve noticed is that the Hotspot temperature is shooting up towards 105°C."

It usually means you:
A)either went too thick(mm), or too hard(hinted by the w/mk rating) on the new pads.
Try thinner versions of the same pads. How thin? Start 0.5mm thinner than the ones you used, and if that doesn't work, then another 0.5mm thinner.

B)Used paste that was too runny. Some pastes that work well on the big ol' cpu IHS kinda just roll off of the smoother bare die, thus leaving uncovered spots.
Gonna have to take it apart and take another look underneath. If it looks good, then back to A.
Thermal Grizzly Hydronaut, Noctua NT-H2, and Arctic MX-4 are a few of the recommended pastes to use for bare die applications. Of course there's others, but those I remember off the top of my head.


Not recommended to do repastes and pad changes at the same time, because if something does go wrong, like right now, the cause becomes harder to pinpoint.
 
Apr 28, 2023
4
0
10
When this happens:
"GPU core temperature, though, hardly goes higher than 68°C... But what I’ve noticed is that the Hotspot temperature is shooting up towards 105°C."

It usually means you:
A)either went too thick(mm), or too hard(hinted by the w/mk rating) on the new pads.
Try thinner versions of the same pads. How thin? Start 0.5mm thinner than the ones you used, and if that doesn't work, then another 0.5mm thinner.

B)Used paste that was too runny. Some pastes that work well on the big ol' cpu IHS kinda just roll off of the smoother bare die, thus leaving uncovered spots.
Gonna have to take it apart and take another look underneath. If it looks good, then back to A.
Thermal Grizzly Hydronaut, Noctua NT-H2, and Arctic MX-4 are a few of the recommended pastes to use for bare die applications. Of course there's others, but those I remember off the top of my head.


Not recommended to do repastes and pad changes at the same time, because if something does go wrong, like right now, the cause becomes harder to pinpoint.
Right, thank you! I appreciate the reply, I see I'm gonna have to buy new thermal pads. It's alright, their cost is less painful than the stress I felt yesterday.

On the original post, I mentioned thermal paste as I was rushing to type everything out and get an answer as soon as possible but I forgot to mention that it's indeed Liquid Metal that I've used. The Thermal Grizzly Conductonaut. I've applie the safe coating on the little capacitors (I believe they're capacitors, might be recalling the wrong name) that are "inside" the square of the chip. The first time I got the temperature issue, I went ahead and opened the GPU to check whether there had been a spill but I found no such thing. Maybe I applied too little? I've done this before, years ago on a 1080 but I got "scared" recently with reports of damaged cards by liquid metal spill so I tried to put as little as possible. I think I left a bit "off" on the edges to avoid it spilling as much as I could. Could these small "places" be the hot spot? These is a diagram of the parts I believe I covered with liquid metal:

The red area is supposed to be where I spread the liquid metal

I feel like this was a "juvenile" question but it feel like it could be a possibility

From my sensors, I really have no clue how to pinpoint where the hot spot is. Under sustained load, 99% utilization, the temperatures are as the picture describes:
I took this print as soon as I saw any thermal throttling. As you can see, the fans were blowing at almost 100%

Then, I applied my custom voltage curve targeting around the same core clock but with almost 100mV less.
These were the results and I, once again, took the screenshot as soon as I say a thermal throttle:

These are all the temperature sensors I have available through GPU-Z. I can't understand what the issue could be. From what I've looked online, both memory and core temperatures look fine, I just dont understand where the hotspot is coming from.
 
Last edited:

Phaaze88

Titan
Ambassador
Hot spot can come from any sensor off the gpu die, of which there are many, so 'which hot spot' is anyone's guess.
From my experience using liquid metal, which has only been direct die cpu, it should be painted to BOTH surfaces - the die and the cooler cold plate. Just the die wasn't enough. Of course, I didn't paint the entire cooler cold plate, just the general area where the die should contact it.
Then there's the matter of the new pads you used. Part of the die may not be contacting the cold plate at all, because the pads don't flex enough.

Like I said, you made it harder doing both at the same time. Now you have to pick a step, make adjustments, and if that wasn't the solution, then go with the other step and make changes there.
 
  • Like
Reactions: PedroPF
Solution
Apr 28, 2023
4
0
10
Hot spot can come from any sensor off the gpu die, of which there are many, so 'which hot spot' is anyone's guess.
From my experience using liquid metal, which has only been direct die cpu, it should be painted to BOTH surfaces - the die and the cooler cold plate. Just the die wasn't enough. Of course, I didn't paint the entire cooler cold plate, just the general area where the die should contact it.
Then there's the matter of the new pads you used. Part of the die may not be contacting the cold plate at all, because the pads don't flex enough.

Like I said, you made it harder doing both at the same time. Now you have to pick a step, make adjustments, and if that wasn't the solution, then go with the other step and make changes there.
Thank you immensely, I appreciate you taking the time to help me out. On Monday I’ll have all the necessaries to do a proper search of the cause, will post the update so this thread is closed, wether I fix it or not.
 
Apr 28, 2023
4
0
10
Thank you immensely, I appreciate you taking the time to help me out. On Monday I’ll have all the necessaries to do a proper search of the cause, will post the update so this thread is closed, wether I fix it or not.
Well, here I am for the update. Turns out my mistake was to not apply enough LM. I was so afraid of causing a spill that I didn't even put enough for a normal use case. Temps are back to normal. Maybe Mem temps are a bit high but I guess it's from the new pads but since they're not crazy, I'm living with them. I apreciate the help immensely. Stay safe guys!