Question Thermal limit reached afterburner

Page 2 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.
Dec 23, 2020
23
3
15
Hello,
I got an msi 3080 suprim x graphics card, installed the beta version of msi afterburner and maxed my power limit and temp limit (116% and 91°c)

In games and benchmarks i reach 58°c max but still afterburner gives me sudden temp limits (0 to 1) on the temp limit graphic.
I double checked with log files from hwinfo64 and gpuz and the temps never spike up but they both say i reached temp limit at the same time i get the temp limit 1 in afterburner.
I thought i fixed it by going into a folder of afterburner in my c-drive and manually putting in 91°c next to the thermal limit of each profile but that didn't do it.

I never see any throttling happen and everything plays smooth so i am not sure why 3 programs give me temp limit reached when there is not a single temperature value going higher than my temperature limit...

Is this a faulty temp sensor or is there something else i could check ?
Thanks in advance !
 
  • Like
Reactions: McLovin_99
juX5kJ.png


tried recreating temp limit reached events.
I had 3 thermal events on gpuz, 1 time on hwinfo and 2 times on afterburner.

i used valley benchmark tool and walked around myself to have some load changes when looking at the sky or looking at trees.

the 2 temp limit events on afterburner happened when i closed valley benchmark, during the test i never gained the temp limit but a lot of power/load/voltage limits going up and down wich was my intend.

maybe the events happen when a capacitor can not unload fast enough thus giving a temp limit spike ? i am shooting blind here since i am not an electrician or have no knowledge how these things work electrically speaking...

i was hoping to get to a point where i i could see the temp limit spike happen and could build further on that but i only got 2 on afterburner, whenever i closed valley benchmark...

maybe you have an idea on where top go from here ? i have a RM1000X corsair psu and can not test your theory of single/double rail supplying to the GPU.
I have 3 seperate pcie cables running from psu to gpu to get the needed wattage, learned from the internet that 1 cables maximum was 150W so to reach my 425Wattage i needed 3 seperate cables. (i think i am correct here ? )

edit : i also did very little research on my psu and i justr connected the pcie cables at random locations. Do i need to look into this further or does it not matter wich pcie avalable slot i use of my psu ?
 
Last edited:
There's only 1 temp strip on video cards. Right in the graphic processor. So that's the only temp you get to read, the chip itself.

You maxed out power limits, that's going to include bumping voltage and amperage usage. That power, whether used or not is going through the VRM's, vram and all the other components in the card.

So you only see gpu temp, not VRM temp. Which the card is reaching when the speeds are boosted with gpu boost 3.
 
There's only 1 temp strip on video cards. Right in the graphic processor. So that's the only temp you get to read, the chip itself.

You maxed out power limits, that's going to include bumping voltage and amperage usage. That power, whether used or not is going through the VRM's, vram and all the other components in the card.

So you only see gpu temp, not VRM temp. Which the card is reaching when the speeds are boosted with gpu boost 3.
i can see tjunction memory temperature using hwinfo , is this not the same as the VRM temp you speak of?
 
No. Tjunction is the temp at the surface of the die, which is only measurable by engineers and extraordinary means. It's pretty much a useless temp in most cases since core temp is slightly higher in general than the surface temp mated to a heatsink.

It's basically skin temperature vrs basal temp. Skin temp being anything according to ambient temp.

It's only really useful when measured against itself, if it continues to rise, under a constant load after a decent amount of time, you have an airflow issue etc.
 
No. Tjunction is the temp at the surface of the die, which is only measurable by engineers and extraordinary means. It's pretty much a useless temp in most cases since core temp is slightly higher in general than the surface temp mated to a heatsink.

It's basically skin temperature vrs basal temp. Skin temp being anything according to ambient temp.

It's only really useful when measured against itself, if it continues to rise, under a constant load after a decent amount of time, you have an airflow issue etc.
So if it is the vrm temperature that is giving this temp limit spike it should give more temperature limit spikes the longer the card is running under load?
But with al my testing and logfiles i can confirm that this is not the case. I have a custom made wooden case with 3x 140mm front fans and 3x120mm fans at the back wich gives me a max gpu temperature of 58°c when running +125 core and +1500memory ( hours of benchmarking to see my max potential ).
(+1500 memory gave me same fps as +800 memory but this is due to the self correcting memory of 3000seried i believe)
The only times i see temp limit spike is when the card makes a sudden load change from lets say 100% to 0% when i close my game and return to my windows homescreen or going into a cutscene in a game and the card drops in load because a cutscene requires less gpu load.

It is a strange thing to see these temp limit spikes and perhaps you are right and it is because of the nvidia gpu boost that is happening to fast/strong but it still does not explain why some people are getting these spikes and some are not.

Every pc is different because of the components inside or build quality, ... But to see a temp limit spike without core clock boosting down, massive stutter, fps drops or anything that will happen when a real temp limit is reached just doesn't feel normal.

But we already concluded that a previous nvidia driver takes away these temp limit spikes on afterburner....

I have overclocked my 7700k processor to 5ghz and used the XMP profile for my ram 2400mhz to 3200mhz on my motherboard asus prime z270k, maybe my system is becomming a little old to provide every component with its desired overclock so whenever a huge load change happens to the whole system it can give a temp limit in afterburner.... Like i said i am running low on options to try to remove the temp limit with the newest gpu drivers.

Hopefully nvidia comes with a new driver and maybe this will indeed fix the temp limit spikes.

But thank you for your explanation on vrm temp, i believed it was the same as tjunction but now i know 😀 just fyi, my tjunction temp reached 69°c max when doing those benchmarks.
 
Try dropping power limits. I oc'd my 660ti to 124%, the Samsung vram had a max of 8k, but mine only reached 7930 stable, I used an old version of Asus gpu and it would hit power limits of 125%, I maxed out at 114%. No additional voltage added. What was wierd was the numbers. I'd be stable upto 109% PL, but 110%-113% was unstable, and 115%-125% was unstable. Just that 1 spot at 114% was good. If I added voltage, unstable.

Power is V x A. So when voltage stays pretty much the same, amperage can go through the roof with a power spike. This might not be a physical thing but an algorithm, the card is reading the power limits and saying it's too much for certain components.
 
Thank you for the reply and the thorough test on the video card.
What we might suggest is to roll back the video card driver version to 456.38, the one you do not have problem with. We will continuously test the driver 457.xx and upper numbered, see if we can duplicate the issue and report to NVIDIA. We will let you know if there is any update.
Because every system setup and system configuration is different, we cannot really recommend a video card driver that is best for you. If you don't have problem using driver 456.38, then you should stick to it. What we can be sure is the video card driver you downloaded for NVIDIA official website will support GeForce RTX 3080 SUPRIM X 10G.


This is the respons i got after a new support ticket with MSI, finally a normal answer and not some "have a good life" thing...


Hopefully they speak the thruth and will test this themselves and report to nvidia.

Yesterday i flashed my bios with the newest version i could find, my previous one was dated from october 2020 and the newer one is from december 2020.
But i still get temp limit spikes, so the only possible reason i can think of is that the problem lies with nvidia and their driver.

Tried this with 100% load, 116% load, undervolting, overclock, every other thing possible to do in my power... Tried everything i can think of so i guess we just have to wait for a new nvidia driver :)
 
I also just finished talking to nvidia support, they are gonna check this with their team so hopefully they will bring a driver that is absent these temp limit spikes.

IF these temp limit spikes keep happening i am gonna try to RMA my card.
If there is indeed something wrong with capicators or whatever electrically and thus reducing the lifespan of some components i would rather have a new card then to see it broken after 1year,2years, ....

Should somebody here have any other solution i would be happy to try it out, i think i tried everything already to remove the temp limit spikes.....
 
Just wanted to chip in and say that I have been seeing the same temp limit spikes on MSI afterburner with my asus tuf gaming oc 3090. View: https://www.reddit.com/r/sffpc/comments/loywmn/temps_for_my_3090_in_the_ncase_m1_timespy_scores/


I even tried thermal re-pasting partly to see if I could improve temperatures on my new card and also to see if these temp spikes would go away. Id incorrectly assumed that they were triggered by GPU temp hotspots (which Hwinfo now allows us to check). But since the spikes happen at load changes, it is driver related.

@coene. Why not wait for new drivers from Nvidia instead of RMA-ing. Seems like an extreme for a driver related problem.
 
Just wanted to chip in and say that I have been seeing the same temp limit spikes on MSI afterburner with my asus tuf gaming oc 3090. View: https://www.reddit.com/r/sffpc/comments/loywmn/temps_for_my_3090_in_the_ncase_m1_timespy_scores/


I even tried thermal re-pasting partly to see if I could improve temperatures on my new card and also to see if these temp spikes would go away. Id incorrectly assumed that they were triggered by GPU temp hotspots (which Hwinfo now allows us to check). But since the spikes happen at load changes, it is driver related.

@coene. Why not wait for new drivers from Nvidia instead of RMA-ing. Seems like an extreme for a driver related problem.
Thank you for your input !
About the rma, i am definitely gonna wait for nvidia to hopefully clear this problem in the new driver. I really like my card and i know its a good one thanks to my benchmarks and gaming performance. + We now know it is driver related :)

So i also send my info to alternate, i had asked them about this temp limit, and they will also adress nvidia about this.

So we got MSI and alternate who are gonna test/talk with nvidia and i spoke to nvidia myself and they will check it.
This should speed it up a bit i hope :)

But thank you again for your information !