[SOLVED] Help needed with undervolting 1 of 2 2080ti's

WaffleToasted

Distinguished
Sep 20, 2016
26
3
18,535
Hello everyone,

I need some help undervolting a gpu in my current work setup, I have read a many things about it and watched quite a few vids about how to do it but what I notice is that a lot of people just say; drag this and see if it works and voila, you're running cooler and done! However nobody really explains some of the things I question about undervolting which in turn causes me to fail miserably.

So first of my current rig;
I run 2x 2080 ti (not in sli) and use this rig mainly under load for rendering with Redshift for Maya (which mainly uses gpu).
One of the two cards runs very hot while doing this (86C when 89C is max according to Nvidia).
I looked around for options cause the airflow in my case is pretty good but the two cards are just too close together (no blower types were available anywhere , I had very little options in gpu's seeing my work situation) and I found that undervolting might be my best option here in order to not void any warranty on the case itself by opening it .

I installed MSI afterburner, I also noted down (as many video's mention) the max clock speed and voltages while rendering (rendering is more demanding than a benchmark I found and demands more of the cards).
But I am unsure where to go from there.

I have tried simply tweaking the temp limit down to 79C and with that lowering the power limit but that has no effect whatsoever.

I see people evening out the graph in MSI afterburner but people never really mention what makes them decide on a certain point in the graph, I hardly see anyone use these max settings some ask you to note down at the start.
I tried imitating a graphnsomeone else had set up for his 2x 2080 ti's but my render program crashes (possible cause of too low voltage?) with the statement that it is possibly due to a gpu crash. If I set everything back to default it works again perfectly.
Someone told me to perhaps increase the voltage to se eif it fixes these crashes, but do I do this in the graph or in the main menu of MSi afterburner (aka the Core Voltage % slider)?

I have so many questions and I would really like to try and get it to work.
How and why do I decide what point in graph to even out and same goes for the deciding the frequency to which I am evening the line out.
What do I do when it is rather unstable aka my render basically crashes and will not run? Do I increase voltage like some guy mentioned and where can I do that?
Do I also apply the undervolt to the bottom gpu?

As for the stats;
From what I saw on Nvidia's website regarding the 2080 ti; base clock is 1350, boost is 1545 and the memclock is 1750 (14k effective)

What I max measured while rendering on the top gpu is the following;
Max gpu core voltage; 1.063V
Max GPU power; 18.3664W
(perhaps not relevant but just to throw it in)
Max gpu clock; 1950 Mhz
Max gpy memory clock; 6.870 Mhz

A screenshot of current (standard) curve as well as MSI afterburner settings can be found here (yes temp limit is still turned on);
View: https://imgur.com/a/HA7JgoE


Thank you very much for reading this wall of text and possibly helping me out with this.
 
Solution
So at first you drag the 1V point up to 1900 Mhz and even everything out on the right.

If stable you drag the point or points to the left of the 1V; 0.993V up to 1V up to match the previous 1900 mhz?

Yeah... that's it. English is not my first language either... that's why you were confused by my explanation.

I'd aim for something around 1750 MHz with the cards... 1698 is quite low. At 1750 I'm pretty sure you can start from 0.9V stable... you might be able to get it as low as 0.85V@1750. Follow the same methodology, but start at 0.9V and go down from there.
It is mostly trial and error... I won't be able to tell you an exact number as every GPU is manufactured in different conditions and it'll OC or undervolt differently.

My method would be this... monitor your average GPU core clock and try getting the lowest possible voltage with the average frequency you were getting on stock.

You can start at let's say 1V... click on the corresponding 1V point on the graph and increase core frequency until the point reaches the desired frequency(1900 MHz for example)... bring all the points after(higher voltage values) to the same level(frequency).

After doing that... stress test the GPU to see if it's stable. If it is, just lower to voltage one step further(to 0.993V or whatever the lower step voltage value is) and keep the same frequency at the lowered voltage. You lower it until it becomes unstable... when it does you just revert it back to the last stable voltage.

If you undervolt too much first and it's unstable... just increase the voltage in steps at the same frequency(don't forget to level the points after) until it becomes stable.

On my 1080ti my max frequency in benchmarks was around 1960-1970 MHz with a similar 1.06V... it would go as low as 1850 MHz at the end of the benchmark. I decided to set the target to the average frequency I was getting in the test... that's about 1910 MHz. I started with the voltage as low as 0.881V and surely it failed rather quickly. I increased the voltage in steps(0.893,0.900,0.911V) until 0.911V@1910 MHz where the GPU was stable... got a reduction in temp of 9C and 0% performance loss from default.

So... stress test the GPU to see what's the average GPU core frequency(not MAX)... that's your target(you don't want to lose performance compared to stock). As a reference point, you should start with the undervolt at 0.9-1V... you can start really low(0.9V) and work your way up until the GPU is stable or start at a higher voltage(1V) and work your way down until it gets unstable.
 
Instead of messing with the power draw try lowering the max gpu clock to sub 1700MHz. 1950MHz is pretty high for a stock boost clock and simply lowering that lock would produce less heat. It's safer than messing with your voltages.

Also install GPU-Z to get a finer detailed look at your GPU's sensors and power draw.
 

WaffleToasted

Distinguished
Sep 20, 2016
26
3
18,535
W
Instead of messing with the power draw try lowering the max gpu clock to sub 1700MHz. 1950MHz is pretty high for a stock boost clock and simply lowering that lock would produce less heat. It's safer than messing with your voltages.

Also install GPU-Z to get a finer detailed look at your GPU's sensors and power draw.

Would that be adjusting the slider for Memory Clock?
I will give that a render test tomorrow morning.

So... stress test the GPU to see what's the average GPU core frequency(not MAX)... that's your target(you don't want to lose performance compared to stock). As a reference point, you should start with the undervolt at 0.9-1V... you can start really low(0.9V) and work your way up until the GPU is stable or start at a higher voltage(1V) and work your way down until it gets unstable.

I was using HardwareInfo for that
If I look at average stats they say as following;
GPU Core Voltage, average; 0.956 V
GPU Power, average; 130.121 W

Again not sure if relevant
GPU clock, average; 1.512,7 Mhz
GPU Memory clock, average; 5.387 Mhz


Thank you both for replying.
 
Last edited:
Would that be adjusting the slider for Memory Clock?
I will give that a render test tomorrow morning.


I was using HardwareInfo for that
If I look at average stats they say as following;
GPU Core Voltage, average; 0.956 V
GPU Power, average; 130.121 W

Again not sure if relevant
GPU clock, average; 1.512,7 Mhz
GPU Memory clock, average; 5.387 Mhz


Thank you both for replying.

No, that's adjusting the slider for the 'core clock'. The memory clock is a different physical component on the board and you don't need to mess with that to lower temps. If anything that'll lower performance significantly. Simply dropping the core clock will suffice in lowering temps.

BUT BEFORE DOING ANY OF THIS.
What model cards are they? You never specified but you did say it wasn't a blower card. I also see that you have the fan curve on AUTO. You should ramp the fans to 85% or higher and see if your cards run cooler at the expense of noise if you don't care for that. I run my 2080Ti gaming x trio card at 1980MHz on 85% fan speed at it stays around 64-70c. Albeit I don't have 2 next to each-other.
 

WaffleToasted

Distinguished
Sep 20, 2016
26
3
18,535
No, that's adjusting the slider for the 'core clock'. The memory clock is a different physical component on the board and you don't need to mess with that to lower temps. If anything that'll lower performance significantly. Simply dropping the core clock will suffice in lowering temps.

BUT BEFORE DOING ANY OF THIS.
What model cards are they? You never specified but you did say it wasn't a blower card. I also see that you have the fan curve on AUTO. You should ramp the fans to 85% or higher and see if your cards run cooler at the expense of noise if you don't care for that. I run my 2080Ti gaming x trio card at 1980MHz on 85% fan speed at it stays around 64-70c. Albeit I don't have 2 next to each-other.

Sorry for not mentioning it earlier; Gigabyte Aorus GeForce RTX 2080 Ti Xtreme 11G.
I can give the fan speed a go, too bad you can't make a curve for that function.

Again many thanks.
It will take about 12 hours before I can test this out (eu timezones) but I will get back to you with the results.
 
Sorry for not mentioning it earlier; Gigabyte Aorus GeForce RTX 2080 Ti Xtreme 11G.
I can give the fan speed a go, too bad you can't make a curve for that function.

Again many thanks.
It will take about 12 hours before I can test this out (eu timezones) but I will get back to you with the results.

No problem. Also there is a fan curve you can set for those cards, in MSI especially. You see that little cog wheel by the fan speed slider? Click that and it will take you to the fan curve graph where you can adjust. Test out what temps you get at what speed during your test runs. And if the fans don't keep it cool enough then you can resort to doing either what I or @ChumP said.
 

WaffleToasted

Distinguished
Sep 20, 2016
26
3
18,535
No problem. Also there is a fan curve you can set for those cards, in MSI especially. You see that little cog wheel by the fan speed slider? Click that and it will take you to the fan curve graph where you can adjust. Test out what temps you get at what speed during your test runs. And if the fans don't keep it cool enough then you can resort to doing either what I or @ChumP said.

Thank you very much, will do
 
W


Would that be adjusting the slider for Memory Clock?
I will give that a render test tomorrow morning.



I was using HardwareInfo for that
If I look at average stats they say as following;
GPU Core Voltage, average; 0.956 V
GPU Power, average; 130.121 W

Again not sure if relevant
GPU clock, average; 1.512,7 Mhz
GPU Memory clock, average; 5.387 Mhz


Thank you both for replying.

I was refering to the average GPU core clock that you're getting in the benchmark... you can use the OSD from afterburner to monitor that(an approximation... because you are monitoring the frequency yourself).
 

WaffleToasted

Distinguished
Sep 20, 2016
26
3
18,535
No problem. Also there is a fan curve you can set for those cards, in MSI especially. You see that little cog wheel by the fan speed slider? Click that and it will take you to the fan curve graph where you can adjust. Test out what temps you get at what speed during your test runs. And if the fans don't keep it cool enough then you can resort to doing either what I or @ChumP said.

Hi , I did some testing with both the fanspeed curve adjustments as well as lowering the core clock maximum down to 1500 as well as 1250.
Despite all that the card's temperature is not going down; I was running a more complex render scene for about 15 minutes and it stuck on average on 88C and this stayed the same whether or not I adjusted the GPU core clock maximum. The fans were hitting 100% cause of the temps and were set to run faster at lower temps during the build up but no difference.

I was refering to the average GPU core clock that you're getting in the benchmark... you can use the OSD from afterburner to monitor that(an approximation... because you are monitoring the frequency yourself).

If I run Heaven's benchmark then I average at 1698 for the GPU core clock. Voltages average around 0.930V

I will try and see if I can get the first method to work, I still have some job related things to do before I can try this out.

You can start at let's say 1V... click on the corresponding 1V point on the graph and increase core frequency until the point reaches the desired frequency(1900 MHz for example)... bring all the points after(higher voltage values) to the same level(frequency).

After doing that... stress test the GPU to see if it's stable. If it is, just lower to voltage one step further(to 0.993V or whatever the lower step voltage value is) and keep the same frequency at the lowered voltage. You lower it until it becomes unstable... when it does you just revert it back to the last stable voltage.

I have a question regarding this cause I am unsure what you specifically mean in terms of what I drag up or down;
So at first you drag the 1V point up to 1900 Mhz and even everything out on the right.

If stable you drag the point or points to the left of the 1V; 0.993V up to 1V up to match the previous 1900 mhz?

Just being sure, since English is not my main language.

Edit:
I did find that if I lower the Temp limit all the way down to 71C that my temps are averaging around 81C on the exact same render scene I was running yesterday.

Yesterday's scene without the temp limit all the way down to that number was hitting 90C max and 88C on average.

I am currently testing how this affects render times on that same scene.
 
Last edited:
So at first you drag the 1V point up to 1900 Mhz and even everything out on the right.

If stable you drag the point or points to the left of the 1V; 0.993V up to 1V up to match the previous 1900 mhz?

Yeah... that's it. English is not my first language either... that's why you were confused by my explanation.

I'd aim for something around 1750 MHz with the cards... 1698 is quite low. At 1750 I'm pretty sure you can start from 0.9V stable... you might be able to get it as low as 0.85V@1750. Follow the same methodology, but start at 0.9V and go down from there.
 
Solution

WaffleToasted

Distinguished
Sep 20, 2016
26
3
18,535
Yeah... that's it. English is not my first language either... that's why you were confused by my explanation.

I'd aim for something around 1750 MHz with the cards... 1698 is quite low. At 1750 I'm pretty sure you can start from 0.9V stable... you might be able to get it as low as 0.85V@1750. Follow the same methodology, but start at 0.9V and go down from there.
Thank you for explaining.
It would also seem that not too much is lost performance wise by adjusting the Temp limit (power limit). So far renders have been stable and the difference is 30 - 50 seconds (and these are 8k renders). I work with stills and thus don't usually render animations which makes it acceptable.

I still want to test with yesterdays scene; a 3,54 hour long 8K render scene that made the gpu hit 90C at times and 88C on average and see how it responds with the current setup.

At the moment I am running some other renders that are floating around 80C without much increase in rendertime and no crashes from the render engine itself. A large improvement seeing as they previously averaged around 86-87C.

Thank you all for responding and explaining it, that was certainly a huge help.