News Preliminary GeForce RTX 4060 Ti Specs Leak: 220W and a Short PCB

Page 2 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.
Jul 7, 2022
601
562
1,760
And here we have a shining example of "smaller number bad" in practice. Without also direct testing of texture compression differences between GA103 and AD106 (reducing bandwidth demand), cache behaviour differences between GA103 and AD206 (relocating read/write bottlenecks to a different interface), and whether GA103 was even memory bandwidth bottlenecked in the first place, concluding that a smaller number = worse real-world performance is at best premature.

But of course, internet commentators know better than Nvidia's engineers, and making number bigger means making GPU more betterer (and die area for the extra PHY lanes is just free and has no impact on die cost or power budget).
Edzieba, first, I apologize for the first sentence in my retort to your post, you are right that one products specs alone are not a good indicator of performance, however the comparative analysis I describe below can predict performance as long as all relevant variables are accounted for.

Let me break out my math skills to see if I can indeed predict performance based on specs, the 3080 has 760 GB/s bandwidth and 8704 shader cores running at 1710 MHz boost clock. The 4080 has 720 GB/s bandwidth and 9728 shader cores running at 2505 MHz boost clock. That is -5% bandwidth, +12% shader count, and +46.5% boost clock for the 4080 which gives it a 42% improvement in game frame rate (based on the tech power up average performance relative to other gpu’s IE if 3080 is 100% performance, 4080 is 142% performance) . Using the 4080’s clock speed, bandwidth and shader count values gives us an IPC detriment of -9% for the 4000 series architecture (this makes sense as, just like with the pascal generation, nvidia sacrificed some IPC in order to clock their shader cores higher).

Now that we know the IPC of the 4000 series architecture, we can then calculate the expected performance of the 4060 Ti vs the 3060 Ti. Starting with the 3060 Ti having 100% performance, we subtract 42% bandwidth, subtract 11.8% shader cores, subtract 9% IPC to account for lovelaces lower IPC compared to ampere, and then adding 50.5% higher clock speed(using 4080 boost clock of 2505 MHz since we don’t have data on 4060 Ti clock speed and 1665 MHz for the 3060 Ti’s boost clock), we have a performance detriment of -13% for the 4060 Ti compared to the last gen 3060 Ti. However, if nvidia does increase 4060 Ti clock speed higher than the currently released 4000 series cards, it would take a boost clock (at the absolute limit of ada Lovelaces capabilities) of 3200 MHz (IE a 92% increase compared to the 3060 Ti boost clock) in order to reach my + 10% performance improvement estimate over the 3060 Ti.

I have also checked to see if perhaps smaller nvidia dies are more performance efficient than the larger dies by directly computing 3060 Ti vs 3080 specs and reveals die size has 0% effect on performance so I doubt the 4000 series would be any different.

So like everything in life, the proper use of math in scientific applications does allow us to predict performance.

PS I stupidly put a lot of time in computing all this so please only respond if you have a constructive criticism of my work or a counterpoint, etc.
 
Last edited:

oofdragon

Honorable
Oct 14, 2017
241
234
10,960
The 3060 really fel short of expectations at just 20% faster than a 2060, it should have been really 50%. The 2070 to 3070.. 50%.. 2080 to 3080.. 50%... 2080Ti to 3080Ti.. 50%... 6900XT to 7900XTX... 50%... 3080 to 4080... 50%.... Really that's the whole point of "next gen". The 3060 got slaughtered, now the 4060 slaughtered again? They are really aiming for lets make people spend MORE and see a reason to buy HIGHER tiers. Shut up and f* up. The 4060 "rightful" place is 3070 performance, less than that and you know they are crippling it on purpose. They could even have made it 3080 tier to justify asking $500 for it $700 for 4070, $900 for 4080 and so on. Nvidia pricing and performance values are absolute **** this gen. Don't buy it, go AMD .
 
Last edited:
  • Like
Reactions: artk2219

oofdragon

Honorable
Oct 14, 2017
241
234
10,960
You know there was always this 60s = 1080p, 70s = 1440p, 80s = 4K.. now it looks like they trying to play it like 70s, 80s, 90s... making the 4060 the new 50s rlly. The 4050 probably won't even launch at this rate, like what's the point if it doesn't look like there's any specs left to nerf
 
  • Like
Reactions: artk2219 and Ar558

Ar558

Proper
Dec 13, 2022
228
93
160
You know there was always this 60s = 1080p, 70s = 1440p, 80s = 4K.. now it looks like they trying to play it like 70s, 80s, 90s... making the 4060 the new 50s rlly. The 4050 probably won't even launch at this rate, like what's the point if it doesn't look like there's any specs left to nerf

I'm sure there is still some wiggle room. what about a 16bit Bus, with 128Kb of VRAM and 2 Cuda cores for a bargain $250?
 
  • Like
Reactions: artk2219

trance77

Commendable
Aug 28, 2020
32
13
1,535
Well 3060 is 1080p GPU. Just fine... And new low end may be $500+

3060TI is a good 1440P card, had one since launch and been very happy with it. Performance equal to a 2080 Super for £370 was a good deal. Shame Nvidia didn't stick to that formula.
 
  • Like
Reactions: Ar558
Don’t hold your breath. The 3050 was a bit of a letdown I think. Sure it worked but their issue there was amd had the rx 6600 for allot 75-100 less that performed at least as good or better. I’d say a 4050 maybe is about equal to a 3060 today.
 

edzieba

Distinguished
Jul 13, 2016
438
431
19,060
Always a good laugh when someone starts their "be reasonable guys" post with a dash of condescension and expects to be taken seriously after.

Specs are a good indicator of performance no matter what, especially when we are comparing a successor product to its predecessor; so pretending people are being idiots for predicting a weak immediate replacement to the 3060Ti based on the rumoured specs is intellectually disingenuous.

Now, I could be wrong and maybe Nvidia engineers have made a leap in let's say delta colour correction and blows the 3060Ti out of the water. Yet at this stage, being concerned and upset at these numbers are far away from being laughable. Let's wait and see, if the 4060Ti with these specs proves me wrong I will be here and own up to it.
Specs may aid in comparisons within the same architecture, but once you start comparing across architectures they are of little value.
Edzieba, first, I apologize for the first sentence in my retort to your post, you are right that one products specs alone are not a good indicator of performance, however the comparative analysis I describe below can predict performance as long as all relevant variables are accounted for.

Let me break out my math skills to see if I can indeed predict performance based on specs, the 3080 has 760 GB/s bandwidth and 8704 shader cores running at 1710 MHz boost clock. The 4080 has 720 GB/s bandwidth and 9728 shader cores running at 2505 MHz boost clock. That is -5% bandwidth, +12% shader count, and +46.5% boost clock for the 4080 which gives it a 42% improvement in game frame rate (based on the tech power up average performance relative to other gpu’s IE if 3080 is 100% performance, 4080 is 142% performance) . Using the 4080’s clock speed, bandwidth and shader count values gives us an IPC detriment of -9% for the 4000 series architecture (this makes sense as, just like with the pascal generation, nvidia sacrificed some IPC in order to clock their shader cores higher).

Now that we know the IPC of the 4000 series architecture, we can then calculate the expected performance of the 4060 Ti vs the 3060 Ti. Starting with the 3060 Ti having 100% performance, we subtract 42% bandwidth, subtract 11.8% shader cores, subtract 9% IPC to account for lovelaces lower IPC compared to ampere, and then adding 50.5% higher clock speed(using 4080 boost clock of 2505 MHz since we don’t have data on 4060 Ti clock speed and 1665 MHz for the 3060 Ti’s boost clock), we have a performance detriment of -13% for the 4060 Ti compared to the last gen 3060 Ti. However, if nvidia does increase 4060 Ti clock speed higher than the currently released 4000 series cards, it would take a boost clock (at the absolute limit of ada Lovelaces capabilities) of 3200 MHz (IE a 92% increase compared to the 3060 Ti boost clock) in order to reach my + 10% performance improvement estimate over the 3060 Ti.

I have also checked to see if perhaps smaller nvidia dies are more performance efficient than the larger dies by directly computing 3060 Ti vs 3080 specs and reveals die size has 0% effect on performance so I doubt the 4000 series would be any different.

So like everything in life, the proper use of math in scientific applications does allow us to predict performance.

PS I stupidly put a lot of time in computing all this so please only respond if you have a constructive criticism of my work or a counterpoint, etc.
If your numbers predict a gen-on-gen real world performance decrease, then one should immediately be questioning those numbers.
Lets compare the 3090Ti and 4080 to see if that IPC assumption holds up in practice:
3090Ti4080
delta​
core count107529728
90%​
boost clock1.86 GHz2.52 GHz
135%​
memory capacity24 GB16 GB
67%​
memory buswidth384 bit256 bit
67%​
memory bandwidth1008 GB/s717 GB/s
71%​
die area628mm^2379mm^2
60%​
We see the 4080 performing roughly 35% above the 3090Ti (anywhere from 25% to 50% in some outliers) discounting the effects of DLSS3, so even if we ignore the memory bandwidth disparity, that delta could be accounted for in clock speed entirely with no IPC deficit indicated.

At the very least, this should indicate that a dramatic drop in memory bus width and bandwidth is not an issue at all for the Ada Lovelace architecture when it comes to actual performance.
 
Jul 7, 2022
601
562
1,760
Specs may aid in comparisons within the same architecture, but once you start comparing across architectures they are of little value.
If your numbers predict a gen-on-gen real world performance decrease, then one should immediately be questioning those numbers.
Lets compare the 3090Ti and 4080 to see if that IPC assumption holds up in practice:
3090Ti4080
delta​
core count107529728
90%​
boost clock1.86 GHz2.52 GHz
135%​
memory capacity24 GB16 GB
67%​
memory buswidth384 bit256 bit
67%​
memory bandwidth1008 GB/s717 GB/s
71%​
die area628mm^2379mm^2
60%​
We see the 4080 performing roughly 35% above the 3090Ti (anywhere from 25% to 50% in some outliers) discounting the effects of DLSS3, so even if we ignore the memory bandwidth disparity, that delta could be accounted for in clock speed entirely with no IPC deficit indicated.

At the very least, this should indicate that a dramatic drop in memory bus width and bandwidth is not an issue at all for the Ada Lovelace architecture when it comes to actual performance.
Tech power up shows the 4080 performs 16% better than the 3090 Ti. I will admit, this case does jumble my numbers a little bit, but I believe this is more to do with the die size of the 3090 Ti so perhaps die sizes on the extreme side of reticle size does have a performance impact.

PS I think I’m still right about IPC loss for this generation, my evidence is that every attempt in history to push graphics cards to significantly higher clock speeds required a reduction of IPC to stabilize the shaders, and every attempt to significantly increase clock speed on CPU’s required a lengthening of the architectures pipeline, which decreases IPC.
 

zakdon

Distinguished
Aug 18, 2011
1
0
18,510
The specs and pricing in this gen is beyond ridiculous. I wonder what 4060 and 4050 would look like and at what price point...?
 

edzieba

Distinguished
Jul 13, 2016
438
431
19,060
PS I think I’m still right about IPC loss for this generation, my evidence is that every attempt in history to push graphics cards to significantly higher clock speeds required a reduction of IPC to stabilize the shaders, and every attempt to significantly increase clock speed on CPU’s required a lengthening of the architectures pipeline, which decreases IPC.
ISO-frequency benchmarking shows IPC continues to increase gen-on-gen: https://www.tweaktown.com/reviews/1...ark-and-AIDA64:~:text=Cinebench R20 IPC & R23

This is as expected: almost all CPU cores will boost to the ~5GHz mark and have done for a good decade, so performance increases come more from architectural improvements than from frequency scaling. Pipeline scaling may increase latency, but does not necessarily impact IPC, as almost any modern CPU will have multiple dispatches within a pipeline active at any one time (i.e. any given stage will be able to pick up the next parcel of work after completing the previous, rather than waiting for a job to complete the entire pipeline before the next is dispatched). Similar to how FPS and frame times are not simple inverses of each other, due to multiple frames being in flight at once.
 
Jul 7, 2022
601
562
1,760
ISO-frequency benchmarking shows IPC continues to increase gen-on-gen: https://www.tweaktown.com/reviews/10222/intel-core-i9-13900k-raptor-lake-cpu/index.html#Cinebench-Crossmark-and-AIDA64:~:text=Cinebench R20 IPC & R23

This is as expected: almost all CPU cores will boost to the ~5GHz mark and have done for a good decade, so performance increases come more from architectural improvements than from frequency scaling. Pipeline scaling may increase latency, but does not necessarily impact IPC, as almost any modern CPU will have multiple dispatches within a pipeline active at any one time (i.e. any given stage will be able to pick up the next parcel of work after completing the previous, rather than waiting for a job to complete the entire pipeline before the next is dispatched). Similar to how FPS and frame times are not simple inverses of each other, due to multiple frames being in flight at once.
wrong, raptor lakes cores do have an IPC loss but is hidden by the double sized L2 cache keeping the pipeline fed more efficiently.
 

chalabam

Distinguished
Sep 14, 2015
154
37
18,720
Where are the great games to justify a new GPU?

A Plague Tale is a long cutscene
Callisto Protocol is repetitive sponge bullet
STALKER 2 is delayed due to war
Most new games are ruined by politics, same as movies.
 

edzieba

Distinguished
Jul 13, 2016
438
431
19,060
wrong, raptor lakes cores do have an IPC loss but is hidden by the double sized L2 cache keeping the pipeline fed more efficiently.
Not borne out by actual benchmarking., And since the L2 Cache is part of the core, deciding to arbitrarily exclude it makes about as much sense as saying "Raptor Lake has half the floating point performance if you remove half the FPUs!".
 
  • Like
Reactions: TJ Hooker
Jul 7, 2022
601
562
1,760
Not borne out by actual benchmarking., And since the L2 Cache is part of the core, deciding to arbitrarily exclude it makes about as much sense as saying "Raptor Lake has half the floating point performance if you remove half the FPUs!".
Totally agree but we were talking about the pipeline so it made sense to bring up the cache.