Ah, now we get to the real meat of the problem.
First, let's look at trends in GPU price & performance. Here's some data I compiled on the top (mainstream) tier of Nvidia cards. You could argue that maybe I should've used Titan cards or the RTX 3090 Ti, but my decision of which GPUs to include was based on how representative each was of its generation (i.e. neither an outlier in value or launch timing).
Model | Launch | Node | Area | M Transistors | GFLOPS | MSRP | mm^2/$ | MTr/$ | GFLOPS/$ |
---|
GTX 980 Ti | 2015-06-01 | 28 nm | 601 | 8000 | 5632 | $649 | 0.926 | 12.33 | 8.68 |
GTX 1080 Ti | 2017-03-05 | 16 nm | 471 | 12000 | 10609 | $699 | 0.674 | 17.17 | 15.18 |
RTX 2080 Ti | 2018-09-27 | 12 nm | 754 | 18600 | 11750 | $999 | 0.755 | 18.62 | 11.76 |
RTX 3090 | 2020-09-24 | 8 nm | 628 | 28300 | 29280 | $1499 | 0.419 | 18.88 | 19.53 |
RTX 4090 | 2022-10-12 | 4 nm | 609 | 76300 | 73100 | $1599 | 0.381 | 47.72 | 45.72 |
One discontinuity worth noting is the GFLOPS increase for RTX 3000, which I believe reflects a SM redesign whereby the theoretical throughput per SM doubled but practical throughput didn't. Another detail the keen observer will notice is the relative lack of improvement between the 1000-series and 2000-series, which is largely due to Nvidia's decision to spend most of their additional transistor budget on Tensor cores and RT cores.
One thing that's truly impressive is just how much faster the RTX 4090 is than its predecessor. That's the product of its clockspeed increase (1.6x) and its increase in CUDA cores (1.56x). I think the big transistor increase is mostly from a huge increase in L2 cache. Overall, I think the improvements in this generation owe a lot to the fact that RTX 2000 and RTX 3000 were being held back by inferior process nodes.
A theme we can clearly see is that the area of these flagship GPU dies tends to sit just above 600 mm^2. This is a large die to make on a cutting edge node.
By contrast, let's look at flagship Ryzen CPUs.
Model | Launch | Node | Area* | M Transistors | Cores | Base Freq | GFLOPS | MSRP | mm^2/$ | MTr/$ | GFLOPS/$ |
---|
1800X | 2017-03-02 | 14 nm | 213 | 4800 | 8 | 3.6 | 0.46 | $499 | 0.427 | 9.62 | 0.92 |
2700X | 2018-04-19 | 12 nm | 192 | 4800 | 8 | 3.7 | 0.47 | $329 | 0.584 | 14.59 | 1.44 |
3950X | 2019-11-25 | 7 nm | 148 | 7600 | 16 | 3.5 | 1.79 | $749 | 0.198 | 10.15 | 2.39 |
5950X | 2020-11-05 | 7 nm | 166 | 8300 | 16 | 3.4 | 1.74 | $799 | 0.208 | 10.39 | 2.18 |
7950X | 2022-09-26 | 5 nm | 140 | 13140 | 16 | 4.5 | 2.30 | $699 | 0.200 | 18.80 | 3.30 |
First, I want to address a set of discontinuities between the 2700X and 3950X, which is due to the fact that the former is a monolithic die and the latter just looks at the compute dies. I didn't want to deal with trying to factor in the I/O die into the calculations and I believe the substantial majority of the cost is in the compute dies, anyhow, since the I/O die is made on an older node. This is even more true of the dual-CCD CPUs. By doing this, the mm^2/$ calculations are thrown slightly off, but probably still more comparable to the GPU data, above.
The big takeaway is that mainstream CPU dies are
much smaller. If you look at
MTr/$ or
mm^2/$,
you're still getting a better value from GPUs, in spite of the fact that a CPU is basically the dies in a package, while a GPU includes VRMs, a PCB, a thermal solution, fans, and GBs of expensive GDDR memory!
With that said, Ryzen has stayed a little flatter in both area and transistor pricing. It's interesting to note that they've also stayed pretty flat in GFLOPS/$, although I'm not terribly confident about those theoretical GFLOPS numbers and will try to firm them up.
To gain some insight into why area and transistor pricing are breaking down, consider:
Why is that happening? Let's start by looking at wafer price trends:
So, should we blame TSMC or ASML for being greedy, instead of Nvidia?
Maybe not. Newer wafers are objectively more resource-intensive to produce.
Furthermore, design costs are also increasing at a similar pace:
GPUs rely primarily on node improvements and scale to deliver better performance. As long as newer nodes continue getting more expensive to design & manufacture, the GPU perf/$ curve will inevitably flatten, either by new generations offering smaller performance gains, being
even more expensive, or some combination.
Try looking at it this way: you're not
losing anything. GPU perf/$ is better than it's ever been. It just won't
improve at the same rate as before. That sense of loss you're feeling is that GPUs got a "free ride" on Moore's Law bandwagon, and that's finally slowing down. It's sad to see a good thing end, but I think that's where we are.