The article said:
When someone pointed out that Nvidia’s AI GPUs are still expensive, Huang said that it’d be a million times more expensive if Nvidia didn’t exist. “I gave you a million times discount in the last 10 years. It’s practically free!”
This point is bogus. Furthermore, there seems to be some serious inflation of the claim, perhaps even greater than Nvidia's stock price!
: D
As far as I can tell, this is referring to "Huang's Law", where just over a year ago, he was claiming only 1000x speedup over 10 years.
The way he arrived at
that figure was:
16x from better number handling,
12.5x from using reduced-precision fp and int arithmetic,
2x from exploiting sparsity, and
2.5x from process node improvements.
Most of those areas are wells you can't keep going back to. Sure, there has been further work on reducing precision and yet further improving sparsity handling, as well as weight compression and arithmetic approximations. So, I'm not saying the well of innovation has run dry. But, when you take a somewhat generic architecture and optimize it for a specific problem, a lot of the big gains are achieved early. So, I'm not expecting to see a repeat performance, here.
As for the claim of 1M, that remaining 1000x can only be referring to the scalability achieved through their HGX systems' utilization of NVLink and NVSwitch to link an entire rack's worth of GPUs into a coherent memory space. This last 1000x is perhaps the most troubling, since it only scaled somewhat linearly with cost. In other words, 1000x GPUs is going to cost you more than 1000x the price of one (assuming both are obtained through the same channel and not just comparing the retail cost of one vs. direct-sales cost of 1000). So, he provided the capability to scale, but not a real cost savings.
Briefly returning to the 1000x scaling from "Huang's Law", for a moment, a lot of these same things were being done by others. Plenty of people were looking at reducing arithmetic precision, building dedicated matrix-multiply hardware, etc. Nvidia was among the first to deploy them at scale, but it's not as if the industry wouldn't have moved at a mostly similar direction and pace, had Nvidia not been in the race.
Huang's biggest contribution was probably his aggressive push of CUDA and GPU compute and embracing the industries and applications where it was found to have the greatest potential. That's what I think gave them an early lead, down the path of AI. It's a lead they can keep, only if they're nimble, and here's where CUDA could turn out to be a liability. That's because it forces them to make their GPUs either more general than you really need for AI, or else they have to break CUDA compatibility and do a whole lot of work to adapt the existing CUDA-based software to newer architectures. This is what I think Jim Keller meant by his statement that
"CUDA is a swamp, not a moat". It's now become an impediment to NVidia further optimizing its architectures in ways similar to its biggest AI competitors (who, other than AMD and Intel, generally seem to have embraced data-flow architectures over classical GPUs).