It sounds like he forgot that he's not just talking to a bunch of knuckle-dragging gamers, this time.
"Remember that the performance of the architecture is going to be improving at the same time so you cannot assume just that you will buy more computers," said Huang. "You have to also assume that the computers are going to become faster, and therefore, the total amount that you need is not going to be as much."
People know this, but they also know that future AI networks will also increase in size & sophistication. So far, the number of parameters has increased faster than the speed of GPUs. Whether that will continue I can't say, but it seems a safe assumption that it
will increase and therefore you'll not be training something like a GPT-n model on a single GPU, at any point in the foreseeable future.
"One of the greatest contributions we made was advancing computing and advancing AI by one million times in the last ten years, and so whatever demand that you think is going to power the world, you have to consider the fact that [computers] are also going to do it one million times faster [in the next ten years]."
This is even more unhinged than the last claim! A lot of those multiples came from one-time optimizations, like switching from fp32 to fp8 arithmetic, introducing Tensor Cores, and supporting sparsity. That low-hanging fruit has largely been picked! There aren't a lot of other easy wins, like that.
Furthermore, density and efficiency scaling of new process nodes is slowing down, as are gains in transistors per $. What that means is that manufacturing-driven performance scaling is
also mostly exhausted.
So, I don't see any reason to believe the insane pace of AI performance improvements will continue. There will still be gains, but more on the order of 10x or maybe 100x over the next 10 years. Not a million-fold.