The reality-distortion field is strong, in this guy.
'Over the last decade, as Nvidia GPUs shifted from 28nm to 5nm processes, the semiconductor process improvements have "only accounted for 2.5x of the total gains," asserted Dally at Hot Chips.'
That's if you
only look at the frequency gains from process advancement! If you also account for density, it's
waaaay higher!
Basically, they're just claiming to have rediscovered something we knew all along, since well before the dawn of hardware-accelerated graphics chips, which is that
fixed-function, purpose-built logic is a lot faster than general-purpose computers!
All of the stuff about custom data formats was true of graphics, back in the old days (and even to some extent, today). Things like 16-bpp color by packing (5, 5, 6)-bit tuples into a 16-bit integer. What happened since the early days of 3D graphics cards is that the silicon improvements came so fast that GPU makers were able to offer a lot of programmability and generality. Then, AI came along, and was sufficiently compute-bound and bottlenecked on specific types of computation that it made sense to have specialized, fixed-function logic for it, just as we have for texturing, ROPs, ray-tracing, etc.
So, just multiply together the 2.5x frequency boost, the additional cores enabled by process density improvements, and the improvements by making fixed-function logic for AI, and that gets you most of the way to their 1000x number. The rest comes from things like larger caches and various other tweaks.
'Huang's Law' - what self-congratulatory nonsense. All he really pointed out was something the supercomputer industry knew since the 60's, which is that you can scale faster by also adding/increasing parallelism than simply relying on scalar performance improvements.