@PaulAlcorn , thanks for the coverage!
Huang also remarked that Nvidia has only been working on the chips for two years, which is a relatively short time given the typical multi-year design cycle for a modern chip.
Nvidia has made no secret of the fact that they're using ARM's Neoverse V2 cores. That means most of their work on this was probably just a matter of integration, rather than ground-up design. ARM tries to make it as fast & easy as possible for its customers to get chips to market that use its IP.
For reference, Amazon's Graviton 3, which launched about 15 months ago, uses ARM's Neoverse V1 cores.
Nvidia's use of the Arm instruction set also means there's a heavier lift for software optimizations and porting, and the company has an entirely new platform to build. Jensen alluded to some of that ...
This feels more like an excuse than whatever was their main issue. They've officially supported CUDA on ARM for probably 5 years, now. They've shipped ARM-based SoCs for at least 15 years. All their self-driving stuff is on ARM. All the big hyperscalers have ARM instances. And Fujitsu even launched an ARM-based supercomputer, using their A64FX, which I'm sure prompted some HPC apps to receive ARM ports & optimizations.
It's weird that their Genoa-comparison slide compares against NPS4, as if it's the only option. You don't
have to use that - you could also just use NPS1 or NPS2, if you have VMs big enough.