Rumors have been swirling that Nvidia already started seeing big customers hold back on B200 orders, as big cloud operators pursue (or continue) building their own AI chips. Without the CUDA penalty, they might have been in a more competitive position.
Although Nvidia has so far been able to sell HPC/AI GPUs at basically the rate that HBM can be manufactured, it's unclear whether or how long that will continue to be true. There's also a question of pricing, where one way to look at things is that Nvidia needs to achieve superior perf/$ than what else is out there. If moving away from CUDA unlocked more performance, the it would have value for them is potentially enabling even higher prices on the same production volume. That additional selling price would be pure profit.
Finally, consider energy. With B200 using 1 kW per GPU and B300 said to use 1.4 kW per GPU, they cannot afford to disregard heat & energy as major concerns. With the B200, there were major rollout issues partly affected by operating temperature, which were very costly for the company (see my link in post 4, above). Data movement consumes very significant amounts of energy, and this is another big downside of the CUDA programming model.
Jim Keller said it best: CUDA isn't a moat, it's a swamp.