AMD has steadily eroded Intels dominant market share in servers and DIY Home builds.
But they got a massive aid by Intel's 10 nm process node getting stuck. So, when you combined some good engineering and outside-the-box thinking with AMD's good fortune of TSMC firing on all cylinders and Intel's 10 nm node stalling out, AMD was able to climb out of its hole and actually mount real competition for Intel.
Nvidia, on the contrary, only had some slight disadvantages from using a 12 nm node for Turing and Samsung 8 nm for Ampere (if we're talking about client GPUs). On the design front, they were slightly disadvantaged by the amount of silicon they spent on Tensor cores and RT cores. Other than that, Nvidia was almost as competitive as ever. Now that they're on effectively the same node as AMD and wised up to the amount of cache they needed, they've effectively eliminated any self-imposed disadvantages.
AMD made smart moves on reducing their wavefront size, in RDNA, and with RDNA2's Infinity Cache. However, instead of doubling down on Infinity Cache, they tripped over their own feet with the chiplet thing. If it would've let them
increase the amount of cache, then heck yeah! But, it instead resulted in
less cache, possibly at higher-latency, and a more expensive assembly process that probably offset the silicon savings from using N6 chiplets for the MCDs. So, right now, it's really AMD who's on the back foot.
All it takes is a few winning consecutive years.
I don't expect AMD to get a more favorable opportunity than they had with Turing and Ampere. Nvidia probably isn't going to take the risk of using a cheaper node, after seeing how it briefly cost them the uncontested crown to the RX 6950XT.
Right now the MI300 is a huge winner.
Yeah, but it's only going to have about 6 months of shipping in volume, before Blackwell ramps, if that? That's better than nothing, but it's not enough to turn around AMD's fortunes.
And the ROCm toolkit is a wedge to break CUDA monopoly.
Hardly. The best ROCm can hope for is matching Intel's driver & userspace stack.
Give it a few years. NVIDIA, AMD, and Intel are limited to what techonology is availble for fabbing chips.
Back in 2019(?), Nvidia bought Mellanox, which has been crucial to enabling them to scale their training solutions to rack-scale and beyond.
So one can't do X Y Z that's more than 2 generations ahead for a similar chip size.
What Nvidia has is much more than just chips. They have the entire stack, and it's been tuned and optimized over an entire decade!
What Intel saw in Habana Labs was not just the silicon and software, but also a scaling solution. If I were Nvidia, I'd be more worried about Intel than AMD.