News Nvidia confirms Blackwell Ultra and Vera Rubin GPUs are on track for 2025 and 2026 — post-Rubin GPUs in the works

I wonder when they're going to begin moving away from CUDA. I think it really is holding them back, when most others have already move on to dataflow architectures. I liken Nvidia's current lead to how Intel's lead from around 10-15 years ago was primarily based on better manufacturing tech.

CUDA was once Nvidia's biggest advantage, but now it's turning into their greatest liability. They know it, too. Just look at their edge SoCs, which pack most of their inferencing horsepower in the NVDLA engines, not the iGPUs.
 
HBM4E, hmm, why Nvidia have not plans to use 16 stacks?
Just a few guesses:
  • Perhaps the timeframe for availability of the 12-high capable dies is sooner?
  • Heat, given that it's going to be stacked atop logic dies. Heat was also an issue for them in B200: https://www.tomshardware.com/tech-i...interested-parties-liquid-cooled-h200-instead
  • Maybe higher stacks require more vias, making it less cost- & area-efficient?
  • Perhaps the optimal compute vs. data ratio favors slightly smaller stacks?
  • Or, perhaps something to do with concerns about yield, the more dies you stack. Yield seems to have been an ongoing area of struggle, for the B200.
 
I wonder when they're going to begin moving away from CUDA. I think it really is holding them back, when most others have already move on to dataflow architectures. I liken Nvidia's current lead to how Intel's lead from around 10-15 years ago was primarily based on better manufacturing tech.

CUDA was once Nvidia's biggest advantage, but now it's turning into their greatest liability. They know it, too. Just look at their edge SoCs, which pack most of their inferencing horsepower in the NVDLA engines, not the iGPUs.
What do you mean? The current industry seems entrenched in CUDA and even if there are cases where different architectures might have advantages, I can't see CUDA moving away or even hindering Nvidia's position much in the future. The only competitors I'm aware of like ROCm and UXL aren't really very competitive.

Nvidia can certainly fumble like Intel has done recently but I wouldn't say CUDA is a liabilty now, just like 15 years ago I never though Intel would be in their current position.
 
What do you mean?
CUDA is too general and depends on SMT to achieve good utilization. SMT isn't the most efficient, in terms of energy or silicon area. Dedicated NPUs don't work this way - they use local memory, DSP cores, and rely on DMAs to stream data in and out of local memory. Nvidia has been successful in spite of CUDA's inefficiency, but they can't keep their lead forever if they don't switch over to a better programming model for the problem.

They did succeed in tricking AMD into following them with HIP, rather than potentially leap-frogging them. With oneAPI, I think Intel is also basically following the approach Nvidia took with CUDA. As long as they keep thinking they're going to beat Nvidia at its own game, they deserve to keep losing. AMD should've bought Tenstorrent or Cerebras, but now they're probably too expensive.

At least AMD and Intel both have sensible inferencing hardware. Lisa Su should pay attention to the team at Xilinx that designed what they now call XDNA - I hope that's what UDNA is going to be.

The current industry seems entrenched in CUDA and even if there are cases where different architectures might have advantages, I can't see CUDA moving away or even hindering Nvidia's position much in the future.
I already gave you an example where even Nvidia clearly sees the light. Just look at their NVDLA engines, which are now already on their second generation. Those aren't programmable using CUDA.

Just sit back and wait. I wonder if Rubin is going to be their first post-CUDA training architecture.
 
AI/Datacenters, is where real money is made for Nvidia. I wonder when they 're gonna stop making consumer GPUs.

They just start making gaming gpus by using older nodes that are not needed for AI chips!
$2000 to $ 4000 for gaming GPU is still profit even if it is tiny compared to AI stuff. So don´t worry! There will be overpriced gaming GPUs also in the future!

😉
 
  • Like
Reactions: valthuer
They just start making gaming gpus by using older nodes that are not needed for AI chips!
RTX 5090 is made on a TSMC N4-class node, which is the same as they're using for AI training GPUs. The RTX 5090's die can't get much bigger. So, the only way they could do a 3rd generation on this family of nodes is by going multi-die, which Intel and AMD have dabbled with (and Apple successfully executed), but Nvidia has steadfastly avoided. What I've heard about the multi-die approach is that the amount of global data movement makes this inefficient for rendering. So, I actually doubt they'll go in that direction, at least not yet.

My prediction is that they'll use a 3 nm-class node for their next client GPU. They're already set to use a N3 node for Rubin, later this year.

$2000 to $ 4000 for gaming GPU is still profit even if it is tiny compared to AI stuff. So don´t worry! There will be overpriced gaming GPUs also in the future!
Yeah, I think the main risk that AI poses to their gaming products is simply that it tends to divert resources and focus. That's probably behind some of the many problems that have so far affected the RTX 5000 launch.

Nvidia does seem to keep doing research on things like neural rendering, so that clearly shows they're not leaving graphics any time soon. It's more that they're focusing on the intersections between AI and graphics, which is certainly better than nothing.
 
I already gave you an example where even Nvidia clearly sees the light. Just look at their NVDLA engines, which are now already on their second generation. Those aren't programmable using CUDA.

Just sit back and wait. I wonder if Rubin is going to be their first post-CUDA training architecture.
I can see Nvidia introducing new architectures in the future and shifting away from CUDA eventually, I can't see Nvidia's position dominance being challenged. My point is:

"Nvidia can certainly fumble like Intel has done recently but I wouldn't say CUDA is a liabilty now, just like 15 years ago I never though Intel would be in their current position."

Despite all the drawbacks you've pointed out, they're clearly still on top of their game. CUDA has technical downsides but it certainly doesn't have significant financial downsides for Nvidia, at least not yet. I'm not holding my breath for Rubin, maybe afterwards.
 
Despite all the drawbacks you've pointed out, they're clearly still on top of their game. CUDA has technical downsides but it certainly doesn't have significant financial downsides for Nvidia, at least not yet. I'm not holding my breath for Rubin, maybe afterwards.
Rumors have been swirling that Nvidia already started seeing big customers hold back on B200 orders, as big cloud operators pursue (or continue) building their own AI chips. Without the CUDA penalty, they might have been in a more competitive position.

Although Nvidia has so far been able to sell HPC/AI GPUs at basically the rate that HBM can be manufactured, it's unclear whether or how long that will continue to be true. There's also a question of pricing, where one way to look at things is that Nvidia needs to achieve superior perf/$ than what else is out there. If moving away from CUDA unlocked more performance, the it would have value for them is potentially enabling even higher prices on the same production volume. That additional selling price would be pure profit.

Finally, consider energy. With B200 using 1 kW per GPU and B300 said to use 1.4 kW per GPU, they cannot afford to disregard heat & energy as major concerns. With the B200, there were major rollout issues partly affected by operating temperature, which were very costly for the company (see my link in post 4, above). Data movement consumes very significant amounts of energy, and this is another big downside of the CUDA programming model.

Jim Keller said it best: CUDA isn't a moat, it's a swamp.
 
Last edited:
Rumors have been swirling that Nvidia already started seeing big customers hold back on B200 orders, as big cloud operators pursue (or continue) building their own AI chips. Without the CUDA penalty, they might have been in a more competitive position.

Although Nvidia has so far been able to sell HPC/AI GPUs at basically the rate that HBM can be manufactured, it's unclear whether or how long that will continue to be true. There's also a question of pricing, where one way to look at things is that Nvidia needs to achieve superior perf/$ than what else is out there. If moving away from CUDA unlocked more performance, the it would have value for them is potentially enabling even higher prices on the same production volume. That additional selling price would be pure profit.

Finally, consider energy. With B200 using 1 kW per GPU and B300 said to use 1.4 kW per GPU, they cannot afford to disregard heat & energy as major concerns. With the B200, there were major rollout issues partly affected by operating temperature, which were very costly for the company (see my link in post 4, above). Data movement consumes very significant amounts of energy, and this is another big downside of the CUDA programming model.

Jim Keller said it best: CUDA isn't a moat, it's a swamp.
The B200 issues are definitely a problem for Nvidia. I can see large cloud or tech companies continues to build and improve their own chips both as a hedge against Nvidia (+CUDA) and since having their own in house solutions could be more efficient. But is that a problem with CUDA or just the underlying packaging technology having a defect?

I don't disagree with having more efficient technology enabling even higher profits but the momentum is still there. CUDA might be a swamp, but its a swamp that tech companies still want despite having their own compute designs and is entrenched in software inspite of its flaws.

I see the timeline for Nvidia replacing CUDA, just don't think they will do so by Rubin.
 
I don't disagree with having more efficient technology enabling even higher profits but the momentum is still there. CUDA might be a swamp, but its a swamp that tech companies still want despite having their own compute designs and is entrenched in software inspite of its flaws.
The fact that they have internal, non-CUDA solutions says it's not CUDA they want, but simply the performance and scaling capability of Nvidia's GPUs. In fact, their building and use of non-CUDA AI chips can be seen as a forceful rebuke of CUDA.

BTW, I'm referring to chips like Google's TPUs and Amazon's Tranium, in case it wasn't clear.