News Intel tempers expectations for next-gen Falcon Shores AI GPU — Gaudi 3 missed AI wave, Falcon will require fast iterations to be competitive

Admin · Dec 13, 2024

The interim CEO says that Intel's Falcon Shores will be the company's first step toward developing AI GPUs.

Intel tempers expectations for next-gen Falcon Shores AI GPU — Gaudi 3 missed AI wave, Falcon will require fast iterations to be competitive : Read more

bit_user · Dec 14, 2024

The article said:
Intel's Falcon Shores is thought to be a multi-chiplet design with Xe-HPC (or at least Xe-HPC-like) and x86 chiplets with unified HBM memory.

No, it won't. At least, not according to this:

"Intel originally planned for its Falcon Shores chips to have both GPU and CPU cores under the hood, creating the company's first 'XPU' for high performance computing. However, its surprise announcement a few months ago that it would pivot to a GPU-only design and delay the chips to 2025 left industry observers shocked"

https://www.tomshardware.com/news/i...redefinition-shares-roadmap-and-first-details

If they retracted that statement, please cite where/when. Tagging @PaulAlcorn , since Anton is contradicting the article you wrote.

bit_user · Dec 14, 2024

FWIW, I think Intel should keep Falcon Shores as HPC-oriented and Gaudi as their AI platform. Just because Nvidia has been successful using GPUs for AI doesn't mean it's the best architecture for that problem. I predict Nvidia will fork off a line of AI accelerators that look less like GPUs than their current products do - and they might not even support CUDA!

I also think you don't need HBM to do AI right, especially if you're not constrained to a GPU-like architecture. What you need is lots of SRAM, so the thing to do is just stack your compute dies on SRAM. AI is a dataflow problem, whereas graphics isn't. That means GPUs have to be much more flexible with data movement & reuse, and that's why they need such crazy amounts of bandwidth to main memory. By contrast, AI has much better data locality, so you can do quite well with adequate amounts of SRAM and a mesh-like interconnect.

P.Amini · Dec 14, 2024

bit_user said:
FWIW, I think Intel should keep Falcon Shores as HPC-oriented and Gaudi as their AI platform. Just because Nvidia has been successful using GPUs for AI doesn't mean it's the best architecture for that problem. I predict Nvidia will fork off a line of AI accelerators that look less like GPUs than their current products do - and they might not even support CUDA!

I also think you don't need HBM to do AI right, especially if you're not constrained to a GPU-like architecture. What you need is lots of SRAM, so the thing to do is just stack your compute dies on SRAM. AI is a dataflow problem, whereas graphics isn't. That means GPUs have to be much more flexible with data movement & reuse, and that's why they need such crazy amounts of bandwidth to main memory. By contrast, AI has much better data locality, so you can do quite well with adequate amounts of SRAM and a mesh-like interconnect.

CUDA is one of the NVIDIA's big advantages and selling points right now, I don't think they drop it anytime soon (as long as NVIDIA is the main player).

bit_user · Dec 14, 2024

P.Amini said:
CUDA is one of the NVIDIA's big advantages and selling points right now, I don't think they drop it anytime soon (as long as NVIDIA is the main player).

Not long ago, Jim Keller claimed CUDA was more of a swamp than a moat. I think his point is that the requirement for CUDA compatibility bogs down Nvidia's innovation on hardware because it forces them to make hardware that's more general than what you really need for AI. This comes at the expense of both efficiency and perf/$. So far, Nvidia has been able to avoid consequences, mainly by virtue of having the fastest and best-supported hardware out there. However, as AI consumes ever more power and their biggest customers increasingly roll out infrastructure powered by their own accelerator, I think these are going to catch up with Nvidia.

If I'm right, then AMD and Intel will get burned by having tried to follow Nvidia's approach too closely. They also fell into a trap by relying on HBM, which is now one of the main production bottlenecks and also not cheap. Of course, you need HBM, if you have a GPU-like architecture. The only way to avoid it is to build a dataflow architecture with enough SRAM, like Cerebras has successfully done.

Search

News Intel tempers expectations for next-gen Falcon Shores AI GPU — Gaudi 3 missed AI wave, Falcon will require fast iterations to be competitive

Admin

Administrator

bit_user

Titan

bit_user

Titan

P.Amini

Reputable

bit_user

Titan

TRENDING THREADS

Latest posts

Moderators online

Share this page