News Baidu's AI breakthrough can meld GPUs from different brands into one training cluster — company says new tech fuses thousands of GPUs together to...

Status
Not open for further replies.

Is that what Chinese President said via that link ? This is funny ! 😆

How to speed up a slow Mac ?

https://www.macworld.com/article/668632/how-to-speed-up-a-mac.html

The White House's attempts to stop technology transfers from the US and its allies may hinder China's scientific advancements in the short term.

Sorry! Page not found.
The page you're looking for has either been moved or removed from the site.
Please try searching our site or start again on our homepage.



Maybe try proof-reading articles BEFORE publishing ?
 
  • Like
Reactions: bit_user
The article said:
The company can combat AI chip scarcity by combining GPUs from different vendors.
...
If Baidu's claims are true, this is a massive development.
How much is that going to help, really? I'm just not seeing a huge upside, here. If you pair a handful of Chinese GPUs with a Nvidia GPU that's 10x as fast, then the total benefit on training time won't add up to much. Also, for anyone building AI systems, they're dealing in aggregates and I'll bet they build systems with either all Nvidia or all AMD GPUs. It's probably much more the exception that they're down to just a couple boards of either kind, and if you were, you just build another all-AMD system (for instance).

Furthermore, any time you don't have a high-speed fabric and have to rely on PCIe for interconnectivity, you're going to be at a significant disadvantage. The software overhead of abstracting each GPU API is going to add a little more, but I see it as neither a huge win nor a major impediment.

The article said:
If Li's claim is true, Baidu has achieved a brilliant technical breakthrough. This technology will allow the company to mix and match different GPUs
On a purely technical level, I think it's less impressive than prior techniques for mixing & matching different GPUs in the same machine.

While searching for that, I found a project enabling disparate multi-GPU configurations for CFD:

Sorta shows it's not quite the genius breakthrough the article claims.
 
How much is that going to help, really? I'm just not seeing a huge upside, here. If you pair a handful of Chinese GPUs with a Nvidia GPU that's 10x as fast, then the total benefit on training time won't add up to much. Also, for anyone building AI systems, they're dealing in aggregates and I'll bet they build systems with either all Nvidia or all AMD GPUs. It's probably much more the exception that they're down to just a couple boards of either kind, and if you were, you just build another all-AMD system (for instance).

Furthermore, any time you don't have a high-speed fabric and have to rely on PCIe for interconnectivity, you're going to be at a significant disadvantage. The software overhead of abstracting each GPU API is going to add a little more, but I see it as neither a huge win nor a major impediment.


On a purely technical level, I think it's less impressive than prior techniques for mixing & matching different GPUs in the same machine.

While searching for that, I found a project enabling disparate multi-GPU configurations for CFD:

Sorta shows it's not quite the genius breakthrough the article claims.
I think you've nailed it. To me this sounds like SLI on steroids, but SLI suffered from overhead with just two cards that made it a losing proposition. I can't imagine how what the overhead would be with hundreds or thousands of GPUs in a cluster. While the current tech is not impressive on a technical level, I still think it would probably be useful for LLM training during a supply shortage from the sanctions.
 
I’ll happily bet a hundred dollars that it’s literally just the recompilers made to run CUDA code on AMD or Intel just combined into one package.
 
SLI suffered from overhead with just two cards that made it a losing proposition. I can't imagine how what the overhead would be with hundreds or thousands of GPUs in a cluster. While the current tech is not impressive on a technical level, I still think it would probably be useful for LLM training during a supply shortage from the sanctions.
The communication patterns are very different for training vs. rendering. One of the first things the US sanctions targeted was communication bandwidth, because they knew how much that would impede training large-scale models. There are good reasons for Nvidia's focus on NVLink. AMD and Intel each have their own version, as well.
 
  • Like
Reactions: ThomasKinsley
I think you've nailed it. To me this sounds like SLI on steroids, but SLI suffered from overhead with just two cards that made it a losing proposition. I can't imagine how what the overhead would be with hundreds or thousands of GPUs in a cluster. While the current tech is not impressive on a technical level, I still think it would probably be useful for LLM training during a supply shortage from the sanctions.

SLI overhead affected gaming because the need for precise synchronization and low latency. (And what finally killed it was the impending move to low-level APIs such as DX12 and Vulkan, where it would have fallen on game developers to implement multi-GPU on each game.)

Depending on the kind of compute workload you have, these might not be a factor at all. Multi-GPU still has applications for compute and rendering tasks (just not real-time rendering).
 
  • Like
Reactions: ThomasKinsley
Status
Not open for further replies.