How much is that going to help,
really? I'm just not seeing a huge upside, here. If you pair a handful of Chinese GPUs with a Nvidia GPU that's 10x as fast, then the total benefit on training time won't add up to much. Also, for anyone building AI systems, they're dealing in aggregates and I'll bet they build systems with
either all Nvidia or all AMD GPUs. It's probably much more the exception that they're down to just a couple boards of either kind, and if you were, you just build another all-AMD system (for instance).
Furthermore, any time you don't have a high-speed fabric and have to rely on PCIe for interconnectivity, you're going to be at a significant disadvantage. The software overhead of abstracting each GPU API is going to add a little more, but I see it as neither a huge win nor a major impediment.
On a purely technical level, I think it's less impressive than prior techniques for mixing & matching different GPUs in the same machine.
While searching for that, I found a project enabling disparate multi-GPU configurations for CFD:
Sorta shows it's not quite the genius breakthrough the article claims.