News China makes AI breakthrough, reportedly trains generative AI model across multiple data centers and GPU architectures

Admin · Sep 29, 2024

China has found a way to efficiently use multiple data centers to train a single GAI model.

China makes AI breakthrough, reportedly trains generative AI model across multiple data centers and GPU architectures : Read more

Notton · Sep 29, 2024

Is it a breakthrough though?
AFAIK using multiple data centers is not a new thing, it's just that no one likes dealing with the latency delays of waiting for data to arrive over the internet.

Which is why the preferred method is to put all eggs in one basket.

usertests · Sep 29, 2024

Notton said:
Is it a breakthrough though?
AFAIK using multiple data centers is not a new thing, it's just that no one likes dealing with the latency delays of waiting for data to arrive over the internet.

Which is why the preferred method is to put all eggs in one basket.

If their technique allows the training to be broken up into latency-insensitive chunks, like Folding@home, then it should probably be considered a breakthrough. I don't know if that's possible, just throwing it out there.

zsydeepsky · Sep 29, 2024

Notton said:
Is it a breakthrough though?
AFAIK using multiple data centers is not a new thing, it's just that no one likes dealing with the latency delays of waiting for data to arrive over the internet.

Which is why the preferred method is to put all eggs in one basket.

well, I remember reading news from MS and FB, both mentioning they met a bottleneck in training AI within a single data center.
that bottleneck is electricity.
the requirement for electricity was so huge that a data center could consume gigawatts, the single data center workload alone can destabilize the power grid.
I would say that the US developers will eventually have no choice but to train AI in multiple data centers as well.

acadia11 · Sep 29, 2024

zsydeepsky said:
well, I remember reading news from MS and FB, both mentioning they met a bottleneck in training AI within a single data center.
that bottleneck is electricity.
the requirement for electricity was so huge that a data center could consume gigawatts, the single data center workload alone can destabilize the power grid.
I would say that the US developers will eventually have no choice but to train AI in multiple data centers as well.

Or build dedicated data center power plants like Oracle new planned nuclear powered data center, or Microsoft’s repurposing of 3 mile island. The real breakthrough will be cold fusion … interesting thought experiment what would be the impact of limitless power supply on our economic and technical endeavors as a society?

usertests · Sep 29, 2024

acadia11 said:
Or build dedicated data center power plants like Oracle new planned nuclear powered data center, or Microsoft’s repurposing of 3 mile island. The real breakthrough will be cold fusion … interesting thought experiment what would be the impact of limitless power supply on our economic and technical endeavors as a society?

Limitless? We would turn Earth into magma.

There's still costs associated with it. But if we end up with $0.01/kWh, it will allow us to do some truly wasteful stuff.

Li Ken-un · Sep 29, 2024

acadia11 said:
The real breakthrough will be cold fusion … interesting thought experiment what would be the impact of limitless power supply on our economic and technical endeavors as a society?

AI will be the impetus for building Dyson swarms around the sun. lol

Gururu · Sep 29, 2024

Not sure that overcoming a state-specific deficiency should be considered an industry-wide breakthrough unless it is a must-be adopted practice across said industry. Time will tell I suppose.

The Historical Fidelity · Sep 29, 2024

Love these types of articles based on X posts of a guy who overheard someone say it happened, but it was in a meeting where he can’t talk about it due to an NDA, which means he overheard it being talked about during unrelated banter in the NDA meeting, which means there is 0 confirmability on the authenticity of this breakthrough other than hearsay…

Geef · Sep 29, 2024

Li Ken-un said:
AI will be the impetus for building Dyson swarms around the sun. lol

Here is a link for an Isaac Arthur video about Matrioshka Brains.

TCA_ChinChin · Sep 29, 2024

The Historical Fidelity said:
Love these types of articles based on X posts of a guy who overheard someone say it happened, but it was in a meeting where he can’t talk about it due to an NDA, which means he overheard it being talked about during unrelated banter in the NDA meeting, which means there is 0 confirmability on the authenticity of this breakthrough other than hearsay…

Unfortunately this feels like 50% of all articles nowadays, and not just for tech related stuff.

JRStern · Sep 29, 2024

Notton said:
Is it a breakthrough though?
AFAIK using multiple data centers is not a new thing, it's just that no one likes dealing with the latency delays of waiting for data to arrive over the internet.

Which is why the preferred method is to put all eggs in one basket.

Surely this is trivial, all you need is an interchange format.
It's done in big chunks anyway, so just read in the chunks and decode for your GPU.
Should be very little need to interchange between servers/racks in real time, any more than there is now.
Why wasn't this done long ago?
Oh, maybe it adds 0.001% to the processing load.
And maybe a given vendor would rather keep it proprietary.

Kondamin · Sep 29, 2024

Oh step closer to skynet, distributed ai

DingusDog · Sep 29, 2024

Do you want Skynet? Because this is how you get Skynet.

dCasualGamer · Sep 29, 2024

While I believe that the China's AI and semicon technology is still behind the West, it is catching up fast. I don't think we can be complacent, thinking that China is still behind the West.

acadia11 · Sep 29, 2024

Li Ken-un said:
AI will be the impetus for building Dyson swarms around the sun. lol

And here we thought all of man’s advancement was driven by pr0n.

Kondamin · Sep 30, 2024

acadia11 said:
And here we thought all of man’s advancement was driven by pr0n.

well sure, AI porn

Reace · Sep 30, 2024

Why do some sites continue to post these Chinese claims of breakthroughs verbatim when 90% of the time they turn out to be completely false or at the very least, exaggerated. In this case, it's 100% exaggerated. Western tech has been using distributed data centers for years now.

davidjkay · Sep 30, 2024

Reace said:
Why do some sites continue to post these Chinese claims of breakthroughs verbatim when 90% of the time they turn out to be completely false or at the very least, exaggerated. In this case, it's 100% exaggerated. Western tech has been using distributed data centers for years now.

yes distributed data centers but apparently no for training ai model... would be too expensive

davidjkay · Sep 30, 2024

not clear less "efficient"... IF you are willing to wait 10x as long, often possible to do things more efficiently with older hardware

davidjkay · Sep 30, 2024

davidjkay said:
not clear less "efficient"... IF you are willing to wait 10x as long, often possible to do things more efficiently with older hardware... if you have crazy large amount of older hardware, and good integration you could do model training only when nothing better to do and spare electricity, running at most power saving low voltages and frequencies

LordVile · Sep 30, 2024

acadia11 said:
Or build dedicated data center power plants like Oracle new planned nuclear powered data center, or Microsoft’s repurposing of 3 mile island. The real breakthrough will be cold fusion … interesting thought experiment what would be the impact of limitless power supply on our economic and technical endeavors as a society?

Cold fusion is quite literally pseudoscience

acadia11 · Sep 30, 2024

LordVile said:
Cold fusion is quite literally pseudoscience

Kind of the point.

The Historical Fidelity · Oct 1, 2024

LordVile said:
Cold fusion is quite literally pseudoscience

Personally, I’m waiting for luke warm fusion

MikeClark512 · Oct 1, 2024

Notton said:
Is it a breakthrough though?
AFAIK using multiple data centers is not a new thing, it's just that no one likes dealing with the latency delays of waiting for data to arrive over the internet.

Which is why the preferred method is to put all eggs in one basket.

The only way to tell if there was a breakthrough would be if they published comparison benchmarks. And the only way to verify (trust) their claims is if they published enough details run an experiment to reproduce their results (don't hold your breath).

These days PyTorch/TensorFlow make it trivial to train a network on a multi-GPU computer. Coordinating this kind of multi-GPU+single-computer training requires serializing, segmenting, and synchronizing in a way that extends naturally to doing multi-server training, you just swap in TCP/IP read()/write() in place of gpu_to_host_memcpy()/host_to_gpu_memcpy() [or some optimization thereof, like NVLINK]. The trouble is that training slows down by a painful amount due to the extra network latency (mostly introduced by network bandwidth constraints, but also network overhead). So anything breakthrough here would be in the realm of training speed increases, which the article fails to discuss.

News China makes AI breakthrough, reportedly trains generative AI model across multiple data centers and GPU architectures

Administrator

Reputable

Splendid

Prominent

Distinguished

Splendid

Distinguished

Commendable

Reputable

Distinguished

Distinguished

Distinguished

Proper

Reputable

Distinguished

Proper

Commendable

Splendid

Distinguished

Reputable

Share this page