News Fujitsu uses Fugaku supercomputer to train LLM: 13 billion parameters

Admin · May 10, 2024

Fujitsu trains Fugaku-LLM model with 13 billion parameters for research and commercial use.

Fujitsu uses Fugaku supercomputer to train LLM: 13 billion parameters : Read more

Deleted member 2731765 · May 10, 2024

The training of Fugaku-LLM naturally took advantage of distributed parallel learning techniques optimized for the supercomputer's architecture and the Tofu interconnect D.

To put it more clearly, the 'Megatron-DeepSpeed' DL framework was actually ported to Fugaku, and the dense matrix multiplication library was accelerated for 'Transformer', so as to maximize the distributed training perf.

brandonjclark · May 10, 2024

Metal Messiah. said:
To put it more clearly, the 'Megatron-DeepSpeed' DL framework was actually ported to Fugaku, and the dense matrix multiplication library was accelerated for 'Transformer', so as to maximize the distributed training perf.

Let's make that more clear... 😉

Megatron (an NVidia-developed framework that excels at multi-GPU AI Acceleration), coupled with DeepSpeed, a Microsoft-written library were used.

The framework ensures many GPU's can be used at once to train the model.

The library itself helps by accelerating the performance of the model during training and operation by introducing many cool things like gradient-loss control and parallelism while still being fairly memory efficient.

When you add on a self-attention tooling like Transformers, you have a model that can pick out or highlight the more important sections of the input.

This type of AI acceleration (all of it put together) is very good at Natural Language Processing.

I think what I've said is true but I'm still learning.

Flayed · May 11, 2024

I wonder how much memory 13 billion parameters uses

A Stoner · May 11, 2024

If they really want to perfect AI, they should be starting small and figuring out how to get the program to 'understand' what it knows... millions of smaller AI projects will absolutely move things forward far faster than a few massive ones in the long run.

Search

News Fujitsu uses Fugaku supercomputer to train LLM: 13 billion parameters

Admin

Administrator

Deleted member 2731765

Guest

brandonjclark

Distinguished

Flayed

Splendid

A Stoner

Distinguished

TRENDING THREADS

Latest posts

Moderators online

Share this page