Google's A3 GPU supercomputer is the company's latest platform for powering LLMs.
Google Launches AI Supercomputer Powered by Nvidia H100 GPUs : Read more
Google Launches AI Supercomputer Powered by Nvidia H100 GPUs : Read more
Damn you aren't easily impressed, you must be pretty cool.Here we go. The most advance text message answering service on the planet! I can't wait
I have my reservations about all the AI's growing in the wild.Google's A3 GPU supercomputer is the company's latest platform for powering LLMs.
Google Launches AI Supercomputer Powered by Nvidia H100 GPUs : Read more
Huh? I'm confused by this reply.Damn you aren't easily impressed, you must be pretty cool.
I call BS. Frontier is now the fastest computer in the world and just hit 1.7 exaflops. Nvidia says that this computer is 15 times faster?
The standard format to use for HPC is to use fp64. I think 80-bit is kind of a weird x87 thing.They say that A3 is "delivering 26 exaFlops of AI performance", so most likely it is not usual (normal) floating point operations.
Normal Floating Point Operation usually take 80 bit data (or more) per operations, but in AI, such precision is not needed, hence they (usually) use TF32 or even FP8 data format for floating point operations. So, assuming they used FP8 to produce this result, its only need 1/10 performance (hence 10 time faster) than normal floating point operations.
Actually, they do. They said it incorporates an undisclosed number of 4th Gen (Sapphire Rapids) Xeons (I'd guess 2x) and 8x H100 accelerators. That actually puts it rather below the theoretical peak throughput of 31.7 EFLOPS, although I'd guess the peak presumes boost clocks and what Google is actually reporting is the sustained performance.They also do not disclosed how many A3 are used to produce this result.
They say the number of H100 used per A3, but they do not say how many A3 they use 😉Actually, they do. They said it incorporates an undisclosed number of 4th Gen (Sapphire Rapids) Xeons (I'd guess 2x) and 8x H100 accelerators. That actually puts it rather below the theoretical peak throughput of 31.7 EFLOPS, although I'd guess the peak presumes boost clocks and what Google is actually reporting is the sustained performance.