News AMD's Instinct MI355X accelerator will consume 1,400 watts

Admin · Jun 11, 2025

As power consumption of accelerators for supercomputers continues to grow in the coming years, AMD expects zetta scale-class datacenters to consume 500 megawatts, require nuclear reactor.

AMD's Instinct MI355X accelerator will consume 1,400 watts : Read more

DS426 · Jun 11, 2025

CDNA4 appears to be a decent improvement over CDNA3 as both the MI325X and "default" MI355X are rated at 1000W, yet perf is up considerably on the latter. I'm sure the AI zealots also appreciate the new data formats.

Stomx · Jun 11, 2025

Curious about this computer slang: how chiplets could be "reticle-sized" ?

Wiki: A reticle also known as crosshair, is a pattern of fine lines or markings built into the eyepiece of an optical device such as a telescopic sight, spotting scope, theodolite, microscope to provide measurement reference during visual inspection

Stomx · Jun 11, 2025

Also curious - when making specialized GPUs by removing HPC-oriented FP64 instructions hardware on a chip, how much silicon is actually saved? If it is relatively small and negligible, let it will still stay there. When all this AI hardware will be similarly like a fire become obsolete and go to the city dumps in masses then with FP64 at least it will be used for HPC

DavidC1 · Jun 12, 2025

Stomx said:
Curious about this computer slang: how chiplets could be "reticle-sized" ?

Wiki: A reticle also known as crosshair, is a pattern of fine lines or markings built into the eyepiece of an optical device such as a telescopic sight, spotting scope, theodolite, microscope to provide measurement reference during visual inspection

A reticle in this case basically means after the light passes through a mirror, it projects an image of the pattern onto a surface. In semiconductors, it's about 850mm2. So the maximum size per chiplet is 850mm2.

Stomx said:
Also curious - when making specialized GPUs by removing HPC-oriented FP64 instructions hardware on a chip, how much silicon is actually saved? If it is relatively small and negligible, let it will still stay there. When all this AI hardware will be similarly like a fire become obsolete and go to the city dumps in masses then with FP64 at least it will be used for HPC

It's significant. FP64 is actually fairly power(only applicable when using FP64) and die area intensive. When you look at whole chip area wise, it would in the range of 10-20%. If you look at individual SM level, it's going to be even more significant, because chips have components that doesn't compute such as memory controllers, IO connections, and caches and other accelerators. It may be in the range of 20-30% in that case.

Full FP64 compliance is set by IEEE, and it wasn't used by any AMD/Nvidia GPUs until fairly recently. It's because compliance took extra effort and there's the power and transistor cost associated with it. Then, "GPUs" actually started living up to the name and did more than just run games, and went into supercomputers.

The difference between FP64 and FP32 is precision. In High Performance Computing(HPC) where you are simulating real world stuff, accuracy is important. In games where you have millions of pixels moving rapidly and changing all the time... not so much.

AI sacrifices precision even more. FP16, and Int8 for examples. I would not be surprised if AI hallucinations are caused significantly by this. The problem is, FP32 takes twice as much units as FP16, and FP64 is twice the amount of FP32, so that's why they use lower precision.

The precision losses are more significant though. It's like comparing 16-bit colors vs 8 bit colors. One is 2 to the power of 16 colors, which is 65,536, and other is 2 to the power of 8, which is only 256. So by going from FP64 to FP32, you might go from practically zero errors, to suddenly few % errors on FP32. And then if the AI is trying to relearn from finished data, you are multiplying the errors. This isn't even taking into account problems with algorithms, or the quality of the original data. It's like a printer that takes a 1200 dpi picture and prints at 300 dpi.

Jame5 · Jun 12, 2025

Still, using nuclear reactors to power supercomputers seems in the 2030s seems to be a more and more realistic possibility.

Great article. Last sentence typo is disconcerting though. Not the best way to end it.

Search

News AMD's Instinct MI355X accelerator will consume 1,400 watts

Admin

Administrator

DS426

Commendable

Stomx

Prominent

Stomx

Prominent

DavidC1

Distinguished

Jame5

Upstanding

TRENDING THREADS

Latest posts

Moderators online

Share this page