News Nvidia Steals AMD's Supercomputer Efficiency World Record

bit_user

Polypheme
Ambassador
Truth to be told, Nvidia-based supercomputers (in many cases comprising of standard servers) have been performance-per-watt champions in the Green500 list for some time, so it is logical to expect H100 to continue Nvidia’s winning spree here.
The reason it's surprising is that MI200 doubled CDNA's fp64 rate, by expanding its registers and datapaths to 64-bit. This enabled it to dispatch fp64 operations at full-rate, instead of half the fp32 rate. Relative to the A100, this enabled AMD to take a significant single-device performance lead, which typically translates into better perf/W (so long as the lead wasn't achieved by juicing power to an extreme degree).

From what I'm reading, H100 PCIe cards have fp64 performance of 24 TFLOPS for 350 W (not sure if that's base or boost). By comparison AMD's MI210 PCIe card provides 13.3 or 22.6 fp64 TFLOPS (base vs. boost) at 300 W. If Nvidia's figures are base, then I'd say it checks out. If they're boost, then it sure looks close.

Anyway, that's a good comeback from Nvidia, especially when you consider they still have a 2:1 fp32:fp64 ratio. That said, their H100 is made on TSMC N4, whereas AMD's MI200-series uses N6. So, the tables will likely turn once again, when AMD counters with the MI300-series.
 
  • Like
Reactions: prtskg