I'm not one to complain much about this, but the article starts off with several basic editing errors that are hard to miss.
From the very first sentence:
... the industry's first general-purpose CPUs with up to 132 that can be used for AI inference.
Should read:
"... the industry's first general-purpose CPUs with up to 192 cores that can be used for AI inference."
Third paragraph:
interconnected using a mech network
mesh network, obviously.
Fourth paragraph:
The company claims that its new cores are further optimized for cloud and AI workloads and feature 'power and are efficient' instructions per clock (IPC) gains
Should be 'power- and area- efficient'.
Moving on to the benchmarks, the VMs/rack calculation is weird. Not only because they're using single-CPU machines, but also because they're not accounting for SMT with Intel or AMD. Even when I compute it their way, I get 2496 cores per rack, not 2688. But, if you look at vCPUs with SMT enabled, then Genoa gets 4992.
And if those are having to tackle workloads like AI, they're going to perform a lot better than cores with a mere dual 128-bit SIMD.
My last point about that VMs/rack analysis: just how common are single-vCPU VMs, anyway?
Considering AI performance, the end-notes are quite telling:
- AMD tested with 256 GB mem; AmpereOne had 512 GB.
- AMD tested with kernel 5.18.11; AmpereOne had 6.1.10.
- AMD tested with fp32; AmpereOne used fp16
I'm also left to wonder whether AVX-512 was even used, on AMD. It won't have been enabled by
-march=native, in the GCC version included with Ubuntu 22.04, because GCC didn't have patches for Genoa yet. And if they had used AVX-512, then they should've tested using BF16, which Zen 4
does support.
If they'd tested AMD with the same kernel, 384 GB or 768 GB of RAM (so, 12-channel mode, as the article pointed out wasn't used), and BF16 via AVX-512, then I think it'd have easily equaled, if not actually
trounced Ampere. Where Ampere might still pull out a win is on AI perf/W. That'd at least be
something.