Phoronix just posted benchmarks on the Xeon Max with 64 GiB of HBM 2e in caching and exclusive mode.
Some of us have been talking a lot about the prospect of HBM in CPUs (@InvalidError , @Kamen Rider Blade ). Exciting prospects of things to come (hopefully)!
Here's the GeoMean (sorry, can't embed image):
It shows an advantage of 18.5% to 20.4% (depending on CPU model) advantage for HBM 2e-exclusive mode vs. 8-channel DDR5-only. An important caveat is that this is only an average over select HPC and AI benchmarks, while not attempting to characterize performance across a broader range of server workloads.
More interesting is probably system power consumption, where increased HBM 2e usage correlated with slightly reduced power consumption.
I'm not surprised by that, but I wouldn't have bet on it since increasing bandwidth should decrease idle cycles of the cores (hence, the faster performance). So, the fact that it can improve performance while slightly decreasing power consumption is interesting.
Then again, I'm betting the CPUs spent a lot of time being power-limited. So, all it had to do was reduce power in a small number number of cases that weren't bumping the upper limit already. The story could still be different, for a desktop CPU with HBM-class memory.
Some of us have been talking a lot about the prospect of HBM in CPUs (@InvalidError , @Kamen Rider Blade ). Exciting prospects of things to come (hopefully)!
Here's the GeoMean (sorry, can't embed image):
It shows an advantage of 18.5% to 20.4% (depending on CPU model) advantage for HBM 2e-exclusive mode vs. 8-channel DDR5-only. An important caveat is that this is only an average over select HPC and AI benchmarks, while not attempting to characterize performance across a broader range of server workloads.
More interesting is probably system power consumption, where increased HBM 2e usage correlated with slightly reduced power consumption.
I'm not surprised by that, but I wouldn't have bet on it since increasing bandwidth should decrease idle cycles of the cores (hence, the faster performance). So, the fact that it can improve performance while slightly decreasing power consumption is interesting.
Then again, I'm betting the CPUs spent a lot of time being power-limited. So, all it had to do was reduce power in a small number number of cases that weren't bumping the upper limit already. The story could still be different, for a desktop CPU with HBM-class memory.
Last edited: