Review Intel 'Emerald Rapids' 5th-Gen Xeon Platinum 8592+ Review: 64 Cores, Tripled L3 Cache and Faster Memory Deliver Impressive AI Performance

Tech0000

Reputable
Jan 30, 2021
23
19
4,515
Thank you for the write article!
1. correction: 2nd table first page Intel Xeon 8462Y+ (SPR) price should be the same as 8562Y+ (EMR) price = $5,945. Right now (11.22pst) you have Intel Xeon 8462Y+ (SPR) priced at $3,583 - which is wrong.

2. I would have liked to see the comparison between EMR and SPR for the same model, e.g. 8462Y+ vs 8562Y+ to better understand and isolate the generational core for core and model improvement (mostly everything else being equal). It's hard to derive conclusions, when you are comparing different models and core configs with test results numbers allover the place - one winning over the other depending on the test performed.
I suspect that a 8462Y+ vs 8562Y+ comparison would result net very modest gain (due marginally higher all core turbo) and that the real performance gains are in top tier SKUs with triple L3 cache, accepting faster DDR5 etc.

3. As a workstation ship the single socket 8558U seams to be pretty good "value" (relative to the other intel SKUs) actually. $3720 for a 48 core chip with 250GB L3 cache is not too bad for a corporate WS. Not as capable (in terms of accelerators) as the other loaded high end pricy SKUs, but for 48 cores at $77/core it is pretty good. Maybe this chip can be packaged and used as a candidate Xeon W9-3585X or similar chip...
 
Last edited:
  • Like
Reactions: bit_user
I'm assuming a chunk of the losses are due to Zen 4 being much more efficient but it would be nice to see some clock graphs (not for every test, but maybe one per category) if possible. If that isn't possible maybe running this same suite on a 13900K/14900K and 7950X to give some context since these are extremely close in threaded performance despite Intel using more power.

Appreciate the immediate look and hope to see some more, or maybe some Xeon W review action when those EMR refreshes come out!
 

tamalero

Distinguished
Oct 25, 2006
1,136
142
19,470
We put Intel’s fifth-gen Emerald Rapids Xeon Platinum 8592+ through the benchmark paces against AMD's EPYC Genoa to see which server chips come out the winner.

Intel 'Emerald Rapids' 5th-Gen Xeon Platinum 8592+ Review: 64 Cores, Tripled L3 Cache and Faster Memory Deliver Impressive AI Performance : Read more
Whats with these "you can trust our review made by pros" ?
The first one I've ever seen do that thing was Gamer Nexus.
Now seems everyone wants to add those kind of "claims" on their own reviews.
 
  • Like
Reactions: NinoPino

bit_user

Polypheme
Ambassador
Thanks for the review, as always!

Some more potential cons:
  • Still significantly lagging Genoa on energy-efficiency.
  • PCIe deficit in 1P configurations (80 Emerald Rapids vs. 128 lanes for Genoa). In 2P configurations, Genoa can run at either 128 or match Emerald Rapids' 160 lanes, if you reduce the inter-processor links to just 3.
  • Fewer memory channels (8 vs. 12 for Genoa), though the number of channels per-core is the same.

The 96-core EPYC Genoa 9654 surprisingly falls to the bottom of the chart in all three of the TensorFlow workloads, implying that its incredible array of chiplets might not offer the best latency and scalability for this type of model.

ef6q7Ry2oRZ87f23fr9K4S.png

I did see a few such inversions in Phoronix' review, but fewer and way less severe. This should be investigated. I recommend asking AMD about it, @PaulAlcorn . It almost looks to me like you might've had a CPU heatsink poorly mounted, forgot to replace the fan shroud, or something like that. It's way worse than anything you saw in your original Genoa review, where we basically only saw inversions in stuff that didn't parallelize too well.
In this review, it almost seems like the EPYC 9554 is outperforming the 9654 more often than not!
 

bit_user

Polypheme
Ambassador
BTW, I find it a little weird that they still don't have a monolithic version that's just 1 of XCC tiles, even as just a stepping stone, before you get down to the range of the regular MCC version.

iWB7xM445J3ezFpNBJ6iwj.jpg

 

bit_user

Polypheme
Ambassador
2. I would have liked to see the comparison between EMR and SPR for the same model, e.g. 8462Y+ vs 8562Y+ to better understand and isolate the generational core for core and model improvement (mostly everything else being equal). It's hard to derive conclusions, when you are comparing different models and core configs with test results numbers allover the place - one winning over the other depending on the test performed.
I suspect that a 8462Y+ vs 8562Y+ comparison would result net very modest gain (due marginally higher all core turbo) and that the real performance gains are in top tier SKUs with triple L3 cache, accepting faster DDR5 etc.
I'd imagine the issue is that they can only test the review samples they're sent by Intel.

Phoronix tested a limited number of benchmarks with different DDR5 speeds. Seems like the faster DDR5 wasn't a huge win, but sadly none of the AI benchmarks were included. Those should've skewed the geomean a bit higher.

If that isn't possible maybe running this same suite on a 13900K/14900K and 7950X to give some context since these are extremely close in threaded performance despite Intel using more power.
To make the results more applicable, I'd suggest the E-cores should be disabled.
 
BTW, I find it a little weird that they still don't have a monolithic version that's just 1 of XCC tiles, even as just a stepping stone, before you get down to the range of the regular MCC version.
iWB7xM445J3ezFpNBJ6iwj.jpg
What does the XCC offer in Xeon Scalable that MCC doesn't? I was trying to think of something but the specs of all the SKUs seem so random for EMR I couldn't figure out what you'd be referring to.
To make the results more applicable, I'd suggest the E-cores should be disabled.
That would remove the entire point I was getting at of using the desktop parts as a comparison. The 13900K/14900K consistently go back and forth with the 7950X in MT performance at stock settings in standard CPU benchmarks despite the extra power consumption on the Intel side. Though with the IPC between RPL/Zen 4 so close maybe disabled E-cores + 1 CCD disabled would make for a good comparison as then it would be just 8 P-cores vs 8 Zen 4 cores. I haven't seen any such comparison though so this is just a wild guess.
 

bit_user

Polypheme
Ambassador
What does the XCC offer in Xeon Scalable that MCC doesn't?
I just meant that perhaps they could get more mileage out of their chiplet usage. Like, maybe there are some XCC tiles with a defect in the EMIB section, so just put those on a substrate by themselves and sell it as 32C or less.

That would remove the entire point I was getting at of using the desktop parts as a comparison.
Okay, well if you don't exclude the E-cores, then I don't see how those tests would be relevant to these server CPUs.

with the IPC between RPL/Zen 4 so close maybe disabled E-cores + 1 CCD disabled would make for a good comparison as then it would be just 8 P-cores vs 8 Zen 4 cores. I haven't seen any such comparison though so this is just a wild guess.
Heh, you might just get your chance! The new Xeon E-series 2400 have their E-cores disabled (sounds ironic, eh?). So, if anyone benchmarks a Xeon E-2488 against a Ryzen 7700X, then it'd be exactly what you're talking about.

Annoyingly (for me), the new Xeon E 2400 also have their GPUs disabled. Otherwise, I might've been interested. I guess they could still announce G-versions, later.
 
I just meant that perhaps they could get more mileage out of their chiplet usage. Like, maybe there are some XCC tiles with a defect in the EMIB section, so just put those on a substrate by themselves and sell it as 32C or less.
Ah yeah I get what you mean, but they'd be limited to 4 memory channels and half the PCIe lanes as well. I would love to know what happens in that circumstance though... like do they have to toss the whole thing?
Okay, well if you don't exclude the E-cores, then I don't see how those tests would be relevant to these server CPUs.
Well like I said originally it's more to give a known quantity comparison than it is to get a direct reflection. What I mean by this being if there was a test that Genoa beat EMR, but the desktop CPUs were closer/equal you could extrapolate that the server CPU differences were more likely due to efficiency than architecture. It would definitely be much better if you had a P-core only setup which matched a Zen 4 setup in performance though for this comparison.
Heh, you might just get your chance! The new Xeon E-series 2400 have their E-cores disabled (sounds ironic, eh?). So, if anyone benchmarks a Xeon E-2488 against a Ryzen 7700X, then it'd be exactly what you're talking about.
Yeah that would be the ideal comparison. I'd love to see a die shot to see if they're using ones without E-cores.
Annoyingly (for me), the new Xeon E 2400 also have their GPUs disabled. Otherwise, I might've been interested. I guess they could still announce G-versions, later.
Yeah I was surprised there were so many SKUs listed but none with an IGP. In the past they've always launched at least a few with graphics. Another reason why I'd love to see a die shot.
 

bit_user

Polypheme
Ambassador
Yeah that would be the ideal comparison. I'd love to see a die shot to see if they're using ones without E-cores.
It's 1000% just the desktop S die, with E-cores and iGPU disabled. All of the other specs line up exactly. There would be no reason to make a custom die, especially since a lot of the people who would've bought one (mainly for ECC) probably already have a mainstream Gen 12 or 13 CPU, by now. I'm really surprised they even bothered to release these.

The only thing I'm not sure about is if it's based on Alder Lake or Raptor Lake, but a troubling sign is that it lists support for only DDR5-4800. However, the max turbo is 5.6 GHz, which better aligns with Raptor Lake. I guess L2 cache will be the giveaway, but sadly ark.intel.com doesn't list L2 specs for the Xeon E-2488.
 
Last edited:
It's 1000% just the desktop S die, with E-cores and iGPU disabled.
I forgot to check AVX512 which would have told me that.
I'm really surprised they even bothered to release these.
There's only two things I can think of:
  1. There's something weird with software licensing and even with E-cores disabled it's a problem (I'm not familiar enough with per core software licensing to know how likely this is).
  2. They have enough good P-core die with faulty E-core/IGP that this made sense as a way to maximize revenue.
 

bit_user

Polypheme
Ambassador
I forgot to check AVX512 which would have told me that.
Yeah, I noticed that too. Pity they disabled the E-cores and still couldn't do the right thing and give us AVX-512. That makes these the only Xeon-branded CPUs, in 2 generations, to have no AVX-512. Way to spread even more market confusion, Intel.

I had thought if Intel would release Xeon E's with no E-cores, they would relent and enable AVX-512. I was focusing mainly on the H0 die, which doesn't even have E-cores, in the first place.

The only reasons I can see why not is that maybe the desktop dies have bugs in their AVX-512, or else maybe Intel decided they wanted to preserve AVX-512 as a differentiator for their higher-end Xeons (i.e. classical market segmentation ; )

There's only two things I can think of:
  1. There's something weird with software licensing and even with E-cores disabled it's a problem (I'm not familiar enough with per core software licensing to know how likely this is).
  2. They have enough good P-core die with faulty E-core/IGP that this made sense as a way to maximize revenue.
3. Some OEMs complained about the lack of Xeon E, because they have customers who insist on Xeon and won't accept either the Xeon E 2300 series (Rocket Lake) or the expensive new Xeon W models.