News Intel Raptor Lake Specs Allegedly Exposed: Up To 24 Cores, 32 Threads

rluker5

Distinguished
Jun 23, 2014
651
388
19,260
I just don't quite understand why would you need 16 Ecores for more performance. esp. when E supposed to mean efficient....
I'm totally excited to get them. I will use parts of them sometimes when I want to bolster my p-cores' serial performance with extra cache and other parts of them when I want more parallel compute capacity. I'll probably use the cache the most, but hey, maybe I'll just enable a set of 4 and process lasso them to the background sometimes.

At least they will help more in games than an extra chiplet on the AMD side and if you want the parallel compute that extra chiplet is good for then 16 e-cores could help with that.

Really I find e-cores more useful than secondary chiplet cores, but most of the heavy cpu use for my pc is gaming. I know it is a bit of a pipe dream, but it would be great if the p-cores could use bios idled e-cores' L2 as additional cache. That would really be getting the most out of the silicon.
 
You don't, but that's how the benchmark wars are won. Have to release something bigger and better every year to give people a reason to upgrade.
Well ryzen has shown that there are enough idiots out there that will pay really good money for big multithread numbers even if they don't need it, so why should intel lose out on that?!
...Competition?...yay?! ...
 

bit_user

Polypheme
Ambassador
I just don't quite understand why would you need 16 Ecores for more performance. esp. when E supposed to mean efficient....
In Alder Lake, they offer about 60% the performance of a single thread running on a P-core, while using less than 20% of the power and about 25% of the area. So, they really are a more power-efficient and area-efficient way to scale performance. It's not a scam.

The only real down-side the E-cores have is that the OS needs to schedule them effectively. Other than that, they're a pure win for highly-threaded workloads. And for lightly-threaded workloads, you can use all P-cores.

I know E-cores aren't popular among enthusiasts, but I think they get a bad & mostly unjustified rap. I think it was a bold move by Intel and they deserve some credit for taking the risk.
 
  • Like
Reactions: TJ Hooker

bit_user

Polypheme
Ambassador
You don't, but that's how the benchmark wars are won.
I'm rather sick of people being told they don't need more cores. Anyone who compiles software can almost certainly use more cores/threads. Phoronix regularly tests LLVM and Linux kernel compilation, both of which scale well even up to 256 threads.

At my job, a full build of our codebase routinely takes me the better part of an hour. I can't wait until I can upgrade to an i9-12900 (we're an Intel shop, so 5950X isn't an option for me).
 
  • Like
Reactions: TJ Hooker

Kamen Rider Blade

Distinguished
Dec 2, 2013
1,280
812
20,060
In Alder Lake, they offer about 60% the performance of a single thread running on a P-core, while using less than 20% of the power and about 25% of the area. So, they really are a more power-efficient and area-efficient way to scale performance. It's not a scam.

The only real down-side the E-cores have is that the OS needs to schedule them effectively. Other than that, they're a pure win for highly-threaded workloads. And for lightly-threaded workloads, you can use all P-cores.

I know E-cores aren't popular among enthusiasts, but I think they get a bad & mostly unjustified rap. I think it was a bold move by Intel and they deserve some credit for taking the risk.
I still think that for DeskTop CPU's, the P-Cores & E-Cores should be split into their seperate SKU's.
No more Hybrid non-sense on DeskTop.

I can easily imagine a world where one CCD/Tile of either 10x P-Cores or 32x E-Cores would rule for the workload you care about.
The upcoming Meteor Lake will have 12x P-Cores or 40x E-Cores per CCD/Tile.

That could be GAME changing for those of us who work highly Multi-Threaded workloads.
Ko0b7gX.jpg

And with the rumored FishHawk Falls to be a HEDT replacement,
I can see a Monolithic 24x P-Cores or 96x E-Cores on one die being something very special.
 
Last edited:

Kamen Rider Blade

Distinguished
Dec 2, 2013
1,280
812
20,060
I'm rather sick of people being told they don't need more cores. Anyone who compiles software can almost certainly use more cores/threads. Phoronix regularly tests LLVM and Linux kernel compilation, both of which scale well even up to 256 threads.

At my job, a full build of our codebase routinely takes me the better part of an hour. I can't wait until I can upgrade to an i9-12900 (we're an Intel shop, so 5950X isn't an option for me).
Maybe you guys should start becoming an AMD shop ^_-
 

bit_user

Polypheme
Ambassador
No more Hybrid non-sense on DeskTop.
Why? I don't understand this irrational fear of E-cores. What evidence do you have that they're hurting anything?

Did you know there's this thing called Hyper-Threading, where you get an even worse performance penalty than running a thread on an E-core? And yet somehow operating systems have managed to schedule threads to cores in a way that minimized its impact for like 13 years since Intel re-introduced it?

I can easily imagine a world where one CCD/Tile of either 10x P-Cores or 32x E-Cores would rule for the workload you care about.
Intel already said they're not pursuing hybrid CPUs for server, so I don't see this happening. Maybe when their CPUs get broken up into a bunch of tiles, they might segregate the core types just so they don't need quite so many different dies for each market segment.

But they're still going to offer E-cores on desktop, because it's the most power-efficient and area-efficient way to scale performance, period.
 

Kamen Rider Blade

Distinguished
Dec 2, 2013
1,280
812
20,060
Why? I don't understand this irrational fear of E-cores. What evidence do you have that they're hurting anything?
I like the E-cores, I just don't want them to be "Hybridized" with my P-cores.

I want ALL "E-cores" or ALL "P-cores" on seperate CPU's for Desktop.

Did you know there's this thing called Hyper-Threading, where you get an even worse performance penalty than running a thread on an E-core? And yet somehow operating systems have managed to schedule threads to cores in a way that minimized its impact for like 13 years since Intel re-introduced it?
Intel needs to improve their Hyper-Threading game, IBM is able to Hyper-Thread up to 8 Threads per core with SMT8 on their latest POWER10.
What has Intel been doing this entire time?

We need Dynamic Hyper-Threading where one core can split the work between either 1 Thread to 8 threads, and every thread count in between, depending on the workload size and how much L1I/D $ it takes.

At least that would justify going to a UltraWide cache setup with 192 KiB of L1 I/D $ like Apple's M1 CPU/SOC did.

You could easily dynamically partition the L1 I/D $ in real time based on the work load and split up the registers as needed for a given workload.

Especially given how wide AVX3-512 has become along with all the extra registers needed to support it.

Intel already said they're not pursuing hybrid CPUs for server, so I don't see this happening. Maybe when their CPUs get broken up into a bunch of tiles, they might segregate the core types just so they don't need quite so many different dies for each market segment.

But they're still going to offer E-cores on desktop, because it's the most power-efficient and area-efficient way to scale performance, period.
I have no issues with E-cores on desktop as long as they're ALL "E-cores".

I just don't see the value in a Hybrid solution that was clearly designed for Mobile to be forced onto DeskTop users.

Leave the Hybrid P+E cores for Mobile users ONLY.

DeskTop Users should get Monolithic P-cores or E-cores.

Never shall the two meet and suffer a Ringbus performance penalty because code touched the E-cores.

Never shall we waste die area for a Thread Director because we need a "Hybrid big.LITTLE solution".
 
Ugh! E-cores actually HURT gaming performance, not help it (at least with 12th gen chips).
Intel's performance cores are top notch though. Maybe they did something new with the E-cores...? I guess we'll see.
 

bit_user

Polypheme
Ambassador
I like the E-cores, I just don't want them to be "Hybridized" with my P-cores.

I want ALL "E-cores" or ALL "P-cores" on seperate CPU's for Desktop.
But why?? Do you not trust the OS to schedule them properly? That's the only explanation I can imagine, but I haven't seen evidence to support it.

Intel needs to improve their Hyper-Threading game, IBM is able to Hyper-Thread up to 8 Threads per core with SMT8 on their latest POWER10.
Why do you assume more is better? As they come at a nonzero cost, Intel & AMD are walking a balance between that cost and the benefits they provide in terms of latency-hiding and core utilization.

I trust they've done the math and concluded 2-per-core is best. Maybe 4 really would be cost-effective, but they're just gun-shy from all these side-channel attacks that seem to argue for disabling it altogether. In that case, maybe they don't want to burn silicon doubling-down on a feature some big customers might simply opt not to use.

BTW, there's a very good solution to the SMT side-channel attack problem, which Google has already implemented in Linux: only allowing threads from the same process to share a core. If you do that, there's no cause for concern. It could also be implemented at the hypervisor level, which might already be the case for all I know.

Especially given how wide AVX3-512 has become along with all the extra registers needed to support it.
You can't skimp on ISA registers. The only registers you can partition are the shadow registers, and I think we don't know how many they have for vector.

FWIW, Zen3 has 64 FP scheduler slots, which I think means you're not going to have more than 64 FP shadow registers. Since AVX-512 already increased the FP ISA registers to 32, we're not talking about a lot to partition between many threads.

I just don't see the value in a Hybrid solution that was clearly designed for Mobile to be forced onto DeskTop users.
Huh? Do you think CPUs have an infinite silicon and power budget? Because that's the only universe in which E-cores don't also make sense for desktop.

As I said before, they give you 60% of P-core performance at 25% of the area and about 20% of the power. How is that not relevant for desktops, when we live in a world where all-core clocks are lower than peak single-thread clocks? Not to mention die area -> cost.

Never shall the two meet and suffer a Ringbus performance penalty because code touched the E-cores.
Ring bus doesn't really have anything to do with E-cores. It's outlived its usefulness, anyhow. IMO, they shouldn't use it in any CPU with more than 8 bus stops anyway. Mesh scales way better.

Never shall we waste die area for a Thread Director because we need a "Hybrid big.LITTLE solution".
I'm sure the Thread Director is like 0.1% of die area. The kinds of stats it keeps are mostly those already tracked by core performance counters.

Furthermore, you incorrectly assume the Thread Director is only useful for P- vs. E- scheduling, but I assure you it's quite valuable for Hyper Threading, because vector workloads scale so poorly with > 1 thread per core. It'll be telling if Intel leaves it in Sapphire Rapids.
 
  • Like
Reactions: TJ Hooker

bit_user

Polypheme
Ambassador
Ugh! E-cores actually HURT gaming performance, not help it (at least with 12th gen chips).
How often? In just a handful of games or the vast majority?

From my perspective, these are all software problems and fixable (in time). Legacy software makes assumptions that all cores are symmetrical and tends to do brain-dead things like spawning a thread per core. That's stone-age stuff. What we ought to have is thread-less async dispatch, and just let the OS manage how many threads to allocate to the app and which cores to run them on.
 
  • Like
Reactions: KyaraM
Why? I don't understand this irrational fear of E-cores. What evidence do you have that they're hurting anything?
For starters -

View: https://youtu.be/wqB96Bdsb4M

There's dozens of other videos/reviews that back up these findings.
The problem is that the E-cores DON'T currently help enough even with background tasks. People get better performance just completely disabling the E-cores in BIOS and running with just P-cores.
 

bit_user

Polypheme
Ambassador
There's dozens of other videos/reviews that back up these findings.
Got anything else? I don't Youtube.

The problem is that the E-cores DON'T currently help enough even with background tasks.
Meh. I think the "background task" thing is mostly a red herring. Not that it doesn't make sense, but people tend to overestimate how much compute time is actually spent in background tasks.

People get better performance just completely disabling the E-cores in BIOS and running with just P-cores.
Do they disable hyperthreading, too? Because if not, then that tells me it's not a problem with the E-cores, but rather with app software that's making assumptions it really shouldn't.
 
  • Like
Reactions: KyaraM
Got anything else? I don't Youtube.
Sorry, I'm not going to Google for you. ;)

Meh. I think the "background task" thing is mostly a red herring. Not that it doesn't make sense, but people tend to overestimate how much compute time is actually spent in background tasks.
Maybe so.
Hopefully, if games start becoming P and E-core aware, it will get better.

Do they disable hyperthreading, too? Because if not, then that tells me it's not a problem with the E-cores, but rather with software that's making assumptions it really shouldn't.
Not true.
E-cores have an affect on ring/bus speeds. Some very technical peps (buildzoid) mentions this as part of the reason for performance loss.
 
  • Like
Reactions: bit_user

rluker5

Distinguished
Jun 23, 2014
651
388
19,260
I like the E-cores, I just don't want them to be "Hybridized" with my P-cores.

I want ALL "E-cores" or ALL "P-cores" on seperate CPU's for Desktop.


Intel needs to improve their Hyper-Threading game, IBM is able to Hyper-Thread up to 8 Threads per core with SMT8 on their latest POWER10.
What has Intel been doing this entire time?

We need Dynamic Hyper-Threading where one core can split the work between either 1 Thread to 8 threads, and every thread count in between, depending on the workload size and how much L1I/D $ it takes.

At least that would justify going to a UltraWide cache setup with 192 KiB of L1 I/D $ like Apple's M1 CPU/SOC did.

You could easily dynamically partition the L1 I/D $ in real time based on the work load and split up the registers as needed for a given workload.

Especially given how wide AVX3-512 has become along with all the extra registers needed to support it.


I have no issues with E-cores on desktop as long as they're ALL "E-cores".

I just don't see the value in a Hybrid solution that was clearly designed for Mobile to be forced onto DeskTop users.

Leave the Hybrid P+E cores for Mobile users ONLY.

DeskTop Users should get Monolithic P-cores or E-cores.

Never shall the two meet and suffer a Ringbus performance penalty because code touched the E-cores.

Never shall we waste die area for a Thread Director because we need a "Hybrid big.LITTLE solution".
In what use would a 12 p-core system be faster than an 8 p/16e setup if you were to choose the faster p/e configuration for your workload?
Techspot article 2308 did comparisons between core and cache increases in gaming (serial workloads) and found the increased cache to generally help more over 6 cores than additional cores. The core scaling is not currently there.
You could start turning off extra p-cores to save on power if you could still get the shared cache benefit, but I imagine you would be like most, use all of them, set your max stable clocks to the frequency they can hold under stress testing and you would just have slower cores than the e-core analogue. Do we have anybody shutting down p-cores (even their slow ones) to get higher frequencies and higher performance now? I bet a 5.4ghz 6 used core 12900k beats one using 8 cores at 5.1ghz in a lot of games, but nobody is going to do that.

For the same reason the mainstream chips were faster than HEDT in gaming until games started using more than 4 cores at about the time coffee lake was released. Now games are made for up to 8 cores instead of up to 4.

And I don't think a p-core is 4 times faster than an e-core in parallel workloads.

If it will be slower to have more p-cores than have that die area be e-cores, at least until the time comes that Golden Cove cores are obsolete, then why have them?
 
  • Like
Reactions: KyaraM and bit_user

rluker5

Distinguished
Jun 23, 2014
651
388
19,260
Ugh! E-cores actually HURT gaming performance, not help it (at least with 12th gen chips).
Intel's performance cores are top notch though. Maybe they did something new with the E-cores...? I guess we'll see.
You could always disable the e-cores and use their cache. Also I've found that when I use Process Lasso to make p-core sets for games and leave the e-cores active I can get a steady 60fps at a lower frequency and power draw than with just p-cores active. AC Odyssey and SOTTR had notable improvements. I think the drm goes to the e-cores in this situation, which isn't a problem in most games, but I still didn't lose performance having e-cores enabled when the game I ran was set to p-cores. Max fps at unlimited power may still suffer from the slower ring so there might still be some penalty there. And buying PL and clicking boxes does take time and money, but where improvements can be made by custom core scheduling, they can also be made by automated scheduling so hopefully more scheduler improvements are coming.
BTW, there's a very good solution to the SMT side-channel attack problem, which Google has already implemented in Linux: only allowing threads from the same process to share a core. If you do that, there's no cause for concern. It could also be implemented at the hypervisor level, which might already be the case for all I know.
That might hurt performance in a virtualized system with lots of users, but I can't imagine it hurting real world performance much otherwise. Google coming up with the side channel fix again.
 
  • Like
Reactions: KyaraM and bit_user
You could always disable the e-cores and use their cache.
True. One thing I am curious to see is whether Intel has made any improvement with the E-cores and their usage. I suspect not, but we'll see.
I mean, if they just made the E-cores neutral (with regards to gaming performance) instead of a detriment, that would at least be something.

Maybe we'll see the return of an Extreme Edition Intel chip with only P-cores (active) and super high clocks. :LOL:
 
True. One thing I am curious to see is whether Intel has made any improvement with the E-cores and their usage. I suspect not, but we'll see.
I mean, if they just made the E-cores neutral (with regards to gaming performance) instead of a detriment, that would at least be something.

Maybe we'll see the return of an Extreme Edition Intel chip with only P-cores (active) and super high clocks. :LOL:
They already are pretty much neutral, almost all games are getting a difference of less than 1% either way with a handful being up to 5% and the two games that see a "big" difference of about 10% are running at 200 and 500+FPS respectively.
Overall it's -1%
and that's at 1080p with an 6900xt so 1080 is already a stretch, anybody using their system in a more realistic way will see much less of a difference.. than -1% ...
The bars might look huge but the biggest one is 11.4%
0oN8BPK.jpg
 

bit_user

Polypheme
Ambassador
Not true.
E-cores have an affect on ring/bus speeds. Some very technical peps (buildzoid) mentions this as part of the reason for performance loss.
Okay, even if disabling them "fixes" the problem by clocking their ring bus stop higher, that's just a design quirk with this iteration. It's neither fundamental to the mere fact of having E-cores, nor even fundamental to having them on a ring bus. Intel just needs to move the async clock domain so the ring bus doesn't get dragged down with them.

Hopefully, it's something they've fixed in Raptor Lake.
 
Last edited:
  • Like
Reactions: KyaraM and alceryes