News Intel's Nova Lake CPU reportedly has up to 52 cores — Coyote Cove P-cores and Arctic Wolf E-cores onboard

Admin · Feb 7, 2025

Leaks allege that Intel is exploring a 52-core design for its next-gen Nova Lake chips.

Intel's Nova Lake CPU reportedly has up to 52 cores — Coyote Cove P-cores and Arctic Wolf E-cores onboard : Read more

bit_user · Feb 7, 2025

If they're going to spread CPU cores across multiple chiplets, it'd be interesting to have chiplets with all P-cores and others with all E-cores. Then, Intel could easily mix & match to serve more markets.

dalek1234 · Feb 7, 2025

This Intel, so 52 cores will probably end up as 5.2 cores.

usertests · Feb 7, 2025

bit_user said:
If they're going to spread CPU cores across multiple chiplets, it'd be interesting to have chiplets with all P-cores and others with all E-cores. Then, Intel could easily mix & match to serve more markets.

It sounds like the plan is to connect up to two 8P + 16 E chiplets, and they'll disable some cores if needed. I think the mixing of these two types on one chiplet could help with switching tasks between the two core types without incurring a latency penalty. And this remains true for any SKU with only one of these 8+16 chiplets, since there they all are. With the exception of the LP E-cores...

Bringing LP E-cores to desktop and increasing them from 2 (Meteor Lake) to 4 could help idle power efficiency immensely if they pull it off, great for office PCs idling 90% of the time, or doing very light work, video playback, etc. If we are looking at cores that are like Golden Cove IPC at 2 GHz, four of them could do surprisingly well on the SoC tile.

thestryker · Feb 7, 2025

bit_user said:
If they're going to spread CPU cores across multiple chiplets, it'd be interesting to have chiplets with all P-cores and others with all E-cores. Then, Intel could easily mix & match to serve more markets.

Yeah this is a question Ian Cutress raised after the ADL launch. At this point I'm not really sure why they don't separate out the P and E-cores. They somewhat doubled down on the existing monolithic nature with ARL given the E-cores clusters are mixed in with the P-cores. I'm sure there are advantages in power constrained parts so maybe this is dictating client strategy.

Peksha · Feb 8, 2025

Another fake news) And Arrow Lake scaled up to 32 cores. This whole strategy with P-E on one die is a failure, as are all their pseudo-chiplets. They still have double-digit sets of dies, completely thoughtlessly fishing out money for the design and production of electronic waste. They are losing money on this, and this will not change until they understand that the strategy of chiplets should be the simpler, the better. Take AMD as an example and do the same. Dumb pride and lack of competence will not allow them to change. The era of their miserable ring bus has passed, and dumb interposers with terrible latency are not what is needed in client machines. But here we are.

bit_user · Feb 8, 2025

usertests said:
It sounds like the plan is to connect up to two 8P + 16 E chiplets, and they'll disable some cores if needed.

Right, that's how it sounds.

usertests said:
I think the mixing of these two types on one chiplet could help with switching tasks between the two core types without incurring a latency penalty.

But they still do. Arrow Lake's P vs. E latency is basically as bad as between two different chiplets of Zen 5.

https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55af2adb-22d6-49b5-b5ce-c93df3f0518b_996x515.png

https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fefaf8904-f4b4-4335-8d5d-184569d27e25_1001x517.png

Source: https://chipsandcheese.com/p/examining-intels-arrow-lake-at-the

Also, core-to-core latency isn't about context switching. It's about communication between active threads on two different cores. That's something schedulers are already aware of, when they place different threads of the same process.

In practice, core-to-core latency has been shown to be of minimal real-world importance. The main reason people look at it is to try and glean details about a CPU's interconnect. What's much more important is each core's cache & memory latency.

usertests said:
Bringing LP E-cores to desktop and increasing them from 2 (Meteor Lake) to 4 could help idle power efficiency immensely if they pull it off, great for office PCs idling 90% of the time, or doing very light work, video playback, etc. If we are looking at cores that are like Golden Cove IPC at 2 GHz, four of them could do surprisingly well on the SoC tile.

If desktops want to decrease idle power, the first thing they should do is implement dynamic scaling of memory frequency. I think we'll never see P-cores being used for LP, since the SoC tile tends to be on an older node, which would both make them less efficient and bigger area hogs. They also can't use an older core, without holding back the ISA support of the newer ones.

Not to mention how Skymont supposedly has IPC similar to Raptor Cove, showing that the newer E-cores are just fine for LP duty.

bit_user · Feb 8, 2025

thestryker said:
They somewhat doubled down on the existing monolithic nature with ARL given the E-cores clusters are mixed in with the P-cores. I'm sure there are advantages in power constrained parts so maybe this is dictating client strategy.

Interleaving them was an interesting move. I wonder which actually has the higher power-density. Probably P-cores, but I think E-cores are actually quite power-dense. Even if power density is similar, Interleaving could make sense in cases where you have a lightly-threaded job hitting only the P-cores. Less likely, but an analogous thing could happen with a background job of some sort that's hitting the E-cores.

bit_user · Feb 8, 2025

Peksha said:
Another fake news) And Arrow Lake scaled up to 32 cores.

Do we know when the Arrow Lake 8P + 32E rumor died? I wonder if that might've coincided with the decision to use TSMC N3B. The 32 E-core option might've only worked with the density afforded by the Intel 18A node.

thestryker · Feb 8, 2025

bit_user said:
If desktops want to decrease idle power, the first thing they should do is implement dynamic scaling of memory frequency.

I wonder how feasible it would be to go beyond what Intel did with ARL. I very much like the fact that it can drop down to the base JEDEC profile (assuming the feature is enabled) and then clock back up to the chosen XMP profile as needed.

bit_user · Feb 8, 2025

thestryker said:
I wonder how feasible it would be to go beyond what Intel did with ARL. I very much like the fact that it can drop down to the base JEDEC profile (assuming the feature is enabled) and then clock back up to the chosen XMP profile as needed.

GPUs and phones scale memory frequency way more than that.

I'd guess the limit is probably something intrinsic to how DDR5 memory works. Perhaps we'll see even greater memory frequency scaling, in DDR6.

I wonder if laptops already drop frequencies like this, with LPDDR5. Given phones' use of the LP memory standards, I think it should be supported on that end.

jp7189 · Feb 8, 2025

bit_user said:
If they're going to spread CPU cores across multiple chiplets, it'd be interesting to have chiplets with all P-cores and others with all E-cores. Then, Intel could easily mix & match to serve more markets.

Taking this further... P and E don't necessarily need the same process node, so separating those tiles gives them considerable options to deal with whatever the future holds.

bit_user · Feb 8, 2025

jp7189 said:
Taking this further... P and E don't necessarily need the same process node,

They don't, but this doesn't necessarily work out like you might expect. For instance, AMD used a TSMC N4 node for the Zen 5 CCD and a N3 node for the Zen 5C CCD (source: https://www.anandtech.com/show/2146...bile-strix-point-with-rdna-35-igpu-xdna-2-npu ).

I'm not even going to speculate on the factors leading to that decision (I can imagine a lot). I just wanted to point out that being the E-core doesn't necessarily mean it should get the worse node.

thestryker · Feb 8, 2025

bit_user said:
I'd guess the limit is probably something intrinsic to how DDR5 memory works.

This was my assumption as well since it seems somewhat odd to be doing dynamic profile switching instead of some sort of direct clock control.

jp7189 · Feb 9, 2025

bit_user said:
They don't, but this doesn't necessarily work out like you might expect. For instance, AMD used a TSMC N4 node for the Zen 5 CCD and a N3 node for the Zen 5C CCD (source: https://www.anandtech.com/show/2146...bile-strix-point-with-rdna-35-igpu-xdna-2-npu ).

I'm not even going to speculate on the factors leading to that decision (I can imagine a lot). I just wanted to point out that being the E-core doesn't necessarily mean it should get the worse node.

Funny... i was thinking E would get the more advanced node. It seems LP tends to be the pipe cleaner with HP coming later. Even though sram scaling is better recently, better doesnt mean the trend of hard to shrink has reversed, so less sram gives E an advantage there as well. Historically, Intel nodes hit higher frequency where TSMC nodes have better efficiency, but really it's impossible to predict from where I'm sitting... just that more tiles equals more flexibility at the cost of latency.

Search

News Intel's Nova Lake CPU reportedly has up to 52 cores — Coyote Cove P-cores and Arctic Wolf E-cores onboard

Admin

Administrator

bit_user

Titan

dalek1234

Honorable

usertests

Distinguished

thestryker

Judicious

Peksha

Prominent

bit_user

Titan

bit_user

Titan

bit_user

Titan

thestryker

Judicious

bit_user

Titan

jp7189

Distinguished

bit_user

Titan

thestryker

Judicious

jp7189

Distinguished

TRENDING THREADS

Latest posts

Moderators online

Share this page