News Intel Panther Lake and Wildcat Lake CPU specs break cover — leak suggests up to 16 CPU cores and 180 total AI TOPS

They are really not messing around with that iGPU.
LPDDR5-8533 equates to theoretical bandwidth of 136.5 GB/s. Do you think it'll be enough?

For graphics, I guess maybe, if they lean into that "side cache", introduced in Lunar Lake. But, it would need to scale like AMD did with Infinity Cache.

Considering its 180 TOPS, that works out to 1318 ops per byte of memory bandwidth. I think it won't be enough for that.

For comparison, consider that a RTX 4060 Ti has 177 TOPS and 288 GB/s of memory bandwidth. That's about twice the bandwidth per op.
 
Wildcat Lake could be an amazing successor to Alder Lake-N, but a lot of things have to go right.

Can it really be a cheap, high-volume, low-margin product on Intel 18A? This node could be significantly more expensive than Intel 7 (10ESF). On the plus side, it should bring great power efficiency improvements.

Does it even use chiplets? The article seems to assume it will, but it could be small enough to be monolithic instead. L3 cache is another big question mark since there is only 6 MiB on the predecessor and LP E-cores in Meteor/Lunar Lake have not been able to access L3 cache. If the 4x LP E-cores in Wildcat Lake can't access L3, they will fall flat in some workloads.

What will the dominant SKU have? Alder Lake-N is an 8-core, 32 EUs die but most of the good deals were for 4-cores with less CUs, like the N100. If a P-core gets disabled on the majority of chips, that's a big downgrade for some applications. On the other hand, if 2 E-cores get disabled, the 2+2 config will be fine.

I'm not sure if it would really use an Xe3 iGPU, but with the full 2 Xe cores, I guess it could be twice as fast as the full Alder Lake-N 32 EUs. But some SKUs may disable that to half, bringing no major improvement. If it is double or better, some games that ran at 720p on N100/N200 could go to 1080p instead.

DDR4 memory support is dropped, which will raise prices since the cheapest Alder Lake-N systems use DDR4 to save a few bucks (literally). It will be disappointing if it's limited to single-channel memory support again, but raising the maximum speed from LP/DDR5-4800 is helpful.

At the end of the day, the big deal with Wildcat Lake is the inclusion of Cougar Cove P-cores. Single-threaded performance could jump forward by nearly a decade compared to Alder Lake-N. This signals that Intel's hybrid/heterogeneous CPUs are ready to do what was long promised (tried and failed with "Lakefield"): deliver single-threaded performance about as good as any other current chip, within a low power envelope, and at a low(?) price. The P-core clocks will be lower than Panther Lake-H, Nova Lake, etc., but it's still a big step up for the Atom line.

It's fun to see a "1-tile" 18 TOPS NPU. It may not meet the "Copilot+" performance threshold, but it won't take up much die area and can be used by any application that supports it.
 
Last edited:
  • Like
Reactions: KyaraM and bit_user
My only real complaint about Alder Lake N is that I had the single channel DDR4 version, which I think was bottlenecking me from doing everything I wanted in a 24/7 OBS stream. Otherwise it's surprisingly very usable for your mom's Facebook and email.
I'm still waiting for the day N100/N95 finally replaces the awful, obsolete celeron processors still used at the lowest-end. Maybe whateverlake will push it down the stack.
 
Considering its 180 TOPS, that works out to 1318 ops per byte of memory bandwidth. I think it won't be enough for that.
GPU is supposedly 120 of the 180, CPU and NPU make up the rest.
My only real complaint about Alder Lake N is that I had the single channel DDR4 version, which I think was bottlenecking me from doing everything I wanted in a 24/7 OBS stream.
It seemed like ADL-N really needs the extra bandwidth from DDR5 since it was limited to single channel at base JEDEC speeds.
 
I'm not sure I really buy into WCL being a successor to ADL-N as that seems like a really odd choice which shouldn't be better than an 8 E-core part. It would certainly be a good replacement for the 1P/4E ADL SKUs though. I'm very curious about the lack of a 6 P-core PTL design, because the last time a couple of cores disappeared it was MTL and likely due to Intel 4.
 
My only real complaint about Alder Lake N is that I had the single channel DDR4 version, which I think was bottlenecking me from doing everything I wanted in a 24/7 OBS stream. Otherwise it's surprisingly very usable for your mom's Facebook and email.
I'm still waiting for the day N100/N95 finally replaces the awful, obsolete celeron processors still used at the lowest-end. Maybe whateverlake will push it down the stack.
All Alder Lake-N products are limited to single channel, there is no dual-channel support. Which is why memory speeds became more important than usual since it could actually be less bandwidth than previous generation dual-channel Jasper Lake.

From where I'm standing, Alder Lake-N has cleared away a lot of that. You can probably find sub-$100 products somewhere (particularly used market) with N4000, N6000, or something. But I even see N100, N200, and similar in the cheap laptops at Walmart.

I'm not sure I really buy into WCL being a successor to ADL-N as that seems like a really odd choice which shouldn't be better than an 8 E-core part. It would certainly be a good replacement for the 1P/4E ADL SKUs though. I'm very curious about the lack of a 6 P-core PTL design, because the last time a couple of cores disappeared it was MTL and likely due to Intel 4.
ADL-N is 8x Gracemont cores, but most commonly sold with half disabled as quad-core N95/N97/N100/N200 and now the Twin Lake refresh N150.

Skymont delivered massive integer/floating-point IPC uplifts over Gracemont/Crestmont. Darkmont probably adds a small (<5%) improvement over that. So already, four of those cores could match six of Gracemont.

Then you come to the Cougar Cove P-cores, which will absolutely smoke Gracemont. They could easily be more than twice as fast as a Gracemont core, delivering great single-threaded performance which is what users tend to notice.

There are some potential red flags:
- Will the true low-end systems get the full 2+4 die? Or cut down to 2+2, or even 1+4 or 1+3?
- The Darkmont cores are described as LP E-cores. Will they have access to L3 cache, or become much slower than expected from having no L3 and all L2 cache misses going straight to DRAM?

If it's just the core count regression you're worried about, it's unlikely the top 6-core Wildcat Lake would be slower in multi-threading than the i3-N305. Even if the LP E-cores are unexpectedly terrible, it should outperform the N305 in most scenarios, and definitely the N200.

Graphics should also be a lot faster, and it supports 33-42% faster DDR5/LPDDR5X memory with the DDR4 options taken off the table, which will help if they stick with single channel again. (This is all assuming the leak is correct)
 
If it's just the core count regression you're worried about, it's unlikely the top 6-core Wildcat Lake would be slower in multi-threading than the i3-N305. Even if the LP E-cores are unexpectedly terrible, it should outperform the N305 in most scenarios, and definitely the N200.

Graphics should also be a lot faster, and it supports 33-42% faster DDR5/LPDDR5X memory with the DDR4 options taken off the table, which will help if they stick with single channel again. (This is all assuming the leak is correct)
Oh I'm not talking about versus ADL-N using Gracemont but rather an 8 core Skymont/Darkmont part. I have a hard time believing a 2P/4LPE could really compete with such a configuration and it would also likely have a larger die size. The 4LPE could be entirely due to reusing SoC tiles (I'm guessing they're not following the two tile design of LNL due to the wildly different GPU/PCIe configurations in this leak) across the stack.

I'm also not sure how much faster 2 Xe cores (who knows which Xe arch these will be) is going to be than the existing 32 EU in the top ADL-N SKUs though faster memory will probably mean more than anything else here.
- The Darkmont cores are described as LP E-cores. Will they have access to L3 cache, or become much slower than expected from having no L3 and all L2 cache misses going straight to DRAM?
Given that all of Intel's designs with LPE haven't had L3 access even when on the same tile it seems likely this won't either. LNL had a separate cache which somewhat helped to alleviate this, but that doesn't seem like a great design choice for what inherently should be a low cost part.
 
Oh I'm not talking about versus ADL-N using Gracemont but rather an 8 core Skymont/Darkmont part. I have a hard time believing a 2P/4LPE could really compete with such a configuration and it would also likely have a larger die size. The 4LPE could be entirely due to reusing SoC tiles (I'm guessing they're not following the two tile design of LNL due to the wildly different GPU/PCIe configurations in this leak) across the stack.
I have no idea what to believe about the tiles. It seems like the complete package would be small enough that they could decide to make it monolithic, but maybe not.

My position is that the P-cores are more beneficial to most users. For the low end, once you've gotten to quad-core (I'm using an i3-10105), what you really want is single threads to be fast since most code from here to eternity is single-threaded. So 2+4 is fine, great even. I see this as the realization of what Intel's Lakefield experiment failed to deliver.

I didn't see much of the 1+4 Alder Lake-U chips on the market. Which is too bad, because I think 1+4 would deliver a generally superior experience to the i3-N305, not to mention the dual-channel memory. A quick Passmark comparison shows the 5-core Pentium Gold 8505 only losing a little in MT to the N305, but it's overall a better chip from around +50% ST. I think that's a similar scenario to your hypothetical 8-core Darkmont config vs. 2+4 Wildcat Lake (ignoring the LP E-core caveat).

I'm also not sure how much faster 2 Xe cores (who knows which Xe arch these will be) is going to be than the existing 32 EU in the top ADL-N SKUs though faster memory will probably mean more than anything else here.
ADL-N is using similar Xe-LP introduced way back in 11th gen Tiger Lake. Xe2-LPG ("Battlemage") is said to be around +50%. If they did go with Xe3-LPG ("Celestial"), that's potentially another uplift. 2x Xe2/3 cores should be 16 Vector Engines which is comparable to the old terminology of 32 EUs, so I would call it the same size. Hopefully it has what it needs for fancy AI upscaling, and they don't slash the iGPU by half in the best-selling SKUs.

Given that all of Intel's designs with LPE haven't had L3 access even when on the same tile it seems likely this won't either. LNL had a separate cache which somewhat helped to alleviate this, but that doesn't seem like a great design choice for what inherently should be a low cost part.
Yeah, this could be the Achilles' heel of Darkmont "LP E-cores".
 
  • Like
Reactions: thestryker
I still don't know what an average workstation is supposed to do with 16 cores, or even eight, even with half e-cores.
I've still never seen an official policy on what's up with e-cores and p-cores. I can guess, but would like to confirm they even have a theory and aren't simply winging it.
 
I have no idea what to believe about the tiles. It seems like the complete package would be small enough that they could decide to make it monolithic, but maybe not.
I don't really see monolithic making a comeback for anything using a leading edge node. If WCL ends up being an Intel 3 part then perhaps it will just be a one off die as it should be rather small. LPE leads me to believe this isn't the case though because outside of artificial segmentation I can't think of a good reason to not just toss everything on the ring for a monolithic die.
I didn't see much of the 1+4 Alder Lake-U chips on the market. Which is too bad, because I think 1+4 would deliver a generally superior experience to the i3-N305, not to mention the dual-channel memory. A quick Passmark comparison shows the 5-core Pentium Gold 8505 only losing a little in MT to the N305, but it's overall a better chip from around +50% ST. I think that's a similar scenario to your hypothetical 8-core Darkmont config vs. 2+4 Wildcat Lake (ignoring the LP E-core caveat).
Yeah they didn't get used a whole lot from what I saw which somewhat surprises me because it's a decent part. I kicked around getting something based on the 8505 when doing my updated router, but I didn't really need the extra compute performance. If I was going to put together a multipurpose transcoding/HTPC type system it's absolutely what I would have gone with.

Skymont is much closer in performance to Lion Cove than Gracemont is to Golden Cove. Boost clock speeds are much closer as well which would also help close the gap. ST would still undoubtedly be faster but I'm not sure 2P/4E would be able to keep up MT when limited to the same power (keep in mind 8505 base power matches N305 maximum) and 2P/4LPE certainly wouldn't be.
 
It's fun how Intel is throwing lots of weird (to say the least, they are also too much software adjustment dependent) CPUs on the market in a futile attempt to compete with 'classic' CPUs of AMD nowadays.
 
GPU is supposedly 120 of the 180, CPU and NPU make up the rest.
That's irrelevant to my point. AI models are big, which means they must be streamed in from memory. If you don't add memory bandwidth corresponding to the compute performance, then these TOPS numbers are only good for putting on slides. Because some of the weights are used in operations like convolutions, it doesn't have to be a 1:1 ratio of compute:bandwidth, but there's definitely a point where adding any more compute is just a waste of die area, without more bandwidth.

It seemed like ADL-N really needs the extra bandwidth from DDR5 since it was limited to single channel at base JEDEC speeds.
I think that's mainly for its GPU. CPU-only workloads should scale pretty well to 8 threads on that amount of bandwidth. It's well less than half the compute power of Alder Lake-S, yet it still has half the memory bandwidth.
 
My position is that the P-cores are more beneficial to most users. For the low end, once you've gotten to quad-core (I'm using an i3-10105), what you really want is single threads to be fast since most code from here to eternity is single-threaded.
Why do you think the E-cores got so much faster in Skymont, though? I think a lot of that is about making them less painful for lightly-threaded use (as well as reducing the hazard they pose for thread-scheduling).

P-cores can't really stretch their legs without burning a lot of power, which is a non-starter for cheap laptops that have small batteries and even smaller cooling systems. P-cores become really inefficient, not far outside the performance envelope of the E-cores.

I didn't see much of the 1+4 Alder Lake-U chips on the market. Which is too bad, because I think 1+4 would deliver a generally superior experience to the i3-N305,
The 1P + 4E configuration should be cut down from a 2P + 8E die, which is probably about double the size of Alder-N. The core area and iGPU are both doubled (64 EU), as is the memory controller (dual channel) and PCIe (20 lanes). Given that it's made on a mature node, there aren't likely to be many they can't sell with both P-cores enabled.

A quick Passmark comparison shows the 5-core Pentium Gold 8505 only losing a little in MT to the N305, but it's overall a better chip from around +50% ST.
Its PL2 is 55W! It's not a fair fight, since the N305 has a PL2 of only 35W.

ADL-N is using similar Xe-LP introduced way back in 11th gen Tiger Lake. Xe2-LPG ("Battlemage") is said to be around +50%. If they did go with Xe3-LPG ("Celestial"), that's potentially another uplift. 2x Xe2/3 cores should be 16 Vector Engines which is comparable to the old terminology of 32 EUs, so I would call it the same size. Hopefully it has what it needs for fancy AI upscaling, and they don't slash the iGPU by half in the best-selling SKUs.
Don't expect any miracles from the iGPU, so long as they don't go back to dual-channel memory. I don't imagine them putting a big "side cache" in their entry-level dies, so it'll probably scale only about as well as there's memory bandwidth to feed it.

BTW, one benefit of a NPU might be to offload image scaling from the iGPU, so that it can just concentrate on rendering. We'll see.
 
Last edited:
  • Like
Reactions: thestryker
I still don't know what an average workstation is supposed to do with 16 cores, or even eight, even with half e-cores.
Compiling and CPU-based rendering are two examples that come to mind. The Rust language is gaining in popularity, and compiling Rust eats CPU cycles for breakfast. Even legacy languages, like C++, are gaining new language features that make them more expensive to compile.

Now, you said "workstation". If you'd instead said something like "family PC", then I'd agree there's currently not much point in going beyond 8 cores, if that.

I've still never seen an official policy on what's up with e-cores and p-cores. I can guess, but would like to confirm they even have a theory and aren't simply winging it.
Hybrid CPUs do several things better than homogeneous ones. They have a natural advantage in MT perf/area and perf/W. This also means better perf/$. Furthermore, by relying on E-cores for multithreaded grunt, Intel could then afford to make the P-cores even bigger and offer even better single-threaded performance. So, the theory sounds brilliant, on paper.

I've done some performance modeling to project the performance across the entire range of power levels, had Intel instead made Alder Lake with 10P cores (which would've cost roughly the same as their 8P + 8E architecture). The gap was so large that I even computed the 12P scenario, just for fun.

bRJ9olV.png

The main caveat is that it assumes near perfect scheduling, and therein lies the problem. If thread scheduling were easier and better executed, this approach would be "pure win" and they'd have probably done it even sooner. Sadly, I still don't see the right things happening, to make scheduling on hybrid CPUs work better. Instead, Intel (and AMD) seem to be trying to make their CPUs less hybrid-like.
 
  • Like
Reactions: KyaraM
It's fun how Intel is throwing lots of weird (to say the least, they are also too much software adjustment dependent) CPUs on the market in a futile attempt to compete with 'classic' CPUs of AMD nowadays.
AMD has been doing hybrid as well, for about the past 2 years. It's just in their "APU" models with the large iGPU and monolithic die, not their chiplet-based models.
 
  • Like
Reactions: KyaraM
Hoping to see a NPU-free WCL
Why?

P-cores can't really stretch their legs without burning a lot of power, which is a non-starter for cheap laptops that have small batteries and even smaller cooling systems. P-cores become really inefficient, not far outside the performance envelope of the E-cores.
As you pointed out, the "15W" E-core only chips like the N305 can use over double that.

We're only talking about 2 P-cores here, and it's allegedly on the 18A node, which is several new nodes better than "Intel 7" and even has backside power delivery. I predict they'll be able to clock them high (even if that's less than other SKUs, e.g. 4.5 GHz) for at least short periods of time for the benefit of ST loads, and it will be great.

Heterogeneity can bring great ST performance to Atom, which is what will make these low-end systems feel more responsive. Few users/segments actually need more MT, even if heterogeneity is good for that too.

Don't expect any miracles from the iGPU, so long as they don't go back to dual-channel memory. I don't imagine them putting a big "side cache" in their entry-level dies, so it'll probably scale only about as well as there's memory bandwidth to feed it.

BTW, one benefit of a NPU might be to offload image scaling from the iGPU, so that it can just concentrate on rendering. We'll see.
We'd all love to see dual-channel come back, but we should expect the worst. So all we can count on is a 33-42% potential bandwidth increase over DDR5-4800, and no more slower DDR4 dragging the chips down.

I think Microsoft was among the first to try upscaling with the NPU, but it was designed for Snapdragon X Elites's 45 TOPS, not the 18 TOPS 1-tile Intel NPU 5.0:
https://videocardz.com/newz/microso...er-resolution-system-level-upscaling-for-npus

We'll have to see how these weaker NPUs get leveraged, if at all. But Intel putting them in even its cheapest products is how you eventually solve the chicken and egg problem.
 
  • Like
Reactions: KyaraM
As you pointed out, the "15W" E-core only chips like the N305 can use over double that.

We're only talking about 2 P-cores here, and it's allegedly on the 18A node, which is several new nodes better than "Intel 7" and even has backside power delivery. I predict they'll be able to clock them high (even if that's less than other SKUs, e.g. 4.5 GHz) for at least short periods of time for the benefit of ST loads, and it will be great.
You mention node-shrinks, but Intel uses those additional transistors and that power budget to extract additional performance. That's why Arrow Lake uses almost as much power as Raptor Lake, in spite of the same core count and fewer threads.

Also, in the sort of devices which use these, the thermal mass of the heatsink is quite low. You don't have heatsinks on the order of 500g to 1kg, like in performance desktops. So, you can't get away with very much/long turbo boosting. Your argument hinges around that, because once it stops boosting, it's not going to be much faster than the E-cores.

Heterogeneity can bring great ST performance to Atom, which is what will make these low-end systems feel more responsive.
I understand the benefits. We're just talking about whether it's pragmatic in area, power, and thermals.

Few users/segments actually need more MT, even if heterogeneity is good for that too.
Web browsers are more multithreaded that you might expect. We've had about 1.5 decades of phones using multi-core CPUs, so that's given devs plenty of time & incentive to make them heavily-threaded.

I think Microsoft was among the first to try upscaling with the NPU, but it was designed for Snapdragon X Elites's 45 TOPS, not the 18 TOPS 1-tile Intel NPU 5.0:
https://videocardz.com/newz/microso...er-resolution-system-level-upscaling-for-npus
Thanks! Hopefully, they'll offer lower-quality upscaling for weaker NPUs. 18 TOPS is still 36.2k ops per pixel, at 4k @ 60 Hz.

We'll have to see how these weaker NPUs get leveraged, if at all. But Intel putting them in even its cheapest products is how you eventually solve the chicken and egg problem.
Yeah, it'll be nice if they're ubiquitous. The first GPU I ever used for realtime inferencing was only good for about 8 TOPS.
 
Wild Cat sounds very interesting to me. I currently use a home-made NAS on the basis of a Raspi 4, it's been running for 4 years now. I would like to use some Docker containers in the future, which currently isn't going very well. So I was looking at the N100, since boards with that chip are pretty cheap where I live, at 96€. Couple that with 16 - 32 GB RAM and you got a nice, inexpensive base for a NAS.

Wild Cat would be a lot more expensive at the start is my guess, the N100 certainly was, but would make for a good future upgrade, and maybe even lower N100 prices further. Though I would get the N100 rather sooner than later...
 
  • Like
Reactions: usertests