News Intel could be prepping 24-core Arrow Lake-H processors for notebooks

D

Deleted member 2731765

Guest
Intel could be prepping 24-core Arrow Lake-H processors for notebooks.

Intel's Arrow Lake-H or perhaps Arrow Lake-HX processor will likely use fully-featured Arrow Lake-S multi-chiplet silicon, equipped with eight Lion Cove performance cores without Hyper-Threading and 16 Skymont efficiency cores, a large cache, and an integrated GPU.

Nope. Wrong . Arrow Lake H only sports 16 cores as per the patch.

GQBWXD5XAAAW5wB



These entries have been spotted. This is how it stacks up. All processors are having higher base clocks than previous gen entries.

Arrow Lake-S 24-cores, with base clock of 3.6 GHz.

Arrow Lake-HX 24-cores, with 3.0 GHz base clock.

Arrow Lake-H 16-core part, with base clock speed of 3.5 GHz.


At this point, we do not have any details about the specifications of Arrow Lake-S in BGA packaging, though InstLatX64 notes that the processor's detected frequency is 3.0 GHz without AVX512.

Wrong.

He was referring to the Arrow Lake-HX part/prototype here,with 3.0 GHz base clock frequency. Not the Arrow Lake-S variant.

GQvy9wYXgAAVZxl
 
Last edited by a moderator:
Workstations need more CPU cores, but if they are low power Skymont or cresentmont cores - is it really worth it?

Intel needs to have more Lion cove cores to compete favourably with 16 core Zen 5 SKUs.
 
D

Deleted member 2731765

Guest
This is the Arrow Lake-S entry. With 3.6 Ghz base clock.

https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_14901/bat-arls-5/boot0.txt

Intel-Arrow-Lake-S-Desktop-CPU.png


View: https://x.com/InstLatX64/status/1799124303985963355


While most notebooks will be fine to use Arrow Lake-H processors with up to 16 cores, ultra-high-end laptops for gaming and professional applications may require more cores, and Arrow Lake-S BGA with 24 cores will be just what the doctor ordered for them.

That's why the mobile HX part exists, and the patch entries also prove it as well. An ARL-HX part sporting 24 cores was spotted. So basically we are talking about the HX mobile variant here.
 
Workstations need more CPU cores, but if they are low power Skymont or cresentmont cores - is it really worth it?

Intel needs to have more Lion cove cores to compete favourably with 16 core Zen 5 SKUs.
16 full cores are expensive, why do you think AMD is starting to use the cheaper c variants?!
If you go above a certain number of cores the individual power of the cores are irrelevant since you can't make them all run at single thread speed anyway (unless you connect them to an Kw A/C unit ) having smaller and cheaper cores for high core count is the only thing that makes sense.
 
  • Like
Reactions: KyaraM and rluker5
16 full cores are expensive, why do you think AMD is starting to use the cheaper c variants?!
If you go above a certain number of cores the individual power of the cores are irrelevant since you can't make them all run at single thread speed anyway (unless you connect them to an Kw A/C unit ) having smaller and cheaper cores for high core count is the only thing that makes sense.

Agree to disagree. The Zen 5C cores still perform atleast 75% of a complete zen 5 core. The performance of the E cores is not in the same ball park.

The multi core scores will speak for themselves.
 

rluker5

Distinguished
Jun 23, 2014
913
594
19,760
Agree to disagree. The Zen 5C cores still perform atleast 75% of a complete zen 5 core. The performance of the E cores is not in the same ball park.

The multi core scores will speak for themselves.
You are using information more than a month old.
Skymont should be pretty close to zen5c not including SMT. They are supposedly close to RPL IPC. And quite good at low use/power scenarios.
 
Agree to disagree. The Zen 5C cores still perform atleast 75% of a complete zen 5 core. The performance of the E cores is not in the same ball park.

The multi core scores will speak for themselves.
On desktop they're a liability, but when power constrained they're close enough for it to not matter. It's not the core design or performance potential holding dense cores back so much as the clock scaling. The 8500G is an example of this where they can only clock up to 3.7 compared to 5 for the regular cores so a 35% clockspeed increase and that's before you touch the architecture side.

As for the Intel side Skymont should have very good performance, but Intel has only talked about the LNL implementation where it's not attached to the ring bus.

Also keep in mind AMD is doing 4 Z5/8 Z5C versus 6P/8E which makes it a bit less cut and dried performance wise.
 

mac_angel

Distinguished
Mar 12, 2008
671
148
19,160
I mentioned this before; I think it would be a great idea for someone to do a proper article with full testing on the validity of hyper-threading with the past couple of generations of CPUs. With gaming as well as production. As well as affected overclocks as a bonus.
 

abufrejoval

Reputable
Jun 19, 2020
615
452
5,260
I mentioned this before; I think it would be a great idea for someone to do a proper article with full testing on the validity of hyper-threading with the past couple of generations of CPUs. With gaming as well as production. As well as affected overclocks as a bonus.
I'm sure such an article would be big. But much less sure it would help digest the massive amounts of data being generated by an ever increasing number of choices and variations.

In a way one could argue that Intel has finally decided to cut down on the complexity they actively helped along with chiplet designs by removing hyperthreading for lower TDP P cores, because it could be fully offset by higher numbers of their now much better E-cores, which could have been made to include hypertheading as an alternative but resulted in being much more easily mistaked for just emulating AMD.

There simply are a lot more choices now and unless you're a lucky hyperscaler who can largely control his workloads, you'll be faced with very tough choices as to which variant might be the best compromise for your personal mix of use cases: there simply is no single best any more.

And for gaming there is also a growing gap between a theoretically possible optimum and the need to fit your game to the hardware available. Game engines might get better at trying to do some automatic adjustments to the available resources, but with GPUs and potentially other xPUs added on top, the sheer range and variety of options means only the most popular sizes can ever be designed for.

My take on the current TDP constrained designs is that everybody is thowing an abundance of CPU cores of the different varieties into those SoCs, hoping customers will pay for higher numbers, while pushing the proper selection of cores to TDP or time constraints will be left to the user or his application.

Using all of them at max speeds will only be possible for Geekbench milliseconds or perhaps somewhat longer on some mostly connected devices with ample cooling.
 
  • Like
Reactions: KyaraM

abufrejoval

Reputable
Jun 19, 2020
615
452
5,260
On desktop they're a liability, but when power constrained they're close enough for it to not matter.
Any such clear statement unfortunately is no longer always true.

It just depends too much on what you do with your desktop at any given time. And ultimately every desktop is also becoming power constrained.

With 8 non-c and 4/8c potentially at similar die sizes and large overlaps on the wattage curves, having dual mixed CCDs even for Ryzen 9 desktops makes as much sense as AMD offering distinct EPYCs for compact and high-clock variants.

And quite honestly looking at my 16 core Zen3 and Zen4 CPUs I can't help thinking that some of them might actually do better had they been 4+8c CCDs, because at fully parallel loads per core Wattags allocations are single digits and clocks well below 5GHz.

With next to no core-to-core dependencies or communication (real multi-tasking) 4+8c could already win, with heavy threading and lots of shared data (V-cache territory), perhaps less so.

Too bad transistors can't just reassemble in different manners and the allocation are fixed.
 
Any such clear statement unfortunately is no longer always true.
Unless you think that AMD is going to give more cores for the same price it certainly is still true as dense core CCDs only come in 16 core configurations. 16 full cores are always going to be faster than 16 dense cores are unless you're starving them for power which client configurations do not. I don't see AMD changing up to lower the core count on dense CCDs as they have definitely minimized the amount of CCD configurations they make.
 

TechyIT223

Prominent
BANNED
Jun 30, 2023
277
66
760
24 E cores wouldn't be a bad thing on a laptop, they do server chips with all E cores.

I would love to see that core configuration implemented on laptop chips, but unfortunately they won't do this on consumer parts.

I'm just wondering whether cramping 24 efficient cores in the package might help lower the power consumption and TDP of the laptop model, to a much greater extent, as compared to using a 16 P cores+ 8 E cores configuration on the same laptop?
 

TechyIT223

Prominent
BANNED
Jun 30, 2023
277
66
760
Based on clock speed this is how they stack up.

Alder Lake-S (12900K) - 3.20 GHz
Raptor Lake-S (13900K) - 3.00 GHz
Raptor Lake-S R (14900K)- 3.20 GHz
Arrow Lake-S (285K?) - 3.60 GHz
 

abufrejoval

Reputable
Jun 19, 2020
615
452
5,260
Unless you think that AMD is going to give more cores for the same price it certainly is still true as dense core CCDs only come in 16 core configurations. 16 full cores are always going to be faster than 16 dense cores are unless you're starving them for power which client configurations do not. I don't see AMD changing up to lower the core count on dense CCDs as they have definitely minimized the amount of CCD configurations they make.
All fully enabled configurations need to limit power, Clients and EPYCs alike, just typically at a somewhat higher amount of total Watts.

My 16 core Ryzens (5000 and 7000) do double digit Wattage per core at lower core counts, but once they're lighting nearly all up, they'll have no choice but to drop to lower frequencies and typically to below 10 Watts/core. And that opens a windows or opportunity for compact cores because at the remaining Wattage per core, frequencies on L cores is nearing their max clocks.

It's a somewhat similiar situation with the mix of V-cache and non-V-cache CCDs on my 7950X3D, which doesn't really suffer from its VC imposed max clock/Watt limitations, when they light up as cores 9-16 and everybody already has to do with less than 10 Watts.

A hypothetical 4L/8C CCD might be able to make do with the same surface area as 8L or 16C variants and thus two could fit into the surface constraints on the AM4/5 sockets. No idea if the client IOD could manage such a mix, but both the 4L/8C and the 16C CCD variants could have their niche on µ-servers/workstations or "client" devices, deven if AMD may also not be that hot on diluting a "server market" with such economy hardware.

In terms of production cost I guess 8L, 4L/8C and 16C CCDs would be nearly equal since that is mostly die size driven, apart from the fact that evidently the hybrid CCD would be an extra SKU (since it's not an APU). But once they have the production scale, that might not even be an issue.

Another intersting academic debate would be if dual 4L/8C or a mix of 8L and 16C CCDs are better for performance, while the latter saves AMD the extra CCD variant.

I'm not saying it's very likely AMD would do that, unless Intel somehow makes 24 cores appear extremely sexy.
I'm mostly saying they are in a pretty good position to be able to do that, should the need or opportunity arise.
 
My 16 core Ryzens (5000 and 7000) do double digit Wattage per core at lower core counts, but once they're lighting nearly all up, they'll have no choice but to drop to lower frequencies and typically to below 10 Watts/core. And that opens a windows or opportunity for compact cores because at the remaining Wattage per core, frequencies on L cores is nearing their max clocks.
Base clock on every desktop Zen 4 CPU is at worst matching the highest boost clock of Zen 4C. The X series CPUs base clocks range from 500-800 MHz higher. There is no opening here unless you're increasing core counts.
A hypothetical 4L/8C CCD might be able to make do with the same surface area as 8L or 16C variants and thus two could fit into the surface constraints on the AM4/5 sockets. No idea if the client IOD could manage such a mix, but both the 4L/8C and the 16C CCD variants could have their niche on µ-servers/workstations or "client" devices, deven if AMD may also not be that hot on diluting a "server market" with such economy hardware.
AMD has two different cache configurations so a hybrid like that isn't possible without sacrificing cache on the full cores. This is why you only see hybrid offerings in the APU/mobile space where they already run half the cache.
 

mac_angel

Distinguished
Mar 12, 2008
671
148
19,160
I'm sure such an article would be big. But much less sure it would help digest the massive amounts of data being generated by an ever increasing number of choices and variations.

In a way one could argue that Intel has finally decided to cut down on the complexity they actively helped along with chiplet designs by removing hyperthreading for lower TDP P cores, because it could be fully offset by higher numbers of their now much better E-cores, which could have been made to include hypertheading as an alternative but resulted in being much more easily mistaked for just emulating AMD.

There simply are a lot more choices now and unless you're a lucky hyperscaler who can largely control his workloads, you'll be faced with very tough choices as to which variant might be the best compromise for your personal mix of use cases: there simply is no single best any more.

And for gaming there is also a growing gap between a theoretically possible optimum and the need to fit your game to the hardware available. Game engines might get better at trying to do some automatic adjustments to the available resources, but with GPUs and potentially other xPUs added on top, the sheer range and variety of options means only the most popular sizes can ever be designed for.

My take on the current TDP constrained designs is that everybody is thowing an abundance of CPU cores of the different varieties into those SoCs, hoping customers will pay for higher numbers, while pushing the proper selection of cores to TDP or time constraints will be left to the user or his application.

Using all of them at max speeds will only be possible for Geekbench milliseconds or perhaps somewhat longer on some mostly connected devices with ample cooling.
I have a 14900K and I'm able to run it at boost speeds fully for multiple Cinebench R23 runs. With hyperthreading on. Yes, I have a custom watercooling kit on, but decent 360AIO can do it, too, if you know what you're doing.
While it would be a lot of work, it wouldn't be any more work than the other times they run test suites on new CPUs and GPUs. And we're not talking about going back a lot of generations, or all the SKUs. I'm pretty sure that some of the lower SKUs don't even have hyperthreading, so they wouldn't need to be tested.
Also, with the thermal problems of the last couple of Core i9 CPUs, turning off hyperthreading and lowering TDP could be a very good solution; of course, depending on what the person is using it for. But Tom's Hardware has always including gaming as well as production software in their test suites. And people that would be interested in those (I'm willing to bet that a huge amount are gamers) can easily go into their BIOS and turn hyperthreading on or off. Even those that might want to have both options, every BIOS I've ever worked with has had a place to save multiple BIOS settings. If you're going to be doing work that is going to be better with hyperthreading, take 60 seconds or whatever to load that BIOS. Or the same for gaming. Or it might not even matter if there's no difference because we don't know; so far as I know it's never been fully tested.