News Cyberpunk 2077 adds core prioritization for hybrid CPUs, which would be great if it didn't cause other problems

Page 3 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.
Status
Not open for further replies.
Where did I ever say thermals?

I thought about explaining it, but having read through your posts I'm not going to waste my time.
You are right, you didn't say thermals, you said TDP. Still, the point stands, 16 P cores would be faster than 8 P cores at the same TDP, so that cannot possibly be the reason.

But reading through your posts, im not going to waste my time.
 
You are right, you didn't say thermals, you said TDP. Still, the point stands, 16 P cores would be faster than 8 P cores at the same TDP, so that cannot possibly be the reason.
That's not always correct, not even close. 16 P-cores would use much more power, and so to stay within the same TDP range, they would clock slower. For multi-threaded apps, they'd likely be faster than the 8 P-core CPU, but for single-threaded and even lightly-threaded loads, Intel doesn't usually allow boosting as high with lots of cores. Xeons have max clocks that are way lower than Core CPUs for precisely this reason. So by having eight higher clocked P-cores, certain consumer workloads run better than if you had 16 P-cores with the same TDP.
 
That's not always correct, not even close. 16 P-cores would use much more power, and so to stay within the same TDP range, they would clock slower. For multi-threaded apps, they'd likely be faster than the 8 P-core CPU, but for single-threaded and even lightly-threaded loads, Intel doesn't usually allow boosting as high with lots of cores. Xeons have max clocks that are way lower than Core CPUs for precisely this reason. So by having eight higher clocked P-cores, certain consumer workloads run better than if you had 16 P-cores with the same TDP.
Well amd does it, there is nothing preventing intel from having different boost clocks based on load. In fact even intel does it, their higher core count cpus also have higher clockspeeds. The 14900k has higher clocks than the 14700k, which has higher clocks than the 13700k, which has higher clocks vs 13600k etc. That applies to both multi and single threaded workloads


The reason we don't see more than 8 pcores is not thermals or tdp, it's just die space and the ring bus.
 
Well amd does it, there is nothing preventing intel from having different boost clocks based on load. In fact even intel does it, their higher core count cpus also have higher clockspeeds. The 14900k has higher clocks than the 14700k, which has higher clocks than the 13700k, which has higher clocks vs 13600k etc. That applies to both multi and single threaded workloads


The reason we don't see more than 8 pcores is not thermals or tdp, it's just die space and the ring bus.
No, the reason we don't see more than eight P-cores is because Intel is choosing to not go that route. Anything that scales with multi-core means it can do 16 E-cores (or more in the future) while sticking with eight P-cores, and get better efficiency. Very few applications need to have eight high power, high performance cores, and fewer still need 16 or more such cores.

AMD isn't doing the same approach for numerous reasons, including chiplets, the complexity of hybrid solutions, and because it has a more efficient core to begin with. But AMD is actually doing hybrid, as Zen 4 and Zen 4c aren't the same. 4c clocks significantly lower, but it's also more efficient. And yes, more dense, but that's just all part and parcel of the design.

If Intel wants to do 16 P-cores, it absolutely can. It doesn't want to, for the reasons mentioned. Yes, die size is part of it as well, because that increases the cost, but power use is a massive issue for the P-cores at the clocks Intel is hitting. Cap all the P-cores at 3.5 GHz and they'll be far more efficient, but then they won't compete with AMD Zen 4 running at 4.5+ GHz.

And hey, Intel even has "special" P-cores on Meteor Lake (two of them, on certain models) that can hit higher clocks than the rest of the P-cores. Just like it also has additional E-cores on the SoC tile that are designed to be even more efficient.

TLDR: Arm has done the hybrid architecture approach for a decade or more now, and I see no reason why Intel would change its approach now. It's proven to work, and it helps avoid 500W CPUs. It's not only about power and efficiency, but those elements are 100% part of the reason to do hybrid.
 
No, the reason we don't see more than eight P-cores is because Intel is choosing to not go that route. Anything that scales with multi-core means it can do 16 E-cores (or more in the future) while sticking with eight P-cores, and get better efficiency. Very few applications need to have eight high power, high performance cores, and fewer still need 16 or more such cores.

AMD isn't doing the same approach for numerous reasons, including chiplets, the complexity of hybrid solutions, and because it has a more efficient core to begin with. But AMD is actually doing hybrid, as Zen 4 and Zen 4c aren't the same. 4c clocks significantly lower, but it's also more efficient. And yes, more dense, but that's just all part and parcel of the design.

If Intel wants to do 16 P-cores, it absolutely can. It doesn't want to, for the reasons mentioned. Yes, die size is part of it as well, because that increases the cost, but power use is a massive issue for the P-cores at the clocks Intel is hitting. Cap all the P-cores at 3.5 GHz and they'll be far more efficient, but then they won't compete with AMD Zen 4 running at 4.5+ GHz.

And hey, Intel even has "special" P-cores on Meteor Lake (two of them, on certain models) that can hit higher clocks than the rest of the P-cores. Just like it also has additional E-cores on the SoC tile that are designed to be even more efficient.

TLDR: Arm has done the hybrid architecture approach for a decade or more now, and I see no reason why Intel would change its approach now. It's proven to work, and it helps avoid 500W CPUs. It's not only about power and efficiency, but those elements are 100% part of the reason to do hybrid.
I'm not disagreeing with most of that, but I insist that power is irrelevant.

I have no idea if 14th gen p cores are more or less efficient than zen 4 cores, but I know for a fact that ALD P cores were a lot more efficient than zen 3, which in turn means that 16 P votes would be both a lot faster and a lot more efficient than 16 zen 3 cores. Still intel didn't got that route.

As you said yourself, it makes more sense to use ecores simply because they offer more performance per die space and more performance per watt, simply because you can fit more of them in the same space.
 
That's not always correct, not even close. 16 P-cores would use much more power, and so to stay within the same TDP range, they would clock slower. For multi-threaded apps, they'd likely be faster than the 8 P-core CPU, but for single-threaded and even lightly-threaded loads, Intel doesn't usually allow boosting as high with lots of cores. Xeons have max clocks that are way lower than Core CPUs for precisely this reason. So by having eight higher clocked P-cores, certain consumer workloads run better than if you had 16 P-cores with the same TDP.
You have heard about p-states and c-states right?!
Have you heard about game mode on AMD cpus?
Having 16 cores doesn't force you to run all of them at the same clocks.
Xeons are professional tools, there is no telling what safety or reliability or whatever other reason they do that for.
 
No, the reason we don't see more than eight P-cores is because Intel is choosing to not go that route. Anything that scales with multi-core means it can do 16 E-cores (or more in the future) while sticking with eight P-cores, and get better efficiency. Very few applications need to have eight high power, high performance cores, and fewer still need 16 or more such cores.

AMD isn't doing the same approach for numerous reasons, including chiplets, the complexity of hybrid solutions, and because it has a more efficient core to begin with. But AMD is actually doing hybrid, as Zen 4 and Zen 4c aren't the same. 4c clocks significantly lower, but it's also more efficient. And yes, more dense, but that's just all part and parcel of the design.

If Intel wants to do 16 P-cores, it absolutely can. It doesn't want to, for the reasons mentioned. Yes, die size is part of it as well, because that increases the cost, but power use is a massive issue for the P-cores at the clocks Intel is hitting. Cap all the P-cores at 3.5 GHz and they'll be far more efficient, but then they won't compete with AMD Zen 4 running at 4.5+ GHz.

And hey, Intel even has "special" P-cores on Meteor Lake (two of them, on certain models) that can hit higher clocks than the rest of the P-cores. Just like it also has additional E-cores on the SoC tile that are designed to be even more efficient.

TLDR: Arm has done the hybrid architecture approach for a decade or more now, and I see no reason why Intel would change its approach now. It's proven to work, and it helps avoid 500W CPUs. It's not only about power and efficiency, but those elements are 100% part of the reason to do hybrid.
In both cases of e-cores and zenc you confuse efficiency with lower power draw.
They can function at lower power than the main cores which gives benefits at extremely low power limits only. ( which is why it's so common on arm devices)
The zen 4c cores are only more efficient at below 2Ghz ~3W , above that they will hurt the overall energy efficiency just like the e-cores do for intel.
https://zhuanlan.zhihu.com/p/653961282
v2-f504496de6708ef1454b358adbfac239_720w.webp
 
As you said yourself, it makes more sense to use ecores simply because they offer more performance per die space and more performance per watt, simply because you can fit more of them in the same space.
That last bit in bold is where you go wrong each time. You can fit more of them in the same die area, yes. But you can also fit more E-cores within the same power envelope! Both factors are important. What you said amounts to: "You can get more performance per die area, more performance per watt, and more performance per die area (again)." Saying it twice doesn't make it more true. LOL

An Intel CPU with 16 Raptor Cove (or Golden Cove) cores all running at ~5.2 GHz would need roughly twice as much power as a CPU with 8 Raptor Cove cores running at ~5.2 GHz. Based on what we know of Raptor Lake, that would mean about 200W for the 8-core chip and 400W for the 16-core chip. And if the maximum boost clock for both is 6.0 GHz, peak power use would be up to 600W for 16 P-core. Or maybe not quite that high, as I don't have exact details of how much power P-core only on an i9-14900K is at 6.0 GHz, but it would be quite a bit.

Because there are actual real-world limitations to how much power you can use from a socket, Intel wants to maximize throughput both in terms of die area and TDP. But it also needs pure single-threaded performance for certain workloads. That's why P-cores exist: for work where a few cores at higher clocks are important. Anything that scales beyond eight threads (give or take) can continue to improve performance with E-cores, while not busting the power budget.

If it were only about performance per watt, or performance per die area, Intel wouldn't even have P-cores and E-cores. It would just do a larger number of E-cores. Roughly four E-cores use the same die area as a single P-core, and they're not one fourth the speed. Intel has a 144 E-core server chip (Sierra Forest), and two of those in a package for a 288-core processor. But that's specifically for hyperscalers where per-core performance isn't quite as critical as raw core counts. Even so, total aggregate performance will be ~2.4X performance per watt compared to Granite Rapids. (Something like that.) And for more diverse workloads, Intel will have 128-core solutions that have both P-core and E-core (I think?) with higher total performance and more than double the power use. At least, that's what I'm expecting based on rumors and leaks.
 
That last bit in bold is where you go wrong each time. You can fit more of them in the same die area, yes. But you can also fit more E-cores within the same power envelope! Both factors are important. What you said amounts to: "You can get more performance per die area, more performance per watt, and more performance per die area (again)." Saying it twice doesn't make it more true. LOL
I was simply explaining that ecores offer more performance / watt not because they are more efficient but because you can fit more of them.

An Intel CPU with 16 Raptor Cove (or Golden Cove) cores all running at ~5.2 GHz would need roughly twice as much power as a CPU with 8 Raptor Cove cores running at ~5.2 GHz. Based on what we know of Raptor Lake, that would mean about 200W for the 8-core chip and 400W for the 16-core chip. And if the maximum boost clock for both is 6.0 GHz, peak power use would be up to 600W for 16 P-core. Or maybe not quite that high, as I don't have exact details of how much power P-core only on an i9-14900K is at 6.0 GHz, but it would be quite a bit.
But you wouldn't t need to run them at 5.2ghz in all core workloads. AMD does the same, their higher end models have higher boost but still clockspeeds are lower in all core workloads. I'd argue 16 PCORES at 125w will probably be faster than 8P Cores at 250 watts, so efficiency would skyrocket.

But again, when you have a type of core that is better in terms of performance per die space per watt like the ecores, it's silly not to use those instead for extreme parallel workloads.
 
Last edited:
An Intel CPU with 16 Raptor Cove (or Golden Cove) cores all running at ~5.2 GHz would need roughly twice as much power as a CPU with 8 Raptor Cove cores running at ~5.2 GHz. Based on what we know of Raptor Lake, that would mean about 200W for the 8-core chip and 400W for the 16-core chip. And if the maximum boost clock for both is 6.0 GHz, peak power use would be up to 600W for 16 P-core. Or maybe not quite that high, as I don't have exact details of how much power P-core only on an i9-14900K is at 6.0 GHz, but it would be quite a bit.
16 zen 4 cores running at 5.1Ghz would use 16 times 22W for a total of 352W
Running at stock they would use 16 times 43W which would be 688W

The number of cores used does not dictate the final power draw....
The final power draw (be it stock or overclock) dictates the total clock speed of the cores.
power-singlethread.png
 
16 zen 4 cores running at 5.1Ghz would use 16 times 22W for a total of 352W
Running at stock they would use 16 times 43W which would be 688W

The number of cores used does not dictate the final power draw....
The final power draw (be it stock or overclock) dictates the total clock speed of the cores.
Talk about missing the point...

Obviously the CPU doesn't always run all cores at the same clock. My point was that if it did, then the power draw of 16-core Raptor Cove would be enormous. And if it's not going to run all 16 cores at max clock, because that would be inefficient, then — and this will really blow your mind! — why not add more efficient cores that run at lower clocks and use a lot less power? We could even call these things something like "E-cores"!

Hypothetically, 16 Zen 4 cores all running at 5.1 GHz would be around 352W, way more than the TDP limit. And if they were running 'stock' at up to 5.7 GHz, then it would be the 688W you speak of, which is about four times the actual power limit on Ryzen 9 7950X. Of course, those numbers aren't actually accurate, because that chart isn't showing you what you think it's showing.

It's not "per core power," it's "package power running a single-threaded workload." BIG difference! We can rightly estimate that a single CPU core is using less than what the chart shows, but all the other stuff in the package will still be using power. How much power? That's difficult to say.

But what's funny is you don't give a chart that shows where Raptor Lake would land. Let me do that for you, and for TheHerald, because he keeps missing the point.

1708524860190.png
Running 'normally' a single Raptor Cove core ends up at around 32W — which is obviously clocking down quite a bit. How do we know? Because when all the P-cores are at 5.6 GHz (and E-cores at 4.4 GHz), package power jumps to 83W.

Again, how much of that is for a single core in practice? Well, less than 32W presumably, and I suspect Intel is better than AMD about making all the idle cores go into a deep sleep state where power use drops close to zero. Because when you force all the cores to clock at 5.6 GHz / 4.4 GHz, the package power skyrockets. Even a fully idle core sitting at 5.6 GHz uses more power than that same core idling at 1.4 GHz or whatever the 'sleeping' value is for Intel these days.

Which is why looking at a fully loaded multi-threaded workload is so important, and that's where TPU gives this:

1708525234390.png

Or alternatively, our own testing:

1708525346896.png
 
  • Like
Reactions: thestryker
Have both, tested it myself. Zen 3 need 2x the power for same performance. 12900k ecores off @ 65w = 5800x @ 150 watts at CBR23 and corona.
Lowering the power limit almost always improves efficiency. If you're not going to run at stock limits (which you're obviously not if the 12900K is only drawing 65W), you should set them to the same limit to get a fair comparison.

Also, did you actually record those power values while running CB, or are you just going off the limits as per what was set in the BIOS?

I couldn't find any results online for ADL that have P-core-only performance with power measured under the same benchmark. Anandtech does have p-core only SPEC2017 results, and p-core only power draw while running POV-RAY. 12900k, p-core only, is anywhere from 6 to 57% faster on average, depending on whether you look at FP or INT results and whether you include the DDR5 results. But it uses 71% more power (240W vs 140W).

https://www.anandtech.com/show/1621...-review-5950x-5900x-5800x-and-5700x-tested/10

The best showing for ADL p-core efficiency I could find was in a review of the i5 12400 (which doesn't have e cores), where it showed ~20% better efficiency than a 5600X, based on CBR23 performance and power measured using an unknown workload.

https://www.pcgamer.com/intel-core-...king/#section-intel-core-i5-12400-performance

Edit: By p-core-only power I meant total CPU power with e cores disabled.
 
Last edited:
  • Like
Reactions: JarredWaltonGPU
Lowering the power limit almost always improves efficiency. If you're not going to run at stock limits (which you're obviously not if the 12900K is only drawing 65W), you should set them to the same limit to get a fair comparison.

Also, did you actually record those power values while running CB, or are you just going off the limits as per what was set in the BIOS?

I couldn't find any results online for ADL that have P-core-only performance with power measured under the same benchmark. Anandtech does have p-core only SPEC2017 results, and p-core only power draw while running POV-RAY. 12900k, p-core only, is anywhere from 6 to 57% faster on average, depending on whether you look at FP or INT results and whether you include the DDR5 results. But it uses 71% more power (240W vs 140W).

https://www.anandtech.com/show/1621...-review-5950x-5900x-5800x-and-5700x-tested/10

The best showing for ADL p-core efficiency I could find was in a review of the i5 12400 (which doesn't have e cores), where it showed ~20% better efficiency than a 5600X, based on CBR23 performance and power measured using an unknown workload.

https://www.pcgamer.com/intel-core-...king/#section-intel-core-i5-12400-performance
I recorded the power values yes. I agree that matching power would make the difference smaller, but I measured in ISO performance instead of iso power.

Also I wasnt measuring core only power but the whole cpu.

The i5 12400 is a different die and probably lower binned, I think at iso power the difference between zen 3 and ald is around 25-30%.
 
Talk about missing the point...

Obviously the CPU doesn't always run all cores at the same clock. My point was that if it did, then the power draw of 16-core Raptor Cove would be enormous. And if it's not going to run all 16 cores at max clock, because that would be inefficient, then — and this will really blow your mind! — why not add more efficient cores that run at lower clocks and use a lot less power? We could even call these things something like "E-cores"!

Hypothetically, 16 Zen 4 cores all running at 5.1 GHz would be around 352W, way more than the TDP limit. And if they were running 'stock' at up to 5.7 GHz, then it would be the 688W you speak of, which is about four times the actual power limit on Ryzen 9 7950X. Of course, those numbers aren't actually accurate, because that chart isn't showing you what you think it's showing.

It's not "per core power," it's "package power running a single-threaded workload." BIG difference! We can rightly estimate that a single CPU core is using less than what the chart shows, but all the other stuff in the package will still be using power. How much power? That's difficult to say.

But what's funny is you don't give a chart that shows where Raptor Lake would land. Let me do that for you, and for TheHerald, because he keeps missing the point.

View attachment 313
Running 'normally' a single Raptor Cove core ends up at around 32W — which is obviously clocking down quite a bit. How do we know? Because when all the P-cores are at 5.6 GHz (and E-cores at 4.4 GHz), package power jumps to 83W.

Again, how much of that is for a single core in practice? Well, less than 32W presumably, and I suspect Intel is better than AMD about making all the idle cores go into a deep sleep state where power use drops close to zero. Because when you force all the cores to clock at 5.6 GHz / 4.4 GHz, the package power skyrockets. Even a fully idle core sitting at 5.6 GHz uses more power than that same core idling at 1.4 GHz or whatever the 'sleeping' value is for Intel these days.

Which is why looking at a fully loaded multi-threaded workload is so important, and that's where TPU gives this:

View attachment 314

Or alternatively, our own testing:

View attachment 315
Why would the 13900k be clocking down? His overclocked numbers are just that, overclocked. There is no reason to assume the 32w is while clocking down.
 
I recorded the power values yes. I agree that matching power would make the difference smaller, but I measured in ISO performance instead of iso power.

Also I wasnt measuring core only power but the whole cpu.

The i5 12400 is a different die and probably lower binned, I think at iso power the difference between zen 3 and ald is around 25-30%.
Your 12900k must be incredibly well binned. TPU found their 12900K needed >100W to exceed a 5800X in CBR23 (with the 5800X scoring ~15.7K, drawing ~120W). And that was with E-cores enabled, which substantially improves performance and efficiency for a given power envelope. And even with a 100W power limit on the 12900k, total system power draw was within a few watts of the stock 5800X system.

https://www.techpowerup.com/review/...er-lake-tested-at-various-power-limits/2.html
 
Your 12900k must be exceptionally, almost unbelievably, well binned then. TPU found their 12900K needed >100W to exceed a 5800X in CBR23 (score of ~15.7K, drawing ~120W). And that was with E-cores enabled, which substantially improves performance and efficiency for a given power envelope. And even with a 100W power limit on the 12900k, total system power draw was within a few watts of the stock 5800X system.

https://www.techpowerup.com/review/...er-lake-tested-at-various-power-limits/2.html
Nope, my 12900k is bottom of the barrel bin

TPU's numbers are horribly off, he probably used fixed voltage. 12900k @ 125w scores 23.500 points, and at 75w 18k+. I've tested 3 different cpus on 4 different mobos. There is something horribly off with his testing, obvious from the fact that the 12900k is even losing to the 12600k at same power
 
Nope, my 12900k is bottom of the barrel bin

TPU's numbers are horribly off, he probably used fixed voltage. 12900k @ 125w scores 23.500 points, and at 75w 18k+. I've tested 3 different cpus on 4 different mobos. There is something horribly off with his testing, obvious from the fact that the 12900k is even losing to the 12600k at same power
I don't see the 12900k losing to the 12600k at the same power, even at 125W the 12900K comes out ahead of the 150W 12600K.

But yeah, if they did something like set a fixed voltage I imagine that would have a pretty significant impact. Unfortunately they don't list any BIOS settings used other than the power limit.
 
Talk about missing the point...

Obviously the CPU doesn't always run all cores at the same clock. My point was that if it did, then the power draw of 16-core Raptor Cove would be enormous.
Yes, why was your point something that is physically impossible?
And if it's not going to run all 16 cores at max clock, because that would be inefficient, then — and this will really blow your mind! — why not add more efficient cores that run at lower clocks and use a lot less power? We could even call these things something like "E-cores"!
They are only more efficient at 6-15W per 4 e-cores which means that you either have to run the whole CPU at laughably low for desktop wattage or you would have to put laughably many e-cores for desktop standards to make that happen.
But what's funny is you don't give a chart that shows where Raptor Lake would land. Let me do that for you, and for TheHerald, because he keeps missing the point.


Running 'normally' a single Raptor Cove core ends up at around 32W — which is obviously clocking down quite a bit. How do we know? Because when all the P-cores are at 5.6 GHz (and E-cores at 4.4 GHz), package power jumps to 83W.
Yes, with normal turbo intel CPUs, and as far as I know AMD CPUs as well, park all non active cores and only power the one that does the work, if you overclock all cores to always run at full speed then all cores will always be powered.
C-states is technically "clocking down" actually even powering down.
 
I don't see the 12900k losing to the 12600k at the same power, even at 125W the 12900K comes out ahead of the 150W 12600K.

But yeah, if they did something like set a fixed voltage I imagine that would have a pretty significant impact. Unfortunately they don't list any BIOS settings used other than the power limit.
The 12600k was pulling less power than the 12900k at 125w.

 
  • Like
Reactions: TJ Hooker
I don't see the 12900k losing to the 12600k at the same power, even at 125W the 12900K comes out ahead of the 150W 12600K.

But yeah, if they did something like set a fixed voltage I imagine that would have a pretty significant impact. Unfortunately they don't list any BIOS settings used other than the power limit.
Coming back to this, I found hwunboxeds test of a 12700 at 65w power. As you can see from their review when compared to TPU's findings, TPU looks completely off.

https://www.techspot.com/articles-info/2391/bench/CB23-1.png

12700 at 65w matches the 5800x at 140 watts. Even turning off those 4 ecores would still leave a big gap between the 2 in efficiency.

That's why I'm insisting on Intel's incredibly efficient cpus. At iso power intel is heavily leading on most segments in both single and multi threaded efficiency. Amd only wins in the 7950x /7950x 3d category, either of them are a bit more efficient than the 14900k (about 5 to 10% depending on the workload)
 
That's why I'm insisting on Intel's incredibly efficient cpus. At iso power intel is heavily leading on most segments in both single and multi threaded efficiency. Amd only wins in the 7950x /7950x 3d category, either of them are a bit more efficient than the 14900k (about 5 to 10% depending on the workload)
You have to look at benches that show both power and scores in the same bench, otherwise you get confused, you can see here with the 13900k that the 7950x wins by 10-13% at every power level down to 65W and the 13900k only wins at 45W by about 10% because that's when the e-cores are finally increasing the power efficiency instead of just increasing the performance.
Their p-cores are more efficient than zen but as a whole CPU intel is less efficient overall in multithreading.
https://www.computerbase.de/2023-10...i5-14600k-test/2/#abschnitt_multicoreleistung

Or here as well, the 14900k limited to 200W draws 141W average compared to the 128W average of the 7950x while having the same performance, while the default draw of the 14900k is 170W average compared to the 128W average.
(Application 47 test average)
https://www.techpowerup.com/review/...ke-tested-at-power-limits-down-to-35-w/8.html
 
Status
Not open for further replies.