I agree and the performance in question is the balance between single core and well multithreaded workloads.
If the scheduling works well enough to put the high performance at low # of core use tasks on the P cores I bet those 4 cores could eat up the entire power budget just fine on their own. Give it 8 P cores and they will probably just split the load and run slower and give worse single core performance for what?
Sorry, but what does the number of P-cores have to do with ST performance? We've seen plenty of examples of CPUs with low base clocks, which is the lower-bound on all-core clocks, and much higher single-core maxes (plus turbo tables with high boost frequences for smaller numbers of cores).
The 12 more E cores should already give all of the multicore performance that one could reasonably expect out of a mobile chip.
For whom, though? Look at Meteor Lake, where Intel decided to use a 6P + 8E + 2LPE configuration, yielding 22 threads. This Panther Lake will have 4P + 8E + 4LPE, yielding 16 threads (not 20, due to the lack of HT in the P-cores). Intel has a recent track record of delivering more than 16 threads for laptop users, even in their mid-tier models.
A lot of software developers would tell you they can use more than 16 threads for building and testing their code. My dev box at work has 24 threads and I could certainly use more.
It seems like adding more wheels to a motorcycle. If you want a pickup truck to do some heavy towing it is going to take a lot of gas. If you want a zippy efficient motorcycle it isn't going to tow as large a boat as a pickup. Those 2 types of vehicles perform better at their tasks when they are designed for them. Some 3 wheeled pickup truck motorcycle hybrid isn't good for most users.
Sure, not every laptop CPU needs a ton of threads, but just because you have a bunch of cores doesn't mean it's inefficient. Look at this perf/W curve for Redwood Cove vs. Lions Cove:
The data is not presented with units, but perf/W curves like this are the norm. Performance always tapers off, as you near the high end of the power envelope. In this case, their hypothetical curve has the new P-cores delivering 84.3% as much performance at half of the peak power. So, if you start with a number of P-cores that can all reach peak GHz within the power envelope and then double the number, without increasing the power budget, you end up with 68.6% more performance. That's an artificial example, I know, but the phenomenon is real and explains how server CPUs are able to scale performance to hundreds of cores while only consuming a few times more power than the TDP of performance desktop CPUs.
I want to clarify that I'm not taking a firm stand that "6 P-cores good; 4 P-cores bad" (with apologies to George Orwell), but I think it's
interesting and will be looking closely at their scaling data to see if they did it really because the marginal benefit of more P-cores would be negligible or maybe more for area/cost reasons. For one thing, I'm dying to know just how the area of Skymont cores compares to Lions Cove.
P.S. a thought that's been kicking around my head, for a while, is that
maybe one of the reasons they decided to grow Skymont cores was for thermal density reasons. It'll be really interesting to see how much power they can each burn, once we know their area, and compare that to other CPUs on a similar node.