AMD's Phoenix 2 APU may pack two Zen 4 cores, four Zen 4c cores.
AMD Allegedly Testing Hybrid Processor with Zen 4 and 4c Cores : Read more
AMD Allegedly Testing Hybrid Processor with Zen 4 and 4c Cores : Read more
In particular it's great if the ISA and cores are truly heterogenous, in the case of ARM that failed to be the case with X2, A710 and A510 with only A710 supporting 32 bit iircI like the big.little concept the more I think about it. I used to dislike it very strongly.
It makes sense for a majority of work to be done on the most low power cores they can make, and only brute force whatever needs it at the time it needs it such as when gaming or using Blender. This is, to some extent, how ARM achieves its goals.
It's also a win for the manufacturers because of the needs of die space in a fab.
I would love to see a chip like that. All the cores support 2 threads so that's 48 threads of zen4 power in a desktop chip. Wild.In particular it's great if the ISA and cores are truly heterogenous, in the case of ARM that failed to be the case with X2, A710 and A510 with only A710 supporting 32 bit iirc
With AMD all cores support AVX512 and since the caches take up majority of the space, it cuts down by half compared to Zen 4 while delivering almost identical IPC unless it heavily requires the L3 cache. It's almost half the size!
(L2, the main thing increased with zen 4 stays the same for Zen 4c per core)
Where I'm guessing AMD is going to take is 8 zen 4 CCD and 16 core zen 4c CCD for the big am5 options. Considering the massive ipc advantage zen 4 has over gracemont cores (especially with vector workloads) this could be a serious fight and the fact that it has a L3 which gracemont doesnt have but kind of has, but it has to go all the way round the CPU to nest into. This is a situation where gracemont truly are low performance cores... (Way below peak IPC, and they suck at power efficiency anyway to begin with)
I think this is primarily aimed at laptops, where they'll probably go with a monolithic die for energy-efficiency reasons.Where I'm guessing AMD is going to take is 8 zen 4 CCD and 16 core zen 4c CCD for the big am5 options.
IIRC Bergamo, Zen4c in server CPUs, each core is supposed to have performance around Zen 3. Gracemont has performance of about Skylake. Zen 2's IPC was 7-10% higher than Skylake and Zen 3 is another 15% over Zen 2. Based on that we could expect Zen4c's IPC to easily be 25-30% higher than Gracemont. I assume that Zen4c would be easier to clock higher as well. This could be an interesting setup as you still would have full fat cores just lower performing. It would be more like the newer Arm big.little with 4 high performance cores but 2 are clocked higher than another 2.Considering the massive ipc advantage zen 4 has over gracemont cores
It seems like you mean "homogeneous", not "heterogeneous".In particular it's great if the ISA and cores are truly heterogenous, in the case of ARM that failed to be the case with X2, A710 and A510 with only A710 supporting 32 bit iirc
With AMD all cores support AVX512 and since the caches take up majority of the space, it cuts down by half compared to Zen 4 while delivering almost identical IPC unless it heavily requires the L3 cache. It's almost half the size!
Most things done on a computer need only modern Atom level performance, but some things need some serious grunt. 4-8 high powered Zen 4 cores and plenty (16-32) of Zen 1-2 level of performance efficient cores is what I'd like in a desktop.
There is nothing to optimize,That could be inevitable if Intel doubles down and eventually goes to 8 P-cores, 32-64 E-cores. 8 small cores can be ignored, 16 small cores is harder to ignore, 32+ is impossible to ignore. Everyone will have to optimize for that.
Both of these options are pretty bad for AMD because it will increase their cost by a lot and decrease the money they would get from each CPU and they are tied to TSMC so it's not like they can just easily produce a larger amount of CPUs to compensate for the lower per unit cost.I think AMD should retain 16 big cores in the flagship, since it has offered that for 3 generations in a row. Granite Ridge could launch with up to 16x Zen 5 cores (2 chiplets), 16x Zen 4C cores (1 chiplet). Closer to the bottom of the stack you could get 6x Zen 5 + 12-16x Zen 4C.
If Granite Ridge isn't heterogenous, offer up to 24x Zen 5 cores, lower the price per core, and call it a day.
Intel was mostly BigLittle motivated by core count envy.AMD's Phoenix 2 APU may pack two Zen 4 cores, four Zen 4c cores.
AMD Allegedly Testing Hybrid Processor with Zen 4 and 4c Cores : Read more
With better APIs, programming languages, and OS support, there's room for a fair bit more concurrency in most software. Intel knows this, hence their investment in things like TBB (Thread Building Blocks) and Sapphire Rapids' new UIRET instruction.There is nothing to optimize,
some workloads can use "infinite" cores so they would be scheduled on all available cores and some workloads can only use a very limited amount of cores so they would be scheduled on the big cores, there is nothing more they can do.
Any attempt to make workloads that use fewer cores use more cores is already being done anyway.
There are two reasons for E-cores:4+4+gpu & ddr5 RAM sounds a nice balance for a frugal but powerful apu/mobile.
I would say yes and no to this. Intel knows that their P cores are VERY power hungry and therefore they cannot make a 16c/32t CPU using all P cores and have it clock well enough AND not consume 1000W while under operation. Intel's biggest problem is that the Core uArch has not been that efficient beyond 4 cores. With Zen AMD designed it to be efficient with 8 cores. We see this in the performance scaling of the CPUs. Zen 4 doesn't get much more performance after it is at 105W TDP whereas Core keeps going with an increase in power.Intel was mostly BigLittle motivated by core count envy.
There is nothing to optimize,
some workloads can use "infinite" cores so they would be scheduled on all available cores and some workloads can only use a very limited amount of cores so they would be scheduled on the big cores, there is nothing more they can do.
Any attempt to make workloads that use fewer cores use more cores is already being done anyway.
Both of these options are pretty bad for AMD because it will increase their cost by a lot and decrease the money they would get from each CPU and they are tied to TSMC so it's not like they can just easily produce a larger amount of CPUs to compensate for the lower per unit cost.
Right now the 13900k and 7950X are more times than not going to swap with each other when it comes to MT performance. In Tom's review of the 13900k, the geometric mean in their application MT tests had the 13900k at 97% as fast as the 7950X, however, the 13900k was 10% faster on average in ST work.Granite Ridge is as good of a time as any to take away Intel's multi-threading lead.
With better APIs, programming languages, and OS support, there's room for a fair bit more concurrency in most software. Intel knows this, hence their investment in things like TBB (Thread Building Blocks) and Sapphire Rapids' new UIRET instruction.
Yeah, these would fall under the they are doing this already and anyway.If there was nothing to optimize in terms of operating systems, games, and applications, everybody would already be very enthusiastic about Intel's E-cores.
Thermal runaway aside, you'd expect doubling the core count to (no more than) double the power. 1000 W is overshooting the mark by at least 2x.I would say yes and no to this. Intel knows that their P cores are VERY power hungry and therefore they cannot make a 16c/32t CPU using all P cores and have it clock well enough AND not consume 1000W while under operation.
Golden Cove has a much larger reorder buffer (512 entries vs. 320 in Zen 4). Seems like it should scale better, leaving aside the matter of clocks and power.ntel's biggest problem is that the Core uArch has not been that efficient beyond 4 cores. With Zen AMD designed it to be efficient with 8 cores.
That was meant as a hyperbole, sadly there isn't a good way to express that in text.1000 W is overshooting the mark by at least 2x.
I wouldn't say that gives us the base frequency and TDP but not the max power TDP. We both know Intel would likely try and have it boost as far as possible and throw power draw out the window, see Intel using the 5000W chiller demo system, and the actual boost TDP will be 500ishW.But, if you really wanted to see what a 16 P-core Alder Lake would be like, you need look no further than Xeon W5-2465X.
I'm not following what you are getting at here as my original comment was dealing with power. Core does a good job of scaling performance with additional power draw where as Zen4 levels off VERY quickly at 105W. That said the Core CPU needs 2x the TDP for the same performance. Which goes hand in hand with my statement that Core isn't efficient.Golden Cove has a much larger reorder buffer (512 entries vs. 320 in Zen 4). Seems like it should scale better, leaving aside the matter of clocks and power.
More than that possibly. Zen 3 IPC might be understating it. If the L3 cache is very heavily used as in HPC or gaming or similar yeah it is Zen 3 IPC but there are many everyday situations where more cache does not improveIIRC Bergamo, Zen4c in server CPUs, each core is supposed to have performance around Zen 3. Gracemont has performance of about Skylake. Zen 2's IPC was 7-10% higher than Skylake and Zen 3 is another 15% over Zen 2. Based on that we could expect Zen4c's IPC to easily be 25-30% higher than Gracemont. I assume that Zen4c would be easier to clock higher as well. This could be an interesting setup as you still would have full fat cores just lower performing. It would be more like the newer Arm big.little with 4 high performance cores but 2 are clocked higher than another 2.
Refer to ARM, heterogeneous means two different architectures but share the ISA, on a technical level Zen 4c is "different" but is more optimized for lower clocksIt seems like you mean "homogeneous", not "heterogeneous".
Golden cove is a big fat core for HPC and doesn't scale down at low power, Zen 4 is not a big fat core but somehow manages to hang in there with Golden coveThat was meant as a hyperbole, sadly there isn't a good way to express that in text.
I wouldn't say that gives us the base frequency and TDP but not the max power TDP. We both know Intel would likely try and have it boost as far as possible and throw power draw out the window, see Intel using the 5000W chiller demo system, and the actual boost TDP will be 500ishW.
I'm not following what you are getting at here as my original comment was dealing with power. Core does a good job of scaling performance with additional power draw where as Zen4 levels off VERY quickly at 105W. That said the Core CPU needs 2x the TDP for the same performance. Which goes hand in hand with my statement that Core isn't efficient.
Actually heavy customization by Intel is also going to be bad for their IDF model. Intel is currently still having troubles because they used to depend on the fab to fix their issues rather than verifying that it works (IDF model means the design team is unable to do it anymore at least on the same scale)There is nothing to optimize,
some workloads can use "infinite" cores so they would be scheduled on all available cores and some workloads can only use a very limited amount of cores so they would be scheduled on the big cores, there is nothing more they can do.
Any attempt to make workloads that use fewer cores use more cores is already being done anyway.
Both of these options are pretty bad for AMD because it will increase their cost by a lot and decrease the money they would get from each CPU and they are tied to TSMC so it's not like they can just easily produce a larger amount of CPUs to compensate for the lower per unit cost.
No it doesn't.Core does a good job of scaling performance with additional power draw where as Zen4 levels off VERY quickly at 105W.
Source?zen 4 is more power/performance efficient than either golden cove or gracemont (gracemont very much isn't, if you have to keep the cores on longer because the IPC is poor and power draw is still mid, it ends up drawing quite a lot more than expected in joules)