News Intel Debuts Meteor Lake Die, 'Intel 4' Node: 20% Higher Clocks at Same Power, 2X Area Scaling

isofilm · Jun 13, 2022

rtx4090 said:
Intel 4 process is far superior to TSMC N5 & N5P. And meteor lake has already taped-out, engineering sample powered on & tested with windows 11 & ubuntu 22.04. Intel is putting the finishing touches on meteor lake right now. Volume ramp up in Q2 2023.

In other words, Intel takes process leadership over AMD starting next june. After that, there's no turning back!

That doesn't mean JACK, Intel is notorious for Binning (sorting chips based on performance), you think they just picked a PROTOTYPE out of a hat, NO, they picked the very best one they could find.

Intel has NEVER "stayed ahead of AMD", they have traded pole position many times in the last 30 years.

Until they have enough EUV tools to start Risk Production (20% yields), and presuming they can get yields to 60-70% to start Volume Production, then we'll see how they are doing.

SiliconFly · Jun 13, 2022

JamesJones44 said:
Based on the article that doesn't look to be correct. Based on transistor density Intel 4 should closer to TMSC 5nm than 7.

Intel 4 Intel 7 TSMC N5 TSMC N3
HP Library Density 160 MTr/mm^2 (est.) 80 MTr/mm^2 130 MTr/mm^2 (est.) 208 MTr/mm^2 (est.)

Ur almost right. It's not just closer to TSMC N5. It's far better!

jeremyj_83 · Jun 13, 2022

rtx4090 said:
Ur almost right. It's not just closer to TSMC N5. It's far better!

You are basing this on a press release. We will not actually know which is better until there are chips in hand. Intel 4 also isn't being used for anything at the moment while TSMC N5 is being used.

Looking at your graph you also failed to show the HD Library Density in which N5 is listed but Intel doesn't plan anything. That to me means that Intel's HP Library is actually an HD and they will just allow power to go through the roof like they have for the last few years.

SiliconFly · Jun 13, 2022

Pat intends to launch meteor lake early so that it coexists with raptor lake. Mobile first. A concept vehicle for tiles & faveros. So that, he can focus all his engineering might on the real toughies 20A & Arrow Lake. Designed specifically to kick Zen 5 where it hurts. If he pulls it off, he's gonna become a legend in the silicon universe. And it appears, he's on track as of now.

AMD shud be worried!!!

jeremyj_83 · Jun 13, 2022

rtx4090 said:
Pat intends to launch meteor lake early so that it coexists with raptor lake. Mobile first. A concept vehicle for tiles & faveros. So that, he can focus all his engineering might on the real toughies 20A & Arrow Lake. Designed specifically to kick Zen 5 where it hurts. If he pulls it off, he's gonna become a legend in the silicon universe. And it appears, he's on track as of now.

AMD shud be worried!!!

We will see what happens. I think you are over hyping and it will under promise, just like Ice Lake.

SiliconFly · Jun 13, 2022

jeremyj_83 said:
We will see what happens. I think you are over hyping and it will under promise, just like Ice Lake.

I think you missed the point. Zen 4 is using slower TSMC N5 high density library. Whereas, meteor lake is using the faster Intel 4 high performance library (less design/manufacturing issues). AMD is doing that to cut costs. Intel already has a half node performance advantage there!!!

But, unexpected manufacturing issues can always arise. Thats why they want to launch meteor lake close to raptor lake and mobile first (like tiger lake). Playing it safe until the process matures & the yield goes up. Intel 4 is not a very important node for them and meteor lake isn't a very important product for them. 20A & Arrow Lake are!!! I'm guessing by 2024. Thats where they hit the sweet spot.

But here's a FUN FACT:
==================

Raptor lake is in inferior intel 7 while the upcoming zen 4 is in far superior TSMC N5P. Simply put, N5P is light years ahead of intel 7. But even then, the zen 4 cpus (w/o 3d vc) will still be "only on par" with raptor lake even thought they have MORE THAN A FULL NODE process advantage. This is serious stuff!!!

Meaning, raptor cove will kick the life out of zen 4 if you put it in TSMC N5!!! Now, imaging that they're going to put something far better in a process node that is far superior to N5P next year. Thats meteor lake guys. Don't take it lightly.

shady28 · Jun 13, 2022

It's been known for a while that Intel 4 was going to be better than TSMC N5 and its derivatives (including TSMC 'N4' which is another revision of their N5).

This doesn't mean that Intel is ahead of TSMC, as TSMC is moving towards its N3 node. However, TSMC fell behind on N3 and didn't deliver mass production in time this year - the Apple A16 coming this fall should have been on N3, but instead it is on an N5 derivative.

Where that comes into play is that, Apple will likely be the only company getting access to TSMC N3 in 2023. This is the gap that Intel has been looking for, as they will likely have a superior node in 2023 to anything AMD can get their hands on.

All of this revolves around Intel getting both Intel 4 up and running for mass production in 2023, and the future Intel 3 in late 2023/early 2024.

Consider, at the point when Intel should have Intel 3 launching - they'll be a full two nodes ahead of what AMD has access to and have equivalency to the most recent node from TSMC in Apple products.

If they can actually pull those node advancements off - and I'm not saying they can - but if they do, they'll be dominant in the x86 space.

So, to an observer, just have to wait and see if there are any signs of Intel's node schedules slipping.

PCWarrior · Jun 14, 2022

-Fran- said:
That's pretty much all there is to compare against Intel since they're just not producing proper new server CPUs. They'll be releasing something now, aren't they? And that will be competing with Milan-X and whatever is based off ~~Zen4~~ Zen 3 now. I can't remember all the names TBH, so sorry for not being super specific.

Well, I am going to start my answer with your own words…

-Fran- said:
the thread is not just about AL and ML, but the process node.

What I explained in my previous post is that a 64-core running at 2GHz will always win in efficiency against a 40-core running at 3.2GHz even if both cpus had the same architecture, the same overall performance and were manufactured on the same process node. So EPYC possibly winning on some efficiency metrics against Xeon (a Xeon on a two generations old microarchitecture mind you) is not a feat of the process node. It is a feat of the design choice of going with more cores and running them at lower clock speeds.

And no you are not comparing their best. By comparing Icelake to Milan we are not comparing the best Intel and AMD managed to extract from the latest nodes they have had access to. Although Zen 3 derivatives are really the best AMD managed to extract from TSMC’s 7nm, Icelake is certainly NOT the best that Intel managed to do on their 10nm node. For starters since the 10nm used for Icelake, Intel has had two new iterations of that manufacturing node. The first was the 10nm superfin that was used with Tigerlake and the second is the 10nm enhanced superfin (later renamed to Intel 7) that is currently used with Alderlake. Both of these improvements increased performance per watt compared to the 10nm used for Icelake in a way equivalent to moving to a new full node. Furthermore, and equally important is that since Icelake we had two new microarchitectures (Willow Cove and Golden Cove). Besides the increase in IPC a microarchitecture can improve efficiency as well. AMD managed just that with Zen 3 against Zen 2. Remember both Zen 2 and Zen 3 were made on TSMC’s 7nm yet Zen 3 is more efficient due to the new architecture. What is more is that Intel has yet another architecture on their Intel 7 node that is expected to have even better efficiency. It is foolish really to think that anyone can optimise and tune a given node better than vertically integrated Intel especially given the impressive recent record of Intel at optimising and extracting the most out of their nodes (both in 14nm and now at 10nm).

-Fran- said:
I already provided evidence that AMD is ahead via AnandTech's server tests. Do you need me to Google more for you? I could also get more Laptop data so I can double down on the point, but nah.

No you did not provide such evidence. Just because you linked an article from a reputable website doesn’t make the content of what you linked relevant to the topic in discussion.

-Fran- said:
The matter of the fact is: Intel even using a more efficient design: monolithic and bigLITTLE is barely on par with AMD's chiplet approach which is inherently less efficient on the desktop and server front.

You are misguided here on multiple fronts.

1. The hybrid design really only comes to play in overall power consumption comparisons across e.g. a daily workload including heavy, mixed and light workloads as well as idling. And in such comparisons Intel is doing wonders. Comparisons when pummelling the cpu with a single heavy workload such as Blender or Prime 95 are irrelevant.

2. Intel’s main reason for using a hybrid design is not really efficiency (although this is one of the reasons it is not the main one, not in desktops at least). Instead the main reason is area efficiency for multithreaded (MT) performance. You see the only reason to have more than 8 cores is scalable MT performance and it turns out that a cluster of 4 Ecores (which occupies the same area as 1Pcore) offers twice the MT performance compared to 1Pcore for the same power envelope. In other words, with 8P+8E cores Intel is achieving the same MT performance as a 12P+0E core cpu but only using the same die area as a 10P+0Ecore cpu.

3. Although in workloads like Cinebench and Blender the efficiency cores add enough performance for Intel to match or beat the performance of AMD’s 12 and 16 cores without requiring much extra die space or needing much extra power, it is still important to note that Alderlake cpus only have 8 big/performance cores and most performance comes from these 8 big cores. So we really again have the situation of comparing a cpu with fewer cores but higher per core performance (due to running at higher clockspeed and having higher IPC that comes from more per-core computational resources) going up against twice the cores at lower per core performance (due to lower clockspeed and lower IPC). The former is never going to win in power consumption metrics.

4. Power consumption alone is not a proper metric for efficiency. Energy is what is. If a cpu pulls twice the power but completes the job in half the time versus another cpu, then both cpus have used the exact same energy and have the same energy efficiency (and the former also saves you half the time). Power consumption is only relevant in cooling.

-Fran- · Jun 14, 2022

PCWarrior said:
<many words>

Most of what you sad is irrelevant or moot: whatever reason Intel had for not using the "best" 10nm could offer in the server market is no reason to not compare. This is essentially saying the comparisons between servers is not valid because Intel didn't use their very best 10nm node for them. That is, quite honestly, strange and a non-starter argument. Same with whatever you nitpick about the reasons why Intel decided to do bigLITTLE. Fact of the matter is, a monolithic die is going to be always more efficient than a chiplet approach. That is why Intel hasn't done it just yet. If they did, their designs would consume even more, so they just can't. They've had EMIB for a good 10 years now (maybe?) and have only started using it for real stuff with Sapphire Rapids? So, again, the fact AMD is still edging out Intel with a chiplet-based design doesn't bode well for them until they get their "Intel 4" node up and running. On this same topic, your simplified approach to power consumption on multiple cores is flawed: process and SoC packaging also matter a lot. It's no longer as simple as saying "moar coars bad for efficiency unless lower clocked"; Intel's bigLITTLE is proof of that.

What else... Nah, I'll stop there; everything else is a wash up.

Regards.

shady28 · Jun 14, 2022

jeremyj_83 said:
You are basing this on a press release. We will not actually know which is better until there are chips in hand. Intel 4 also isn't being used for anything at the moment while TSMC N5 is being used.

Looking at your graph you also failed to show the HD Library Density in which N5 is listed but Intel doesn't plan anything. That to me means that Intel's HP Library is actually an HD and they will just allow power to go through the roof like they have for the last few years.

There is / has been a lot more than press releases on the nodes. Places like semiwiki and others have looked at the technical specifications (the various pitches and technologies involved).

The reason Intel moved from "nm" to 'Intel 7' 'Intel 4' and so on are well known.

Intel long ago stated how it came up with its density measurements, and tried to fight the good fight of honesty and transparency, but thanks to TSMCs marketing and a bucketload of seeming fanboy type misinformation, that backfired on them. TSMC and GloFlo both counted the move to FinFet as if it were a full node die shrink, and so kept publishing nanometer measurements that were not comparable to Intel's methods.

Intel 7 (previously known as 10nm) is fully twice the density of TSMCs 10nm node and on par with TSMCs 7nm node. Intel 4 is halfway between TSMCs N5 and N3. It is interestingly also 65% more dense than Samsung's "8nm" node.

These are simply facts, attempting to deflect from that is at best disingenuous.

I for one am really glad to see an article from Tom's that points some of the meaningful technical aspects out, rather than just falling back on the misleading marketing terms.

Again, this doesn't mean Intel is ahead of TSMC. TSMC N5 and its derivative N4 are the highest density nodes currently in mass production. However, that could change by the end of 2022 if Intel stays on its roadmap.

-Fran- · Jun 14, 2022

Amazing interview and some incredible insights from Wendel:

View: https://www.youtube.com/watch?v=MYJ_9zfqWUg

Regards.

jkflipflop98 · Jun 14, 2022

isofilm said:
That doesn't mean JACK, Intel is notorious for Binning (sorting chips based on performance), you think they just picked a PROTOTYPE out of a hat, NO, they picked the very best one they could find.

Intel has NEVER "stayed ahead of AMD", they have traded pole position many times in the last 30 years.

Until they have enough EUV tools to start Risk Production (20% yields), and presuming they can get yields to 60-70% to start Volume Production, then we'll see how they are doing.

Everyone bins their chips. It's one of the basics of the business.

Intel has only lost the performance leadership position one other time in all of history. Then Conroe came out and that was the end of that.

You kids have such short memories.

-Fran- · Jun 14, 2022

jkflipflop98 said:
Everyone bins their chips. It's one of the basics of the business.

Intel has only lost the performance leadership position one other time in all of history. Then Conroe came out and that was the end of that.

You kids have such short memories.

Strange you say that... Forgetting AMD's K-II 300 or good ol' Thunderbird Athlons or Barcelona/Toledo Athlon64s? He is not wrong overall: AMD and Intel have been trading places ever since AMD had to start making their own designs. Intel, historically, managed to get away with it at times mostly because of their grotesque manufacturing advantage (leaving "incentive" shenanigans aside), but they're still playing catch up now and you can see how they depend as much on good design as good manufacturing process.

Regards.

PCWarrior · Jun 14, 2022

-Fran- said:
Nah, I'll stop there; .

Sigh. There is not much point in continuing this discussion as you don’t seem to understand the nuances of what is explained to you (or rather you do understand but you simply choose to ignore it and just repeat your misguided half truths). For the sake of completeness though, I will bother to answer the following point.

-Fran- said:
On this same topic, your simplified approach to power consumption on multiple cores is flawed: process and SoC packaging also matter a lot. It's no longer as simple as saying "moar coars bad for efficiency unless lower clocked"; Intel's bigLITTLE is proof of that.

Never said that this is the only reason. In fact I said exactly the opposite. But because this is one of the big reasons for the power consumption differences we see between Intel and AMD cpus (especially in reviews comparing flagships like the 12900K and the 5950X) and because here we are talking about process nodes it follows that if you really want to compare apples to apples and see which node is superior you have to at least eliminate as many variables as you can and therefore (like I already said in my original post) compare equal core-count cpus running at the same frequency.

In the desktop space the best comparison would be between the octacore 5800X, the octacore 5800X3D, the octacore 12700K (with the 4 e-cores disabled) and the octacore 12900K (with the 8 e-cores disabled) all running at a fixed 4.4GHz (both single and all-core) and tuning each cpu for the lowest voltage for stability at this frequency. Then you will see who truly wins in terms of efficiency. Tom’s Hardware please make this comparison happen, you can. As for the server space the best is to compare a 32-core Intel versus both Zen 2 and Zen 3 32-core AMD cpus (remember that both Zen 2 and Zen 3 are on 7nm). For example, compare the (Icelake) Xeon Platinum 8362 against (Milan) EPYC 7543 and (Rome) EPYC 7542. All running at equal clock speeds (say 2.8GHz). You will see that their efficiency difference is much smaller than you were lead to believe (and not always the win goes one way).

-Fran- · Jun 14, 2022

PCWarrior said:
Sigh. There is not much point in continuing this discussion as you don’t seem to understand the nuances of what is explained to you (or rather you do understand but you simply choose to ignore it and just repeat your misguided half truths). For the sake of completeness though, I will bother to answer the following point.

Never said that this is the only reason. In fact I said exactly the opposite. But because this is one of the big reasons for the power consumption differences we see between Intel and AMD cpus (especially in reviews comparing flagships like the 12900K and the 5950X) and because here we are talking about process nodes it follows that if you really want to compare apples to apples and see which node is superior you have to at least eliminate as many variables as you can and therefore (like I already said in my original post) compare equal core-count cpus running at the same frequency.

In the desktop space the best comparison would be between the octacore 5800X, the octacore 5800X3D, the octacore 12700K (with the 4 e-cores disabled) and the octacore 12900K (with the 8 e-cores disabled) all running at a fixed 4.4GHz (both single and all-core) and tuning each cpu for the lowest voltage for stability at this frequency. Then you will see who truly wins in terms of efficiency. Tom’s Hardware please make this comparison happen, you can. As for the server space the best is to compare a 32-core Intel versus both Zen 2 and Zen 3 32-core AMD cpus (remember that both Zen 2 and Zen 3 are on 7nm). For example, compare the (Icelake) Xeon Platinum 8362 against (Milan) EPYC 7543 and (Rome) EPYC 7542. All running at equal clock speeds (say 2.8GHz). You will see that their efficiency difference is much smaller than you were lead to believe (and not always the win goes one way).

That is false though (italic bold). The "package power" is higher on Intel than AMD all the time: https://www.anandtech.com/show/1621...e-review-5950x-5900x-5800x-and-5700x-tested/8 vs https://www.anandtech.com/show/1704...hybrid-performance-brings-hybrid-complexity/4

If you look strictly at P cores, AMD is still better even with the package power overhead disadvantage. Well, I guess we can't see what overhead the ucore in Intel is, but the triple ring BUS still sucks energy, unless they're using Tiger Lake's mesh? I can't remember. It still uses way more and even when you allow AMD to go over the 142W power limit, it is still a tad more efficient, but then both just look stupid; difference is Intel was forced to do it, kind of. And this is the consumer design of Zen. I'm sure in the server side this comparison, at this level, is even more telling. Too bad there's not many in-depth reviews we can read about for Server.

As for your second point, fixing the clock speeds is not a good comparison. They're using different process nodes and their optimal speeds are set at a different power envelope / voltage points for sure. I'd be willing to say, somewhat confidently, AMD's Zen design is optimized for the 3Ghz range (as you correctly point out later), as the server Zen has been running around that mark since introduced. Same-ish with Intel. Well, point is, I still believe locking them at the same speed will necessarily put one at an advantage and vice versa. I would say you could lock the performance (where you get the same amount of "processing" done) and then check how much power was used instead; I'm sure AMD will show a big enough gap. That may be a more interesting test, but it'll be application dependent (even more). Either way, I'll agree that at their most efficient point, both are going to be so close it is indeed a moot point, but for people that runs farms of these things, single watt differences per unit of work (bench) may be a few thousand dollars? I don't know, but there's a reason AMD has an advantage in server. Look at the link I put above where Tom and Wendel talk about these things; it's quite interesting. Maybe you'll understand where I'm coming from a tad better when you listen to it. Just "cores" and "hertz" is not enough to explain the whole picture.

Regards.

Danra · Jun 14, 2022

Admin said:
Intel debuted the details of its 'Intel 4' process node and shared the first diagram of its Meteor Lake die.

Intel Debuts Meteor Lake Die, 'Intel 4' Node: 20% Higher Clocks at Same Power, 2X Area Scaling : Read more

This is about, "Meteor Lake" right? What is with all the fan boy comments about AMD being better that something Intel has not released? I own a PC, it is using an Intel i7 12700K and it does fine for my use. If someone wants AMD that's okay with me, just please stop the fan boy flaming comments. I am here to enjoy computer topics not read flames.

jkflipflop98 · Jun 15, 2022

Danra said:
This is about, "Meteor Lake" right? What is with all the fan boy comments about AMD being better that something Intel has not released? I own a PC, it is using an Intel i7 12700K and it does fine for my use. If someone wants AMD that's okay with me, just please stop the fan boy flaming comments. I am here to enjoy computer topics not read flames.

JayNor · Jul 1, 2022

Is it not contradictory to report that Alder Lake's process is somehow inferior while it objectively outperforms the competition in numerous significant benchmarks? I would guess their Intel-7 process tweaks and the reported MIM capacitor additions enabled Intel to move to PCIE5 and DDR5 a year ahead of the competition.

JayNor · Jul 1, 2022

I haven't seen a discussion of the SIMD performance of the single threaded Gracemont cores on Alder Lake. Intel turns off hyperthreading for some SIMD benchmarks, likely due to the inefficiency of thrashing the SIMD registers. So, for these type of benchmarks, Alder Lake's small cores should be closer in performance to full cores.

JayNor · Jul 1, 2022

Intel's fab process now includes its Ponte Vecchio 3D fabrication. One thing that Intel has been secretive about is that their 408 MB of L2 cache is largely embedded in the base tile. This was reported, maybe inadvertently, in W. Gomes paper, "Ponte Vecchio: A Multi-Tile 3D Stacked
Processor for Exascale Computing". If this proved successful, I'm guessing it will not be long until it appears in the tiled server chips, which are already getting the HBM tiles from their GPU cousins.

	Intel 4	Intel 7	TSMC N5	TSMC N3
HP Library Density	160 MTr/mm^2 (est.)	80 MTr/mm^2	130 MTr/mm^2 (est.)	208 MTr/mm^2 (est.)

News Intel Debuts Meteor Lake Die, 'Intel 4' Node: 20% Higher Clocks at Same Power, 2X Area Scaling

Honorable

Commendable

Glorious

Commendable

Glorious

Commendable

Distinguished

Distinguished

Glorious

Distinguished

Glorious

Distinguished

Glorious

Distinguished

Glorious

Distinguished

Distinguished

Honorable

Honorable

Honorable

Share this page