News Intel Details 3D Chip Packaging Tech for Meteor Lake, Arrow Lake and Lunar Lake

So is the magic here that the 3d Foveros acts as a substrate that acts as the fabric interconnect bus and just makes it simpler to arrange chiplets (tiles) in a close package density or are the chiplets themselves stacking? With heat being the biggest limiter won't stacking just amplify thermal issues?
 
I think the most telling detail in all of this is the slide that is targeting "2024+" for Performance per watt Leadership. That is the only metric that matters for server and mobile - which is where the money is at.
Intel is basically saying they don't expect to have a competitive product on the market for at least 3 years.

Maybe they'll be able to crank out some 250W+ monstrosity that will get the best gaming performance... but high performance gaming PCs aren't really that big of a market, especially since everybody just upgraded their home PC within the last year or two.
 
  • Like
Reactions: bit_user and LuxZg
So is the magic here that the 3d Foveros acts as a substrate that acts as the fabric interconnect bus and just makes it simpler to arrange chiplets (tiles) in a close package density or are the chiplets themselves stacking? With heat being the biggest limiter won't stacking just amplify thermal issues?
Looks like the first. Not all that 3d, just grouped on an interposer like vega and hbm. Probably won't be any harder to cool.
Hopefully they won't have too much for current limitations. Not that I want to dump 350w into my cpu, I just like to have the option available.
I just thought of my vega comparison. 350w should be fine.
 
  • Like
Reactions: bit_user and LuxZg
That is the only metric that matters for server and mobile - which is where the money is at.
That's not even close to being the only metric for servers, having finite space heat and just physical space, how many cores you can fit into a case or how many CPUs you can fit into a single mobo, is just as important.
Then a lot of software charges per cores used so the fewer cores can do the same amount of work is going to be king there.
Then there is hardware support.
Also software support.
And also just availability in large enough numbers that a server would need.

If AMD only wins in perf/watt that is a very small percentage of server customers that want only that and nothing else.
 
  • Like
Reactions: shady28 and rtoaht
So is the magic here that the 3d Foveros acts as a substrate that acts as the fabric interconnect bus and just makes it simpler to arrange chiplets (tiles) in a close package density or are the chiplets themselves stacking? With heat being the biggest limiter won't stacking just amplify thermal issues?

No, there won't be any heat issue. There is no stacking here at all. It's still all in 2D.

AMD did almost the same thing back in 2015 with Fiji and hbm. The only difference is that Fiji itself is a single die (hbm not part of the GPU) while Intel has 4 separate dies.

Using silicon interposer is obviously a lot more expensive compared to organic one (using the substrate itself). But there are benefits like ability to place the tiles much closer and reduce latency. Having multiple dies also means you need to flatten the surface since there will be minute height variations for each die. More work needed to be done.

Having said that I like this method of puting different dies together. It allows you to use different process and best of all you can go beyond the recticle limit.
 
  • Like
Reactions: bit_user and LuxZg
That's not even close to being the only metric for servers, having finite space heat and just physical space, how many cores you can fit into a case or how many CPUs you can fit into a single mobo, is just as important.
Then a lot of software charges per cores used so the fewer cores can do the same amount of work is going to be king there.
Then there is hardware support.
Also software support.
And also just availability in large enough numbers that a server would need.

If AMD only wins in perf/watt that is a very small percentage of server customers that want only that and nothing else.

That's 100% correct.

To put this in perspective, I'm looking at a bill I pulled up from 2020 on a purchase of 48 core licenses for SQL Server. They charge per core and for outright purchase (not subscription) it's like $160,000 with corporate discount. It looks like it costs about 30% more right now.

In that kind of model, the per-core performance is paramount. If I can get 30% more transactions with core X than with core Y then I need 30% fewer licenses, and that will translate directly into lowering my licensing cost by 30% - which is going to be upwards of $60,000 in the 48 core example above.

And this is just for SQL Server, the OS also charges based on cores. The retail pricing according to Microsoft for Data Center version of of the OS is $6,155 per core.

And Oracle.. jeez. $47,000 per core * a core factor. For Intel/AMD that is $47,000 * .5 = $23,500 per core.

Whoever has the fastest core here wins.
 
That's 100% correct.

To put this in perspective, I'm looking at a bill I pulled up from 2020 on a purchase of 48 core licenses for SQL Server. They charge per core and for outright purchase (not subscription) it's like $160,000 with corporate discount. It looks like it costs about 30% more right now.

In that kind of model, the per-core performance is paramount. If I can get 30% more transactions with core X than with core Y then I need 30% fewer licenses, and that will translate directly into lowering my licensing cost by 30% - which is going to be upwards of $60,000 in the 48 core example above.

And this is just for SQL Server, the OS also charges based on cores. The retail pricing according to Microsoft for Data Center version of of the OS is $6,155 per core.

And Oracle.. jeez. $47,000 per core * a core factor. For Intel/AMD that is $47,000 * .5 = $23,500 per core.

Whoever has the fastest core here wins.

Well, except for many companies that are stepping away from Microsoft, Oracle, and similar, and transitioning to FOSS solutions. Sure, some need MS SQL or Oracle specific features, but many are fine with PostgreSQL, MariaDB, which translates to less Windows Server and more CentOS (Rocky?), Ubuntu, which also pushes less Hyper-V and related tech and more of Proxmox, KVM, QEMU, and so on and on.

I come from very Microsoft -centric company, but in last 5 years or so we've heavily transitioned to FOSS, 100% because off licensing costs. Starting with LibreOffice for 90% of client PCs, then Debian, Ubuntu, CentOS, MySQL, MariaDB, more recently PostgreSQL, but also related stuff like PHP, Laravel, Apache, nginx, etc.

We used to have Windows desktop applications, connecting to Microsoft SQL, managing thousands of custom machines that ran Windows Embedded variants, and that all led to Exchange, Dynamics NAV, Windows Server galore, MS Office everywhere, with time more .NET desktop applications, ASP web, etc. That all originated in ~2007/2008. Now we still have Windows client PCs and Windows AD, but most embedded stuff is open source based, applications transitioned to web apps, on open source tech (as said above), and so on. Some stuff is hard to change, eg Exchange and Dynamics NAV, but those licensing cost pale in comparison to everything else that was changed.

Now back to topic 😃 just look at big names like Facebook, Google, Amazon. They all have systems based on open source, then customized to their own needs. Google would be broke if they used Oracle ;D More and more of software in server environment is license free.

Thus, number of cores or per-core performance is irrelevant for those enterprises. They want performance per watt and performance per rack. Performance per watt saves electricity, also saves on cooling, which saves more electricity and saves space, which allows more racks and better density. And then perf per rack seals the deal. That's why everyone experiments with ARM and 144 core low power CPUs, they aren't aiming to save 100k on per-core licensing, they saved 100% on licensing and will save 100's of thousands on electricity each month and millions on datacenter development.

To top all that, AMD already has per per watt and perf per unit space, they just need to keep executing. I am sure that eventually Foveros will help Intel, but it's yet to be seen if it's enough to surpass AMD or was it just - too little too late.

Just take a look at supercomputer top 500, once Intel was untouchable in the list with something like 99% using Intel CPUs. Now contrast that to June 2022 update: "All 3 new systems in the top 10 are based on the latest HPE Cray EX235a architecture, which combines 3rd Gen AMD EPYC™ CPUs optimized for HPC and AI with AMD Instinct™ 250X accelerators, and Slingshot interconnects."

Cheers!
 
  • Like
Reactions: bit_user
Performance per watt saves electricity, also saves on cooling, which saves more electricity and saves space, which allows more racks and better density. And then perf per rack seals the deal.
If only things where that simple...
Going with desktop data since we don't have any server data.
Using the same W, 120 for the 5950x and 125 for the 12900k, the 5950x is running 20% hotter which would mean 20% higher cooling cost or 20% less CPUs you could fit into the same space due to cooling.
And the 5950x provides 9% better performance on that 120W so it does not level out the increased heat, it could not compensate the loss due to cooling with the better performance per watt.
The 5950x does win on perf/watt but if you also look at heat and cooling the 12900k would win.
And then if you have a choke point in your server, a huge spike in more work, with the 5950x you would be screwed, you would have to tell your customers "better luck next time" because no matter how much you juice it it wont run any faster, but with the 12900k if you have additional cooling and power to spare you can boost it and at least do better than.
https://www.hardwareluxx.de/index.p...-desktop-cpus-alder-lake-im-test.html?start=8
grBsmGV.jpg
 
Intel's Ponte Vecchio uses this Foveros stacking. Its matrix peak performance is reported as 839 TFLOPS bfloat16 vs 383TF BF16 on MI250X. See the servethehome coverage for the details.
 
I read a patent on Intel's EMIB which indicated the processing reduced the resistance and enabled switching by gates at the endpoints.

By creating the foveros connections on silicon, they can use the same straining or doping tricks they use on their transistors.

I suppose by creating these as separate tiles, they can also take advantage of whatever they've developed for their wafer testing of tiles.
 
Having multiple dies also means you need to flatten the surface since there will be minute height variations for each die.

I wonder if different production lines, i.e. N3, N5, N7, etc., will result in
different height, or are they only affected surface area (width and length) ? 🤔
 
That's not how it works. If the CPU is consuming X watts, then your cooling system needs to dissipate X watts (assuming more or less steady-state).
But one CPU is still going to run much cooler than the other if the cooling dissipates 125W for both.

And then the 12900k has a max temp of 100°C up to which it can keep all of its performance while the 5950x is at 90°C so you have to cool it more, you will have to ramp up the cooling to above 125 more often to keep the CPU at 90 as you would have to to keep it at 100.

The 5950x hits 80°C at 125W with a ceiling of 90°C so there isn't much of a wiggle room there, you have 10 degrees to work with for the 5950x while you have 30 for the 12900k.
 
But one CPU is still going to run much cooler than the other if the cooling dissipates 125W for both.

And then the 12900k has a max temp of 100°C up to which it can keep all of its performance while the 5950x is at 90°C so you have to cool it more, you will have to ramp up the cooling to above 125 more often to keep the CPU at 90 as you would have to to keep it at 100.

The 5950x hits 80°C at 125W with a ceiling of 90°C so there isn't much of a wiggle room there, you have 10 degrees to work with for the 5950x while you have 30 for the 12900k.

Sorry, late reply. But you're talking desktop because "we don't have data" yet we do have data, and even on desktop you're looking at it wrong. 120W and 125W, and you ignore 240W at the end of chart for 12900K (?). It's literally about 240W Intel vs ~170W AMD at roughly same performance (!!)

As for servers, you just need to search a little. Even news about new gen of both camps, AMD at 400W TDP, and Intel at 350W - PL1 (base clock) (!!) Plus AMD has way more cores and performance.

So sorry, can't agree with picture you're trying to paint here. We will see in a few months when Intel finally releases new gen, but so far they're getting crushed.

And again, I point to top supercomputer entries last quarter, and well, all year, check how many are Intel and how many AMD. Trust me, people building petaflop sized supercomputers are very carefully considering all aspects, and they DO have data, even many months ahead of any hardware leaks and reveals, snd they're picking AMD.
 
  • Like
Reactions: bit_user
Sorry, late reply. But you're talking desktop because "we don't have data" yet we do have data, and even on desktop you're looking at it wrong. 120W and 125W, and you ignore 240W at the end of chart for 12900K (?). It's literally about 240W Intel vs ~170W AMD at roughly same performance (!!)
It has both measurements, at 125W for the 12900k and at 241W as well, that's the whole point of the benchmark.
You can't only take the one of them and pretend it's the only one that there is, both are realistic settings.

At 125W the 5950x is roughly 10% above the performance of the 12900k at the same power draw, at 240W the 12900k is roughly 10% above the performance of the 5950x at 70W more power.
pmyXvMO.jpg

As a user you can pick and choose which one of the two power levels (or anyone in-between) you want to use.
Here you can see different power levels and how much change there is in performance so you can choose the best power level for you.
And basically a KS at 1,2Vcore and 160W is the best you can do for efficiency.
For home users you can argue that they are all stupid and will never ever look into this but for server customers this is the first thing they will do even before ordering their systems they will get a single one and test the crap out of it to find the best efficiency point.
zKe0Vme.jpg
 
It has both measurements, at 125W for the 12900k and at 241W as well, that's the whole point of the benchmark.
You can't only take the one of them and pretend it's the only one that there is, both are realistic settings.

At 125W the 5950x is roughly 10% above the performance of the 12900k at the same power draw, at 240W the 12900k is roughly 10% above the performance of the 5950x at 70W more power.
pmyXvMO.jpg

As a user you can pick and choose which one of the two power levels (or anyone in-between) you want to use.
Here you can see different power levels and how much change there is in performance so you can choose the best power level for you.
And basically a KS at 1,2Vcore and 160W is the best you can do for efficiency.
For home users you can argue that they are all stupid and will never ever look into this but for server customers this is the first thing they will do even before ordering their systems they will get a single one and test the crap out of it to find the best efficiency point.
zKe0Vme.jpg

So your words basically confirm what I say, but your tone is still trying to say I am wrong.

125W vs 125W AMD is faster, so server customers would choose AMD.

240W vs 170W Intel is 10% faster for 40% extra power and is out of efficiency curve, so server customers would choose AMD.

You also confirm that option is there for consumers, but nost won't ever use it. And not just for lack of knowing about options. How many ppl buy 12900K to cripple it with 40% less power. If you wanted lower power buy lower powered CPU.
 
If only things where that simple...
Going with desktop data since we don't have any server data.
Using the same W, 120 for the 5950x and 125 for the 12900k, the 5950x is running 20% hotter which would mean 20% higher cooling cost or 20% less CPUs you could fit into the same space due to cooling.
If only things were that simple.

You mistakenly equate temperature with heat output. You can have two chips running at the same temperature with very different heat outputs, if the thermal conductivity of their cooling system (including the heat spreader and how it's coupled to the dies) differs.

What really matters is perf/W. And not on gaming benchmarks or even non-gaming desktop apps, but with server processors running server workloads. Since Watts = $, perf/W equates to operational perf/$.

The proof of the pudding is in the eating. Datacenter operators aren't dumb, and AMD's datacenter marketshare will ultimately tell us whether they offer compelling value.
 
Intel needs to quit releasing details and start releasing products on time.
That's not how it works. The people giving these presentations aren't the same ones doing the detailed design, testing, and production. Furthermore, the stuff they're presenting is boiled down from slides and diagrams they internally produce, anyway.

There are several reasons for presenting this stuff. One is to show customers they're on a comparable or better technology curve to the competition, so that customers will feel they have a future with Intel and not just look at how Intel is struggling to crisis-manage its recent stumbles. At a conference like Hot Chips, you also have folks from academia and the research community that Intel wants to recruit from.

Finally, there's a more shrewd business angle to a lot of their recent announcements, which is to drum up customers for their foundry business that you might not care about, but that Intel absolutely needs to succeed if they're to have a financially viable path forward. Fabs are a huge cost for Intel that their competitors don't carry, and they need to grow their foundry services into something that will compete with TSMC, so that it's not a giant boat anchor hung around their necks.