News Intel mulls spinning off its manufacturing division

thestryker · Aug 31, 2024

bit_user said:
Huh? The Golden Cove Xeon cores are clock-limited, probably because they just didn't bother to do critical path optimizations for the added FPU and AMX logic, since most Xeon SKUs are power-limited anyhow.

I don't know what you mean by "cut down RPL cores". That characterization just doesn't make sense to me.

Talking about the Xeon E models they released using RPL B0 where they basically cut the clocks, only support DDR5-4800 and fused off the E-cores/IGP.

DavidC1 · Aug 31, 2024

bit_user said:
Ivy Bridge would indeed clock well, if you did direct-die cooling. Intel mad a bad decision to switch to a worse TIM (replacing solder with paste, under the IHS) while simultaneously shrinking the process node and thereby increasing thermal density.

They changed the V/F curve again with future variants(emphasis on future) of 14nm process, hence the enormous increase starting with 4790K. You had to go through all the hoopla to reach higher clocks. You can see clock/power comparisons of both Raptorlake and Meteorlake compared to AMD. TSMC optimizes for lower power so AMD wins in the low end but Raptorlake when given enough power(70W or more) it starts winning.

They claimed 37% perf improvement with 22nm process. Later they said it was achieved exactly for the Atom.

bit_user said:
Haswell had other things going on, like AVX2 and wider vector units. That said, the i7-4970K (Haswell Refresh) did overclock better than Ivy Bridge.

So what? Most applications still barely use AVX2 nevermind AVX512. For real applications the gains were about 10%. And stuck on quad core.

bit_user said:
32 nm? Intel only had one mainstream generation release on it: Sandybridge. Westmere used it, but that wasn't a fully new generation.

I'm talking about endless 14nm and 10nm iterations. You can optimize designs and clocks given enough time, which most designs don't fully get to due to the rapid 1-2 year cycles of computers. 14nm and 10nm had 5-7 years.

That seems hard to believe, but I haven't found real workload data on P-core IPC either to support or refute this claim. Given that die area costs money, one would presume Intel increased the L2 cache size for sound reasons.

Yea except the fact that the L2 cache hasn't resulted in any meaningful performance nor power reductions.

Just like the Netburst and Bulldozer designs, the teams can be steered in a way to make nonsensical decisions.

By the way Raptorlake wasn't supposed to exist. Successor to 12th Gen was supposed to be Meteorlake, which failed due to 6 month+ delay of 7nm(Intel 4). But Raptorlake as we found out is useless with the die degradation thing. They pushed it too far.

It don't really matter. Intel is in really dire straits and couple of product failures or success won't change it drastically.

bit_user · Aug 31, 2024

DavidC1 said:
Yea except the fact that the L2 cache hasn't resulted in any meaningful performance nor power reductions.

You've now asserted this twice. I think the onus is on you to provide supporting data.

DavidC1 said:
Just like the Netburst and Bulldozer designs, the teams can be steered in a way to make nonsensical decisions.

Netburst was 20+ years ago. Bulldozer was a different company (AMD) and like a dozen years ago. It's not clear to me that either of those examples are informative of recent history. Not least, because you've provided zero sources to show there were similar dynamics driving the decision-making processes behind each. If the similarities are any more than merely superficial, you have yet to make that case.

Pierce2623 · Aug 31, 2024

TheSecondPower said:
I don't think there's been much wrong with the design side. But I guess we'll find out next month if Lunar Lake is tested then. Being on the TSMC N3B node, it ought be more efficient than every TSMC N4B CPU.

If Lunar Lake on 3nm isn’t significantly more efficient than Strix Point then we can safely blame Intel’s design team. Going from Meteor Lake, which is already mostly made inTSMC, to significantly more efficient than Strix Point is a tall order. We shouldn’t have to wait too much longer though. If Lunar Lake is as good as advertised, I’ll be the first in line.

Pierce2623 · Aug 31, 2024

DavidC1 said:
They changed the V/F curve again with future variants(emphasis on future) of 14nm process, hence the enormous increase starting with 4790K. You had to go through all the hoopla to reach higher clocks. You can see clock/power comparisons of both Raptorlake and Meteorlake compared to AMD. TSMC optimizes for lower power so AMD wins in the low end but Raptorlake when given enough power(70W or more) it starts winning.

They claimed 37% perf improvement with 22nm process. Later they said it was achieved exactly for the Atom.

So what? Most applications still barely use AVX2 nevermind AVX512. For real applications the gains were about 10%. And stuck on quad core.

I'm talking about endless 14nm and 10nm iterations. You can optimize designs and clocks given enough time, which most designs don't fully get to due to the rapid 1-2 year cycles of computers. 14nm and 10nm had 5-7 years.

Yea except the fact that the L2 cache hasn't resulted in any meaningful performance nor power reductions.

Just like the Netburst and Bulldozer designs, the teams can be steered in a way to make nonsensical decisions.

By the way Raptorlake wasn't supposed to exist. Successor to 12th Gen was supposed to be Meteorlake, which failed due to 6 month+ delay of 7nm(Intel 4). But Raptorlake as we found out is useless with the die degradation thing. They pushed it too far.

It don't really matter. Intel is in really dire straits and couple of product failures or success won't change it drastically.

Wait a second…. the larger L2 basically accounted for most of Raptor Lake’s 10-15% improvement over Alder Lake. Where are you getting that it didn’t improve performance? The cache improvements and 5% faster clocks was the only difference between Alder Lake and Raptor Lake. The 5% clock bump added about 3% performance and the L2 added about 7-12% more performance depending on the workload. The large L2 is Intels best modern design decision. The huge structure sizes they’ve moved to haven’t scaled performance that much. The L2 worked well enough that they’ve gone larger again on Arrow Lake.

DavidC1 · Sep 1, 2024

Pierce2623 said:
Wait a second…. the larger L2 basically accounted for most of Raptor Lake’s 10-15% improvement over Alder Lake.

No it wasn't. It was mostly frequency.
https://www.anandtech.com/show/17601/intel-core-i9-13900k-and-i5-13600k-review/6

~2% faster due to large cache.

Meteorlake's Redwood Cove is arguably slower than Raptor Cove.
https://www.techpowerup.com/317317/intel-meteor-lake-p-cores-show-ipc-regression-over-raptor-lake

bit_user said:
You've now asserted this twice. I think the onus is on you to provide supporting data.

Right cause there's no internet on your side? I'll do it just this time.
https://www.anandtech.com/show/16084/intel-tiger-lake-review-deep-dive-core-11th-gen/8

IPC/performance-per-clock wise, things are mostly flat between generation at +-2% depending on workloads,

That's the effect of increasing cache by 2.5x from 512KB to 1.25MB on Tigerlake.

bit_user said:
Netburst was 20+ years ago. Bulldozer was a different company (AMD) and like a dozen years ago.

Sure it is. History doesn't repeat but it rhymes is the phrase.

Those are just examples to illustrate that seemingly stupid decisions can be made. Why do you think AMD's cores perform nearly similar to Intel when Intel cores are nearly 50% larger than Intel's? One is clearly better than the other.

It was apparent after just 2 years Netburst was a dead end but the entire company milked the thing for 5 years after that. You could also see from Extreme Overclocking records 5GHz wasn't achievable without exotic cooling yet the company that's supposed to be filled with engineers thought 10GHz was easily achievable, when to this day with the most exotic cooling, the most binned parts, and the easily clocked CPU can't reach it.

You don't need every and explicit detail to understand what's going on, which is what you are saying is needed. Unlike "AI" which does, humans have the ability to understand "between the words".

ET3D · Sep 1, 2024

2Be_or_Not2Be said:
If China takes over Taiwan, does the US really want Intel to be a design-only, no-manufacturing company? The risk you take with spinning off the manufacturing side is that they may choose to move some/all manufacturing to Malaysia/Vietnam/etc. to get back into the black (well, expand with those they're already in). The U.S. likely needs to preserve as much manufacturing in-house/in-country as possible, so perhaps the gov't might need to investigate a "sovereign" fund to put some financial stakes into Intel's manufacturing.

I feel that Intel wanting to part with its manufacturing arm is actually a good opportunity for the US. It will require the US to invest more in those fabs than it would have otherwise, and to whip them into shape, but as long as that's possible, it will give the US manufacturing that's close enough to state of the art that US fabless companies, such as AMD, NVIDIA, Qualcomm, etc., will have an alternative that will allow them to produce chips in the US.

The US could also incentivise those companies to produce chips using US fabs. This way, having US-based manufacturing that's not tied to Intel could be of great benefit to the US.

Gururu · Sep 1, 2024

ET3D said:
I feel that Intel wanting to part with its manufacturing arm is actually a good opportunity for the US. It will require the US to invest more in those fabs than it would have otherwise, and to whip them into shape, but as long as that's possible, it will give the US manufacturing that's close enough to state of the art that US fabless companies, such as AMD, NVIDIA, Qualcomm, etc., will have an alternative that will allow them to produce chips in the US.

The US could also incentivise those companies to produce chips using US fabs. This way, having US-based manufacturing that's not tied to Intel could be of great benefit to the US.

I am with you here, because at the end of the day the last thing any U.S. citizen should want is for U.S. to lose any traction in fabrication, which is at a crossroads. Whether Intel steps forward or back, we should be supportive of efforts to retain and build on this industry. I just can't blast a U.S. company with a fairly strong hold on these technologies. China is too hard to ignore at this point. I think it would be foolish to assume that they don't already own Taiwan and all of its gifts.

bit_user · Sep 1, 2024

DavidC1 said:
No it wasn't. It was mostly frequency.
https://www.anandtech.com/show/17601/intel-core-i9-13900k-and-i5-13600k-review/6

~2% faster due to large cache.

Where are you getting the 2% figure? Just from eyeballing this slide?

It looks bigger than that, to me (maybe 3-4%, for ST?), but we really shouldn't assume unitless graphs are that precise. More importantly, it's a far bigger chunk of the MT performance gains!

Even if Anandtech had bothered to compute the averages of their rate1 SPEC2017 scores for us, it would still be difficult to divide out the frequency difference, if we don't even know what frequency it boosted to and for how long. For instance, if you simply use the "P-core Max Turbo" frequency, it amounts to a 5.9% improvement. However, there's also the Thermal Velocity Boost, which increased by more than that.

DavidC1 said:
Meteorlake's Redwood Cove is arguably slower than Raptor Cove.
https://www.techpowerup.com/317317/intel-meteor-lake-p-cores-show-ipc-regression-over-raptor-lake

That probably had a lot to do with the higher memory latency of Meteor Lake's interconnect and LPDDR5X memory. However, perhaps it was also impacted by whatever methodology he used to measure clock-normalized performance.

Quite frankly, I don't believe there was a real regression in Redwood Cove and certainly not that large. Intel said IPC was about the same and I think that's pretty consistent with what other reviewers have observed.

DavidC1 said:
Right cause there's no internet on your side? I'll do it just this time.
https://www.anandtech.com/show/16084/intel-tiger-lake-review-deep-dive-core-11th-gen/8

That's the effect of increasing cache by 2.5x from 512KB to 1.25MB on Tigerlake.

Too bad they didn't bother to do the same clock-normalized comparison for the Rate-N (multithreaded) benchmarks, because I think you're falling for the fallacy that MT performance is simply ST * N.

The other thing you have to account for is how the latency changed. They discussed it, earlier in that review:

"The private L2 cache gets the biggest update, with a +150% increase in size. Traditionally increasing the cache size by double will decrease the miss rate by √2, so the 2.5x increase should reduce L2 cache misses by ~58%. The flip side of this is that larger caches often have longer access latencies, so we would expect the new L2 to be slightly slower. After many requests, Intel said that its L2 cache was a 14-cycle latency, which we can confirm, making it only +1 cycle over the previous generation. It’s quite impressive to more than double a cache size and only add one cycle of latency. The cache is also now a non-inclusive cache.

The L3 also gets an update, in two ways. The size has increased for the highest core count processors, from 2 MB per core to 3 MB per core, which increases the L3 cache line hit rate for memory accesses. However, Intel has reduced the associativity from 16-way at 8 MB per 4C chip to 12-way at 12 MB per 4C chip, which reduces the cache line hit rate, but improves the power consumption and the L3 cache latency. There is some L3 latency cycle loss overall, however due to the size increase Intel believes that there is a net performance gain for those workloads that are L3-capacity bottlenecked."

So, L2 latency increased 7.7% and L3 suffered both a loss of associativity and increase in latency by ?? %. As for the change in L2 to be non-inclusive, I expect that to have virtually no impact. It just avoids L2 from duplicating the contents of the tiny L1 caches and is probably something they did to simplify cache coherence, rather than to improve L2 utilization.

DavidC1 said:
Sure it is. History doesn't repeat but it rhymes is the phrase.

Again, you've done nothing to show any more than superficial similarities. An actual historian would look for evidence of what was happening inside those organizations and why they made the choices they did, before drawing such analogies.

DavidC1 said:
Why do you think AMD's cores perform nearly similar to Intel when Intel cores are nearly 50% larger than Intel's? One is clearly better than the other.

Intel P-cores have larger out-of-order structures to extend their performance at high frequencies (where Zen 3 & 4 tend to hit a brick wall). That becomes very area-intensive. They could afford to make big P-cores, due to their hybrid strategy. If not for the E-cores, Intel probably wouldn't have made the P-cores so big.

Here's a summary of how the different OoO structures in their cores compare:

Structure	Zen 4	Zen 3	Golden Cove	Comments
Reorder Buffer	320	256	512	Each entry on Zen 4 can hold 4 NOPs. Actual capacity confirmed using a mix of instructions
Integer Register File	224	192	280
Flags Register File	238	122	Tied to Integer Registers	AMD started renaming the flags register separately in Zen 3
FP/Vector Register File	192	160	332	Zen 4 extends vector registers to 512-bit
AVX-512 Mask Register File	52 measured + 16 non-speculative	N/A	(152 measured via MMX)	Since SKL, Intel uses one RF for MMX/x87 and AVX-512 mask registers However Golden Cove does not officially support AVX-512
Load Queue	88 (136 measured)	72 (116 measured)	192	All Zen generations can have more loads in flight than AMD’s documentation and slides suggest. Intel and AMD have different load queue implementations
Store Queue	64	64	114	A bit small on Zen 4, would have been nice to see an increase here
Branch Order Buffer	62 Taken 118 Not Taken	48 Taken 117 Not Taken	128

Source: https://chipsandcheese.com/2022/11/05/amds-zen-4-part-1-frontend-and-execution-engine/

DavidC1 said:
It was apparent after just 2 years Netburst was a dead end but the entire company milked the thing for 5 years after that. You could also see from Extreme Overclocking records 5GHz wasn't achievable without exotic cooling yet the company that's supposed to be filled with engineers thought 10GHz was easily achievable, when to this day with the most exotic cooling, the most binned parts, and the easily clocked CPU can't reach it.

Again, you're just looking at it from the outside. What actually happened was that Netburst was architected to scale up to 10 GHz, because they assumed Dennard Scaling would continue. It obviously didn't, but it wasn't immediately clear to them just how insurmountable the challenges would be in trying to control leakage. So, for a while, they kept pinning their hopes on new process nodes, but that didn't pan out.

Also, the development times of CPUs are long. It takes between 3 and 5 years to bring a new CPU to market, depending a lot on the scale of the changes we're talking about. To switch gears and deliver Core 2, they had to first come to terms with the fact that whatever they planned to follow Netburst had to be scrapped, and then switch gears to working on Core 2, which involved a lot more work than simply continuing to extend Netburst.

DavidC1 said:
You don't need every and explicit detail to understand what's going on, which is what you are saying is needed. Unlike "AI" which does, humans have the ability to understand "between the words".

If you don't understand why certain decisions were made, you can't presume the same underlying logic will apply to future decisions.

bit_user · Sep 1, 2024

ET3D said:
I feel that Intel wanting to part with its manufacturing arm is actually a good opportunity for the US. It will require the US to invest more in those fabs than it would have otherwise, and to whip them into shape, but as long as that's possible, it will give the US manufacturing that's close enough to state of the art that US fabless companies, such as AMD, NVIDIA, Qualcomm, etc., will have an alternative that will allow them to produce chips in the US.

The US could also incentivise those companies to produce chips using US fabs. This way, having US-based manufacturing that's not tied to Intel could be of great benefit to the US.

That's the best case scenario. I hope it goes something like that, but I fear investors could decide to follow the more short-term oriented approach Global Foundaries has taken, since they decided to kill their 7 nm node and just focus on milking their existing manufacturing lines.

I wonder if the SEC has the power to attach strings to spin-off approval of IFS, but I doubt it. It'd be neat of there would be something in their corporate charter that tied the company's business model to the continued development of new manufacturing advancements. The only levers the government has are probably grants and tax breaks, tied to new technology development.

Gururu said:
I just can't blast a U.S. company with a fairly strong hold on these technologies.

The only way to expect Intel can address their shortcomings and failings is by being honest about them. I want Intel to have long-term success, which is precisely why I criticize their missteps, when I see them.

However, I always reach for informed analysis, whether critical or approving. I do my best not to be simplistic, reductive, or partisan.

thestryker · Sep 2, 2024

Saw this post from Ian Cutress which is applicable:

bit_user · Sep 2, 2024

thestryker said:
Saw this post from Ian Cutress which is applicable:

Yeah, I think I knew about the PDK thing (Physical Design Kit?), which is needed in order for EDA software to target the fab node. Definitely never heard of any external customers using Intel 7.

From what I recall, Intel 4 and 20A are incomplete libraries, lacking cells needed for things like I/O. So, it's expected that Intel 3 and 18A will be the main nodes used by their external customers.

abufrejoval · Sep 2, 2024

bit_user said:
Yeah, I think I knew about the PDK thing (Physical Design Kit?)

Yes, that's what it stands for.

bit_user said:
, which is needed in order for EDA software to target the fab node. Definitely never heard of any external customers using Intel 7.

Semiaccurate Charlie rant-hinted for a while about how they lost external customers because they just couldn't design chips having to rely on Intel proprietary tools and the Intel engineers to operate them.

Wasn't compelling enough to pass through the paywall.

That's exactly why Intel then tried to migrate to "standard silicon game engines", which evidently is a lot like going from a legacy entanglement of mainframes and minis to SAP.

bit_user said:
From what I recall, Intel 4 and 20A are incomplete libraries, lacking cells needed for things like I/O. So, it's expected that Intel 3 and 18A will be the main nodes used by their external customers.

Since (off-silicon) I/O typically cannot be shrunk, it's wasteful to put into these tiny nodes and the main motivation to chunk these SoCs into mix and match tiles with all kinds of process sizes.

I don't think they'll do (external) I/O on the smaller futher nodes, either, but leave that to cheaper nodes with existing I/O libraries.

I don't know when we'll see tiled design even for mobile SoCs, but it's all a numbers game and shrinks won't ever become uniform again from the way physics work... for all I know.

bit_user · Sep 2, 2024

abufrejoval said:
Since (off-silicon) I/O typically cannot be shrunk, it's wasteful to put into these tiny nodes and the main motivation to chunk these SoCs into mix and match tiles with all kinds of process sizes.

And yet AMD APUs are still monolithic, as of Strix Point.

abufrejoval said:
I don't think they'll do (external) I/O on the smaller futher nodes, either, but leave that to cheaper nodes with existing I/O libraries.

I don't know when we'll see tiled design even for mobile SoCs, but it's all a numbers game and shrinks won't ever become uniform again from the way physics work... for all I know.

I think it's mainly a question of whether the overhead of having an extra die is worth the savings you get by making the compute die smaller. For something like a server CPU, sure. For a low-cost, mainstream laptop CPU, the multi-die proposition hasn't seemed appealing to AMD, whose been one of the biggest chiplet proponents in recent history.

thestryker · Sep 2, 2024

bit_user said:
I think it's mainly a question of whether the overhead of having an extra die is worth the savings you get by making the compute die smaller. For something like a server CPU, sure. For a low-cost, mainstream laptop CPU, the multi-die proposition hasn't seemed appealing to AMD, whose been one of the biggest chiplet proponents in recent history.

I think cost is a primary reason behind the design of LNL given that they had to use TSMC's leading edge node. The GPU tile moving over seems really obvious given that all of Intel's GPU IP has already been TSMC and GPUs tend to scale. It's also a narrowly focused product rather than something used for an entire stack.

We'll find out with PTL though as at least the compute tile is on Intel 18A.

It's entirely possible that IO will be on Intel 3 once the node is free from compute volume if there's a plan for in house. This guess is mostly based on the combination of it being a long term node and Intel pushing 18A for third parties.

I could also see TSMC N6 still being economically viable given that it's 3 nodes back now and Intel doesn't have anything affordable in that class.

Kamen Rider Blade · Sep 4, 2024

This should've happened years ago, now that Intel is sinking, this might be the ONLY play they have.

SonoraTechnical · Sep 13, 2024

bit_user said:
Mixing a lot of facts and fiction, there. Secretary of State is involved in foreign affairs, not domestic ones. I think you're probably referring back to when she was a Senator, which was from 2001 to 2009 and well before GF even had a 14 nm node online!

meh... you're hung up on my brain cramp... I meant Sen (Senator) .... it's just we heard so much about her damn email servers as Secretary of State... it's burned in my mind... my impulse was to type Sec 😉

good lord, not even sure why I'm explaining it to you... you just delight in being pedantic and arguing with everybody over everything... 🤣😆

bit_user · Sep 14, 2024

SonoraTechnical said:
meh... you're hung up on my brain cramp... I meant Sen (Senator) .... it's just we heard so much about her damn email servers as Secretary of State... it's burned in my mind... my impulse was to type Sec 😉

good lord, not even sure why I'm explaining it to you... you just delight in being pedantic and arguing with everybody over everything... 🤣😆

If you read my entire, 2-sentence reply (all of which you quoted), I not only pointed out that you probably were getting the two mixed up (as you say), but also pointed out that it doesn't align with your allegation that it had anything to do with them being stuck on 14 nm, because her term as Senator was well before the 14 nm era.

So, yes I did nit-pick your mistake of Senator vs. Secretary of state, but then I proceeded to point out that it's too simplistic to blame her actions as Senator for them being stuck at 14 nm, as you seemed to imply. The reason I addressed both interpretations of your statement wasn't just to nit-pick for its own sake, but to point out that the facts don't seem to align with your allegation, either way one reads it. If you meant Senator, then the timeline doesn't work out. If you indeed meant Secretary of State, then the scope of her responsibilities doesn't align.

What matters is their activities during the 14 nm era, in preparation for the next stage of production. GF reportedly had a 7 nm process that was nearly production-ready, but its investors pulled the plug on it, because it was running late. They decided it would be more profitable instead to milk their existing, legacy nodes. This is a fear some of us have about Intel spinning off its fabs - that short-sighted investors could make a similar decision. If anything, government grants could help shift the incentives back towards continuing to progress on development of new nodes.

Finally, I didn't say that just for the sake of being pedantic or arguing, either. You advanced a serious conjecture that doesn't deserve to be accepted at face value, without any scrutiny. I think it's troubling that you bat away my counterpoint by treating my reply as mere nit-picking, rather than addressing the substance of the matter.

P.S. if you work in a technical field, you'll know they tend to be very unforgiving of mistakes. Technical work tends to attract, reward, and reinforce attention to detail. It really drives home the point that details matter. It's not enough for something to "feel right" or believe something because it fits an attractive narrative, the details actually have to support the claim.

News Intel mulls spinning off its manufacturing division

Judicious

Distinguished

Titan

Commendable

Commendable

Distinguished

Distinguished

Commendable

Titan

Titan

Judicious

Titan

Honorable

Titan

Judicious

Distinguished

Honorable

Titan

Share this page