Discussion Why are we stuck at 5 GHZ for almost 18 years?

Page 2 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.

Jacob 51

Notable
Dec 31, 2020
555
20
915
5 GHZ on a processor was a big thing in 2003 but so is till date.

Tom's Hardware overclocked a CPU to 5 GHZ in 2003 with liquid nitrogen. They set the record. (Which is broken now.)

In the latest Core i9 11900K has frequency with turbo boost of 5.3 GHZ.

So that is a big number, considering that this core i9 is the most powerful INTEL chip Rn. (Maybe I'm wrong. Correction would be appreciated).

Why aren't we increasing this number to 6 GHZ?

I understand that some games require GHZ, while some require cores.
 
5 GHZ on a processor was a big thing in 2003 but so is till date.

Tom's Hardware overclocked a CPU to 5 GHZ in 2003 with liquid nitrogen. They set the record. (Which is broken now.)

In the latest Core i9 11900K has frequency with turbo boost of 5.3 GHZ.

So that is a big number, considering that this core i9 is the most powerful INTEL chip Rn. (Maybe I'm wrong. Correction would be appreciated).

Why aren't we increasing this number to 6 GHZ?

I understand that some games require GHZ, while some require cores.
The 11900K can hit 7Ghz under LN. The record is an FX8370 at over 8.

IPC and efficiency is better. Sure hitting 6Hz would be cool but you’d need a 360mm rad at minimum to run it, a 1000W PSU, a motherboard that could handle it and it would draw well over 300W of power. On top of that you’d have issues with the VRM lasting an acceptable time and the motherboard PCB warping with the heat.
 
Honestly its a great question OP. I don't know exactly why its so difficult, but hitting higher frequencies is always going to be significantly harder by an exponential number.
Not always: prior to 2004, there was a steady frequency ramp from 4.77MHz to 3600MHz and IPC improvements were a much smaller proportion of overall performance gains. IPC has become the primary performance driver once practical CPU clock frequencies hit a ceiling in the 4-5GHz range.

And the reason for this ceiling is simiply that pipeline stages can get a lot more done when they have a little more time more efficiently than simpler pipeline stages would at higher frequencies.
 
Not always: prior to 2004, there was a steady frequency ramp from 4.77MHz to 3600MHz and IPC improvements were a much smaller proportion of overall performance gains. IPC has become the primary performance driver once practical CPU clock frequencies hit a ceiling in the 4-5GHz range.

And the reason for this ceiling is simiply that pipeline stages can get a lot more done when they have a little more time more efficiently than simpler pipeline stages would at higher frequencies.
As I recall deep pipelines are a way to achieve extremely high clocks, which is what Pentium 4 did. I remember (in the early P4 days) someone was claiming CPU clocks over 6Ghz would be common "soon".

But the problem with deep pipelines are the pipeline stalls, where everything in the pipeline has to be discarded because of a "miss". So they have to add more/better prediction stages and larger L1 caches and that just adds (exponentially) more heat producing transistors and increases die size. The larger die is (or was) easily dealt with by process shrinks, the higher heat output with process improvements like FinFET. But there's only so much they can do all while just holding clocks speeds right around the same levels.
 
Last edited:
As I recall deep pipelines are a way to achieve extremely high clocks, which is what Pentium 4 did. I remember (in the early P4 days) someone was claiming CPU clocks over 6Ghz would be common "soon".

But the problem with deep pipelines are the pipeline stalls, where everything in the pipeline has to be discarded because of a "miss".
When Intel created Netburst, the target was 8-10GHz.

The "everything must be discarded' issues still apply to modern CPUs and Intel's newest CPUs can be 300+ instructions deep into a bad speculative execution branch before finding out that it did all of that work for nothing. Making the pipeline shorter does not reduce the amount of work that may get thrown out from a mispredict, it only shortens the overall instruction execution latency at least in terms of clock cycles, which simplifies scheduling a little since that means not having to keep tabs on in-flight instructiions for that many extra clock cycles.
 
When Intel created Netburst, the target was 8-10GHz.

The "everything must be discarded' issues still apply to modern CPUs and Intel's newest CPUs can be 300+ instructions deep into a bad speculative execution branch before finding out that it did all of that work for nothing. Making the pipeline shorter does not reduce the amount of work that may get thrown out from a mispredict, it only shortens the overall instruction execution latency at least in terms of clock cycles, which simplifies scheduling a little since that means not having to keep tabs on in-flight instructiions for that many extra clock cycles.
Point is that dealing with the issues of deep pipelining, which I understand to be the key part of enabling very high clock speeds, requires a lot of heat producing transistors. Not just for the pipeline stages but for the circuitry such as branch prediction and target prediction buffers needed to mitigate it's problems.

I fully imagine it's the heat production, or more accurately getting it out of the die, that keeps today's CPU's from running at 10Ghz right now. After all, we've seen current CPU's at much higher clocks with exceptional cooling. But there's a limit to how fast the temperature gradients imposed even by LN2 allow removal of heat across the tiny surface areas at 7nm, or Intel's 10.
 
Last edited:
Not just for the pipeline stages but for the circuitry such as branch prediction and target prediction buffers needed to mitigate it's problems.
The transistors needed for branch prediction, target buffers, etc. in a deep-pipeline CPU are still required in CPUs that can go 300+ instructions deep into speculative execution. One misprediction early in that re-order window and it can be hundreds of clock cycles worth of stuff getting purged from the retirement buffer.

The penalties from branch mispredict haven't magically gone away with shorter pipelines, they are still very much there and even more critical than ever as CPUs have to dig deeper and deeper into speculative execution to fill their execution units with out-of-order instructions.
 
..... as CPUs have to dig deeper and deeper into speculative execution to fill their execution units with out-of-order instructions.
Since you're talking about a CPU that almost necessarily means ever more complex circuits, with even more transistors being added to the die. Heat producing transistors. Add as many as you want, inevitably the die gets too large and they have to shrink it to make it economically (wafer yields) and to fit the package.
 
This seems to be a big thread.

BTW why is Intel going back in time? Why are they reducing cores so much? In I9 11900k there are 2 less cores than i9 10900K.

AMD got really far, with their 7NM lithography, where Intel is stuck at 12 NM as far as I know
 
This seems to be a big thread.

BTW why is Intel going back in time? Why are they reducing cores so much? In I9 11900k there are 2 less cores than i9 10900K.
Because it's not hurting them financially. AMD has to sell more cores than intel to appeal to people, intel doesn't have to do that so they don't to maximize profits.
AMD got really far, with their 7NM lithography, where Intel is stuck at 12 NM as far as I know
It's still intel 14nm ( several+ ) with a backported architecture that was supposed to release on 10nm.
But that is completely irrelevant since everybody can call their process anything they want to.
 
  • Like
Reactions: CountMike
BTW why is Intel going back in time? Why are they reducing cores so much? In I9 11900k there are 2 less cores than i9 10900K.
Because Cypress Cove cores are a fair bit bigger than Skylake cores from all the stuff that got added or scaled up so Intel had to drop two cores to keep the total die size manageable. Despite shedding two cores, the 11th-gen CPUs are still larger than 10th-gen.

With Intel still being stuck on 14nm for its mainstream CPUs, having 200+sqmm mainstream chips hurts the number of working chips they can get out of each wafer pretty bad. Making the die even bigger would make matters much worse in the middle of an on-going chip shortage. At this point, Intel has been struggling to meet CPU demand for over two years.
 
If intel where forced to they could scrap the iGPU (big chunk that covers the whole width on the left of the chip) and add another 4 cores without changing the die size.
Do they get the gpu and cpu parts from the same wafer or do they put them together from two different wafers?
https://www.comptoir-hardware.com/actus/processeurs/43260-rocket-lake-deja-un-die-shot-.html
11900k-die-shot_t.jpg
 
If intel where forced to they could scrap the iGPU (big chunk that covers the whole width on the left of the chip) and add another 4 cores without changing the die size.
Do they get the gpu and cpu parts from the same wafer or do they put them together from two different wafers?
It is all one single chip.

And I suspect the core reason why Intel puts an IGP in all of its mainstream CPUs (albeit deactivated in F-SKUs) is because the vast majority of its customers (schools, businesses and institutions) do not need any more than that for the tens of millions of seats they have doing mostly 2D work.

I used an i5-11400 in my new PC so I'd be able to use the IGP should anything happen to my GTX1050 before some semblance of sanity returns to the GPU market. (And also because the -F version wasn't available anywhere even if I wanted to save $20. Cheap insurance and I won't need to toss a GPU in it when I upgrade again 5-10 years from now before I can re-purpose it for something else.)
 
And I suspect the core reason why Intel puts an IGP in all of its mainstream CPUs (albeit deactivated in F-SKUs) is because the vast majority of its customers (schools, businesses and institutions) do not need any more than that for the tens of millions of seats they have doing mostly 2D work.
Not to mention that only a very small amount of people would benefit from more cores on a daily basis, while much more people benefit from AI/quicksync and the like.
 
They've Added Intel Xe Iris Which Is Pretty Powerful And Can Be Compared To Vega 8 Of AMD, But Adding That Might Mean The IGPU Taking Much More Space.

Why Don't We Have Onboard Graphics On The Motherboard Anymore?
 
They've Added Intel Xe Iris Which Is Pretty Powerful And Can Be Compared To Vega 8 Of AMD, But Adding That Might Mean The IGPU Taking Much More Space.

Why Don't We Have Onboard Graphics On The Motherboard Anymore?

Because transferring data between the Northbridge and the CPU became the bottleneck. Which is part of the reason why those functions moved onto Intel's CPUs with Sandy Bridge.
 
Why Don't We Have Onboard Graphics On The Motherboard Anymore?
Because graphics requires high-bandwidth low-latency access to system memory. Chipset graphics died when the memory controller got integrated into CPUs for the same reason.

If you can eliminate extra hops, you save power and make efficiency gains. That is how the 5600G and 5700G perform nearly as good as their CPU-only counterparts despite having 1/4th as much L3 cache and sharing resources with an IGP.
 
Increasing the GHZ might increase the performance I guess? But how much?
Nowhere near as much as increasing L3 cache size based on HWUB's new cache size scaling video where they disabled cores on an i7-10700K and i9-10900K to simulate an i5-10600K with 16MB and 20MB of L3 cache and found that in some games, the hypothetical extra-cache i5 gains 10% extra performance per 4MB of extra L3 cache.

So Intel's CPUs would benefit far more from having 24+MB of L3 cache as baseline than any clock frequency increases. Only problem with that is that large caches consume large amounts of die area and isn't feasible until Intel gets its new 5-7nm class processes fully sorted out.
 
  • Like
Reactions: Howardohyea
Besides all the amazing tech explanations you have up there...

This is similar as driving a fast sport car to go to work. You could get inside your ..... (fill with your favorite brand/model, for me Mac Laren P1), press the gas pedal all the way down and try to drive to your workplace that way. The result you are most likely going to end in a real horrible crash.
There are situations on which speed is not always the best solution (knowing of course that frecuency is not the same as speed of course).

Another example, theres a kit of DDR4 memory that many reviewers use from a well known brand which runs at DDR4 3200 CL14, and beat every other memory kit of higher "faster" frecuency, like 3600, 3800 or even higher. Because working frecuenty is not the only variable in the performance equation.
 
Nowhere near as much as increasing L3 cache size based on HWUB's new cache size scaling video where they disabled cores on an i7-10700K and i9-10900K to simulate an i5-10600K with 16MB and 20MB of L3 cache and found that in some games, the hypothetical extra-cache i5 gains 10% extra performance per 4MB of extra L3 cache.

So Intel's CPUs would benefit far more from having 24+MB of L3 cache as baseline than any clock frequency increases. Only problem with that is that large caches consume large amounts of die area and isn't feasible until Intel gets its new 5-7nm class processes fully sorted out.
What if processors are made bigger instead?
 
What if processors are made bigger instead?
If you make bigger chips, you can fit fewer chips per wafer and you also stand a higher chance of fatal defect ruining the entire CPU. This translates into much higher effective manufacturing cost per chip since you get much fewer working chips per processed wafer.

Improving overall yields by going with chiplets/tiles for future architectures is the main motivation behind the various upcoming 2.5/3D die-stacking techniques, another being the ability to mix-and-match different fab processes for different parts of a design.