News TSMC's 3nm Node: No SRAM Scaling Implies More Expensive CPUs and GPUs

Page 2 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.
A lot of places only report the instruction cache size for whatever reason, but Intel and AMD lump those two together when they report L1 on their spec sheets.
Okay, I see now. Thanks for explaining.

A historical note: separate memories for code & data is formally known as a Harvard Architecture. I guess having separate L1 caches confers most of the benefits, without actually needing to extend the split all the way out to DRAM. Over the years, I think Instruction caches have even been used to compute & hold some information used by the instruction decoder.
 
I guess we we always gonna run into these barriers eventually. Unfortunately the big guns will use as a good reason to hike prices (not saying it wont cost them more but they will add a big extra % on top and blame it on this)
 
What I find a bit vexing is that I can't find anyone explaining why SRAM should scale worse than logic...
All the stories are about the impact of that trend, not it's reason!

It seems to be an issue that's been well known and around for many generations, but I find it hard to believe explenations would need to be found in the pre-Internet ages: so am I just googling it all wrong?

From what little I know, SRAM is logic, Only that it's six or eight transistors, so it's relatively expensive, much more than DRAM with its single gate, which of course has to lug a capacious capacitor around and that seems to require surface area (which can be achieved via deep holes).

And I've eagerly followed the stories with alterante RAM designs, which allow single gate 'DRAM' density while they even eliminate the volatility of DRAM e.g. magnetic, phase-change, memristor or even more exotic designs, which unfortunately are much worse in terms of scaling, latency, wear etc.: those are all very different beasts, so the fact that mixing them with logic is difficult across process sizes creates challenges is easy to understand.

But for SRAM the 6-8 gate overhead is certainly a heavy burden, I just plain don't understand why it should scale less than logic. In the past I remember reading how the very regular structure of SRAM allowed many tweaks to allow better than logic density for the SRAM gates.

Do these tricks now simply unravel? Are we simply going from a better-than-logic density to equal-to-logic density?

That might feel like a relative slow-down and disappoint those who want linear improvement for shrinks anywhere, but would deserve being called out for what's really happening.

Now, I have seen mentions of wiring limitations, which could offer an alternate explanation.

Sure, I can see that SRAM needs plenty of wires and that these wires have to go a long way with long rows and columns. And, yes, I can also see how logic might fare better there, because lots of inputs might coalesce and then only be passed along to immediate neighbors: fewer and shorter wires: on average, not in all cases. In this scenario I see SRAM like a worst case of logic scenario in terms of signal lines: so is that the reason?

Or is it a combination of both?

In the chip world, we usually talk about Memory and Logic. SRAM is memory, typically laid out in a very compact and regular array structure. There's not a lot of "design" optimization to do here, since it is already very dense, all of the individual transistors are more or less the same, and the design is relatively simple. Logic is not memory, but where all of your calculations and data manipulations are performed. The devices in these Logic macros tend to be more varied and the structure is much less regular. There's a lot of room for design optimization within the Logic macros.

I haven't worked on any advanced nodes beyond 7nm, but I would suspect two possibilities for the lack of SRAM scaling at 3nm. First, it's possible that due to the regular layout, the SRAM at 5nm was already quite compact, so the additional scaling achieved with the Fin shrinkage at 3nm (Gate sizes have been stuck somewhere around 18nm due to tunneling effects and cannot really scale much more) was relatively modest. This could be especially true due to some of the constraints on making contacts and laying out the metal layers in the Back End of Line. In this case, the shrinkage may provide additional options for optimizing the logic, leading to the better scaling in the logic regions relative to the SRAM.

The second possibility is that there are some fundamental transistor density limits due to cross-talk, heating, or leakage. It's possible that the denser SRAM structures are bumping against these limits, while the more sparse logic regions are not, at least not yet.