[SOLVED] its a problem to make CPUs bigger because of reject but why it doesnt make sense on SOC?

Solution
Same rules apply for any integrated circuit design. The larger the die area, the less chips per wafer. Less chips per wafer with a given yield percentage means that there is a higher chance of failure on a smaller number of chips. Everything is cost per wafer.

So if you plan for 128 chips on a wafer with a yield of 95% you can write off 7 chips.

121 chips with a wafer cost of $6000, means each of your chips is $50 base cost = $5650

512 chips per wafer, you write off 26 chips, but your chips cost $12.30 each = $5680, an extra $30. Now scale that up to to the millions in product and a small percentage is a huge amount of money.

Smaller the chip, higher the yield, equals more saleable product. And that is at the fab side, at each step...

Eximo

Titan
Ambassador
Same rules apply for any integrated circuit design. The larger the die area, the less chips per wafer. Less chips per wafer with a given yield percentage means that there is a higher chance of failure on a smaller number of chips. Everything is cost per wafer.

So if you plan for 128 chips on a wafer with a yield of 95% you can write off 7 chips.

121 chips with a wafer cost of $6000, means each of your chips is $50 base cost = $5650

512 chips per wafer, you write off 26 chips, but your chips cost $12.30 each = $5680, an extra $30. Now scale that up to to the millions in product and a small percentage is a huge amount of money.

Smaller the chip, higher the yield, equals more saleable product. And that is at the fab side, at each step you are adding value until you reach the true Bill of Materials for a product, then you can set the sale price for profit, and the ability to be flexible with competition.

That aside, SoC are designed to be power efficient, you don't want a large SoC as that means more internal resistance, and more power draw from all the extra transistors.
 
  • Like
Reactions: drea.drechsler
Solution
Adding a bit...

Chips are rectangles cut out of a circle; larger chips leave a lot of waste around the edges that smaller chips would be able to use. Less waste means more efficient manufacturing process.

But also, process defects tend to occur much more frequently around the edges of the wafer. Smaller chips being lost from around the edges leaves a lot more in the center than with a larger chip.

Speaking of Ryzen 2nd/3rd gen CPU's, the I/O dies are made on a 14nm process which is less critical than the 7nm process the CPU chiplets are manufactured on.
 

Eximo

Titan
Ambassador
Not sure if they are still doing multi-pattern masks, I vaguely recall that being a thing. With larger chips laid out in the center of the wafer and smaller chips around the outside edges.

Pretty sure they stopped that around the 40 or 32nm range.
 

jasonf2

Distinguished
Not to throw another kink in the SOC puzzle but generally speaking unlike a dedicated CPU there are multiple use components integrated into an SOC design. These pieces, like a cellular modem do not necessarily scale well to the latest and greatest process node. So monolithic die designs have moved twards fabric interconnects (on die busses) and mix and match on die components that allow for greater flexibility. This also allows for partial chip manufacturing error while still being able to sell the chip, greatly increasing the yield. AMD calls it chiplets, Intel is now calling it tiles but it is all really the same philosophy. Increase yield by binning chips. The perfect chips are sold with everything running as premium. The not so perfect are sold as another cheaper sku with the bad cores disabled. Binning has also been part of intel's clocking magic as of late, but it is also why overclocking headroom has been reduced over the last couple of years.
 

Eximo

Titan
Ambassador
Yep, unless something changes, might not be buying Intel K SKUs anymore, if ever. Only got the 10900F because I wanted to re-use my LGA115x CPU block one last time. That made the performance per dollar about as optimal as possible vs the switch to AMD, with the exception being a 10850K had I lived a little closer to a Microcenter. Looks like Alder lake is another hot box. Zen 4 will likely wipe the floor with it as soon as AMD invests a little in Windows 11 scheduling (Though no issues with Windows 10 at the moment)
 
Something of note, the choice between a monolithic design and using a multi-chip-module (MCM) approach depends on what you're doing as well. For instance, while AMD is famous for pioneering the MCM approach with the last few generations of CPUs, they're still making APUs monolithic. I haven't seen any official reason why they're doing this, but my guess it's a combination of being able to make a smaller overall package, requiring less circuitry to feed a monolithic design, and possibly signaling issues if the GPU isn't integrated.

The biggest bottleneck with the MCM approach however is communication. AMD may be approaching a limit with their current designs, which is why AMD doesn't appear to be going above 8-cores per chiplet/CCX: https://www.anandtech.com/show/16930/does-an-amd-chiplet-have-a-core-count-limit
 
Wouldn't latency be much better with the GPU being local to the CPU? Certainly even a multi-chip module would be better than a discrete GPU for that, but being on the same die would let them eek all the performance possible out of minimal hardware.
I think that's part of it, but I also think it's mostly to keep the communication backbone simple since the target for APUs is portable/low power devices. Backbones take up a non-trivial amount of power and whatever milliwatts you can pinch here and there can add up.