News AMD Dishes on Zen 3 Architecture, Milan and Genoa Roadmap

Gillerer

Distinguished
Sep 23, 2013
360
81
18,940
I disagree about the sweeping statement of "eliminating one layer of latency within the compute die".

The only situation where that happens is if you only have one compute die (CCD), since that will then no longer have multiple CCXes, so the latency layer of inter-CCX communication is removed.

If you have multiple CCDs, communication between cores on them will still make a hop by the I/O die, just as between two CCXes do on Zen 2 (even if they're physically located on the same CCD). Even though the higher latency is incurred less frequently (since there are now 7 neighboring cores instead of 3), it is by no means "eliminated".

Also, the fact that you double the L3 cache accessible to each core - and therefore less likely to have to go to system memory - doesn't remove any latency layers, just alters the chances of hitting them.


The place I would anticipate the most advancement is from combining the Infinity Fabric links of two CCXes into a single link. This could give a single core double the memory and I/O throughput (to the I/O die), as long as there is little contention for the resources with other cores.

EDIT: I stand corrected. It seems I had misunderstood this - getting a couple of terms mixed - when I first read about Ryzen 3000 during launch. The text I remembered was referring to intra-package, not intra-die communication (lack of direct links between CCDs). Shouldn't read technical material quickly...
 
Last edited:
  • Like
Reactions: bit_user

PaulAlcorn

Managing Editor: News and Emerging Technology
Editor
Feb 24, 2015
858
315
19,360
There is an infinity fabric connection between the two CCX within a single die/chiplet. Much like you see in the graphic in the link below. (The infinity symbol between these two CCX denotes that connection.) Two CCXes on a single die communicate across that fabric via this intra-die connection, essentially you have two quad-core CPUs talking to each other. The problem comes in when you have to hop to other die. https://www.tomshardware.com/reviews/amd-ccx-definition-cpu-core-explained,6338.html
 
Last edited:
  • Like
Reactions: bit_user

hannibal

Distinguished
Big thanks! That seems much more energy- & latency- efficient.

One reason I want an APU is to have only one CCX. I'm trying to get a Ryzen 5 Pro 3400G.

You mean ryzen 4400g... Because 3400G is old Zen+ technology aka monolith architecture. Or do you mean that you want to get both cpu and gpu in one die. Then the 3400g is way to go! Zen2 Apus comes (maybe) next year.
 

bit_user

Polypheme
Ambassador
You mean ryzen 4400g... Because 3400G is old Zen+ technology aka monolith architecture.
Right now, I want a Ryzen 5 Pro 3400G. They were just announced. Pro version, because it's for a small file server and I want ECC memory, which the non-Pro APUs don't support.

The reasons I want an APU are:
  • cost
  • I only need 4 cores
  • 1 CCX = better & more-efficient inter-core communication
  • avoid the need for a separate GPU - the machine mostly runs headless.

Zen2 Apus comes (maybe) next year.
As it's replacing a Phenom II, Zen+ will be a fine upgrade.
 
  • Like
Reactions: PaulAlcorn

djayjp

Distinguished
Feb 24, 2008
23
4
18,515
"The largely unchanged specifications, at least in key areas, implies Milan is merely a "Tock"-equivalent, or just a move to the second-gen of the 7nm node (7nm+)."

You mean "tick": Intel's tock involved a significant microarchitectural change.
 

bit_user

Polypheme
Ambassador
"The largely unchanged specifications, at least in key areas, implies Milan is merely a "Tock"-equivalent, or just a move to the second-gen of the 7nm node (7nm+)."
It'll be interesting to see what EUV does, for TSMC's 7 nm.

You mean "tick": Intel's tock involved a significant microarchitectural change.
Really? I kinda thought the whole tick-tock thing started with Sandybridge. Ivy Bridge was mostly just a node-shrink. So, then ticks should be the bigger architectural changes.

Also, people like to joke that Intel's scheme, since Skylake, has been tick-tock-tock-tock-tock...
 
I really just want a 6 core Ryzen APU.

I think If AMD took a Ryzen 5 2600 and strapped on the IGPU from the 3400g, they would have a great seller, even if they charged around $175 for it.

I get it, its not this simple, but I still want it.
 
Oct 6, 2019
1
0
10
There is an infinity fabric connection between the two CCX within a single die/chiplet. Much like you see in the graphic in the link below.

There is a connection between the CCXs but no on-die. The connection is through the IO die. The reason for this is because the hardware block that used to connect the two CCXs, which is called "SDF plane," was moved to the IO die. The SDF plane is part of the infinity fabric, hence the infinity symbol in the graphic you posted.

AMD improved the connection between dies. The previous implementation had a latency of around 150ns. The new one is around 65-70ns (and that accounts for the whole round-trip to the IO die and back). The connection between cores in the same CCX was also improved. It was originally of around 40ns. Now is of around 25-30ns. It's thanks to these improvements that you don't notice any performance regression for having to go through the IO die for inter-CCX communication.

You can see it all in this analysis on the 3900x

aetZIlx.jpg

As you can see, there are only two latencies. The green one shows the comunication between cores in the same CCX, and the yellow is the comunication between cores in different CCXs which always requires a round-trip through the IO die.
 
Last edited:
I wonder if the good ole x370 boards will end up supporting these CPU's? Would be lovely to plop in one of these next year to replace my 1800x.

Im hoping TSMC can get higher consistant clocks with 7nm+ as right now usually only one core can hit the rated boost frequency in any given CPU.
 
  • Like
Reactions: NightHawkRMX