Discussion Clarifying Ryzen 3000 memory overclocking

MasterMadBones

Distinguished
AMD's marketing on Ryzen 3000 DRAM overclocking has been confusing at best, in some instances saying the DRAM clock is decoupled from the Infinity Fabric, and in others that the IF clock (seemingly) runs at half the speed beyond 3733 MT/s. While both are partially true, neither actually fully explains what is going on here. In order to fully understand what is happening, we need to look at how the Infinity Fabric has evolved since Zen.

Because DDR4 is a double data rate technology, where two transfers occur every cycle, we often use its effective clock rate, which is twice the actual clock rate. In order to avoid confusion, I will always make a clear distinction between MHz and MT/s (MegaTransfers per second) for DRAM. MT/s is the same number as the memory's effective clock rate.

To start off, Infinity Fabric is AMD's marketing term for an interconnect protocol that succeeds the HyperTransport interconnect that AMD has used in various iterations since the Athlon 64 days. In fact, IF is a superset of HT (not to be confused with Intel's HyperThreading, which is abbreviated as HTT) and covers a number of different controllers and connections. Its clock domain is tied to that of the memory and the bus width varies depending on its implementation.

In Zen, the Infinity Fabric uses a 256-bit bus internally. Its clock is the same as the memory clock. Since DDR4's effective bandwidth is calculated as bus width * clock speed * # of channels * 2, where the bus width is always 64 bits, in a dual channel implementation, our effective bandwidth is 256 * clock speed, which, perhaps surprisingly, is the same as the IF's bandwidth. How does AMD enable quad-channel or even 8-channel configurations in Threadripper and EPYC? Since each die has its own memory controller, each die can drive two channels of DDR4. However, when one die has to fetch data from RAM for use on another die, it has to squeeze all of that data through a 128-bit IFOP (Infinity Fabric On-Package) link. This is part of the reason why accessing "distant" memory has a rather dramatic effect on performance, as demonstrated most apparently by the 2970WX and 2990WX (which are technically Zen+), which have two of their four memory controllers disabled. The two "compute-only" dies only have access to half the memory bandwidth that the other two dies have.

AMD made some small tweaks to the Infinity Fabric for Zen+, increasing its maximum clock rates and adjusting its timings for better latency. The bus width is unchanged across the package. Some of the timing tweaks were already implemented in Raven Ridge (1st gen Ryzen APUs) and Whitehaven (1st gen Threadripper).

Zen 2 brings the first major revision to the Infinity Fabric. The internal bus width is increased to 512 bits, along with the IFOP which connects connects the cIOD (central IO Die) to the CCDs (Core Complex Dies) in this iteration. One of the primary targets of widening the bus, especially for the IFOP link, is to compensate for the increased absolute latency associated with having multiple dies communicating with each other. However, it also opens some interesting avenues for higher memory clocks, as now the effective bandwidth of the IF is double that of a dual-channel DDR4 configuration. It is this widening of the bus that allowed AMD to introduce their "2:1 mode" for memory clocks beyond 3733 MT/s, although it does not work exactly how you may expect.

d24hymdbbm331.png


Although the IF and DRAM clocks are still tied together at speeds up to 1866 MHz, they are automatically decoupled when the DRAM clock reaches 1933 MHz. In this situation, the IF clock reverts to 1800 MHz. In order to allow the memory to fulfill four (64-bit) transfers per clock cycle even when that cycle doesn't match with an IF cycle, the data is effectively placed in a queue which assigns a 64-bit section of the the IF link to each transfer. When the next IF cycle arrives, the data may be moved as a single 512-bit transfer. In AMD marketing terms, the memory controller is able to move the data from 2 DRAM cycles with 1 IF cycle. The operation of queueing and reshuffling the data takes a significant number of cycles and also eats up some inter-die bandwidth. Otherwise, DRAM overclocking on Ryzen from 3866 MT/s onward is very similar to overclocking on Intel systems.

All of this theoretically allows for DRAM clocks up to 7200 MT/s without altering the base clock or significantly overclocking the memory controller. For now though, even X570 motherboards do not support these data rates yet (please prove me wrong though). Additionally, if you wish to clock the IF higher than 1866 MHz for better inter-die latency, this should also be possible.

This post may be slightly redundant, but I felt that it was necessary to bring all of the information available across the many different communities we have together in one place for the sake of completeness. I hope it helps you understand how 3rd gen Ryzen handles the Infinity Fabric and DRAM clocks better. If you have any further questions, feel free to ask.

The information in this post is deducted, which is to say that I have consulted several sources on the workings of the Infinity Fabric and DDR4 DRAM and looked at findings from the community regarding overclocking memory on AMD 3rd gen Ryzen CPUs and came to the above conclusion(s). My knowledge on CPU design is the result of my personal curiosity and experience with developing hypothetical designs. I do not have a degree in computer science or similar studies. I have no relationship with AMD or its partners and as such, I do not possess any internal information on the Zen 2 design. If you wish to know my sources, you can request them by subject as I do not currently have the links.
 
Last edited:
  • Like
Reactions: rigg42