leopolder :
I am coming to the conclusion that 5 GT/s for both the processors I mentioned in my question may be wrong.
In view of what said in this thread and looking at a possible formula of the GT/s (http://en.wikipedia.org/wiki/Front-side_bus - see Transfer rates), that should be different in the 2 cases.
In fact, the data path width and the number of data transfer should be the same, the clock frequency (cycles per second) being the only difference.
That also agrees with the fact that the given GT/s is an upper limit as everything runs off the base clock, GT/s just being a sample time based on the clock, and the clock doesn't always run at max speed to save power when it's not needed.
FSB and QPI are two different interfaces.
The term "transfer" is used by us engineers to clearly differentiate the rate of data transfer across a bus from the periodic clock signal used to synchronize the various components attached to the bus and the amount of data transferred across the bus on each transfer. Those are three separate ideas that deserve three separate definitions. Unfortunately, marketing departments like to make our jobs a living hell.
In the case of the FSB, the bus transfers data four times per clock cycle (with each transfer 90 degrees out of phase). This is called Quad-Data-Rate, or Quad-Pumped. So, an FSB that has a 200Mhz reference clock synchronizing the transfers between the Northbridge and attached CPUs (there can be more than one, common example being Core 2 Quad which is really just two Core 2 Duo CPUs glued together) transfers data 800,000,000 times per second for a transfer rate of 800 MT/s. Most FSB implementations are 64-bits wide, for a total transfer size of 8 bytes per transfer.
8 bytes (64 bits) per transfer * 4 transfers per cycle * 200 million cycles per second = 6.4 billion bytes per second in each direction
QPI has a very similar formula
20 bits per transfer * 64 bits of payload per 80 bits of transmission * 2 transfers per cycle * 3.2 billion cycles per second = 12.8 billion bytes of payload data per second in each direction
The takeaway here is that the number of transfers per time interval is independent of the amount of data transferred across the link per transfer. QPI has built-in fault protection. it nominally operates at a width of 20 bits in each direction, but if part of it fails it can fall back to 10 bits and even 5 bits. Were this to happen, the bandwidth in each direction would drop from 12.8 billion bytes per second (in the example above) to 6.4 billion bytes per second, to 3.2 billion bytes per second but the transfer rate would remain at 6.4GT/s because the transfer rate is linked to the reference clock and the reference clock doesn't change in this example.
PCIe is very different as it uses both a fixed frequency reference clock (100Mhz) and an embedded data clock which changes based on the link speed. In this case, the transfer rate changes while the reference clock stays fixed. PCIe is also capable of renegotiating the link width. PCIe devices can operate in 1x, 4x, 8x, or 16x link width. In fact, it's possible to cut down the 16x connector on a GPU and fit it into a 4x slot if desired (please don't do this).
Thanks to the embedded data clock and serial link design, PCIe 2.1 transfers data up to 50 times per reference cycle.
100 million cycles per second * 50 transfers per cycle = 5GT/s when operating at PCIe 2.1 link speed
5GT/s * 1 bit per transfer per lane * 8 bits of payload per 10 bits of transmission (8b/10b encoding) = 4 gigabits of payload per second per lane (500MB) in each direction.
500MB per second per lane * 16 lanes = 8GB per second in each direction
If need be, PCIe 2.1 can reduce its link speed speed to PCIe 1.1 speeds and drop the transfer rate from 50 to 25, still with a 100Mhz reference clock.
I hope that this helped a little bit and didn't confuse you too much.