Ray,
As I had to get some sleep last night, I couldn't join in the "discussion." Sorry for the late post.
What is a multi-rate per clock fad?
Using multiple signals per clock (Hz) i.e. DDR, QDR, etc.
Thusfar, the only processor with an FSB not equal to the external clock (that is set on the motherboard) is the Pentium 4, which has a quad-pumped FSB.
As you already conceded, and as I stated from the beginning, this is incorrect; the original K7 Athlon was the first x86 processor to multi-pump the FSB with a DDR 100MHz (200 effective).
While many people use the term "DDR" to indicate DDR SDRAM, DDR really has nothing to do with memory. It simply means 'double-data-rate' and was first used to refer to Athlon processor FSBs.
Usually this access is simply to the L1 cache. Occasionally it is to the L2 cache. Rarely, main memory must be accessed. We should note that the L1 cache on the Pentium 4 includes new technology. The instruction opcodes remain in a type of compiled state. This makes the effort of retrieving new instructions during a branch misprediction much more quick. You can in effect skip a few stages of the pipeline this way.
While this was the intention of Intel, practice proved different. The P4 is extremely memory bandwith hungry due primarily to branch mis-predictions and corresponding recomputes.
I expect all processors to use such technology in the future.
As I've stated before, AMD will probably include similar technology in upcomming processors. This kind of copying is common in the industry. It's just that AMD's will probably work better as they've had the benefit of Intel's mistakes....
While it does have a shorter pipeline, it also has less cache now. The Northwood Pentium 4 includes 512KB of L2 cache, surpassing that of the Athlon.
Actually, Athlon's L1 cache is much deeper than P4's new style L1 cache. In practice it has also proved more functional. While NW does have double Athlon's L2 cache, Intel had to add this to correct the perfomance deficit with the original P4. It still doesn't completely overcome the memory bandwidth intensity of the chip. It would need 2-10MB or more to scale that back to near Athlon levels, making the chip cost prohibitive and dumping the yield. This is why the high bandwidth FSB is so important to the p4 and why dual-channel RDRAM is required for highest performance.
This is true of any processor. Increase the clockspeed on a processor by X% and you will increase the amount of memory bandwidth it desires by X%. Note that the percentages do not change.
This is also true of any processor. It is why the Pentium 4 platform does not perform as well with DDR-SDRAM as it does with RDRAM. An RDRAM memory subsystem will deliver up to 100% more memory bandwidth to the processor.
Actually, because of its internal efficiencies, Athlon provides equivalent performance increases to P4 with smaller increases in cycles. Increase your Athlon clock 66MHz and you will have to increase your P4 100-125MHz to get the same performance increase.
Since the P4 is less efficient with memory bandwidth than the Athlon, larger clock and FSB increases will require larger memory bandwidth increases. Following this trend, memory bandwidth strangulation will come much sooner with the P4 and RDRAM than with the Athlon or with P4 mated to dual-channel DDR. As Intel scales the P4 to a 533MHz and then 600MHz FSB to achieve performance from the 3-5GHz chips, even dual-channel PC1066 (4266MBps) will fail to feed the hunger. Dual-channel DDR333 (5333MBps) or DDR400 (6400MBps) will be required. These are current shipping products. I have my doubts that RDRAM will scale this high with profitable yields; it would have to be running at 1600MHz! The memory manus can't even get the 1066Mhz stuff off the line yet. Maybe Intel will start to manufacture RDRAM?
I thought a thought, but the thought I thought wasn't the thought I thought I had thought.