Archived from groups: alt.comp.hardware.overclocking (
More info?)
You seem to have a misconception about how DDR/QDR busses operate, or
possibly busses in general. A "bus" in the computer world (ie: the world in
which DDR and QDR have a meaning) includes both the data lines, and the
control/address lines (different than a bus in the electrical world, which
is usually just a collection of similarly functioning lines). For example,
the EV6 bus scheme of the 21264/K7 has 72 bidirectional data lines (64 bits
data bits + 8 ECC bits), and 26 unidirectional control/address lines, plus
several other miscellaneous lines. The data lines "operate" at 2<x> or 4<x>
MHz, but the control and address lines only "operate" at <x> MHz. The "bus"
includes both data AND control/address lines. EVERY request across the bus
requires the use of the control lines, so they are no less important than
the data lines.
A request across the bus can only be started through the use of the control
lines. You can't start sending the data, then send the address later. So if
a request comes in at time 0.25 on a QDR bus, you have to wait until time
1.0 before you can start the transmission, even if nothing was sent at time
0.
For example, sending 32 bytes across a <x> MHz 64-bit QDR bus goes as
follows (simplified):
time 0.00: Bus sits idle
time 0.25: Bus sits idle, request is sendable but cannot be sent
time 0.50: Bus sits idle, request is sendable but cannot be sent
time 0.75: Bus sits idle, request is sendable but cannot be sent
time 1.00: Request sent
time 1.25: Request data continues
time 1.50: Request data continues
time 1.75: Request data continues
time 2.00: Bus sits idle, ready for next request
etc etc
In a <x*4>MHz 64-bit SDR bus, it would look like:
time 0.00: Bus sits idle
time 0.25: Request sent
time 0.50: Request data continues
time 0.75: Request data continues
time 1.00: Request data continues
time 1.25: Bus sits idle, ready for next request
etc etc
On a <x> MHz 256-bit SDR bus:
time 0.00: Bus sits idle
time 1.00: Request sent (all 32 bytes in a single cycle)
time 2.00: Bus sits idle, ready for next request
Which brings me back to the original point that an <x> MHz <y> bit QDR bus
is equavalent in performance to a <x> MHz <4*y> bit SDR bus, and slower than
an <4*x> MHz <y> bit SDR bus.
I know you dislike bringing memory into it, but the exact same thing applies
to DDR memory. Requests can only be sent on integer cycles, but data can be
sent on both integer and half-integer cycles. This is why DDR400 chips have
a 5ns access time, not 2.5ns.
David Maynard wrote:
> Michael Brown wrote:
>> David Maynard wrote:
>>> Michael Brown wrote:
>>>> David Maynard wrote:
>>>>> BigBadger wrote:
>>>>>> No it's not 333MHz, it's actually a 166 'MHz' FSB
>>>>>> processor....333 is just AMD hype to sell the virtues of the DDR
>>>>>> bus. Intel do the same trick but they multiply the real bus
>>>>>> speed by 4x.
>>>>>
>>>>> Double and quad pumping the bus is not "hype." It's an engineering
>>>>> technique for transferring data twice, or 4 times for quad, per
>>>>> clock cycle.
>>>>>
>>>>> 333 is the bus cycle rate, e.g. "Bus Speed," and is the relevant
>>>>> number from a performance standpoint.
>>>>
>>>> Oh dear, oh dear, here we go again ...

It depends on whether you
>>>> measure the control lines or the data lines for quoting the "bus
>>>> speed" number.
>>>
>>> No it doesn't. It has to do with how many data transfer cycles there
>>> are.
>>
>> This is exactly what I mean

There are some lines on the bus that
>> "operate" at <x> MHz, and some that "operate" at <4*x> MHz.
>
> It is irrelevant what "some lines" do. What's relevant is the data
> rate.
Why are the data lines more important than the signal lines when determining
how many MHz the bus operates at? Without both, you don't have a bus, and
there are arguments for adopting either of the two speeds.
>> What, then, is the bus speed?
>
> The bus 'speed' is the data rate.
<nitpick>
Data rate is measured in bps, not MHz. Which is why I'm semi-comfortable
with the DDR333/DDR400 terminology, but dislike people saying it's a 333MHz
bus.
</nitpick>
What I *think* you mean is that the data signal characteristics are more
important than the control signal characteristics. Again, there are
arguments for both sides: the data signal characteristics are more important
under streaming conditions (the norm for GPUs), control signal
characteristics are more important under random access conditions (the norm
for CPUs).
[...]
>>>> I actually think a more accurate way of representing it
>>>> (performance-wise) is a 128-bit bus (DDR) or 256-bit bus (QDR),
>>>> both running at 166MHz.
>>>
>>> Except it isn't '128 bits' or '256 bits' wide. It does, however,
>>> transfer data at either 2, for dual pumped, or 4 times, for quad
>>> pumped, the system clock rate.
>>
>> Hence why I explicitly said "performance-wise". The question I was
>> answering was:
>> Which closer represents the performance of a 64 bit <x> MHz QDR
>> bus? (a) A 64-bit <4*x> MHz SDR bus
>> (b) A 256-bit <x> MHz SDR bus
>> The correct answer is (b).
>
> The correct answer is (a) because that is REALITY. (b) is a figment
> of your imagination.
Again, you miss the "performance-wise" part. See the bit at the top of the
post for the reasoning behind this.
[...]
>>>> Say you have a 166MHz DDR system
>>>> (aka DDR333), and a 100MHz QDR system (aka QDR400), and the CPU
>>>> runs at 1GHz (6.0x for DDR, 10.0x for QDR). Excluding memory
>>>> latencies, to fill a randomly-accessed 64-byte cache line would
>>>> take: Waiting for bus strobe: 3.0 cycles (DDR system), 5.0 cycles
>>>> (QDR system) Transferring data: 24 cycles (DDR), 20 cycles (QDR)
>>>> Total: 27 cycles (DDR), 25 cycles (QDR)
>>>>
>>>> So DDR333 is, under random access conditions, only marginally
>>>> slower than QDR400. The actual break-even point is 180MHz (actually
>>>> slightly above due to memory latencies), but hopefully you get the
>>>> idea. Of course, the QDR system will perform better under
>>>> "streaming" type conditions, where the higher latency won't matter
>>>> so much.
>>>
>>> No, you're analyzing the memory, not the processor bus.
>>
>> Errm, not at all. I specifically EXCLUDE any memory performance
>> considerations from the analysis: see the third line of your quoted
>> section.
>
> You claim to be excluding it but you embed it in your analysis
> nonetheless.
Please, tell me where in my analysis I bring memory performance into it
(excluding the "slightly above" remark, of course). All of the numbers above
are only influenced by the speed and signalling scheme of the bus in
question. A few quotes below, I say that you can think of it as two
processors exchanging a cache line. Another option is a write to an I/O
port. This is even more dramatic, as these writes are not cached, and DDR333
bus will annihilate a QDR400 bus: the DDR333 bus will do 166 million I/O's
per second, whereas the QDR400 bus will only manage 100 million per second.
> You also create artificial conditions in direct contrast to reality
> by, for example, trying to limit the analysis to 'random access' on a
> bus that is specifically designed for, optimized for, and RATED AT
> it's synchronous stream transfer rate whether it be SDR, DDR, or QDR.
Could you please provide a reference that states that the EV6 bus was
"specifically designed for" streaming data transfers. I'm not holding my
breath.
And limiting the analysis to non-streaming applications is creating
artifical conditions? Excluding applications such as media encoding/decoding
and large matix computations, much of the traffic that flows across the bus
is random-access for the purpose of analysis. To get into the streaming
case, you need to be running a bus-bandwidth limited task. Believe it or
not, just about everything except media encoding/decoding or large matrix
operations are CPU limited, not bus limited (which is why you don't get a
50% increase in performance moving from 133x15 to 200x10).
Certainly, the P4 is designed and tweaked for streaming, but that doesn't
mean that ALL DDR/QDR busses are designed/tweaked for that purpose. The main
reason why DDR/QDR was implemented is that it's easier to use a <x> bit bus
at <2*y> MHz, as opposed to running a <2*x> bit bus at <y> MHz due to signal
skew problems. Sorta similar reasons as to why it's cheaper to use a USB2
interface than a EPP interface.
> Heck, the names you (properly) use explain it even as you're denying
> it: SRD - Single DATA RATE, DDR - Double DATA RATE, QDR - Quad DATA
> RATE.
Please provide a quote where I say that the data lines are running at the
same speeds as the control lines. They are most certainly typo's and I'd
like to correct them. What I HAVE said is that the performance of a DDR bus
running at 166MHz (control signal clock) is identical in performance to a
SDR bus that is twice as wide running at 166MHz. It is NOT equavalent in
performance to a SDR bus (running at the same bus width as the DDR bus)
running at 333MHz.
I use the names DDR333, QDR400, etc (note the lack of MHz) simply because
these have become the "standard" names for the particular busses. I would
NOT call a DDR333 a 333MHz DDR bus though. To me, this says that the bus
runs at 333MHz, with the data lines running at double this (ie: 667MHz).
Also, I wouldn't I call it a 333MHz bus or a 166MHz bus, as this fails to
specify the scheme used. I *would* call it a 166MHz DDR bus.
>> I'm solely analysing how long it would take to fill (or write out) a
>> processor cache line over a DDR/QDR bus, which is pretty much all the
>> processor bus is used for.
>> The exact same argument applies to the
>> point-to-point DDR busses in a K7 SMP system, if having memory in the
>> picture makes things confusing for you.
>
> You can wag imaginative theories and pick artificial 'conditions' all
> you want.
So, analysing the typical use for a bus is an artificial condition?
Riiiiggghhhttt ...
> I'm telling you how it works.
I know EXACTLY how the EV6 bus works, and know fairly well how the DDR
memory bus and the AGTL+ busses work. I wrote, pretty much from scratch, a
VHDL program to log transfers across a EV6 bus (and also played with, but
never got very far with, a DDR memory bus logger). Granted, it probably
wouldn't actually work correctly because of signal purity issues if it was
actually hooked into a bus, and that it was designed from the 21264 specs,
but I do know the theory behind it quite well. The EV6 isn't a great example
of a DDR bus for several reasons, but it still operates in much the same
manner as I described above.
>>>> Incidentally, this issue is exasparated by the P4's 128-byte cache
>>>> line, as opposed to the 64-byte cache line of the K7.
>>>
>>> Processor (L2) cache has nothing to do with bus speed.
>>
>> Hence my "incidentally" (spot the recurring theme here: I don't
>> usually put in words for no reason). The processor cache line size
>> (note: cache LINE size, not cache size or anything else) and bus
>> performance characteristics are quite interlinked for the
>> performance of a processor. The larger cache line size improves
>> streaming performance and decreases random-access performance, which
>> is exactly the same characteristics as a QDR bus. My point was that
>> the P4 has been heavily tweaked towards streaming computations, as
>> opposed to having fast random-access times.
>
> We aren't talking about the "performance of a processor." We're
> talking about the bus data rate.
<sigh>
Will you PLEASE go and read back over what you quoted. It was simply a
comment about how the cache line size and bus type are interlinked with
respect to the performance of a processor. If you think it's off topic for
the thread, then just snip it instead of trying to make a great big issue
over it.
>>> Btw, what don't you call a 3.4 Gig P4 a 200Mhz P4 because the 'real
>>> clock' (sic) is 200 Mhz. That 3.4 Gig number is just 'hype'.
>>
>>
>> Why don't you call it a 6.8GHz P4?
😛
>
> Because the reality of it is that it's operating at 3.4 GHz.
Not all of it ... the ALUs and some parts of the scheduler are operating at
6.8GHz, and numerous other bits are operating at all sorts of different
speeds..
[snip further P4 stuff, as this is really getting off-topic for a discussion
on busses]
--
Michael Brown
www.emboss.co.nz : OOS/RSI software and more

Add michael@ to emboss.co.nz - My inbox is always open