Archived from groups: comp.sys.ibm.pc.hardware.chips (
More info?)
On Sun, 10 Apr 2005 01:15:53 GMT, epaton <epaton@null.com> wrote:
>ive recently had a thought and figured i would post here to see if it were
>mad or not.
>
>my understanding is the amd has 3 hypertransport links which can connect
>ram other slots or ever other processors to it.
Hypertransport was designed to connect to other ICs, either processors
or I/O chips. It wasn't really designed to ever connect to RAM, since
Opterons have memory controllers built in.
> now i was wondering if it
>would be possible for a motherboard manufacturer to build a board which
>used one of these transports to connect to some exterenal but VERY high
>frequency ram like back in the pentium days.
Theoretically yes.
The Opteron does have a method of snooping over their hypertransport
links. This is used to access data stored in the caches of other
processors. I think it's not entirely out of the realm of possibility
that this could be adapted to connect to an IC which is little more
than a cache controller with a bunch of SRAM.
> i would guess this would need an obteron as they have extra ht's but the
> idea of say a 32-64meg L3 cache has got to be useful to someone and would
> mean opterons could go up against the xeons with massive cache.
It would definitely only be for an Opteron, as the Athlon64 has only a
single hypertransport connector (err, I suppose you could daisy chain
this in between your processor and your I/O chips, but doing so would
probably be just a dumb idea). However the real question you would
have to ask is whether or not this would be remotely useful.
The reason why Xeons have such massive caches is because they don't
have integrated memory controllers and therefore they NEED to do
everything in their power to minimize main memory access if they want
to keep up with the Opteron. With it's built-in memory controller the
Opteron is always going to have lower memory latency and therefore is
much less dependant on large caches.
Given that hypertransport offers less memory bandwidth (only 3.2GB/s
in each direction, vs. 6.4GB/s for the memory interface) and the very
low latency of accessing main memory, plus the overhead latency of
snooping over an HT link, it would be REALLY tough to get much extra
performance out of a such a setup. With caches you quickly run into a
situation of diminishing returns as your size increase. Only 64KB of
cache gets you a hit-rate well in excess of 90% on most applications.
Going up to 1MB of cache will often push your hit rate up to around
96-98%. If a 32MB cache only moves that cache hit rate up by 1 or 2%,
then you're spending a LOT of transistors for only a small percentage
of your memory access. If that memory access ends up being only twice
as fast as going to main memory (my very rough guesstimate of this
setup), then you're only going to see a 0.5-1% improvement in
performance. For comparison, with the Xeon's built in L3 cache you
are often looking at reducing latency by 75% vs. main memory, so only
a small increase in cache hit-rates can help out a fair bit.
In short, my guess is that this just wouldn't be worthwhile in all but
very rare situations.
-------------
Tony Hill
hilla <underscore> 20 <at> yahoo <dot> ca