Archived from groups: comp.sys.ibm.pc.hardware.chips (
More info?)
On 9 Mar 2005 11:30:30 -0800, "YKhan" <yjkhan@gmail.com> wrote:
>Robert Myers wrote:
>> MD-Grape is fast because it is highly specialized, see, for example,
>> figure 1 of
>>
>>
http://www.jsbi.org/journal/GIW02/GIW02P121.pdf
>
>I figured it would be highly specialized, without even being told. I
>wonder if dual-core chips doing single-precision SSE math wouldn't be
>able to compete with this totally? I can't see people sliding one of
>these PPU units in separately, taking up an entire slot by itself.
>Perhaps if it was part of the graphics card, then it might be accepted.
>
There's specialized and there's specialized, and MD-Grape is even more
specialized than, say, a GPU. MD-Grape is just a bunch of
multiply-accumulate pipes. From a power consumption point-of-view if
nothing else, it's a specialization that's well worth doing if you
have enough particles to keep track of.
The naive particle-particle problem has N*(N-1)/2
multiply-accumulates. That's exactly the kind of
completely-predictable streaming calculation for which stream
processors are well-suited, but the natural computational geometry of
a GPU or the Cell processor (multiple stages operate on a single
stream of data) would be sub-optimal for the calculation, which
requires only a single stage.
RM