Pentium M desktops ???

Page 2 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.
Archived from groups: comp.sys.ibm.pc.hardware.chips (More info?)

"Anonymous Joe" <anonymousjoe@net.net> wrote in message
news:TlWkc.7010$Ia6.809665@attbi_s03...
> "Robert Redelmeier" <redelm@ev1.net.invalid> wrote in message
> news:6RUjc.3959$0j3.1020@newssvr23.news.prodigy.com...
> > Paul Tiseo <tiseo123.paul456@mayo.edu> wrote:
> > > If you spend umpteen million in R&D to design the P4, and the P-M
> > > comes along with less R&D and runs slightly faster in some scenarios,
> > > what do you do? Chuck all your P4 R&D out the window?
> >
> > Actually, I believe the Pentium4 was done as a rush job,
> > on the cheap after it became apparant that ia64 (aka
> > Itanium) would not take over. IMHO it's an original
> > Pentium plus SSE2, deeply pipelined for inflated clocks.
> >
> > The Pentium-M is little more than the venerable P6 core,
> > tweaked and clocked higher on smaller processes.
> >
> > -- Robert
>
> It does seem as though the P4 is just that. A P3 with SSE2, with very
long
> pipelines (extended further thanks to Prescott) for a hyper-inflated clock
> speed that eventually is pretty decent, but only when you get to a
> ridiculous level of clock speed, and some of that is due to the quadrupled
> bandwidth bus speed combined with the memory speeds (dual channel
anyways).
> As the P4 increases in clock speed, so does the L1/L2 cache, which can
only
> help but improve performance further.
>
> What strikes me is that a P4 @ 3GHz is generally on par with an Athlon
> 3000+, which depending on the bus speed chosen is either 2.16GHz (333Mhz
> bus) or 2.1GHz (400Mhz bus). I forget off-hand how deep the P4 pipeline
is,
> but is something like 24, isnt it? The Athlon is something like 12, or
15.
> Either way, the numbers are off, but it still is rather close. The P4 has
> L1 & L2 cache running at 3GHz, while the Athlon's is about 66% of that,
but
> more plentiful. The bus bandwidth of the P4 is going to be either
> 4.27GB/sec or 6.4GB/sec (533 or 800MHz bus [yet it is really 133 or
> 166MHz]). Yet the Athlon is using a 3.2GB/sec bus. As for memory, the
most
> you can get out of the Athlon is the 3.2GB/sec (whether you use P3200 RAM,
> any speed dual channel RAM, even PC3200), but with P4 you have a shot at
> getting a theoretical of 6.4GB/sec (dual channel PC3200).
>
> All this combined, things sure look favorable for P4. It has so much more
> bandwidth in every area, cache, bus, and RAM. Yet, how come with a 900MHz
> core clock lead, it is only able to tie the Athlon? It seems like it is
> using all the bandwidth and wasting it. If AMD could get the sort of
> bandwidth that Intel has, I would imagine that the P4 would need about a
> 1200MHz or more head start to start being comparable.
>
> For anybody who cares, I do use AMD, so if you want to say I'm promoting
AMD
> unfairly or whatever, that's wrong, I'm simply showing that Intel isn't
> efficient.
>
>

In my opionion(2 ?cents worth) that's exactly the opposite of bragging
rights, those numbers. When we consider the problems they constantly bring
out related to Moore's law and heat dissipation and all that stuff related
to miniaturization reaching it's final barrier, every FLOP/Hz, B/s and
transistor saved should mean more than ever and P4 design looks right in the
eye of the problem and smugs: yes, but my pipe is longer...
 
Archived from groups: comp.sys.ibm.pc.hardware.chips (More info?)

In article <c7glnp$mkm$1@ls219.htnet.hr>, taurus@email.hinet.hr
says...
> "Robert Redelmeier" <redelm@ev1.net.invalid> wrote in message
> news:6RUjc.3959$0j3.1020@newssvr23.news.prodigy.com...
> > Paul Tiseo <tiseo123.paul456@mayo.edu> wrote:
> > > If you spend umpteen million in R&D to design the P4, and the P-M
> > > comes along with less R&D and runs slightly faster in some scenarios,
> > > what do you do? Chuck all your P4 R&D out the window?
> >
> > Actually, I believe the Pentium4 was done as a rush job,
> > on the cheap after it became apparant that ia64 (aka
> > Itanium) would not take over. IMHO it's an original
> > Pentium plus SSE2, deeply pipelined for inflated clocks.
>
> Everybody's using this theory of more pipelines = more Hz, but can someone
> please shed some physical light on this topic.
> What i know about pipelines is from AoA book and web(instruction stages =
> fetch, decode, ip++ ...), but i really can not realate that to any kind of
> speed increase, plus where do they come up with 20+ stages in Prescott, i
> mean there's only so much stuff one instruction needs/can do.

It's really rather simple; the less work done in each clock
cycle, the faster that clock can be run. The Intel developer
site has the descriptions and clock counts (length) of each pipe.

--
Keith
 
Archived from groups: comp.sys.ibm.pc.hardware.chips (More info?)

Bitstring <MPG.1b06c196a1014f85989821@news1.news.adelphia.net>, from the
wonderful person KR Williams <krw@att.biz> said
<snip>
>It's really rather simple; the less work done in each clock
>cycle, the faster that clock can be run. The Intel developer
>site has the descriptions and clock counts (length) of each pipe.

A real simple (too simple, but WTF) analogy .. it takes what, 30 seconds
maybe, to run round a baseball diamond. However just getting from base
to base you can do in ~10 seconds. If you had 16 or 32 bases instead of
the 4 (including home plate) you could get from base to base in maybe a
second or two.

You wouldn't get runs completed any faster (if you decided to =stop= at
each base, it'd actually take way longer), but you sure could brag about
an amazing clock speed, getting from one base to the next. 8>.

Which just demonstrates the stupidity of trying to measure MPH or BHP
using the rev counter.

--
GSV Three Minds in a Can
Outgoing Msgs are Turing Tested,and indistinguishable from human typing.
 
Archived from groups: comp.sys.ibm.pc.hardware.chips (More info?)

In article <V4GicxGS0SnAFATA@from.is.invalid>,
GSV@quik.clara.co.uk says...
> Bitstring <MPG.1b06c196a1014f85989821@news1.news.adelphia.net>, from the
> wonderful person KR Williams <krw@att.biz> said
> <snip>
> >It's really rather simple; the less work done in each clock
> >cycle, the faster that clock can be run. The Intel developer
> >site has the descriptions and clock counts (length) of each pipe.
>
> A real simple (too simple, but WTF) analogy .. it takes what, 30 seconds
> maybe, to run round a baseball diamond. However just getting from base
> to base you can do in ~10 seconds. If you had 16 or 32 bases instead of
> the 4 (including home plate) you could get from base to base in maybe a
> second or two.
>
> You wouldn't get runs completed any faster (if you decided to =stop= at
> each base, it'd actually take way longer), but you sure could brag about
> an amazing clock speed, getting from one base to the next. 8>.
>
> Which just demonstrates the stupidity of trying to measure MPH or BHP
> using the rev counter.

Or nine women can have nine babies in nine months, but each one
still takes nine months. ;-)

--
Keith
 
Archived from groups: comp.sys.ibm.pc.hardware.chips (More info?)

"KR Williams" <krw@att.biz> wrote in message
news:MPG.1b06ff339fdcb0fe989833@news1.news.adelphia.net...
>
> Or nine women can have nine babies in nine months, but each one
> still takes nine months. ;-)

That's the difference between throughput (bandwidth) and latency,
Keith. ;-)
 
Archived from groups: comp.sys.ibm.pc.hardware.chips (More info?)

In article <fWanc.12535$V97.2894
@newsread1.news.pas.earthlink.net>, fmsfnf@jfoops.net says...
> "KR Williams" <krw@att.biz> wrote in message
> news:MPG.1b06ff339fdcb0fe989833@news1.news.adelphia.net...
> >
> > Or nine women can have nine babies in nine months, but each one
> > still takes nine months. ;-)
>
> That's the difference between throughput (bandwidth) and latency,
> Keith. ;-)

Sorta like base-runners, eh? ...though I know you don't care a
wit about stick-ball (nor do I, actually). When's New England
going to kick SF's sorry butt again, eh? ...gotta be soon now.
;-)

--
Keith
 
Archived from groups: comp.sys.ibm.pc.hardware.chips (More info?)

On Sat, 08 May 2004 19:54:51 GMT, "Felger Carbon" <fmsfnf@jfoops.net> wrote:

>"KR Williams" <krw@att.biz> wrote in message
>news:MPG.1b06ff339fdcb0fe989833@news1.news.adelphia.net...
>>
>> Or nine women can have nine babies in nine months, but each one
>> still takes nine months. ;-)
>
>That's the difference between throughput (bandwidth) and latency,
>Keith. ;-)

lol!
 
Archived from groups: comp.sys.ibm.pc.hardware.chips (More info?)

On Sat, 8 May 2004 23:32:33 -0400, KR Williams <krw@att.biz> wrote:

>In article <fWanc.12535$V97.2894
>@newsread1.news.pas.earthlink.net>, fmsfnf@jfoops.net says...
>> "KR Williams" <krw@att.biz> wrote in message
>> news:MPG.1b06ff339fdcb0fe989833@news1.news.adelphia.net...
>> >
>> > Or nine women can have nine babies in nine months, but each one
>> > still takes nine months. ;-)
>>
>> That's the difference between throughput (bandwidth) and latency,
>> Keith. ;-)
>
>Sorta like base-runners, eh? ...though I know you don't care a
>wit about stick-ball (nor do I, actually). When's New England
>going to kick SF's sorry butt again, eh? ...gotta be soon now.
>;-)

The Pats STILL haven't lost a game since last October ;-)
 
Archived from groups: comp.sys.ibm.pc.hardware.chips (More info?)

In article <op9r909h6bkjj2oqutndg5n49nm86jt63j@4ax.com>,
day_trippr@REMOVEyahoo.com says...
> On Sat, 8 May 2004 23:32:33 -0400, KR Williams <krw@att.biz> wrote:
>
> >In article <fWanc.12535$V97.2894
> >@newsread1.news.pas.earthlink.net>, fmsfnf@jfoops.net says...
> >> "KR Williams" <krw@att.biz> wrote in message
> >> news:MPG.1b06ff339fdcb0fe989833@news1.news.adelphia.net...
> >> >
> >> > Or nine women can have nine babies in nine months, but each one
> >> > still takes nine months. ;-)
> >>
> >> That's the difference between throughput (bandwidth) and latency,
> >> Keith. ;-)
> >
> >Sorta like base-runners, eh? ...though I know you don't care a
> >wit about stick-ball (nor do I, actually). When's New England
> >going to kick SF's sorry butt again, eh? ...gotta be soon now.
> >;-)
>
> The Pats STILL haven't lost a game since last October ;-)
>
....and won't again for three more months, maybe they can make it
a complete year! ...or two. ;-)

--
Keith
 
Archived from groups: comp.sys.ibm.pc.hardware.chips (More info?)

GSV Three Minds in a Can <GSV@quik.clara.co.uk> wrote in message news:<V4GicxGS0SnAFATA@from.is.invalid>...
> Bitstring <MPG.1b06c196a1014f85989821@news1.news.adelphia.net>, from the
> wonderful person KR Williams <krw@att.biz> said
> <snip>
> >It's really rather simple; the less work done in each clock
> >cycle, the faster that clock can be run. The Intel developer
> >site has the descriptions and clock counts (length) of each pipe.
>
> A real simple (too simple, but WTF) analogy .. it takes what, 30 seconds
> maybe, to run round a baseball diamond. However just getting from base
> to base you can do in ~10 seconds. If you had 16 or 32 bases instead of
> the 4 (including home plate) you could get from base to base in maybe a
> second or two.
>
> You wouldn't get runs completed any faster (if you decided to =stop= at
> each base, it'd actually take way longer), but you sure could brag about
> an amazing clock speed, getting from one base to the next. 8>.
>
> Which just demonstrates the stupidity of trying to measure MPH or BHP
> using the rev counter.


I think that's a very good analogy. It also points out the theoretical
*BENEFIT* to long pipelines. If you can keep a runner on every base
all the time, you've got a person crossing the plate more frequently,
which represents more work getting done.

I don't think the higher frequency, or even the higher heat generated
is the real problem. As long as you can cool it reasonablty well, why
not design right up to the thermal threshhold? As long as you're not
talking blades, notebooks, or making the room shake with excess fan
noise, I'd rather have the PC doing more work when it's running. I
don't even consider high IPC a strict definition of "efficiency". IMO,
the real problems is are:

1) The P4 *DOESN'T* keep a runner on every base continually. Maybe
this is just Intel's implementation. Maybe they just didn't do as good
of a job as they could have. Or, maybe it's truely an intractable
problem with long pipelines. My guess is that it's probably both. But
I don't think it was WRONG to go that route when they decided to. I
also think Hyperthreading didn't help as much as they had initially
hoped it would.

2) If it takes more transistors to implement, you have to question
whether those transistors could be spent other ways that would
increase performance without the heat penalty. But I also firmly
beleive that you don't get something for nothing. Short of a temporary
performance advantage from a good idea until it gets copied by
everyone (like the on-board memory controller), the chip designers are
all working with the same transistor budget.

Put another way, I think if a Prescott successor had:

- An on-board memory controller
- Used the Pentium M's micro-op fussion
- Was optimized to run just a little bit cooler
- Had the 64-bit extension enabled

It would be a great desktop CPU. This isn't to say that the Athlon64
isn't a BETTER cpu right now (it is). But I'd rather have a Prescott
like THAT available in late 2004 than have to wait for the dual-core
64-bit desktop 2ghz Pentuim-M in late 2005 (or even later) to get the
same performance.
 
Archived from groups: comp.sys.ibm.pc.hardware.chips (More info?)

G <gaf1234567890@hotmail.com> wrote:
> It would be a great desktop CPU. This isn't to say that the Athlon64
> isn't a BETTER cpu right now (it is). But I'd rather have a Prescott
> like THAT available in late 2004 than have to wait for the dual-core
> 64-bit desktop 2ghz Pentuim-M in late 2005 (or even later) to get the
> same performance.

Heck, how about just a 2.2ghz or so single Pentium M; given the numbers I've
seen for the 1.7, a 2.2ghz Banias would be a very competitive CPU with
anything Intel or AMD is selling now.

--
Nate Edel http://www.nkedel.com/

"Elder Party 2004: Cthulhu for President -- this time WE'RE the lesser
evil."
 
Archived from groups: comp.sys.ibm.pc.hardware.chips (More info?)

G wrote:
> GSV Three Minds in a Can <GSV@quik.clara.co.uk> wrote in message news:<V4GicxGS0SnAFATA@from.is.invalid>...
>
>>Bitstring <MPG.1b06c196a1014f85989821@news1.news.adelphia.net>, from the
>>wonderful person KR Williams <krw@att.biz> said
>><snip>
>>
>>>It's really rather simple; the less work done in each clock
>>>cycle, the faster that clock can be run. The Intel developer
>>>site has the descriptions and clock counts (length) of each pipe.
>>
>>A real simple (too simple, but WTF) analogy .. it takes what, 30 seconds
>>maybe, to run round a baseball diamond. However just getting from base
>>to base you can do in ~10 seconds. If you had 16 or 32 bases instead of
>>the 4 (including home plate) you could get from base to base in maybe a
>>second or two.
>>
>>You wouldn't get runs completed any faster (if you decided to =stop= at
>>each base, it'd actually take way longer), but you sure could brag about
>>an amazing clock speed, getting from one base to the next. 8>.
>>
>>Which just demonstrates the stupidity of trying to measure MPH or BHP
>>using the rev counter.
>
>
>
> I think that's a very good analogy. It also points out the theoretical
> *BENEFIT* to long pipelines. If you can keep a runner on every base
> all the time, you've got a person crossing the plate more frequently,
> which represents more work getting done.

It is a good analogy in the sense that the P4 puts lots of runner
on base - and actually provides extra bases for those base runners
to stand on. However, the analogy falls apart in other ways - you
could say that the AthlonXP and AMD64 processors put fewer men on
base but do a *much* better when it comes to actually driving in some
of those baserunners. In the Intel vs AMD baseball game, Intel
loses because their batting average with men on base is pretty shitty.
It is the number of runs scored that counts at the end of the game,
not the number of men you had on base. A stranded base runner
counts for nothing.

>
> I don't think the higher frequency, or even the higher heat generated
> is the real problem. As long as you can cool it reasonablty well, why
> not design right up to the thermal threshhold? As long as you're not
> talking blades, notebooks, or making the room shake with excess fan
> noise, I'd rather have the PC doing more work when it's running. I
> don't even consider high IPC a strict definition of "efficiency". IMO,
> the real problems is are:
>
> 1) The P4 *DOESN'T* keep a runner on every base continually. Maybe
> this is just Intel's implementation. Maybe they just didn't do as good
> of a job as they could have. Or, maybe it's truely an intractable
> problem with long pipelines. My guess is that it's probably both. But
> I don't think it was WRONG to go that route when they decided to. I
> also think Hyperthreading didn't help as much as they had initially
> hoped it would.
>
> 2) If it takes more transistors to implement, you have to question
> whether those transistors could be spent other ways that would
> increase performance without the heat penalty. But I also firmly
> beleive that you don't get something for nothing. Short of a temporary
> performance advantage from a good idea until it gets copied by
> everyone (like the on-board memory controller), the chip designers are
> all working with the same transistor budget.
>
> Put another way, I think if a Prescott successor had:
>
> - An on-board memory controller
> - Used the Pentium M's micro-op fussion
> - Was optimized to run just a little bit cooler
> - Had the 64-bit extension enabled
>
> It would be a great desktop CPU. This isn't to say that the Athlon64
> isn't a BETTER cpu right now (it is). But I'd rather have a Prescott
> like THAT available in late 2004 than have to wait for the dual-core
> 64-bit desktop 2ghz Pentuim-M in late 2005 (or even later) to get the
> same performance.