Barcelona model numbers and clockspeeds

Page 3 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.

sandmanwn

Distinguished
Dec 1, 2006
915
0
18,990
Those numbers would depend on how you narrowed the market in your survey.

Overall cpu sales including server, specialized processors, home pc's, business pc's; enthusiasts are equal to about jack squat of 1%

Maybe 10-15% if you are solely talking about the home market.
 

tamalero

Distinguished
Oct 25, 2006
1,142
148
19,470
Actually... he's right. Enthusiasts only make up 10 - 15% of the market.
yeah but does all these 15% of enthusiasts, go for quadcore? or super expensive systems?
id say theres degress of enthusiasts, just like levels of gamers.
 

eregular

Distinguished
Dec 8, 2006
266
0
18,780
Intel doesn't release 3.2+ ghz c2d's because of a little thing called a "power envelope" :p everyone except the gamer gfx card companies are trying to use LESS power, it would be retarded for something that is as much of a power hog as quadfx to be released as anything near what the average consumer could possibly use it for!!

Rabble! K8L will not be so easily written off!! Agena will be its maturation...who cares if every1 uses c2d's for a few more months??!? when people are ready for world record setting they call AMDTI :twisted:
 

LordPope

Distinguished
Jun 23, 2006
553
0
18,980
Intel doesn't release 3.2+ ghz c2d's because of a little thing called a "power envelope" :p everyone except the gamer gfx card companies are trying to use LESS power, it would be retarded for something that is as much of a power hog as quadfx to be released as anything near what the average consumer could possibly use it for!!

Rabble! K8L will not be so easily written off!! Agena will be its maturation...who cares if every1 uses c2d's for a few more months??!? when people are ready for world record setting they call AMDTI :twisted:

After an excellent post like this... you need to be made a MOD....

I like this new trend... BARON said the K8L is a beast so then we must say it sucks....
 
Reaally?
World record setting? AMD/ATI?

Lets take a look at the top 20 from HWBot.org:

CPUZ:
cpuzhallzoffameax9.png

Curiously, no AMD.

SuperPi:
superpihallzoffameoa2.png

Curiously, no AMD.

3DMarks 01 & 06
.png]http://img395.imageshack.us/img395/9940/.png[/url]
[img:ab54df5a9a]http://img395.imageshack.us/img395/4427/3dmark06part1pm5.png
Curiously, no AMD/ATI

SiSoft Sandra:

Curiously, no AMD.

Not saying AMD or ATI are bad in any way, but they've not really been much when it comes to world records.

Edited for the pedantic bastard =P /joke
 
Apparently you failed at reading comprehension. I never said they didn't do good, but the Netburst, Core and Nvidia chips have consistently taken the WORLD RECORDS. AS IN THE TOP 20. YES, EVEN THE NETBURST CHIPS.
For fcuks sake, I said records, not performance.
 

the_vorlon

Distinguished
May 3, 2006
365
0
18,780
A few quick facts, then speculation....

AMD has stated, on the record, that barcelona is a rework of the existing k8 pipeline. Some heavy modifications to be sure, the 128 bit single clock cycle SSE execution units being the biggest one. In ~~theory~~ this will give Barcelona a big floating point edge on Conroe which is also 128 bits wide, but can only dispatch a single 128 bit SSE instruction per clock versus two for Barcelona.

But as a rework, there just simply are not going to be large clock speed jumps, indeed, the switch to the aforementioned 128 bit sse engines will make is tough to hold existing clock speeds, they may get a bit of a speed bump from the new (at least to AMD) 65 nano process, but well into the 3 ghz range just ain't gonna happen.

Intel will continue to lead in integer performance, there is nothing published anywhere I can find that's going to give Barcelona a bump on integer applications.

Barcelona will, generally speaking, issue 4 instructions per clock cycle,(kinda 5 under some limited conditions) versus 4 for Conroe (also ~~kinda~~ 5 under some limited situations) -The two designs have the same size of instruction buffers and issue restructions, so clock for clock, these chips will be darn close, Barcelona a tad better on FP, Conroe a tad better on Integer code.

Clock speed is pure speculation.

We know Conroes typically get to 3.3 ghz or so with ease., close to 4 on really good cooling.

~~assuming~~ Penryn has leakage issues solved to the degree Intel claims, 4.0+ ghz is not an irrational assumption, but it is still an asumption.

The other issue is yields.

If Conroe/Clovertown can hold the fort with a 3.4-3.6 ghz part on a 1600 mhz FSB, I think Intel holds of Penryn till yields are mature. IF Intel needs it for PR value, I am sureat least a few 45 nano parts can be dragged into the 3rd quarter...

The reality is VOLUME shipments of 45 nano start in Christmas 2007 - You can pullin a design, but due to change over or masks, tools, process, wafer composition, etc, logistically it's damn hard to pull in the timelines on a FAB changeover - these things take time.
 

LordPope

Distinguished
Jun 23, 2006
553
0
18,980
Last time a saw this many cat fights was in high school.
I ma just wait and see, them let the dust settle.

no some people here want to take any INTEL specualation as fact
and any AMD speculation as impossible

thus the fights break out
 

BaronMatrix

Splendid
Dec 14, 2005
6,655
0
25,790
I though Barcelona was still a 3-issue core. Perhaps I am wrong.
And I thought that it will be capable of dispatching only one 128bit SSE instruction per clock.

No, according to the Xbit dissemination last year(and AMD docs), it can do 2 x 128 SSE4A and 2 x 128 X87 FP partly because the L1 does two loads/retires per cycle(again, ideally).
 

Ranman68k

Distinguished
Dec 19, 2006
255
0
18,780
i wanna wait and see.... so that makes me a AMD paid shill

pinocchio.gif
IM NOT AN AMD FANBOI!!!
\
8O <= Pope
wombat2 -- Another embarrassment for the forum. Grow up! :lol:


On another note: Consider the type of software you will be running. 3D games make heavy use of floating point processing. So, consider this fact when choosing a CPU. :wink:
 

the_vorlon

Distinguished
May 3, 2006
365
0
18,780
I though Barcelona was still a 3-issue core. Perhaps I am wrong.

Yes, it is, but with the bi-directional nature of the L1 pipe and OoO, they can do 3 reads and 2 writes (ideally) per cycle. XbitLabs

Yes, I guess it depends what you consider to be an "instruction" - Both Conroe and Barcelona can, ideally, do 5 instructions per clock, The K8L carries over the 3 integer execution units of the k8 and now has 2 128 bit SSE/FP units.

In ~~theory~~ sometimes this will equal 5 per clock, buy usually much less, and most of the time the length of the instruction buffer and the ability to re order loads before stores speculatively will limit this to well under 4.

Where it gets complicated comparing is that both Intel and AMD break x86 instructions down into smaller building blocks or micro-ops for executiion and at this level the two cores are very different.

so even here it's hard to compare.
 

DavidC1

Distinguished
May 18, 2006
499
73
18,860
AMD has stated, on the record, that barcelona is a rework of the existing k8 pipeline. Some heavy modifications to be sure, the 128 bit single clock cycle SSE execution units being the biggest one. In ~~theory~~ this will give Barcelona a big floating point edge on Conroe which is also 128 bits wide, but can only dispatch a single 128 bit SSE instruction per clock versus two for Barcelona.

Ok, you got something mixed up here. First, the word "dispatch" here is not right. Second, the reason Conroe can execute single cycle SSE is because it has a 128-bit SSE unit. Previous CPUs like K8 and Pentium 4/Core Duo had a 64-bit SSE unit, so they had to break up 128-bit SSE instructions into two cycles. Conroe has a 128-bit SSE unit, so it does it in a single cycle. It is one reason Conroe can get massive advantage in SSE performance. The theoretical capability in that regard is essentially equal to Barcelona. Barcelona will have twice the SSE load performance of Conroe, but that won't translate into twice the performance, it'll probably be minor.

SSE unit capabilities for Barcelona and Conroe:
-Conroe can execute 4 128-bit SSE instructions plus 1 if the instruction types are all different.
-Conroe has three 128-bit SSE units but they are not fully symmetric(only two are symmetric)
-Conroe has the ability to execute 2x 128-bit SSE instructions/cycle
-Barcelona can also do 2x 128-bit SSE instructions/cycle
-Barcelona has the ability to execute 2x 128-bit FP LOAD instructions/cycle, while Conroe can do 1x 128-bit FP LOAD.

And btw, Barcelona has a 3-issue core. They just have a ability to fetch 32 Bytes, which is double Athlon 64.

gOJDO, Penryn cores allow the ability to have floating point multipliers, meaning they can step up in 0.5x increments. They don't need to have a wide disparity like 3.0/3.33/3.66, but rather 3.33/3.50/3.66. This ability will also come with Barcelona cores.
 

BaronMatrix

Splendid
Dec 14, 2005
6,655
0
25,790
I though Barcelona was still a 3-issue core. Perhaps I am wrong.

Yes, it is, but with the bi-directional nature of the L1 pipe and OoO, they can do 3 reads and 2 writes (ideally) per cycle. XbitLabs

Baron, this is a new concept to me --- could you point to the Xbit piece that talks about this specifically. The fetch buffer can actually retrieve up to 32 bytes in one clock, depending on the instruction length. Though it will fetch this amount, the width refers to decode, dispatch, execute and retire, which is indpendent of write or read from L1 cache. Intructions are loaded into to the reorder buffer where the most efficient, non-dependent order will be dispatched through the pipe through the execution units.

Example, Core 2 duo can fetch up to 6 instructions per clock (if the average instruction length is 4 bytes, so C2D would be 24 bytes fetched) but this does not mean it will disptach at 6 instructions.

I think you guys are confusing instruction length with core width:

The 16 bytes per cycle fetch rate allows sending 3 instructions with an average length of up to 5 bytes for decoding on each cycle. In certain algorithms, however, the average instruction length in a chain may be bigger than 5 bytes.

Later in their conclusions, Xbit writes:

The inability to decode and retire 4 commands per clock cycle in some cases may result into tangible performance gaps in integer applications.

One should be careful concerning reading Xbit articles, their english is not very good.




Jack


I can usually use context clues. but their english was pretty good. The key thing is that most times you will only get instruction streams of 2-2.5 per cycle.

By bi-directionality I mean it works the same as HT where it can do L1-L2 and L2-L1 at the same time. Barcleona also creates much smaller macro-ops which makes for fewer cycles to decode.
Under ideal (instruction size) situations it should average close to the theoretical max.

Integer should get a hefty increase depending on the efficiency of the OoO mechanism and branch predictor. Just doubling the fetch means hardly any split instruction cases, especially with the ability to do 2 reads per clock.


I am more interested in seeing Kuma than Barcelona because Kuma will give a direct correlation to Brisbane, though I guess you could disable two Barcelona cores.