AMD reports Q3 results

Page 4 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.
I think they're all suckers because Intel knows that Barcelona will be MONSTROUS. If the "IPC-enhanced" cores give them 1.5 loads\retires per decoder, that will theoretically translate into 4.5 IPC.
What if not?

Poor Baron, again he does not understand. K8 and K8L are 3 issue cores, AMD does some form of micro fusion but given that no processor hits it's theoretical maximum it seems unlikely that K8L will exceed 3 IPC.

Boy, he loves to strech things in the direction he sees fit (or shall we say fits his view of things) doesn't he...

As I stated above it's all based on a dream rather then reality. Now if AMD were to increase the amount of decoders then yes, they would be increasing their IPC, but AMD is using 3 complex decoders with limited Micro Ops abilities.

Intel's Core 2 Duo on the other hand are using 4 Decoders (3 Simple 1 Complex) with Macro/Micro Fusion enabling a +1 around 20% of the time (approximately 2/10 instructions for most programs).

So Core 2 Duo is 4+1 while AMD is 3.
 
No it does not, it is the total number of dispatch ports that will determine how many instructions go to the execution units, a 3 issue core is max-theoretical 3, and simple random dependencies dictates that a processor will never acheives this over time. The time average IPC will always be less than the total dispatch capabilities of the processor unless special 'fusion' tricks are used.

You should read some of those links I post.

Here is another one.... this one looks at how weel the OoO execution efficiency changes IPC as a function of window size, what you will notice is that the actual IPC never exceeds the total issue of the processor.
http://courses.ece.uiuc.edu/ece512/Papers/Michaud.2001.HPCA.pdf


Since I'm not working nights anymore, I will read and expound upon your purported proof that AMD can't get more "IPC" meaning successfully executed and retired intsturctions PER SECOND ( per cycle is usually a relative term dependent upon instruction length, complexity and code efficency).

As noted in Part 2 the ALpha 21264 had to in some cases re-issue instructions and it still would DESTROY X86 at half the MHz.

ALso, if you refer to the two different approaches Intel and AMDs patent took for the VAUNTED RHT, it shows that efficiency is the key (just like with the Carnot cycle). Theoretical analog tests can sometimes mimic actual digital OS response curves but once checks and error is applied the actual test algorithm matters as much as the results.

Section 3 descibes the problem overcome by Core 2 load before stores and is a micocode based algorithm that use "probably" 2-4B tags for order in the instruction window(damn that RHT).

By expanding the fetch size to 32B (Barcelona) AMD can add extra instructions to both data and instruction fetches so that more instructions can be loaded before non-conflicting stores - even Intel can't load a value dependent upon a store (unless they use an advanced OS cache that can update the OS stack without going through main memory) - which will enable the retire mechanism to operate more times per cycle on average.


Section 4 shows that FIFO is a key to execution in that if the prescheduler sorts according to an efficient microcode algorithm, 3 FIFOs can actually produce 6 instructions.

By adjutsting latency (~resistance in a complex RC serial\parallel circuit) with varying capacitance or resistance values it is possible to optimize the total cycle usage and increase superscalar capabilities.

(sidebar)

Going faster than the speed of sound is impossible.

(endsidebar)

Section 5 describes how increasing registers and parallelization can also increase efficiency and alleviate latencies resulting in increased IPC ( 2.2 is STILL less than 2.8).

Section 6 clearly shows a theoretical 14 IPC by varying line length and buffer size. Thsi is assuming the same 8 issue ideal core used. If we assume a 97% prediction rate for L1 hits and 3 issue while also assuming HT with two loads per cycle @ 128b provides an "unlimited" supply of data, a quick extrapolation will give approximately (14\8) = (x\3) or 5.25 IPC.


The last section (thank god) explains that associativity in caches was not considered whcih allows for a theoretical application of a few % points to the HW tally.


How close is a THEORETICAL 5.25 to 6?
 
You're the one with the "holier-than-thou" attitude.


Well, i get to do it because I have pulled off the impossible in my career. My latest triumph was rewriting a very important web application in 1/6 the time it took two other devs. And I rewrote the entire Ui and logic code, made it 150% faster and even helped normalize the DB.

I have invented scripting languages for complex Exchange testing, etc. If you would stay off my back my every word would be nice and non-sarcastic but you can't leave it alone so I hope I send you over the edge. Some of you are hanging by a thread.


I mean after all, the description of RHT is right on.


BTW, what does that have to do with AMDs Q3?

Oh yeah, you're jealous of them too.

Oh yeah.. how about showing off your elite coding skills and writing us a nice little program. Doesn't matter what it does.. just a program that showcases your talent?

Until you do that anything you say is and will be taken with a grain of salt. Why? Because what you claim and how you act are completely contradictory.

Only beginners do that. I only code for money.
 
I have invented scripting languages for complex Exchange testing

Writing scripts that performs automatic regression testing on a system or
unit testing is old hat.

Invented is a big word. Did you invent the internet? Because, that
has already been claimed.

Idiots like you wrote the scripts I wrote the parser and execution algorithms.
 
I have invented scripting languages for complex Exchange testing

Writing scripts that performs automatic regression testing on a system or
unit testing is old hat.

Invented is a big word. Did you invent the internet? Because, that
has already been claimed.

Here we go again, BaronMatrix invents something.... I wonder if he applied BaronMatrix Logic® and turned the regression to a progression... maybe that's how he mispredicts the future.

:) :)



it is registered as BaronMatrix nLogic®

:) what does the 'n' mean?


I would think it means "not you" but I guess that would be nyLogic. Wait, I live in ny.

😳

No or maybe it's a play on NUnit. My harness is better as it doesn't require an IDE to write the test cases, so NOT!
 
No it does not, it is the total number of dispatch ports that will determine how many instructions go to the execution units, a 3 issue core is max-theoretical 3, and simple random dependencies dictates that a processor will never acheives this over time. The time average IPC will always be less than the total dispatch capabilities of the processor unless special 'fusion' tricks are used.

You should read some of those links I post.

Here is another one.... this one looks at how weel the OoO execution efficiency changes IPC as a function of window size, what you will notice is that the actual IPC never exceeds the total issue of the processor.
http://courses.ece.uiuc.edu/ece512/Papers/Michaud.2001.HPCA.pdf


Since I'm not working nights anymore, I will read and expound upon your purported proof that AMD can't get more "IPC" meaning successfully executed and retired intsturctions PER SECOND ( per cycle is usually a relative term dependent upon instruction length, complexity and code efficency).

As noted in Part 2 the ALpha 21264 had to in some cases re-issue instructions and it still would DESTROY X86 at half the MHz.

ALso, if you refer to the two different approaches Intel and AMDs patent took for the VAUNTED RHT, it shows that efficiency is the key (just like with the Carnot cycle). Theoretical analog tests can sometimes mimic actual digital OS response curves but once checks and error is applied the actual test algorithm matters as much as the results.

Section 3 descibes the problem overcome by Core 2 load before stores and is a micocode based algorithm that use "probably" 2-4B tags for order in the instruction window(damn that RHT).

By expanding the fetch size to 32B (Barcelona) AMD can add extra instructions to both data and instruction fetches so that more instructions can be loaded before non-conflicting stores - even Intel can't load a value dependent upon a store (unless they use an advanced OS cache that can update the OS stack without going through main memory) - which will enable the retire mechanism to operate more times per cycle on average.


Section 4 shows that FIFO is a key to execution in that if the prescheduler sorts according to an efficient microcode algorithm, 3 FIFOs can actually produce 6 instructions.

By adjutsting latency (~resistance in a complex RC serial\parallel circuit) with varying capacitance or resistance values it is possible to optimize the total cycle usage and increase superscalar capabilities.

(sidebar)

Going faster than the speed of sound is impossible.

(endsidebar)

Section 5 describes how increasing registers and parallelization can also increase efficiency and alleviate latencies resulting in increased IPC ( 2.2 is STILL less than 2.8).

Section 6 clearly shows a theoretical 14 IPC by varying line length and buffer size. Thsi is assuming the same 8 issue ideal core used. If we assume a 97% prediction rate for L1 hits and 3 issue while also assuming HT with two loads per cycle @ 128b provides an "unlimited" supply of data, a quick extrapolation will give approximately (14\8) = (x\3) or 5.25 IPC.


The last section (thank god) explains that associativity in caches was not considered whcih allows for a theoretical application of a few % points to the HW tally.


How close is a THEORETICAL 5.25 to 6?

Wow you really didn't read the same paper JumpingJack linked. Nowhere does it state the amount of instructions can go over the amount of decoders. The ONLY way to do this is to use Fusion techniques. Thus fusing TWO instructions into a single instruction which is then executed. This process is what the core 2 Duo call Macro Ops Fusion.

Technically the processor still is only able to decode 4 IPC but BECAUSE one of those 4 can be comprised of TWO instructions that are fused together the Core 2 Duo CAN at times reach 5 Integers Per Cycle.
 
Invent you do... stories that is... :roll:


I could invite you to my job and you'd say I stole someone's card key.

You work as a hotel janitor?

i think he does he said he doesnt even get to see daylight. i bet he works the day shift and then when he is off its already dark

Have you noticed he's changed his vocabulary. He's now probably proof reading everything and Copy Pasting from websites to sound more intelligent then he actually is.

I find it hard to argue with someone when they're so sure of themselves but yet so wrong.
 
Invent you do... stories that is... :roll:


I could invite you to my job and you'd say I stole someone's card key.

Why? You couldn't get a vistor's pass to show your workstation, or even talk to a co-worker?

I could invite you to my job, get you a vistor's pass, and show you the FAB I work in now, and the FAB I use to work in. Hell, I could even bring you to the cafe for a coffee, that I would gladly pay for.

Since you should get a promotion or something for inventing this great thing, I'm sure they wouldn't mind you showing some people around, right?

Otherwise, it's BS.
 
I've bolded the bullshit.

Since I'm not working nights anymore, I will read and expound upon your purported proof that AMD can't get more "IPC" meaning successfully executed and retired intsturctions PER SECOND ( per cycle is usually a relative term dependent upon instruction length, complexity and code efficency).In an argument about IPC, don't try to change the issue.

As noted in Part 2 the ALpha 21264 had to in some cases re-issue instructions and it still would DESTROY X86 at half the MHz.X86 != Alpha.

ALso, if you refer to the two different approaches Intel and AMDs patent took for the VAUNTED RHT, it shows that efficiency is the key (just like with the Carnot cycleDid you also know that the Carnot cycle 1. has nothing to do with the argument 2. is impractical). Theoretical analog tests can sometimes mimic actual digital OS response curves but once checks and error is applied the actual test algorithm matters as much as the results.WTF are you smoking?

Section 3 descibes the problem overcome by Core 2 load before stores and is a micocode based algorithm that use "probably" 2-4B tags for order in the instruction window(damn that RHT).

By expanding the fetch size to 32B (Barcelona) AMD can add extra instructions to both data and instruction fetches so that more instructions can be loaded before non-conflicting stores - even Intel can't load a value dependent upon a store (unless they use an advanced OS cache that can update the OS stack without going through main memory) - which will enable the retire mechanism to operate more times per cycle on average.Cocaine is a hell of a drug.


Section 4 shows that FIFO is a key to execution in that if the prescheduler sorts according to an efficient microcode algorithm, 3 FIFOs can actually produce 6 instructions.Relevence? I can type gibberish too.

By adjutsting latency (~resistance in a complex RC serial\parallel circuitSo, obviously Intel will go bankrupt[/sharikou]) with varying capacitance or resistance values it is possible to optimize the total cycle usage and increase superscalar capabilities. (WTF? This whole paragraph makes no sense.)

(sidebar)

Going faster than the speed of sound is impossible.

(endsidebar)

Section 5 describes how increasing registers and parallelization can also increase efficiency and alleviate latencies resulting in increased IPC ( 2.2 is STILL less than 2.8).

Section 6 clearly shows a theoretical 14 IPC by varying line length and buffer size. Thsi is assuming the same 8 issue ideal core used. If we assume a 97% prediction rate for L1 hits and 3 issue while also assuming HT with two loads per cycle @ 128b provides an "unlimited" supply of data, a quick extrapolation will give approximately (14\8) = (x\3) or 5.25 IPC.


The last section (thank god) explains that associativity in caches was not considered whcih allows for a theoretical application of a few % points to the HW tally.


How close is a THEORETICAL 5.25 to 6?
You can increase your credibility by posting your research papers here. Since you're in computer science, I will link you to a secret tool that our favorite PhD uses.
http://pdos.csail.mit.edu/scigen/
 
I think they're all suckers because Intel knows that Barcelona will be MONSTROUS. If the "IPC-enhanced" cores give them 1.5 loads\retires per decoder, that will theoretically translate into 4.5 IPC.
What if not?
There is no such case in BaronBS nLogic® algorythm. But it has a very powerfull feature, the technology of seeing the wishes as reality.
Anyway the BaronBS nLogic® is enabled only on the HORDE operating system.

gOJDO,

You seem to have an inside track on the HORDE OS. Ive heard they're coming out with a new optional GUI called "Blinders"TM, which will automatically shut the computer down if it senses something the user doesnt want to see. Can you expand on that?

Thanx
No, there is new GUI for the HORDE OS, it is now know as "Brainwash" (tm), where you see the things through the green filter. The filter displays the things you want to see(also there is the same option for audio) and hides the things you don't want to see. It is very simple, but very powerfull toy for the HORDE OS. It makes your system the best system ever made. There is also such feature with the blue filter, but it works on another OS.


BaronBS Classics :lol:

And no I don't have a link.
You don't have a brain, also. :wink:
 
No, there is new GUI for the HORDE OS, it is now know as "Brainwash" (tm), where you see the things through the green filter. The filter displays the things you want to see(also there is the same option for audio) and hides the things you don't want to see. It is very simple, but very powerfull toy for the HORDE OS. It makes your system the best system ever made. There is also such feature with the blue filter, but it works on another OS.
The best feature is when you boot up, it displays... WELCOME GENIUS . Perfect for the HORDE ego. :wink:
 
Since I'm not working nights anymore, I will read and expound upon your purported proof that AMD can't get more "IPC" meaning successfully executed and retired intsturctions PER SECOND ( per cycle is usually a relative term dependent upon instruction length, complexity and code efficency).

Baron, having re-read the rubbish, I felt compelled to respond to this part.....

I am not trying to prove anything, what I am trying to explain is your ridiculous claim that a 3 issue core can achieve 4.5 IPC.... it is complete and utter nonsense.

I completely expect AMD to improve the IPC of their processor but it will not sustain above 3 instructions retired per clock, it is not possible.

Finally, you have major problems with terms. At the very least you must understand the basic fundamentals before you throw jargon around that you clearly do not understand. IPC is instructions per clock. That is clock ticks or clock cycles. IPS (instructions per second) = IPC X CLOCK SPEED (Hertz).

In basic kindergarten science class we teach things like unit analysis, so follow me here...

IPS (instructions/second) = IPC (instructions/clock) X CLOCK SPEED(clock/second)

The units on the right cancel to give you the units on the left.

You are not very bright and you should avoid making insanely ridiculous claims that put you on the intellectual level of a three-toed sloth.


Yeah right. I know the difference between IPC and IPS. I even mentioned that IPC varies depending upon instruction length, complexity. WHile IPS is an aggregate of clock cycle averages.

Talk about bragging.
 
Trying to drag this back to the original topic...

In the third quarter of 2005, excluding the Memory Products segment(1), AMD reported sales of $1.01 billion and operating income of $129 million. In the second quarter of 2006, AMD reported sales of $1.22 billion and operating income of $102 million.
Considering the dramatic price decreases of its desktop processors over the past three months, AMD delivered a solid third quarter result. Sales increased almost 32% from $1.01 billion to $1.33 billion year-over-year. Operating income retreated from 129 million to $119 million in the same time frame.

This is what will hurt AMD big time starting next quarter. As I stated in my other post they will be selling more processors (thanks to Dell) but making less profits. This is a direct consequence of Intel's strategy with Core 2 Duo pricing.[/quote]

These figures are actually the way that AMD has to move forward now and is the only way that their new business next year can succeed once the costs of the ATI acquisition are taken into account.

At this point, cashflow from sales becomes far more important than profit income as there will be a far greater debt burden to manage.

It's a classic setup of a growth company expanding rapidly and the model works, but would fail when the growth is not sufficient. However, the benefit to this kind of model is of course that the creditors can't afford for you to go out of business either, so they don't let you fail. Etc etc.
 
I really don't understand why certain people point out on certain company losses or profit, in this case of AMD or Intel.

Money is the driving point of anything in the market. Doesn't mean a thing otherwise, as long as the company/people make their pile of money.

I mean greed is the main root of this evil, and so the only thing running through the companies mind is to make more, not to mention those who are in to stock exchange to make a profit. How it works, why it works, how well it works doesn't matter, as long as it sells. Other wise, I think the CPU technology would be farther ahead, instead of milking people for profits.

Just a simple straight forward question:

So why burden your thoughts on company profits when the majority of us just want to find the best price/performance of certain components? Or to understand how technology works behind it?

I pretty much thought this is one reason why there's a CPU forum to learn these things. If I was into stocks, making money, I'd prolly be more into this kind of thread. :lol:
 
Invent you do... stories that is... :roll:


I could invite you to my job and you'd say I stole someone's card key.

You work as a hotel janitor?

i think he does he said he doesnt even get to see daylight. i bet he works the day shift and then when he is off its already dark

Have you noticed he's changed his vocabulary. He's now probably proof reading everything and Copy Pasting from websites to sound more intelligent then he actually is.

I find it hard to argue with someone when they're so sure of themselves but yet so wrong.

Sounds like the definition of your bias to me. You wanted me to take this more seriously. I told you you didn't, let's just keep it fun, but noooo.
 
Invent you do... stories that is... :roll:


I could invite you to my job and you'd say I stole someone's card key.

Why? You couldn't get a vistor's pass to show your workstation, or even talk to a co-worker?

I could invite you to my job, get you a vistor's pass, and show you the FAB I work in now, and the FAB I use to work in. Hell, I could even bring you to the cafe for a coffee, that I would gladly pay for.

Since you should get a promotion or something for inventing this great thing, I'm sure they wouldn't mind you showing some people around, right?

Otherwise, it's BS.

Ok, meet me at 55th and 6th Ave in NYC and I'll show you my Home Office and take you downtown to my current assignment with the city of NY.
 
Invent you do... stories that is... :roll:


I could invite you to my job and you'd say I stole someone's card key.

You work as a hotel janitor?


And so the fool debases himself. The besmirching of character in a serious conversation shows a lack of couth on your part that isn't surprising given your penchant for partiality.

Begone.
 
Invent you do... stories that is... :roll:


I could invite you to my job and you'd say I stole someone's card key.

Why? You couldn't get a vistor's pass to show your workstation, or even talk to a co-worker?

I could invite you to my job, get you a vistor's pass, and show you the FAB I work in now, and the FAB I use to work in. Hell, I could even bring you to the cafe for a coffee, that I would gladly pay for.

Since you should get a promotion or something for inventing this great thing, I'm sure they wouldn't mind you showing some people around, right?

Otherwise, it's BS.

Ok, meet me at 55th and 6th Ave in NYC and I'll show you my Home Office and take you downtown to my current assignment with the city of NY.

Okay. So, anyone that shows up at 55th and 6th Ave, will find you how?

You'll have some sort of sign or something to point you out? I hope so.

What time? Can we be more vague? I can post a street number too, you know.
 
Okay. So, anyone that shows up at 55th and 6th Ave, will find you how?

You'll have some sort of sign or something to point you out? I hope so.

What time? Can we be more vague? I can post a street number too, you know.
Look after confused stupid faces with dumb look. He is the dumbest, you can't miss.