The growth in speed increases are outpacing Intels.
Its been shown here that AMDs "IPC" if you will, isnt as bad as some thought, and compared to BD, PD showed huge gains.
SR hints ad another set of nice gains, again out distancing Intels growth, and lets just look at the first APU iterations, where no clock speeds, no tweaks, using the old VLIW approach, it was miles ahead, and being held back by BW, where the competition wasnt even fast enough to take advantage of the BW it had.
Upping it to GDDR5, with its BW, using GCN 1.1 if you will, improving clocks and power, then improving power management where AMD has been further behind Intel, we again will start seeing faster growth here as well, allowing for a more sustainable turbo, which Intel already enjoys, and is why unless these chips are locked in at a given frequency, so far, this advantage has been huge for Intel, where the growth is less from them, as we see HSW adding a better gfx solution, and in house VRMs, but the TDP is up in high usage, and the "IPC", if you will, is very flat.
So, AMDs gfx solutions, huge upside.
CPU, nice upside.
Power management, huge upside there as well, and until we see the full growth of the BD design, which we wont until EX, where perf is supposed to skyrocket, I cant see why people wont be picking up these chips, and future ones as well, which is alot more than can be said before RR, where they let BD out before it was truly ready, and why we see the current downtrend, but they arent PDs, or SRs, and these are right around the corner, in varying iterations
But, the difference in IPC currently is only 32 IPC (8 core AMD) vs. 36 IPC on i7 SB...
The difference is only the efficiency of the instructions handling...or were you aware of the actual architecture IPC values?
IPC is not the gap...the front end architecture is...that's why Steamroller will be so amazing...30% increase in effectiveness and IPC increase from 32 to 44...(which will be more IPC than Ivy, and likely Haswell with the exception of the 6 core intel's they will be 48 IPC in Ivy-E)
IPC is NOT a flat value, and varies by workload. Throw in the shared backend, and you can easily get pipeline stalls (which is blamed as one of the reasons for BD's poor single threaded performance). This is especially notable in FP workloads, as the FP scheduler isn't shared like the integer one is.
So anyone claiming to have absolute numbers on IPC is kidding themselves. You can only solve IPC PER APPLICATION.
I'll use this as an example of the math:
As its single threaded, I can safely disregard any pipeline stalls. I'm going to do the math assuming no turbo for simplicity sake though, so these numbers won't be perfect...
Hence why the i5 at a lower clock still beat the 4300: Superior IPC. But note how IPC only dropped by 6, compared to the drop of 19 for AMD: This indicates that the i5 actually scales better then BD/PD (again: could be pipeline stall, lack of core loading, and other factors).
Same story here: IPC drops slighty (7.5), but less then half as much as AMD. Farther indication theres a problem somewhere as workload starts to scale (not a good sign for an arch designed to scale).
As far as this PARTICULAR application goes: Intel has far superior IPC, and scales better in the multithreaded bench, even if it looses in pure performance.
Farther, you can break down IPC per core, which nets this:
AMD FX-4300: 7
Intel i5 2500k: 19.25
Intel i5 3570k: 20.375
Hence why AMD only wins when clocked higher and when all the cores are used: Its per core performance is less then half as much. (Do remember that shared backend though).
So yeah, your IPC numbers are kinda worthless in hindsight. You can only solve PER APPLICATION, as loading factors will vary with different apps.
Pixels per second is not tied to IPC at all...IPC, without having access to the base code of an application, is a Unicorn, in that you and I cannot feasibly determine this number.
IPC has engineered maximums built in to the architecture of the CPU. The variance between programs, stems from applications operating below the engineered maximums to varying degrees
There are some circumstances where a program could theoretically reach the engineered maximum IPC, though this would require an extremely simple program coded beyond humanly well, where all the single instructions processed eliminated larger groups of instructions as they were processed perfectly. I am not sure that any program currently coded today can achieve this utilization of architecture. This would be..."the perfect program". Nothing is perfect...and as I said, while it may be theoretically feasible...it's not at all likely.
That link shows CPU workloads...notice pixel rendering is not among the listed items?
That measures system capability to run said program at X speed. It gives you a metric to compare, but you can reasonably conclude nothing from it other than X CPU runs that program better than Y CPU based on the system's performance that it is installed in
That talks specifically about IPC...note...they talk about values of IPC relatively, without specifics, but you can see that they clearly can say that newer architecture is a higher clock speed with lower IPC. Meaning while the AMD Athlon could process more IPC theoretically than the FX8350, the FX8350 is technically faster because 32 IPC @ 4 billion cycles per second is more than 36 IPC @ 1 billion cycles per second. In 1 second the FX8350 can compute an engineered maximum of 128 billion instructions based on architecture. While Intel's i7-3770k can theoretically process a maximum of 122.4 billion instructions per second (both figured at stock clocks). This shows 2 things...1.) Intel does not have an advantage comparing CPUs on a "raw hardware power" standpoint. 2.) AMD should actually pull ahead if they can get the wrinkles ironed out of their architecture. By shortening pipelines, adding a second decoder on the front end, and decreasing time lost by executing instructions that were miscalculated.
See, a CPU will make "calculated guesses" as to what needs to come next on the instruction path...if it guesses wrong...it clears it's cache and starts over. This is what determines efficiency of the CPU. If people code more specifically for your hardware or architecture design...you gain an advantage in efficiency in your CPU...making it appear faster when it's really not...it's more efficient.
This is why AMD architecture in consoles is such an enormous victory for AMD, that means developers will be coding specifically for their hardware. Which means not only is their design going to become more efficient, but the coding for it will become more efficient to maximize efficiency of instructions for the games being designed.
So, you're chasing a unicorn trying to prove intel IPC is significantly higher than AMD. I gave you the engineered maximums...that is what it is. You cannot change it. Right now, intel is likely 20% more efficient in general, especially in single threaded tasks. But when steamroller comes with 44 IPC @ 4.5 billion cycles per second, with better optimization and better software utilization of architecture. You will see a shift in the balance of things.
I hope you're picking up what I am putting down...you're talking about system optimization and claiming it has anything to do with IPC. It does not.
But, the difference in IPC currently is only 32 IPC (8 core AMD) vs. 36 IPC on i7 SB...
The difference is only the efficiency of the instructions handling...or were you aware of the actual architecture IPC values?
IPC is not the gap...the front end architecture is...that's why Steamroller will be so amazing...30% increase in effectiveness and IPC increase from 32 to 44...(which will be more IPC than Ivy, and likely Haswell with the exception of the 6 core intel's they will be 48 IPC in Ivy-E)
IPC is NOT a flat value, and varies by workload. Throw in the shared backend, and you can easily get pipeline stalls (which is blamed as one of the reasons for BD's poor single threaded performance). This is especially notable in FP workloads, as the FP scheduler isn't shared like the integer one is.
So anyone claiming to have absolute numbers on IPC is kidding themselves. You can only solve IPC PER APPLICATION.
I'll use this as an example of the math:
As its single threaded, I can safely disregard any pipeline stalls. I'm going to do the math assuming no turbo for simplicity sake though, so these numbers won't be perfect...
Hence why the i5 at a lower clock still beat the 4300: Superior IPC. But note how IPC only dropped by 6, compared to the drop of 19 for AMD: This indicates that the i5 actually scales better then BD/PD (again: could be pipeline stall, lack of core loading, and other factors).
Same story here: IPC drops slighty (7.5), but less then half as much as AMD. Farther indication theres a problem somewhere as workload starts to scale (not a good sign for an arch designed to scale).
As far as this PARTICULAR application goes: Intel has far superior IPC, and scales better in the multithreaded bench, even if it looses in pure performance.
Farther, you can break down IPC per core, which nets this:
AMD FX-4300: 7
Intel i5 2500k: 19.25
Intel i5 3570k: 20.375
Hence why AMD only wins when clocked higher and when all the cores are used: Its per core performance is less then half as much. (Do remember that shared backend though).
So yeah, your IPC numbers are kinda worthless in hindsight. You can only solve PER APPLICATION, as loading factors will vary with different apps.
You did a bit of bait and switch (probably unintentional) with your numbers. For the multi-thread you used the 6300 instead of the 4300.
It should be IPC= 727.9 / 4 * 3.8
IPC = 727.9 / 15.2
IPC = 47.9
Per core is 12 instead of 7.
Still not a rosy picture but not quite as bad as your original.
Look at how IPC is calculated...the IPC of the CPU in his example is 16...it's an intel...there is no such IPC as something in the 50's even, much less in the 70's...unless we're talking something on the order of high end server CPUs, though we're not.
You're not anywhere near the ballpark of IPC with pixels per second...If the planet Earth was IPC...you'd be in the andromeda galaxy, about 2 mil light years away.
EDIT: To give you an idea of theoretical performance...
These are maximum theoretical single precision GFLOPS capability for both the i7-3770k and FX 8350:
i7-3770k: 112
FX8350: 256
Maximum theoretical double precision GFLOPS:
i7-3770k: 28
FX8350: 64
Now maybe you can grasp what I am talking about since we are comparing "in intel terms".
As you can see, the raw muscle available is basically double in favor of the FX8350, even including hyperthreading.
It would be nice to see Kaveri compete with a 8150.
Everyone wanted the old chips to undergo shrinks etc, but they arent as scalable as Kaveri will be, doesnt have the power/perf Kaveri will have etc.
relying on one lopsided benchmark will give you your lopsided point of view.
how about using something thats more to the norm as amd/intel goes? 50% rofl. on average its actually closer to 20% as seen here. After all, if its the 50% that you claim, this should never happen:
so whats the problem with Itunes? why is it so far off from all other results? Does it actually reflect RL results? only if you use it, othewise NO.
kinda funny that Itunes and Cinebench run close to the same single threaded result, and Cinebench is known to be compiled on ICC.
You are absolutely correct to state that I shouldn't have looked at a single benchmark to gauge IPC. Taking the average IPC of a few single threaded tests would have been more fair.
Where your logic fails however, is in the multi threaded benchmarks you have posted. You don't judge the IPC of a single core in any application based multi threaded benchmarks, especially when considering that the 2 architectures handle multiple threads differently ( HT vs the module concept.)
Again, for your single threaded cine bench test, considering turbo speeds,
In other words, here ALSO, at equal clock speeds, Intel is 37% faster, not a mere 15-20%.
AAC encoding might have been a worst case AMD scenario or something, that was my bad. But if you use the same logic on most other single threaded stuff, you'll find that intel is usually more than 30% faster at least.
Correct me if I'm wrong.
PS:- I intentionally left out turbo core from the previous calculations, because we were comparing the multi-threaded performance of a Centurion vs an SB-E, and all core turbos are usually not that dramatic. It helps to simplify guestimations by quite a bit. However, in real life turbo scenarios, Intel will pull even more ahead, as the SB-E turbos up by 18%, while the centurion can manage only 10% if the 5.5Ghz max turbo story is true.
Nice work noob whomever says Intel is 50+% faster has certainly bitten the sensationalist apple, basically Intel and Nvidia have been able to spin up stories that are completely falsified and has the effect of taking marketshare,
Anyways at no point was Intel 50% faster in x86 operations, that number is inflated and I would say best case scenario is 15-20% top limit.
so whats the problem with Itunes? why is it so far off from all other results? Does it actually reflect RL results? only if you use it, othewise NO.
kinda funny that Itunes and Cinebench run close to the same single threaded result, and Cinebench is known to be compiled on ICC.
The first bit is MEME ready. "In real life, I Itunes"
Basically thats what it ultimately is, who ever said that iTunes was ever a meaningful synthetic, first its apple and well that about runes it for life and honestly have you seen how rubbish it is to navigate around, I hate to say this but Windows Media Player is infinitely better than iTOONS.
Again its the way Intel is marketed and not in part without the complience from online review sites, they can mask around the truth to create a sensation that Intel is 2x faster than a AMD system when as most of us whom own both will tell you thats hardly true and its only in instances, then when you scale to real world apps that never manifests into major differences.
It's pretty obvious that you're an AMD fanboy, but no worries, as I am actually a bit biased towards AMD myself. But to blindly close your eyes and not see the truth is wrong.
Your 15-20% faster is more like the lower limit, not the upper one!! As i've just shown with 2 calculations, in which Intel is 37% and 50% faster. Its a reproducible calculation with most single threaded tests, and you can calculate them yourself when you have time.
There's nothing wrong with having subpar single core performance, especially when you can more than make up for it in multi threaded workloads like AMD does.
For example, encoding a song is only a matter of seconds, and is irrelevant, but when re-encoding videos with x264 for example, AMD usually provides much more performance per $, and that I like. The i3 vs FX 6300 is an outshining example of this
But by todays demands, and certainly for tomorrow onwords, single core becomes less relevant .
You could say that have beefed up single core perf is the wrong direction, if your pricing model gets beaten in ever growing MT apps.
Now, going to more cores of course you would see an even greater gap, and wins by Intel where it loses, but then again, you have the price of doing so.
I guess what we need is the old car analogy, where everyone would love to have a MacLaren, or at least its engine in their cars, but when a Porsche is so close.....
So again, it could be argued, Intel isnt offering better to most, but only a few, it all depends on how you look at it.
It's pretty obvious that you're an AMD fanboy, but no worries, as I am actually a bit biased towards AMD myself. But to blindly close your eyes and not see the truth is wrong.
Your 15-20% faster is more like the lower limit, not the upper one!! As i've just shown with 2 calculations, in which Intel is 37% and 50% faster. Its a reproducible calculation with most single threaded tests, and you can calculate them yourself when you have time.
There's nothing wrong with having subpar single core performance, especially when you can more than make up for it in multi threaded workloads like AMD does.
For example, encoding a song is only a matter of seconds, and is irrelevant, but when re-encoding videos with x264 for example, AMD usually provides much more performance per $, and that I like. The i3 vs FX 6300 is an outshining example of this
The issues lie in your "IPC" calculations...you cannot deduce IPC from anything anyone has posted...it's not feasibly possible. The limits for IPC are hard wired into the architecture because they are hardware limited. None of the numbers are anywhere near what some of you are talking about either. IPC over 40 is a bit crazy...yes, it is dictated by program's coding, but they never exceed the engineered maximum...that's why overclocking has become so popular...you can't increase IPC, but you can increase the clock cycles per second.
IPC has little relevance from software...the term is thrown about far too often. IPC is an architecture term from engineering as the maximum theoretical capabilities of the hardware itself. Somehow or another, it got attached to software, and it's actually nothing to do with software.
Coding efficiency and optimization of architecture are the advantages Intel has over AMD...those are the only advantages...
If you look at them on paper...the AMD is easily a LOT more CPU...
The issue is the hardware is not being advancely optimized by the coders...once that occurs...look out.
No he's correct, it's an eventuality not a probability. Commodity software is just now getting to the point where it can do multiple threads worth of work simultaneously.
The overuse and incorrect use of the term "IPC" is something I've been warning people about for ages. "Per-clock" means absolutely nothing, it's performance vs cost vs energy usage.
It's pretty obvious that you're an AMD fanboy, but no worries, as I am actually a bit biased towards AMD myself. But to blindly close your eyes and not see the truth is wrong.
Your 15-20% faster is more like the lower limit, not the upper one!! As i've just shown with 2 calculations, in which Intel is 37% and 50% faster. Its a reproducible calculation with most single threaded tests, and you can calculate them yourself when you have time.
There's nothing wrong with having subpar single core performance, especially when you can more than make up for it in multi threaded workloads like AMD does.
For example, encoding a song is only a matter of seconds, and is irrelevant, but when re-encoding videos with x264 for example, AMD usually provides much more performance per $, and that I like. The i3 vs FX 6300 is an outshining example of this
We have been saying this for a while, AMD's arch may not have been perfect but yet somehow it isn't as bad as made out to be bar a few exceptions in synthetics which are latched onto like a pitbull on a chew toy. So if I have taken the minimum, you have stretched the maximum it would be somewhat ironic if its the inbetween ie: 25% but such is life.
I just said that SR just has to stay relatively close to intel ie: within 10% to provide a bit of a problem to Intel, and since improvements are carried out across the line that will mean Kaveri will have similar per core performance which may just very well be the arch that is the salvation to the desktop market. Competition is good and AMD innovation is Exceptional, driven by Keller anything is possible. But just for the record I haven't said that AMD will be better nor do I expect it, intels process is to advanced.
[edit] The quote was Shawn's not Bobby's dunno what happened there
mayankleoboy1 :
The issue is the hardware is not being advancely optimized by the coders...IF that occurs...look out.
I don't expect SR to beat Intel in flat per core performance, just "touch wood" close the gap to within acceptable levels. That will make 2014 a very exciting year.
If stock AMD gets within 10% of stock Intel, AMD gets my money.
My question to you, "Which Intel CPU are you talking about?" Because I'm more than positive AMD is going to put the hurt on the i3s, heck, maybe even i5s if their 6 core kaveri is decently clocked.
we've been 'getting there' ever since amd hex core cpus came out. when the heck will we actually 'get there'? it's 2013 and majority of thuban owners either switched to intel or changed to pd cpus, rendering thuban purchase decision at that time (for multicore 'revolution') pointless. at that time, what those guys got were low clocked power hungry cpus that didn't scale with workload and needed to be overclocked. amd hypes multicore, multithreading a lot but barely supports it in software as well - wouldn't have happened if programmers wanted to make multicore-friendly software or had any say in hardware/sdk development. for example, does anyone here know how codexl coming along? where is amd's own compiler? the continuous negligence from red, green and blue (in seeking instruction support/promoting, for mainstream ) resulted in current skeptical and cynical mindset - something amd needs to change. multicore/multithreading is badly needed now more than ever.
The way I see it is, as nodes become more expensive, less effective as well in some cases, thus becoming even more exotic (coatings etc), thus more expensive yet to retain perf/power, you will start seeing more and more SW doing MT and the like, better use of extensions etc.
Until recently and yet still, cpus have been brute forcing it, especially Intel, and is why we see their smp doing so well.
This is sort of like extra money in the bank, brute force can do.
As SW becomes more demanding, it will either slow down, or become more innovative.
Since AMD is playing catch up, and doesnt enjoy nearly so much of this perf/power advantage, the ongoing tweaks will matter more to them, as still some low hanging fruit is there for them, allowing for some brute forcing of their own, as they have solutions for MT, where they compete quite well.
Going forwards, using bigger chips for DT, which is in decline, means business models have to change, where depending on ROI, it can and will have an effect.
We will see a future where good enough have driven the DT market into SFF with connects, and DT as we know it today as expensive super computers, where discrete gfx and huge cpus still rule, but at costs much higher than we see now, only to sustain a smaller market, within which each node becomes more elaborate and expensive, APUs smothering half of the current family of discrete etc ?
I think its on its way.
The current players and their respective business models will play into all this, as theyre mostly publicly held, and so must maintain certain levels of ROI.
Currently, if this is to be whats going to happen, look at current pricing in the cpu and gfx markets, and where each brand chooses to set their product price.
How is this looking going forwards, all things being somewhat equal
The issues lie in your "IPC" calculations...you cannot deduce IPC from anything anyone has posted...it's not feasibly possible. The limits for IPC are hard coded into the architecture because they are hardware limited. None of the numbers are anywhere near what some of you are talking about either. IPC over 40 is a bit crazy...yes, it is dictated by program's coding, but they never exceed the engineered maximum...that's why overclocking has become so popular...you can't increase IPC, but you can increase the clock cycles per second.
(I REALLY need to stop doing major posts when I'm exiting out the door, because I always screw something up)
Technically, correct. There is one value missing from the equation I used: The number of instructions the application executed. As a result, the IPC values used are higher then what is feasibly possible.
HOWEVER:
Assuming the code paths for the CPUs are more or less the same (which basically means limited use of CPU opcodes that are Intel/AMD only, and a non-idiotic compiler), the Number of Instructions should be more or less the same between Intel and AMD (within 5% or so), so while you can't measure IPC directly, you CAN get an estimate at the relative performance difference between them, since the Number of Instructions can be estimated as just a scalar value to the rest of the equation.
The POV_Ray multithreaded benchmark, the FX-4300 (47.9) and Intel i5-2500k (77) numbers would need to be divided by the Number of Cycles executed to get the real IPC value. Assuming this is more or less the same for both CPU's (which may or may not be a valid assumption), you can measure the relative performance difference in IPC, which nets you:
100 - (47.9 / 77 * 100) = ~38% IPC advantage in favor of Intel for this particular benchmark.
And yes, again, this is why I need to stop posting when I'm walking out the door at work.
relying on one lopsided benchmark will give you your lopsided point of view.
how about using something thats more to the norm as amd/intel goes? 50% rofl. on average its actually closer to 20% as seen here. After all, if its the 50% that you claim, this should never happen:
so whats the problem with Itunes? why is it so far off from all other results? Does it actually reflect RL results? only if you use it, othewise NO.
kinda funny that Itunes and Cinebench run close to the same single threaded result, and Cinebench is known to be compiled on ICC.
You are absolutely correct to state that I shouldn't have looked at a single benchmark to gauge IPC. Taking the average IPC of a few single threaded tests would have been more fair.
Where your logic fails however, is in the multi threaded benchmarks you have posted. You don't judge the IPC of a single core in any application based multi threaded benchmarks, especially when considering that the 2 architectures handle multiple threads differently ( HT vs the module concept.)
Again, for your single threaded cine bench test, considering turbo speeds,
In other words, here ALSO, at equal clock speeds, Intel is 37% faster, not a mere 15-20%.
AAC encoding might have been a worst case AMD scenario or something, that was my bad. But if you use the same logic on most other single threaded stuff, you'll find that intel is usually more than 30% faster at least.
Correct me if I'm wrong.
PS:- I intentionally left out turbo core from the previous calculations, because we were comparing the multi-threaded performance of a Centurion vs an SB-E, and all core turbos are usually not that dramatic. It helps to simplify guestimations by quite a bit. However, in real life turbo scenarios, Intel will pull even more ahead, as the SB-E turbos up by 18%, while the centurion can manage only 10% if the 5.5Ghz max turbo story is true
.
nice try but crude math doesn't pay off. Look at the 2nd part that you missed. Intel cpus will never run at their rated speed unless you put them in the oven to increase the temperature to the point that it throttles. you also can't assume "max turbo" more on that later, but lets examine "turbo values" just for the sake of ignoring the actual value
Look at amd's numbers.
238 for the 4300, 3.8ghz base, 4.0 ghz turbo
242 for the 8310, 3.5 base, 4.0 turbo ...
252.1 for the 8350, 4.0 base. 4.2 turbo
so at turbo speeds, 238/4.0=59.8 per ghz for the 4300, and 240.7/4.0 = 60.7 for the 8320, and 252.1/4.2 = 60.2
That would mean that the 8320 is the best AMD cpu because its IPC is higher ... This is why you can't assume turbo speeds = actual speed.
now lets look at Intel side,
312.4 for the 3770k 3.9 turbo = 80.1
302.2 for the 3570k 3.8 turbo = 79.4
288 for the 3470 3.6 turbo = 80
266.7 for the 3220 no turbo 3.3ghz = 80.8
the ratio of 80 to 60 is 33, not 37.
lets look back a page ". In other words, SB is 1/0.66 => 1.51x PD IPC, ie, its 51% faster."
Lets check that statement since you changed the comparison as IVY didn't gain much in Itunes.
274.9 for the 2500k at 3.7 turbo =74
so now you lost that arguement even deeper (74/60 = 1.23, in other words 23% not 51%, you just lost 55% of your lead) instead compare it to Ivy bridge since its faster than SB. Thats assuming the cpus run at their turbo speeds. .
the part you miss interpreted about the multi-threaded test is in theory, if said program scales 100%, then IPC for said program = single core IPC X speed X cores. Scaling can be calculeated to how well the cpu works comparing to RL results
3570k of 80(ipc per ghz) *3.4ghz *4 = 1088 Actual = 1108.9, scaling of 102% How can you scale over 100%? remember I said more on that subject, here it is. Intel cpus never run at their "rated speed". Its all a marketing gimmick to give you the assumption that the cpu is faster than it actually is if you assume the cpu is running at its "rated speeds" Without monitoring the cpu speed throughout the test, determining core scaling is near impossible to be accurate, same problem with trying to calculate "IPC"( or in this case IPG(hz)
but lets see what happens anyway using the flawed values.
lets see how well HT scales in comparison
3770k (80*3.5 ghz *4 = 2240, actual 1363.6 = 120% scaling minus 102% that the 3570 showed = 18% ht boost.
what was that quote again? "Assuming HT brings in a 30% boost on average,"
don't think i have ever seen ht boost even close to 30%.
Lets see how the 8350 scales
60*4.0*8 = 1920 actual = 1504.4 = 78% (pretty much right where amd stated modules = 80% of a dual core)
so lets examine again what I said.
"After all, if its the 50% that you claim, this should never happen:"
So assuming what you claimed as the truth, 8350 actual numbers should have been (3770k ipc of 80) *66% (1/0.66 rembmer but we are going in reverse to calculate AMD) *4.0 ghz *8 cores, *78% scaling for amd's architecture = 1317.
Or to sb: 250k= 74* 66%*4.0ghz*8 cores*78% = 1219
So if amd's ipc is 50% of Intel's SB, the maximum 8350 score should have been 1219, staying well behind the 3770k. (IE: it should never happen)
Clearly not the case, so clearly AMD != 50% of Intel's SB cpu, its not even 50% of IB.
t should be IPC= 727.9 / 4 * 3.8
IPC = 727.9 / 15.2
IPC = 47.9
this assumes 100% core scaling, can't calculate this direction on a multithreaded test as already shown, especially when HT or CMT are involved.
In other words, all these futile attempts to calculate "IPC" are all flawed. all we can do is take the values for what they are for any given benchmark. IPC is impossible to get any accuracy and carry it from one program to the next. sure, you could guesstimate to some margin of error, but does that make it correct?
Its why I refrain from getting into these "IPC" "Per Core" arguements, most are so redundant in daily operations for a normal user that you cannot realistically tell the difference, but factors such as "SMT" are dismissed when we are clearly moving to that, then "HSA" is another very distinguishable facit that is overlooked so that we can beat on AMD's "per core" all day long. Some will throw the fanboy around but nobody ever said we don't want more or we consider it where we want but AMD is doing what they had to do all along when opting for a new type of system architecture, they had to role through the road map and suffer the growing pains with it, but ultimately there are facets of AMD's arch that is very impressive so perhaps this "fail" moniker is branded all to easy by people who seem to have incentive in pandering Intel around as the best thing since sliced bread.
I do on the other hand use Intel for specific system build requirements, I run a dual Xeon hex core with 580's and a 7970 for purely crunching reasons and it does a splendid job at that. For regular consumer grade systems, gaming, media etc I find Intel lineup duller than a dead cow's eyes, when the temporal boredom sets in, they begin to aggitate me so I don't buy them anymore. If we are insistent on playing the "blind eye" game then why not focus on areas where AMD is actually doing well at. Sure the APU my be loosely defined as a x86 with graphics but only a naive person would be so narrow in mindset, the APU represents the future of heterogeneous system architecture which has in its limited time come on leaps and bounds already flexing its robust performance over pure x86, to do that you need a good graphics core and well AMD is giving Intel a very mighty spanking here, should people focus a bit here or are we "AMD fail only zone" like the muppets at Anandtech.