AMD CPU speculation... and expert conjecture

lilcinw · Apr 16, 2013

I think that we would all agree that what gets thrown around as 'IPC' these days bears little resemblance to the original definition. The problem is that 'IPC' is a widely accepted term that conveys the basic idea being discussed; a given CPU can process data quantity (x) in time

at clockspeed (z).

Until an alternative phrase becomes generally accepted people will continue to use 'IPC' incorrectly for convenience's sake.

P.S. Watching these threads (BD, PD, now SR) devolve into the same discussions each time is kind of like Groundhog Day. The first time it was educational since I was still newbish but now just feels like a broken record. Next we will be arguing whether or not people can see more than 60 FPS and why films get away with 24 FPS (or did we already?).

sarinaide · Apr 16, 2013

We could always talk about how my fuddy duddy APU can spank the fastest current mainstream CPU from pillar to post........in integrated gaming but that wouldn't be to much fun now would it?

8350rocks · Apr 16, 2013

gamerk316 :

No, I debunked your entire argument earlier...and you just glossed over it without responding.

IPC != pixels per second.

It's not even a software term...you guys throw it about like it is...but you cannot rationally determine how many instructions each CPU is actually processing. All you can determine is how many pixels per second the system is rendering. Gaps there can stem from less optimal utilization of GPU, they can stem from less optimal utilization of memory bandwidth, they can stem from inefficient coding from a compiler (see: ICC), and many, many more things...

Utilizing the same hardware for benchmarks is scientific, but not necessarily the most accurate display of capabilities. Tom's hardware just showed last week, that AMD GPUs perform slightly less than comparable nvidia GPUs paired with AMD CPUs. The same article also showed that nvidia GPUs perform slightly less than AMD GPUs when paired with intel chips. Ironic is it not? Though, with that information, based on systems, even the choice of GPU can dramatically impact the pixels per second alone.

In order to get a most accurate comparison, you would need to take 2 GPUs benchmarked to show similar rendering capabilities with each CPU...(may not be the same card) You would need to have RAM that has XMP for intel and AMP for AMD (likely will not be the same RAM) Those levels of optimization alone will skew the scores of such benchmarks drastically in one direction or the other.

So, much of what you are saying is basing your conclusions on scientific results taken out of context. Suppose that those benchmarks were run with RAM that only had AMP and not XMP, and then they used a nvidia GPU...want to bet the results wouldn't be 30% closer?

One thing I learned long ago was this: numbers and benchmarks can be manipulated to show whatever you want, whether it be intentionally or inadvertently. So take a benchmark, any benchmark, with a grain of salt.

lilcinw · Apr 16, 2013

sarinaide :

If that rig has any downtime you should think about setting it up for Folding@Home (for Tom's team of course). You would be a welcome addition to the team.

sarinaide · Apr 16, 2013

lilcinw :

It runs part of our local Community Grid team, It does cost a lot of money

. It also does do folding but never knew THG had folding.

mayankleoboy1 · Apr 16, 2013

lilcinw :

+1

P.S. Watching these threads (BD, PD, now SR) devolve into the same discussions each time is kind of like Groundhog Day. The first time it was educational since I was still newbish but now just feels like a broken record. Next we will be arguing whether or not people can see more than 60 FPS and why films get away with 24 FPS (or did we already?).

+10

8350rocks · Apr 16, 2013

gamerk316 :

No, you'll get the number of properly executed instructions per cycle.

Which again breaks down to efficiency. You're comparing a refined architecture that's been improved many times, to 2nd gen architecture. The efficiency is going to be drastically better for intel. That's what it is...fewer miscalculations through the front end of the architecture allow intel to run more efficiently at the CPU level. Though I still do not accept your argument entirely on the grounds that systems were not optimized for the individual architectures.

If Intel was 51% better than AMD...then how on earth does the FX8350 beat Intel's i5-3570k and i7-3770k at so many things? Is it just 175% better at certain things? How do you close the loop for your logic?

Your math is flawed...and so is your logic.

Cazalan · Apr 16, 2013

sarinaide :

APUs aren't terribly exciting when $100 discrete still blows them away. When the PS4 APU goes mainstream now THAT will be an APU worth boasting about.

AMD has a lot on the table this year but so far we haven't seen anything concrete and we're into Q2 already.

de5_Roy · Apr 16, 2013

8350rocks :

No, you'll get the number of properly executed instructions per cycle.

Which again breaks down to efficiency.

You're comparing a refined architecture that's been improved many times, to 2nd gen architecture. The efficiency is going to be drastically better for intel. That's what it is...fewer miscalculations through the front end of the architecture allow intel to run more efficiently at the CPU level. Though I still do not accept your argument entirely on the grounds that systems were not optimized for the individual architectures.

If Intel was 51% better than AMD...then how on earth does the FX8350 beat Intel's i5-3570k and i7-3770k at so many things? Is it just 175% better at certain things? How do you close the loop for your logic?

Your math is flawed...and so is your logic.

are you sure you want to associate that word with current fx cpus(covering earlier posts as well)?

$hawn · Apr 16, 2013

First of all, noob cool down. I am not planning to put you down in this forum in anyway, and that is not my intention. Let's have a discussion, not an argument.

Secondly, lets make somethings clear. When I say 'IPC', I mean the ratios of the IPC's for a given task (although IPG like you said would be a much better term to call it), and that's why all my comparisons are in terms of IPG ratios. Like you must already know, percentages are nothing but ratios expressed with 100 as the base.

Even though I shall use IPG from now on, its still an inappropriate term, as no mordern chip does around 60 instructions per GHz. However, I hope we both understand what kind of performance relative measurement IPG* is for us, and I'll base further discussions using this (hopefully) mutually understood term.

Also, we are NOT dealing with proper scientific data here, and hence minor variations are definately bound to occur. Scientifically speaking, a 5% variance is well tolerated for most rough measurements (although I prefer values that are within 3%).

Now let me go through the points you've put through, and try to counter them.

so at turbo speeds, 238/4.0=59.8 per ghz for the 4300, and 240.7/4.0 = 60.7 for the 8320, and 252.1/4.2 = 60.2

That would mean that the 8320 is the best AMD cpu because its IPC is higher ... This is why you can't assume turbo speeds = actual speed.

Avg IPG for FX chips, for that task = 180.7/3 = 60.23.
Therefore, would you still claim the 8320 to have best IPG*?? Hell no, its not even 0.8% above the mean value!!
A more rational understanding is that all FX chips have the same IPG, as they all come with a 1% tolerance value. The slight variations are more than acceptable margins of error in our expirement.

the ratio of 80 to 60 is 33, not 37

The ratio isn't 33, its the percentage speed increase that is.

Again, my ratio = 1.37. Your ratio= 1.33.
Whats the % difference? (137/133)x100= 1.03%. My value was only 3% faster than yours, again which very comfortably sits within the margin or error.

274.9 for the 2500k at 3.7 turbo =74

so now you lost that arguement even deeper (74/60 = 1.23, in other words 23% not 51%, you just lost 55% of your lead) instead compare it to Ivy bridge since its faster than SB.

No I didn't

Again you missed a very crucial point. These IPG* values of 74 and 60 which you are using here, are for an entirely different workload. These are for cinebench, NOT AAC encoding. As I'm sure you already know, IPG ratios of AMD vs Intel vary from task to task, based on the strengths and weakness of each architecture, for the particular task in concern. There is no fixed value that suits all the tasks.

For cinebench I calculated the intel/amd ratios as 1.37(37% faster), and for AAC encoding 1.50(50 % faster).
Again, note that if I used IB instead of SB for that AAC test calculations, the results would have been even more towards Intel's favour, and yes, exceeding 50% perhaps!!

Please run your own calculations for single threaded AAC encoding, and tell me what numbers you arrive at. I shall be waiting for your calculation results.

if said program scales 100%, then IPC for said program = single core IPC X speed X cores.

Perfectly correct, but IPG ratio measurements get so much more simplified if we just consider single cores at identical frequency.

don't think i have ever seen ht boost even close to 30%

Regarding HT giving upto 30% more performance, I possibly am wrong. My mistake.

scaling of 102% How can you scale over 100%

2%, once more, well within error tolerance limits. Rounding of data values are main contributers to this kind of error.

"After all, if its the 50% that you claim, this should never happen:"

Ofcouse it won't!! You took the wrong ratio. Take 37% and see what it comes upto. It'll be closer to real life, but you'd still be off. Why? Because, your method of calculation is also flawed. Workloads scale differently among multiple cores on AMD and Intel chips. Have you considered that into your calulations?
This is the reason I stick to single threaded calculations.

Cazalan · Apr 16, 2013

8350rocks :

The FX-8350 has a bottleneck in the instruction decoding, so it can't keep the pipes full. The backend actually has more execution resources which is why fully parallelized tasks can beat the i5/i7 quads in many things.

This is why SR is going to dual instruction decoders, which will clear up that bottleneck (hopefully) and provide more consistent performance.

$hawn · Apr 16, 2013

gamerk316 :

Considering the average of a number of workloads, I'd almost agree with that figure, but I think 65-75% is a more accurate range

Higher clock speeds work fine on the desktop. Laptops are where AMD needs to do something. Was looking at a few laptops the other day, and man the A8 and A10 really do need to improve their clocks. 1.9Ghz is really a joke. Plus, I've read somewhere that these APU's can't sustain their max turbo state for more than a few seconds. They really look weak against even a non turboing mobile i3, let alone the i5's.

de5_Roy · Apr 16, 2013

amd is looking to change the turbo situation with richland. kaveri will likely have the best solution.
edit: in mobile/laptops.

truegenius · Apr 16, 2013

gamerk316 :

you don't need to walk, ya just need somethings

so you basically need a car

and a driver

i mean a driver which will not overload your cpu and gpu and memory (brain) and camera (eyes) and some other peripherals

like this

note : do not use instruction MALENATURE so as to avoid undesireable part of this reply

8350rocks · Apr 16, 2013

$hawn :

Based on GFLOPS, 1 AMD MODULE is worth 2.2 intel cores....or an intel core is worth about 45% of 1 AMD MODULE (64/28 = 2.2xxx)

Please refer to the manufacturer's maximum GFLOPS for single and double precision:

i7-3770k SP GFLOPS: 112
i7-3770k DP GFLOPS: 28
FX8350 SP GFLOPS: 256
FX8350 DP GFLOPS: 64

sarinaide · Apr 16, 2013

Cazalan :

Sigh, constantly overlooked. So a Llano let alone the mid level Trinity's are able to bash lumps of custard out of the best the x86 superpower can muster up. For $100 for a chip I can get 50 odd FPS in MP BF3 at 1280x760 Low presets the exact same they benched HD4000's new drivers yesterday with, and while HD4000 barely ever hit 30FPS the APU is muscling its way along at 50FPS in a game where low is still a tessellation junkie. Contrastingly I can play F1 2012 on Ultra at 1080 at 40FPS this is copious enough and while yes the bandwidth limitations are holding iGPU's back at least AMD have something that is playable. Then I may be oversimplifying the iGPU which is premised on future systems architecture, it is rather impressive in the few HSA based benches.

If you look at the APU as being something that is relative to a discrete card then perhaps you have overlooked what is a piece of technology that "NOBODY" can replicate in realms of reason. Then we can point you to the few Richland benches which the parts are out in DT and mobile form in a few days time. Fast forward to Q4 and Kaveri which may very well eliminate the need for entry level cards. The early talk is 60% iGPU gains over Richland which is 22-40% over Trinity. If you factor in the synthetic scores, if Richland is 2000P in 3DM 11 twice that is 4000 which very mainstream, ie: 560ti scores 4200.

Since xbox and PS4 are similar, AMD have said they will be making DT derivatives of it, already Kaveri has a 6core variant in less power than the existing 5800K.

sarinaide · Apr 16, 2013

truegenius :

gamerk316 :

you don't need to walk, ya just need somethings

so you basically need a car

and a driver

i mean a driver which will not overload your cpu and gpu and memory (brain) and camera (eyes) and some other peripherals

like this

note : do not use instruction MALENATURE so as to avoid undesireable part of this reply

Lol

JAYDEEJOHN · Apr 16, 2013

Quote:
P.S. Watching these threads (BD, PD, now SR) devolve into the same discussions each time is kind of like Groundhog Day. The first time it was educational since I was still newbish but now just feels like a broken record. Next we will be arguing whether or not people can see more than 60 FPS and why films get away with 24 FPS (or did we already?).

The above sounds like a bad Hobbit to have, we need 48

Cazalan · Apr 16, 2013

$hawn :

Were they the old Llano models? The newer Trinity models run faster.

http://www.staples.com/HP-Pavilion-G6-2235us-156-Laptop/product_985050

A6-4400M is 2.7/3.2GHz - not bad for $399

8350rocks · Apr 16, 2013

de5_Roy :

I totally agree with your insinuation, the FX8350 is far less efficient than intel right now. If it was a question of raw computing power...such as a dead on comparison of GFLOPS potential...intel would be a laughing stock and the discussion would be completely inverted.

However, in efficiency, we're comparing 2nd gen architecture to architecture that has been refined for the last 12-15+ years...(since the pentium, essentially). So of course, with 4 generations+ of refinement, the intel is far more efficient.

If AMD can get a firm grasp on the hardware front end to make it comparably efficient...and shorten the learning curve to catch up. The raw horsepower in the vishera CPUs is more than intel brings to bare in haswell, or even speculatively in broadwell without a drastic change in architecture and process.

AMD has the horsepower...but it's like a 1970 GTO Ram Air 4...it's sleek with tons of horsepower...but it only gets about 8 MPG...because it's inefficient.

griptwister · Apr 16, 2013

I feel, If AMD would give the APUs some gosh darned L3 Cache, they'd be up there with intel. I'd like to buy a new CPU sometime after july, probably Kaveri, and I'd like to upgrade to a CPU with L3 cache. Lol, I'm not sure why they can't/don't do this, would it have a slightly higher power footprint? I really want to upgrade this Phenom II x4 840, but I don't want to waste money. (Cpu has no L3 ):

8350rocks · Apr 16, 2013

Cazalan :

I have seen the logic flowcharts for steamroller, and it looks incredibly promising...they predict 30% greater efficiency in mispredicted branches, and combine that with shorter pipelines, decreased memory latency, etc. It will be a giant leap forward...I equate it to the leap that occured at K7/K8 in terms of hardware advancement percentage gains.

sarinaide · Apr 16, 2013

http://www.coolaler.com/showthread.php/301778-%E5%85%A8%E4%B8%96%E7%95%8C%E9%A6%96%E7%99%BC-MSI-GX60%E6%94%B9%E4%B8%8AAMD-A10-5750M

Coolaler fun as usual, there is a ES A10 5750M

Cazalan · Apr 16, 2013

sarinaide :

On the contrary. I have high hopes for APU, just not what is shipping today for my needs. Clearly AMD can put an 18 core GPU in an APU. They have done so for Sony. This means by ~2015 there will be some beefy APUs that can displace the under $100 GPU.

It may require 20nm tech to couple a higher performance CPU (SR) with an 18+ core GCN 2.0. GF should have that online by the end of 2014.

Cazalan · Apr 16, 2013

griptwister :

The need for L3 is mostly negated by the larger L2 caches AMD uses. Intel still uses much smaller 256K L2 caches where AMD is using 1024K.

The latency of the L2 cache is AMDs bigger issue, not the size.

AMD CPU speculation... and expert conjecture

Distinguished

Splendid

Distinguished

Distinguished

Splendid

Distinguished

Distinguished

Distinguished

Splendid

Distinguished

Distinguished

Distinguished

Splendid

Distinguished

Distinguished

Splendid

Splendid

Champion

Distinguished

Distinguished

Distinguished

Distinguished

Splendid

Distinguished

Distinguished

Share this page