AMD CPU speculation... and expert conjecture

Page 529 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.
Status
Not open for further replies.

colinp

Honorable
Jun 27, 2012
217
0
10,680


Oooh! Look everyone! AMD news to discuss, instead of endless flaming about x86 vs ARM, Windows vs Linux, D3D vs OpenGL.

Eagerly awaiting reviews of mobile Kaveri, probably in the vain hope that someone will actually put it into a decent design rather than the nerfed ones we've seen from HP et al in the past.
 

juanrga

Distinguished
BANNED
Mar 19, 2013
5,278
0
17,790



There is an evident typo in the Anandtech graph. The frequency is 3.2GHz not 2.3Ghz. Your numbers are all incorrect.

Jaguar: 0.39/1.5 = 0.26
Puma+: 0.54/2.2 = 0.25

Thus Jaguar and Puma+ have the same single thread IPC (within the 0.01 margin of error), just as expected because both are the same muarch. Now let us see Piledriver

PD: 0.70/3.2 = 0.22

Those numbers agree with averages obtained by EXTECH: 103% (10 tests) and 114% (18 tests).

It also seems that it was not understood that EXTECH numbers are not limited to single thread benchmarks.

FYI ARM's Mike Filippo stated the a57 is targeted at 2.5 ghz on 20nm. AMD Seattle Opteron A1100 is a 28nm chip. Thus 2.5GHz/1.183 = 2.11GHz >= 2 ghz

Your pretension that the Opteron was clocked "at least 2.5 ghz" continues being your pure invention.

I did read your link to the Xeon E3-1220L is funny. Now check this other to another Xeon E3-1220L

http://www.spec.org/cpu2006/results/res2011q2/cpu2006-20110524-16734.html

See? benchmark score of only 27.2.

You already tried this kind of cherry picking in the past. It was with a C-ray benchmark. I explained then to you how different compiler setting and other stuff can vary the score for the same processor. Did you learn? You repeat the same mistake once again.
 

juanrga

Distinguished
BANNED
Mar 19, 2013
5,278
0
17,790


The anandtech article about "Mobile Kaveri spec details" is simply a copy and paste of the AMD website. Scroll down the anandtech page and click where says "Source". You will jump to the same AMD website link given by me before in this thread:



Oooh! Look everyone! AMD news to discuss... about Kaveri mobile is the same information that I posted the day 25. :lol:
 

hi collin! thanks for noticing!

the lower skus have a welcome bump in memory speed and the top skus get ddr3 2133 support. both are overdue since trinity. unlike with desktop skus, the mobile skus get clockrate bump on the cpu side (over richland), which will greatly help in cpu performance and in benchmarks. imo amd would add ddr3 2133 for all of the 384 shader kaveri skus as well, even if oems use lower speed ddr3 1333 ram. it would at least give uers a chance to upgrade.

this could also be chance to measure how good gcn 1.1 is compared to vliw4 since shader counts are similar in previous skus albeit with minor clockrate reductions (e.g. a10 5757m vs a10 7400P).
 

juanrga

Distinguished
BANNED
Mar 19, 2013
5,278
0
17,790


Simple, because this is an AMD thread.



It is not an excuse. It is the reality. The amount and complexity of the projects that AMD is developing are far away from oculus or calxeda. AMD is competing against Intel, Nvidia, Qualcomm, Microsoft, Google/IBM... at once and AMD is not pushing infinitesimal updates to existing tech/software, but is developing far innovative projects: Skybridge, HSA, MANTLE, SBSA, K12...

Didn't you hear Keller/Papermaster last conference? Papermaster joked about how he cannot give more resources to Keller. AMD resources are finite.

5c45a10c2ba0ab480809c36caba8d090.png
 


Its interesting to see how people react to that... I'll keep an eye on comments once its released. It will be a monkey show with looooooots of poop, hahaha. Man, I'll have to get the pop corn ready.

And well, that is one game I won't buy for sure (not because of this, but I've never played that saga; not interested), so it doesn't affect me at all.

Cheers!
 

juanrga

Distinguished
BANNED
Mar 19, 2013
5,278
0
17,790


Further thoughts from a dev.:

You have this wrong, us developers have been going to Microsoft for years on end and have mostly got middle finger, malicious laughter, with some left-overs thrown at them at most. AMD's trip to Mantle land comes out of us developer begging something or someone to throw us a bone and now we got a big juicy one, with a twist. This bone is trip wired with picking sides, splitting markets and whatever the choice it's going to be risky and piss someone off. It's a desperate move, but seems like jump from burning ship to risk drowning in cold water is the choice for many developers here.
 

juanrga

Distinguished
BANNED
Mar 19, 2013
5,278
0
17,790


http://www.forbes.com/sites/jasonevangelho/2014/05/26/why-watch-dogs-is-bad-news-for-amd-users-and-potentially-the-entire-pc-gaming-ecosystem/
 

colinp

Honorable
Jun 27, 2012
217
0
10,680


And then all you need is for the laptop manufacturer to allow fast memory in their bioses. I have a Llano laptop from HP that will only run 1600 memory at 1333 speed. And the other problem with it is that the cooling system is not up to scratch, although maybe that's to do with me overclocking the CPU and 6750m dGPU...
 
Funny how people listen to one site and make their opinions:

Ultra_02.png


290x ~ Titan. Or about where it should be. 290 and 7970 CRUSH the 770.

Ultra_03.png


Same deal, though the 770 catches with the 7970 (likely VRAM bottleneck on lower tier cards?).

In both cases:

Titan ~ 290x
780 > 290
7970 ~ 770

Looks about right, performance wise. I don't see anything too bias.

CPU wise, about what you'd expect:

CPU_01.png


GPU bottleneck @ 82 FPS, so Intel has a LOT of headroom to go. FX-8350 caps out at 79 FPS, just ahead of the i3-4130 @ 76 FPS, tied with the FX-6300. Kinda shocked the i3 line is still holding up, FPS wise at least.

Also worth noting is the jump from the 4320 to the 6350, so the game clearly can use those extra CPU cores. Intel seems strong enough to make due with a Quad, or even a high end Duo though. Really be interested to see how non-GPU bottlenecked numbers look like though, to see how high Intel's performance goes.

Also, going to solve for IPC between the 4320/A8-7600 later, just for kicks.
 
Watch dogs: Solving IPC difference between the 4320/A8-7600*

*Not real IPC values; just looking for the DIFFERENCE. Also makes a lot of other assumptions (core loading being similar) which may not hold up. Good enough for a guestimate though.

FX-4320: 67 FPS @ 4.0 GHz
A8-7600: 52 FPS @ 3.1 GHz

Solving for relative IPC:

Perf = IPC * Clock * Num_Cores

Num_Cores cancels in this case, since both CPU's have four, leaving us with:

Perf = IPC * Clock

FX-4320:
67 = IPC * 4.0
IPC = 16.75

A8-7600:
52 = IPC * 3.1
IPC = 16.77

Don't even have to do the divide; the IPC difference between the two chips is essentially equal. So Kaveri = Piledriver, as far as IPC goes in this one specific program.
 

logainofhades

Titan
Moderator
Kaveri's issue is the same that all of the FM2/FM2+ chips suffer from, and that is lack of L3 cache. I am more shocked by the performance difference between PhII and Kaveri. Early benchmarks suggested that Kaveri basically broke even with Phenom II, as Trinity and Richland were a bit below.
 

oems want to build as cheap as possible. i highly doubt that llano laptop is suitable for o.c. but it sure 's good to have that option. i wish intel laptops had o.c. options. most even lack basic tweaking options.
 


Kaveri has higher IPC then Phenom II (expected), but performs about the same due to lower clocks (also expected). Having L3 would help, but not THAT much. Its a red herring that people are using to try and explain the differences in performance, but L3 isn't that significantly faster then accessing main memory, now that we have dedicated super-fast buses built into the CPU.
 

etayorius

Honorable
Jan 17, 2013
331
1
10,780


Clock for Clock Kaveri seems to be faster than PhenomII, i tried setting both chips at 3.7 (Turbo disabled for Kaveri) and i am seeing about 15% more performance on Skyrim, Oblivion, GTA4 and other benchmarks such as "PerformanceTest", i would say that Kaveri is about 7% faster than PhenomII at the same Frequency even with the lack of L3, add L3 to Kaver and it may win another 3-5% more performance against PhenomII, L3 does not makes that much of a difference... i still have an AthlonII 240 (2.8GHZ) who does not have L3 at all, and i can manage to push it to 3.5 on air, setting the PhenomII at 3.5GHZ the difference between the two seems to be less than 6% in favor of the Phenom.

Still, i think Kaveri is a fail since it can barely beat a CPU from 2009, i would had expected Kaveri (2014) to at least match the i5 2500 (2011). mKaveri looks great though, i would not mind building some really cheap and fast PCs to friends and family on a budget or who just want to all the basic stuff with a good performance.
 

logainofhades

Titan
Moderator


For basic stuff, desktop Kabini 5350 might work. I am considering one for an HTPC.
 

it was likely from techspot.
http://www.techspot.com/review/827-watch-dogs-benchmarks/
here's pclab's
http://pclab.pl/art57916.html
looks a bit system memory bound (edit: TS used 2x 4GB ddr3 2400 vs pclab's dd43 1866). may be the disrupt engine is similar to dunia.

edit2: Moar benches

gamegpu's utterly moronic cpu benches (at stock settings save for fx9k cpus) with r9 295x2 and 780ti SLI at 1080p to impose a huge cpu bottleneck, LOL.
http://gamegpu.ru/action-/-fps-/-tps/watch-dogs-test-gpu.html
at least they learned to use windows 8.1... or to write that they used win 8.1....

and pcgameshardware.de's usual low res stuff
http://www.pcgameshardware.de/Watch-Dogs-PC-249220/Specials/Watch-Dogs-Test-GPU-CPU-Benchmarks-1122327/
 
Clock for Clock Kaveri seems to be faster than PhenomII, i tried setting both chips at 3.7 (Turbo disabled for Kaveri) and i am seeing about 15% more performance on Skyrim, Oblivion, GTA4 and other benchmarks such as "PerformanceTest", i would say that Kaveri is about 7% faster than PhenomII at the same Frequency even with the lack of L3, add L3 to Kaver and it may win another 3-5% more performance against PhenomII, L3 does not makes that much of a difference... i still have an AthlonII 240 (2.8GHZ) who does not have L3 at all, and i can manage to push it to 3.5 on air, setting the PhenomII at 3.5GHZ the difference between the two seems to be less than 6% in favor of the Phenom.

Math time! Using the same math as before:

PII X4 980: 50 @ 3.7 GHz
A8-7600: 52 @ 3.1 GHz

Perf = IPC * Clock

PII X4 980:
50 = IPC * 3.7
IPC = 13.51

A8-7600:
52 = IPC * 3.1
IPC = 16.77

Solving % Difference:

= ( (13.51 - 16.77) / ( (13.51 + 16.77) / 2) ) * 100
= (-3.26 / (30.28 / 2) * 100
= ( 3.26 / 15.14) * 100
= 0.215324 * 100

= 21.5324% difference

So Kaveri is about 21% faster IPC wise then Phenom II in this program.
 

logainofhades

Titan
Moderator
Based on your math, then the PhII X6 has a better IPC than the X4. Must be because of the multithreaded nature of this title. Does kaveri have instructions sets or features that PhII does not? If so, that could explain the gap between them, if this particular title is taking advantage of those.
 


The multithreading helps; that's why I'm limiting this to 4 core chips, as its a more fair comparison. I'd have to throw in a variable for the number of cores, but then core loading comes into effect, and the math gets a LOT ickier. When comparing chips with the same number of cores, all that ickyness goes away.

Speaking of which, for kicks:

A8-7600: 52 @ 3.1
IPC = 16.77

i5-4670k: 82 @ 3.4 [Note this is clearly GPU bottlenecked, so the outcome is likely understated]
Perf = IPC * Clock
82 = IPC * 3.4
IPC = 24.12

Perf Difference: 35.95%

As mentioned before: a GPU bottleneck limiting Intel's FPS to 82 FPS, so its likely the outcome is UNDERSTATED. The real IPC difference is probably north of 40%, we just can't see it because the GPU can't go that fast.

Hence why AMD can double the cores and not match Intels performance: Intels cores are literally twice as fast.
 

noob2222

Distinguished
Nov 19, 2007
2,722
0
20,860
@juan

Instead of ignoring the rest of the discussion, let me repeat it. A10 4600 is 2.3ghz base 3.2 ghz max turbo. AT did a discussion on AMDs turbo. With the 8150 on cinebench single thread it ran an average of 3.93 ghz, not 4.2.

In the same token, the a10 4600 will run between 2.3 ghz and 3.2 ghz not just 3.2. Same with puma+, 1.2 base 2.2 turbo. Actual speed- somewhere between. Look at puma+ full cinebench before you try to say puma is the same muarch. 1.2 ghz sacoring 0.03 less than the 1.5 ghz jaguar.

As for pretending that you can do ipc in multithreaded tests. Lol. Look up core scaling. Thats not part of ipc.

as for cherry picking, try to tell me you dont. Im proving you can make any software come up with different results. Does that mean that marketing is telling you the absolute truth? No, marketings job is to cherry pick numbers.

P.s. go back a page and look at the rl test from esrever
 
As for pretending that you can do ipc in multithreaded tests. Lol. Look up core scaling. Thats not part of ipc.

You can get away with it if the cores are all loaded to 100% in the benchmark, but otherwise calculating IPC like that is impossible. That's why I limited my math above to quads, so I could safely do a comparison. But I'd never be able to figure out the "real" IPC with my math.
 

juanrga

Distinguished
BANNED
Mar 19, 2013
5,278
0
17,790


The percentage is incorrect. I have not checked all your computations but if 13.51 and 16.77 are correct numbers the difference is 24.13%

(( 16.77 - 13.51 ) / 13.51 ) * 100 = 24.13

So Kaveri is about 24% faster IPC wise.



Same mistake, the Haswell i5 has 43.83% more IPC, assuming your IPC values are correct.

About your Intel vs AMD rant. There are benchmarks that show a FX-8350 destroying an i7-3770k. Of course, this is in integer workloads where (2 cores > 1 core + HT). However, note that there is only one shared FPU per CMT module. Thus a FX-8350 has four FPUs not eigth. Thus in floating point benchmarks there is no double number of cores...

Now Excavator is supposed to bring doubled FPU per module, closing the gap with Ivy Bridge.
 

juanrga

Distinguished
BANNED
Mar 19, 2013
5,278
0
17,790


AT ancient discussion about FX-8150 turbo means little to Piledriver (whose one of the improvements over Bulldozer was better turbo: Turbo Core 3 vs Turbo Core 2).

As a consequence your same source, AT, correctly used the turbo frequencies in the graph that you did bring here. The only problem was that they made a typo when reporting the turbo frequency of the Piledriver A10 4600M, typo which was corrected.

The computation made before shows that even if the effective average frequency is a bit smaller than the max turbo, the difference in IPC is of 0.01 units and thus irrelevant.

I don't have look at anything that is in your head because I know that puma+ is the same muarch than jaguar. Stop inventing IPC numbers from mixing turbo frequencies and base clocks to pretend that puma+ has higher IPC. Again from your AT source:



Clearly, the IPC is the same for both.

And finally, IPC does not mean Instruction per Core. The formula for multithreaded performance is

Perf = IPC * Clock * cores

This is the formula used in the EXTECH article.
 
Status
Not open for further replies.