AMD CPU speculation... and expert conjecture

Page 405 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.
Status
Not open for further replies.

juanrga

Distinguished
BANNED
Mar 19, 2013
5,278
0
17,790
(i) 4.9GHz for Kaveri is within the expected range.

(ii) We know that the FPU in Steamroller has been redesigned.

(iii) In my Kaveri article I considered 20% IPC, but I said I was being conservative. In one footnote, I mentioned different rumors and leaks suggesting that Steamroller could be up to 40% faster (IPC).

(iv) As mentioned before (scores such as this were also given),

Himeno-Benchmark-kaveri-pre.png


Kaveri cannot compete with an i5-2500k in pure FPU workloads. No surprise here because it has half the units: one FPU per module (A10) against one FPU per core (i5).

(v) Does this mean that the leak is legit? No, but if it is fake the guy who faked it is well-informed. As mentioned in wcctech, he or she would even fake an incorrect detection of the number of cores.
 

juanrga

Distinguished
BANNED
Mar 19, 2013
5,278
0
17,790


Multicore integer performance is 24% better.
Multicore floating point performance is 24% better.

But...

(A) In this same page we discussed how APUs run faster under W8.1 than under W7. Kaveri ES is running W7, the Trinity APU is running W8.1.

(B) The Trinity APU run with more aggressive turbo than the Kaveri ES.

(C) I suspect that the Kaveri ES Northbridge was underclocked.

(D) Both run single memory channel ultralow DIMMS, but Kaveri was probably more bottlenecked because requires faster RAM.

Combining all Kaveri is >30% faster (IPC) than Trinity.
 


Like I said, 4Chan is the place under the bridge where all trolls feed and create/plan their horrible deeds. I lurk in there, so I know what they're capable of, haha.

Anyway, take an extra doses (dosis?) of salt for that screenshot :p

Cheers!
 
Multicore integer performance is 24% better.
Multicore floating point performance is 24% better.

Not according to basic math.

(A) In this same page we discussed how APUs run faster under W8.1 than under W7. Kaveri ES is running W7, the Trinity APU is running W8.1.

Valid point, if we were discussing the APU as a whole. We're not; we're talking the CPU portion only.

(B) The Trinity APU run with more aggressive turbo than the Kaveri ES.

And...seems like an excuse to me. Turbo is a fact of life now, and comparing chip to chip, you factor it in, and if Trinity has more aggressive Turbo, then so be it.

(C) I suspect that the Kaveri ES Northbridge was underclocked.

Based on what exactly? Memory scores were basically the same, which we would expect when using the same RAM. If you UC'd the NB, it would show in the memory benchmarks.

(D) Both run single memory channel ultralow DIMMS, but Kaveri was probably more bottlenecked because requires faster RAM.

The GPU portion of the APU would be starved, yes. The CPU portion wouldn't be, so the memory setup used should be affecting Trinity and Kaveri the same way.

-------------

So now, I think we're at the point where the first benchmarks may be hitting, performance is less then expected by some, and denial is starting to kick in.
 

Cazalan

Distinguished
Sep 4, 2011
2,672
0
20,810


Take a screen shot or make a graph out of it. It automatically makes the data 10x more reliable according to interwebnews. ;)

~15% is inline with what AMD says.

I think the variance will be quite high on Kaveri but AMD's main advantage is Intel isn't updating their competing part until towards the end of 2014. Broadwell is targeting a much lower power range, and the haswell refresh is an almost insignificant bump in speed.

For those without dGPU Kaveri is a solid win (Edit: If you care about graphics at all).
 

jdwii

Splendid


But yet that is hard market to please that part will probably cost a good 1200+$ and that part would be more useful more a work station which by that time you could make a server with a server board and so on. Useless for gamers, useless for anyone. If Amd actually made a 8350fx and just shrink it and kept the same performance but only had a 65-95watt tdp that really would satisfy a lot of people who want to use it as a server part.

As it stands i can get 5 8350fx processor's for the price of 1 Haswell-E most likely that is 40 cores and for somthing like big web server that would be fast and stable(more so if Amd cut the TDP down to 65watts)
 

juanrga

Distinguished
BANNED
Mar 19, 2013
5,278
0
17,790
I have just tweeted this

http://www.digitimes.com/news/a20131218PD214.html

It confirms Carrizo being FM2+ update to Kaveri.

This part is interesting:

AMD's Kaveri APUs will succeed previous-generation Richland-based APUs. The processors are made by Globalfoundries on a 28nm process, targeting performance and mainstream desktop markets, the sources said.

This part is as well:

Currently, FM2-based processors account for over 80% of AMD's total shipments
 

Master-flaw

Honorable
Dec 15, 2013
297
0
10,860

Watched this last night...
Seems a lot more important for AMD than Kaveri on the gaming end of things..
Looks like he's running 10,000 independent detailed units at once...AND the bottleneck is at the GPU with 120 frames, with what looks to be a 4K or higher res...all amazingly on an 8350...
My 8350 would be screaming at me if I tried that with DirectX...probably pull off mid lower 20's zoomed out.

 

Ags1

Honorable
Apr 26, 2012
255
0
10,790


All I'm saying is that your post would have been improved by NOT including the AMD graphic, just the numbers :)
 

anxiousinfusion

Distinguished
Jul 1, 2011
1,035
0
19,360


I normally avoid the videos on the firm's own channels because so much of it is pure marketing, but this was worth watching. The demo ran on an FX-8350, R9 290(X) system and I kept watching that unit number climb waiting for the framerate to choke out. It never did and they go on to explain that this is only the very first implementation of Mantle with no specific optimizations made to that demo yet.
 
Like I said, 4Chan is the place under the bridge where all trolls feed and create/plan their horrible deeds. I lurk in there, so I know what they're capable of, haha.

Anyway, take an extra doses (dosis?) of salt for that screenshot

Cheers!

Screw that, you'd need an entire salt mine with anything from there. 4chan is the hidden dungeon under the heart of the seedy underbelly of the internet. Of all the sources of information on the internet 4chan is the absolute worst to trust.

Need I remind everyone that this is the place that published the instructions on how to modify your XBONE to run 360 games.
 

Master-flaw

Honorable
Dec 15, 2013
297
0
10,860


Can't wait for the first massive RTS to be made on Mantle....The RTS genre has been severely held back due to CPU limitations.

@tourist...I guess it's an infused chipset(MoBo/CPU in one combo)...looked it up as soon as I heard him say that.
 

juanrga

Distinguished
BANNED
Mar 19, 2013
5,278
0
17,790


I use advanced math. :sarcastic:



I am discussing the improved scheduler in W8.1/W8 which is specificially designed to improve the AMD CPUs based in CMT architecture. The APU runs faster because the CPUs inside are running faster. :sarcastic:



Kaveri turbo is 4.0GHz, but that engineering sample was running with lower turbo speed. Get it now? :sarcastic:



As said it is only a guess I got from the details of the ES. I cannot confirm. In any case your argument assumes that Kaveri memory subsystem is the same than trinity. It is possible that Kaveri has a better IMC and that underclocking the NB still offers about the same memory scores than trinity at stock.

I know that Kaveri L2 cache is ~20% faster than L2 cache in Trinity/Richland, just saying it.



CPUs performance is affected by memory speed. Benchmarks show how a FX CPU runs games and other applications ~5% faster with faster RAM. It is the same for the Piledriver CPUs inside APUs.

You are the poster who gave the link about Broadwell-EP Xeons. Those are CPUs, there is no iGPU right? If you pay attention to the link that you gave, The new Xeons use faster RAM than previous Xeons. This is because faster RAM increases the performance of the CPU:

Intel’s internal benchmarks show around 15-25 percent gains by moving to DDR4-2400 from DDR3-1866, while keeping all else same, on SPEC and Stream benchmarks, so yes it seems to be worth it, not forgetting the power savings from 1.2v memory operation.

I am not saying that CPU in Kaveri will see those gains from moving to 2400MHz (in fact I am only considering ~5% gain), but it seems evident that it needs faster ram than CPU in Trinity. :sarcastic:



Fixed it for you. :sarcastic:

It is fantastic to see that in some concrete cases (e.g. JPEG Decompress Multicore) Steamroller is >45% faster (IPC) than Piledriver, when some people here was expecting Steamroller to be 37% slower than Bulldozer, whereas others claimed up to 10% faster than Piledriver at best...
 
f they fixed the prefetch and decreased the horrid latency in the cache architecture then there is no reason they couldn't catch up on Intel in a number of areas, but the fact remains that the individual cores are simpler, do less real work for a given load, and the fab process is still 1.5 generations behind Intel in size and optimisation.

So I'm just saying don't get your panties all in a knott like Baron did, TWICE, and then have a brain aneurism here when it all fails to quite meet the hype.

Be optimistic by all means, but you don't want to be the guy in charge of the Russian moon program ... hint hint ... nothing got off the pad intact.

Before you flame me I was a bit of an AMD troll ... ask the oldtimers ... I had plenty of egg on my face.

Juan ... this is advice for you ... your more excited than a fat kid at Willy Wonka's chocolate factory.

:)
 


I didn't see anything too impressive. Lasers didn't seem to be affected by light/shadows, so they're basically just blue lines; not THAT hard to do. Likewise, we already have games that can handle more ships at a time then that. Really didn't see anything too impressive in that demo.

Also, I;d love to know how you think the demo "looks like" 4k when watching a compressed Youtube video on a screen less then 4k...
 
I use advanced math. :sarcastic:

Show your work for full credit please.

I am discussing the improved scheduler in W8.1/W8 which is specificially designed to improve the AMD CPUs based in CMT architecture. The APU runs faster because the CPUs inside are running faster. :sarcastic:

Not faster, just scheduled differently. Essentially, Windows treats the CMT cores as HTT cores, using them LAST, resulting in more aggressive turbo clocks. Which was my recommendation on how AMD implement CMT in the first place (go back to the BD thread). AMD could have accomplished the same thing years ago by just setting the HTT bit on its CPUs.

I also note that under full load, since all the cores are used anyway, you wouldn't see any performance improvements out of this. Likewise, single-core loads use only a single core, and thus the performance penalty of CMT doesn't occur. So its unlikely that this type of benchmarking would be affected by the Windows scheduler. Nice try though.

Kaveri turbo is 4.0GHz, but that engineering sample was running with lower turbo speed. Get it now? :sarcastic:

We'll see when the parts release, won't we?

As said it is only a guess I got from the details of the ES. I cannot confirm. In any case your argument assumes that Kaveri memory subsystem is the same than trinity. It is possible that Kaveri has a better IMC and that underclocking the NB still offers about the same memory scores than trinity at stock.

I know that Kaveri L2 cache is ~20% faster than L2 cache in Trinity/Richland, just saying it.

If AMD re-designed how its memory subsystem works for Kaveri, I'm relatively sure we'd know about it by now. Your totally BSing on this one.

CPUs performance is affected by memory speed. Benchmarks show how a FX CPU runs games and other applications ~5% faster with faster RAM. It is the same for the Piledriver CPUs inside APUs.

And the same as intel CPU's, VIA CPU's, and ARM CPU's. But when using the SAME RAM setup, you wouldn't expect any variance in performance between similar processors. If Trinity is starved, then Kaveri is likely starved by the same amount, so, for doing a comparison between the two, the numbers are perfectly valid.

You are the poster who gave the link about Broadwell-EP Xeons. Those are CPUs, there is no iGPU right? If you pay attention to the link that you gave, The new Xeons use faster RAM than previous Xeons. This is because faster RAM increases the performance of the CPU:

Because they are used in tasks that are actually memory constrained. In typical benchmarking, memory access speeds is not a major constraint in terms of performance, and unless Kaveri is REALLY sensitive to RAM speeds, which I doubt (and isn't backed up by the results posted), then Kaveri likely has similar bottlenecks to Trinity in CPU bound tasks.

Intel’s internal benchmarks show around 15-25 percent gains by moving to DDR4-2400 from DDR3-1866, while keeping all else same, on SPEC and Stream benchmarks, so yes it seems to be worth it, not forgetting the power savings from 1.2v memory operation.

So...in Memory benchmarks, using faster RAM increases scores? Wow, what a concept.

In actual applications? Not so much. Especially when you can cram everything into the L2/L3 cache.

It is fantastic to see that in some concrete cases (e.g. JPEG Decompress Multicore) Steamroller is >45% faster (IPC) than Piledriver, when some people here was expecting Steamroller to be 37% slower than Bulldozer, whereas others claimed up to 10% faster than Piledriver at best...

Your numbers are WAY off.

JPEG Decrompress Multicore was a 31.75662219% increase in performance compared to Trinity, one of AMD's best increases. Which makes sense, given how you can scale the thing to the number of cores [expected linear performance gains]. But the typical gains were half of that. Workload matters.
 
been reading this, prolly was posted here already:
http://www.digitimes.com/news/a20131218PD214.html
i don't know if the author/editor has poor proofreading or just misinformation. first the bad stuff -
amd doesn't have socket fm3 out yet. kaveri and richland use socket fm2+.
carrizo won't (strongly likely) use socket fm2.
having many high volume products is not good enough for amd. they need $200-$500 halo products like radeon 7970 or r9 290x or 7990. they better not count centurions as flagships.
athlons are gettin 4 digit numbers and semprons are coming back. after pushing moar cores for 2 years..... :p however, those have better brand image than intel's pentium and celeron.
the llano debacle forced amd to sell those apus well into 2013 and in 2014. don't do that again please.
suspicious stuff-
beema has puma+ ...? what upgrades did it get?
good/interesting stuff -
beema's successor may be called nolan. nice.
why does kabini and temash have such low share in shipments? may be amd is channeling all jags to the consoles (sure sales are way better than "may be").
amd's d.i.y. global marketshare is 30%, similar in china (30-35%) - but that Does Not mean that 30% is made up of fx cpus.
760g chipset being phased out means, old am3 owners finally have to either get an apu pc or am3+ pc. then again, amd quietly launched 960G chipset earlier, asrock even has a motherboard... i think.
there's a bunch of good stuff in that article... but the mistakes makes me doubt the credibility. now i want amd to make a bga 6 core jag or puma + gcn 2.0 igpu with 768 cores, at least. with optional 3rd party cooling. it.will.rock.
 

szatkus

Honorable
Jul 9, 2013
382
0
10,780


Funny, it was fixed shortly after Bulldozer's release, but you had to install that patch by yourself. In Windows 8 it's OOTB.



I forgot to write that ES was clocked 3.5GHz/3.9GHz. A10-5700 is 3.4GHz/4.0Ghz, that's why I chose it to the comparision.



As I remeber there was some informations about tweaking cache, but without concrete numbers.
 
Funny, it was fixed shortly after Bulldozer's release, but you had to install that patch by yourself. In Windows 8 it's OOTB.

That too.

I forgot to write that ES was clocked 3.5GHz/3.9GHz. A10-5700 is 3.4GHz/4.0Ghz, that's why I chose it to the comparision.

Using that logic, since Kaveri has the higher base clock, and Turbo doesn't kick in when all cores are being stressed, then wouldn't Kaveri have a slight advantage in the multithreaded benchmarks due to higher base clocks? In that case, the numbers I computed are slightly high, since I didn't correct for clock speed differences. That puts the increase in performance closer to 15%, more in line with the single-threaded numbers.

I remeber there was some informations about tweaking cache, but without concrete numbers.

They likely decreased the latency somewhat; I'd imagine thats where a LOT of the IPC gains are coming from. For memory benchmarks though, which use memory transfers too large to fit in the cache, this shouldn't be having an effect on performance.
 
Don't understand the comparison to the bulldozer launch. Everything AMD did wrong on bulldozer, they are getting so far with steamroller.

No overhyped marketing. AMD has been , but not excessively so like bulldozer.

By releasing the mainstream part first they allow the APU to hide any first revision issues. This hits the bulk of the pc market and allows any launch issues, ala Barcelona and Bulldozer, to be addressed before enthusiast parts are launched.

I think this exactly what they should be doing.
 


Gaming results are the most interesting...well, not really, because I've been saying this for years now:

assassins-creed-iv-percentile2.png


assassins-creed-iv-fps3.png


Huh, must be the last gen version. Surely BF4 and Frostbite will favor AMD...

battlefield-4-fps2.png


battlefield-4-percentile2.png


Uhh...OK. But surely it will choke to death in Crysis 3!

crysis-3-percentile2.png


crysis-3-fps2.png


Damn it, this isn't working. Metro maybe?

metro-last-light-fps2.png


metro-last-light-percentile2.png


Ok, lets try an AMD sponsored game this time!

tomb-raider-percentile2.png


tomb-raider-fps2.png


...

CPU wise, it is now impossible to recommend any AMD CPU at any price point.

Good thing AMD has the lead when using the IGP...

assassins-creed-iv-fps4.png


battlefield-4-fps3.png


crysis-3-fps3.png


metro-last-light-fps3.png


tomb-raider-fps3.png


----------------------------------

AMD looses across the board. If these numbers are real, AMD has a massive, massive problem performance wise.
 
Status
Not open for further replies.

Latest posts