AMD CPU speculation... and expert conjecture

juanrga · Nov 14, 2013

8350rocks :

1. The 3.7GHz virtual match the 3.8Ghz and we don't know the turbo. Yes we waited a small decrease in clock speeds, but I mean all the fear and negative claims about it is a very low clocked part. I recall people saying that it would be clocked at 2.6 GHz. I recall an italian site, mentioned here, saying that was 2.9Ghz and so on. I recall someone saying that 3.5GHz couldn't be surpassed because that was the maximum frequency of Intel chips on bulk... I also recall your bad feelings

The truth is what they have managed to maintain the high frequencies without SOI.

2. I can imagine many reasons. First, Richland is made in a mature 32nm process, tweaked during years. Kaveri is the first product released by Glofo in their new 28nm process. If you leave them more years, they could offer more freq. no doubt. Second, Kaveri has lower TDP constraint than Trinity/Richland. Kaveri is 95W APU or less. Third, we don't know if AMD is following Intel approach of lower base clocks and more aggressive turbos. Fourth, Kaveri has a bigger GPU that consumes much more power. Therefore the CPU cannot be clocked so high as with a small GPU if the total TDP is of 95W or less.

This will not break OC records, but we heard rumors that can OC up to 4.5GHz.

Had it go for FD-SOI they have had to have to pay more; split resources between two process (one for dGPU other for iGPU, one for Steamroller other for jaguar); be following a dead-way (no evolution beyond 20nm at Glofo, everyone else on FinFETs); be delayed (Glofo has not ready 28nm SOI) and then compete against Broadwell instead Haswell; be locked at Glofo (one of poor foundries actually)...

The move to bulk was smart and means they pay less; concentrate resources in a 28nm bulk process for all products; be following a highly evolving road (20nm bulk for 2015 and then 14nm FinFETs); be selling the first Kaveri chips, and prepare Carrizo successor for Broadwell arriving by 2015; be open to more reliable and aggressive foundries such as TMSC, which has almost ready the 16nm node and is already tapping out chips in 10nm FINFET.

juanrga · Nov 14, 2013

kansasboy001 :

Are you aware that I am posting here with my real name? But I can say you that rocks is wrong with its FX replacement thing.

gamerk316 · Nov 14, 2013

juanrga :

Which is exactly how I DON'T want HSA implemented, because this will never work in the real world.

Remember, first and foremost, you can not make the assumption no other application has access to the hardware you are trying to use. If you are not VERY careful, you can easily tank performance in a multi-application environment compared to what it would be if you stuck with a more traditional approach.

Secondly, you will run into problems in regards to future scalability. Today, if you had a GTX 610, you would obviously want the APU doing rendering, and most of the compute work going to the dGPU, based on how fast each one is. But how about the GTX 910; does that assumption still hold? You run into the very dangerous situation where the assumptions you make about the hardware are no longer true in a year or two, which results in your very fine tuned model costing possibly significant performance.

Frankly, what AMD should have done was get MSFT to modify their WDDM model to allow separate devices to be selected as the default GPU device, and default compute device. That solves the problem with having to do this application side, and forcing the developers to make guesses about the state of the hardware.

gamerk316 · Nov 14, 2013

juanrga :

As I've said before, DX11 doesn't have nearly as much overhead as DX9. You are loosing, worst case, ~10% maximum performance in non-CPU bottlenecked situations.

Secondly, when developing for consoles you can make assumptions about the hardware that you can NOT on PC's. If you are not careful, fine tuning too much can tank performance on PC's, because you, the developer, do not have final say on when your application even runs, let alone where resources are allocated or when your worker thread actually starts. There's a reason we went away from coroperativly scheduling threads.

de5_Roy · Nov 14, 2013

AMD’s Wireless Display is the Real Deal
http://semiaccurate.com/2013/11/14/amds-wireless-display-real-deal/

gamerk316 · Nov 14, 2013

blackkstar :

juanrga :

I mentioned this elsewhere, but the software side of things are very good for AMD right now in regards to gaming.

Remember when FX 8350 first launched? You'd have to dig forever to find good FX 8350 benchmarks and the AMD guys were floating the same few benchmarks which showed it doing well over and over again while there was a bombardment of Skyrim, Shogun 2, Starcraft 2, Warcraft, etc.

Now look at how things have changed. GamerK is trying to disprove the FX and the best he can do is post a few niche examples of modern games that don't run well on FX.

I do think this is why AMD is not in a rush to push SR on HEDT. PD has gone from trailing Intel 3570k by 30% in reviews in all the gaming benchmarks to beating 3770k. Nothing on the chip has changed.

AMD simply has absolutely no reason to even release SR on HEDT. Why release a chip that is 15% faster when you just got 30% improvement out of software increases by nipping the whole "game developers optimizing for Intel" thing right in the butt?

After watching the APU13 keynote on Mantle it's quite clear that AMD is designing Mantle to not only scale where GPU on APU does some other calculation besides rendering, but that this will scale between TWO dGPUs.

http://www.youtube.com/watch?feature=player_embedded&v=tDPgJB2x7dQ

Watch the whole thing, I think Johan screwed up. He said you'd see a Mantle situation where one GPU calculated global illumination whilst one just rendered the scene.

Do you know what's missing from that equation?

An APU.

And I hate to beat a dead horse, but it's quite clear AM3+ is completely incapable of HSA.

Yet at the same time running two dGPUs on APU platform is a complete waste because of the PCIe lanes available.

So why talk about Mantle using two dGPUs? AMD doesn't have a publicly available platform that supports everything required for Mantle to use two dGPUs.

Johan dun goofed.

When did Crysis 3 and BF4 become a niche examples again?

The ONLY FX that is competitive is the FX-8350; the others, even the lower clocked 8 core variants (8230, etc) all lag behind IB i5's. The weak individual cores hobble the architecture, even in games that scale well, like BF4 and Crysis 3.

http://pclab.pl/art52489-9.html

FX-8350 lagging behind the i5-3570k.

Obvious GPU bottleneck at 46 FPS. Worth noting the FX-6300 lags behind the 3570k.

8350 lagging behind the 3570k.

And I already posted a GPU analysis of Crysis 3; it uses 13 threads, so you can't say it won't scale. In Crysis 3, 3570k > FX-8350

How about BF4?

http://pclab.pl/art55318-3.html

Same trend exists, the FX-8350 still lags behind the i5-3750k, adn even the FX-6350 comes in just ahead of the i3 lineup, trailing the i5's by a decent margin. Even the 8 core FX-8320 can't match even the cheapest i5, the 3350p, in BF4.

Hence my point: FX's architecture deficiencies are hidden in part due to clock speed. You typically see, at max OC, Intel pulling farther ahead. You also see the 4-core FX being noncompetitive, and even the 6-core being one step up from the i3-lineup.

Farther, the FX-8350, in both Crysis 3 and BF4, lost to the i5-3570k, which comes in just $10 more expensive. And at max OC, the i5-3570k pulls farther ahead (except in the case of a GPU bottleneck being reached, as seen in the second image). So I'd have to recommend the i5-3570k over the FX-8350 in EVERY instance right now. The 6350 is attractive for its price (~$140), but you accept sub-i5 performance by going that route.

That is not the sign of a good architecture, that with a clock speed edge and DOUBLE the cores, you still lag in performance, even in titles that use more then a dozen threads. The only time FX matches high end IB chips is when a GPU bottleneck suppresses the results.

logainofhades · Nov 14, 2013

Other members from Poland have labeled that site as untrustworthy for results.

pangolin_user · Nov 14, 2013

esrever :

noob2222 :

the problem with that statement is, amd compared 15 watt beema to kabini 25 watt that we know the less efficient kabini chip, if amd compared to 15 watt kabini (a4 5000), or 25 watt beema vs 25 watt kabini than we will have better picture of the improvement, is there result of pcmark 8 home with amd 5000?

8350rocks · Nov 14, 2013

juanrga :

8350rocks :

If the demand for FX-8350 CPUs was higher than it is, AMD margin would be better. Look at the 9000 series. They were overpriced, nobody was purchasing them and recently their prices dropped by giant amounts. It is the always the same: lack of demand.

I consider that steam statistics must be pretty accurate from my knowledge of the local market, but look at own AMD numbers: the FX 8-core represents like a 2% of the total revenue generated by both APUs and CPUs. Now try to compute the percentage of the total revenue.

8350rocks :

And where in AMD vice-president words says that they will release a FX successor?

What if I already have in my hands the official desktop roadmap for 2014?

8350rocks :

I didn't mean this to be happening tomorrow. Kaveri is a 2014 product and uses DDR3. Carrizo successor comes in 2015 and uses DD3/4. The new high-bandwidth memory technology is something for 2016/17. Then we will start to see ultra-high-end performance APUs (which not will be mainstream). Nvidia is developing an APU with cube memory that is projected to give 10x the performance of the GTX Titan.

If that was the case, we would have already seen the slide for it 1000x because you would be trying to convince us that there is no dCPU coming, and that would be your "holy grail" of evidence. However, I am more willing to bet you have the official 2014 Desktop APU roadmap that everyone else has...

de5_Roy · Nov 14, 2013

@logain: you don't have to be a native citizen to find a website unreliable. i am not defending pclab.pl, multiplayer games are very hard to consistently benchmark anyway. but.. at least they did it, and used dice recommended o.s. their test hardware were more well-rounded than the other site that benched bf4 mp and crysis 3 - gamegpu.ru.
....
now what if some russian poster said that they considered gamegpu.ru unreliable?

gamerk316 · Nov 14, 2013

logainofhades :

And I know a few Russian gamers on other forums who claim the same about GameGPU. I'm sure "biased" in both groups minds is "doesn't agree with my pre-conceived result set".

Here's the main issue here: I've got very limited samples to choose from here. And the only two that bench significant numbers of CPU's conflict, come from foreign sources (PcLab and GameGPU), and most annoyingly, don't match up with any other set of benchmarks I can find.

Toms and PcLabs match in terms of ordering (2500k ahead of FX-8350), but Toms didn't bench nearly enough CPU's to be enough to validate PcLabs entire result set. Same thing occured with Crysis 3 results (and I noted it at the time), in that GameGPU tended to place the FX-8350 and other FX chips higher then other sites (Toms, etc). So we get the situation where no ones result sets agree, and NO ONE is investigating why thats happening.

Point being, until some other site gets serious about benching CPU's, we're stuck with two conflicting result sets.

One thing I AM finding though, significant in the case of BF4, is that Win7.1 and Win8 results tend to differ, significantly in some cases. So we may have some OS effects in this case, which could explain everything. But I doubt any site is going to investigate that issue...

Point being, since the ordering with PcLab tends to agree with Toms (as far as ordering is concerned), I'm sticking with them unless someone can prove the results aren't legit. And no, saying "the results aren't legit" doesn't count by itself.

8350rocks · Nov 14, 2013

Ok, here's what I know...

AMD's 2014 product roadmap will release shortly if not today...

Piledriver FX is going to be the HEDT line through 2014 until further notice. AMD concedes Kaveri is not going to replace FX in HEDT, and they know the segment is crucial for gaming, and it's the only segment growing at this time in DT PC.

AMD expects to see the PD arch gain performance boosts in several facets and remain mostly competitive at the price points they have established through software optimization and other means.

The next HEDT platform already has an internal codename, though he could/would not tell me what it was. There were no promises that we would see the HEDT successor in SR arch, it may not be until excavator that they move forward. The issue is simply that they will not do HEDT on bulk given the way things went with Kaveri. If they had been able to completely meet/exceed the design goals on bulk with Kaveri they would have moved forward with HEDT. However, he expressed the sentiment that they regretted being forced to downgrade their process to bulk and that they know they lost performance to release the APUs in a timely manner.

Next HEDT line may come on 28nm or 20nm, he wasn't clear there, but he was certain it would not be bulk, and felt like they would move back toward FD-SOI if GF could get their "issues" sorted out for 28/20nm SHP. He could not tell me if the next iteration of their HEDT line would be a FM2+/FM3 product, or if they would in fact move forward with AM4 when DDR4 arrives, though he expects that the timing of DDR4 and the new HEDT platform would be pretty close. Supposedly, there is a new IMC in the works for next gen HEDT hardware (to coincide with DDR4).

That's all I could get...

gamerk316 · Nov 14, 2013

Again, if only AMD could invest in its own fabs, rather then waiting for GloFo to handle its own business...

de5_Roy · Nov 14, 2013

if amd is so fab-bound and using bulk substrate, it makes sense for them to wait for fd-soi at 20nm and include ddr4 support to move on to an entirely new socket and new 'big' core for hedt. one catch is that current info showing st micro licensing their fd-soi booster tech for 28 and 14nm(xm version) only.

8350rocks · Nov 14, 2013

gamerk316 :

They hit their release targets more reliably before they spun off GF.

I agree with the sentiment...though at the point where they are, launching a private fab would be near disastrous to their bottom line. They would need to be at a point where they could offset the startup losses from the fab end with product revenue. It will be interesting to see if they go that route down the road.

8350rocks · Nov 14, 2013

de5_Roy :

The XM version is really a hybrid, using 14nm front end and 20nm back end with a gate last approach.

gamerk316 · Nov 14, 2013

8350rocks :

gamerk316 :

They hit their release targets more reliably before they spun off GF.

I agree with the sentiment...though at the point where they are, launching a private fab would be near disastrous to their bottom line. They would need to be at a point where they could offset the startup losses from the fab end with product revenue. It will be interesting to see if they go that route down the road.

My point was that it was a mistake to spin GF in the first place.

8350rocks · Nov 14, 2013

gamerk316 :

Well, you could argue that it was...

However, you could also argue that it wasn't...since GF has lost significant money since it was spun off, where would AMD be right now? Hemorrhaging cash right now...that's where. From a business perspective it made sense in some ways, and not in others. I think all we are doing is arm chair quarterbacking at this point.

gamerk316 · Nov 14, 2013

But...thats my favorite past time!

*Is a Jet fan

blackkstar · Nov 14, 2013

gamerk316 :

blackkstar :

When did Crysis 3 and BF4 become a niche examples again?

The ONLY FX that is competitive is the FX-8350; the others, even the lower clocked 8 core variants (8230, etc) all lag behind IB i5's. The weak individual cores hobble the architecture, even in games that scale well, like BF4 and Crysis 3.

http://pclab.pl/art52489-9.html

FX-8350 lagging behind the i5-3570k.

Obvious GPU bottleneck at 46 FPS. Worth noting the FX-6300 lags behind the 3570k.

8350 lagging behind the 3570k.

And I already posted a GPU analysis of Crysis 3; it uses 13 threads, so you can't say it won't scale. In Crysis 3, 3570k > FX-8350

How about BF4?

http://pclab.pl/art55318-3.html

Same trend exists, the FX-8350 still lags behind the i5-3750k, adn even the FX-6350 comes in just ahead of the i3 lineup, trailing the i5's by a decent margin. Even the 8 core FX-8320 can't match even the cheapest i5, the 3350p, in BF4.

Hence my point: FX's architecture deficiencies are hidden in part due to clock speed. You typically see, at max OC, Intel pulling farther ahead. You also see the 4-core FX being noncompetitive, and even the 6-core being one step up from the i3-lineup.

Farther, the FX-8350, in both Crysis 3 and BF4, lost to the i5-3570k, which comes in just $10 more expensive. And at max OC, the i5-3570k pulls farther ahead (except in the case of a GPU bottleneck being reached, as seen in the second image). So I'd have to recommend the i5-3570k over the FX-8350 in EVERY instance right now. The 6350 is attractive for its price (~$140), but you accept sub-i5 performance by going that route.

That is not the sign of a good architecture, that with a clock speed edge and DOUBLE the cores, you still lag in performance, even in titles that use more then a dozen threads. The only time FX matches high end IB chips is when a GPU bottleneck suppresses the results.

Which is exactly my point. You keep referring to the same website and the same benchmarks to make your point.

When FX 8350 came out, you could find games FX sucked at all over the place.

Going back in time to see FX 8350 reviews, here are the results in google and the games tested

1. Tom's hardware: BF3, Skyrim, Warcraft
2. PCGamer: Shogun 2
3. PCMag: no benchmarks
4. Amazon: no benchmarks
5. TechReport: Skyrim, Batman, BF3, Crysis 2
6. Tom's hardware with the same gaming benchmarks again
7. Engagdet roundup
8. Newegg link
9. Anandtech: Skyrim, Diablo 3, Dragon Age, Dawn of War, WoW, SC2.

My point is that to show FX sucked at gaming when FX first came out, there were plentiful resources and pages upon pages of benchmarks showing that FX was behind the competition. BF3 and Crysis 2 were really the only strong games for it back then. Basically, just a handful of outliers.

Now you are stuck posting the same benchmarks from the same website saying "see, nothing has changed!!!!!" If multi-core is not catching on and it's impossible for it to catch on, I want to see pages of google results of modern games where FX is still significantly behind.

GamerK you really need to grow up. The old 10ghz Nehalem died a long time ago and we're never going to see anything like that.

Intel can no longer squeeze more single core performance out of their CPUs. They are even going to add "MOAR COARS" with Haswell-E. AMD has just finally caught up to older Intels in IPC with SR (going by rumors).

There is a big problem here that chips can no longer scale in the ways they previously have, and that we have to do something else to make a difference.

The path you are suggesting of sticking to single core rendering and not looking for alternatives is suggesting that we create a world where there's no longer a reason to upgrade your CPUs because Intel can't make anything faster and AMD is still catching up in single thread.

Tell me how good that works out for AMD and Intel when they need to sell CPUs? You are so backwards thinking it makes my brain hurt. I can imagine you as a crotchety old man sitting in front of his computer going "these stupid kids and their Penium 1s, it all went downhill after the 486, I wish I could go back to the golden age of computing!"

juanrga · Nov 14, 2013

de5_Roy :

Except AMD has clearly stated that they want to build a standard about it, offering it to Nvidia and Intel.

de5_Roy :

juanrga :

oh really? false, is it? here's what you said in your post, ad verbatim:

juanrga :

but wait.. there is more, in case you have conveniently forgotten (along with the poorly executed deflection when i ask for audio version):

juanrga :

That part that you bolded is "That is what the slide #13 says."

I am saying which is the message given by the slide. Not that the slide is the reason for AMD plans. You got it backwards.

The reasons why AMD is abandoning FX line (and not releasing improvement or refresh) were given by my before:

The reasons why AMD is not releasing SR FX line are multiple and I explained them here before: transition to APU, reorganization of server plans, and lack of demand. I gave details and further explanations for each one.

There is another post here who is also explaining why AM3+ is a dead platform. I suppose he is not being read as well.

juanrga · Nov 14, 2013

noob2222 :

juanrga :

noob2222 :

juanrga :

And you always change the story. You stated that kaveri will =i5 in ordinary cpus workloads. When proven wrong now it's only in apu specialized software.

You may be happy with weaker cores coupled with a software solution, it will not go over well when the reviews come out.

And unsurprisingly you come with the same misunderstanding again, one that I corrected before and before...

In my BSN* article, which evidently you didn't read, I show how kaveri performs as an i5 using ordinary cpu workloads. I am not using "apu specialized software" as you pretend.

Finally, I am happy with the idea of software using my hardware. All my hardware (or most of it) and not only one half or one third of it. There is another person here with similar ideas to me. He compiled a program for his hardware and now it runs 2x faster or said in another way the former unoptimized program was using only a 50% of the performance of his hardware.

You seem to prefer the other way. If the software only uses a 50% of the hardware you update to a 2x more powerful hardware for that the same software can now ignore the 50% of more.

I love the idea of utilizing software. Not at the expense of only releasing low end hardware and relying solely on the software to make up the lack of good hardware. We need both. APU are not high end. Period.

APU are a little of both, and a master of none. Without the specialized software as I have shown kaveri needs 80% or more boost to catch the low end i5. It needs 40% to catch the 4320. This is in ordinary gaming cpu workloads.

Your the one not looking at the data I have provided and coming up with lame excuses instead of trying to convince me with hard evidence. Your website and marketing slides that are designed to only show the best possible scenario do not count as evidence. Those will be only true in a few select cases and are not representative of "ordinary workloads".

Kaveri is not "low end hardware".

And as explained before the benchmarks given are using "ordinary workloads". They are not "specialized software" for APUs neither the "best case".

You don't understand what was made/measured in the BSN* article. I have corrected your misunderstandings again and again and again. Still you insist on that those benchmarks are "specialized software" for APUS, when are not.

Pure AMD hate...

juanrga · Nov 14, 2013

noob2222 :

You? Because AMD said "jaguar servers" replaced by "arm servers"... :lol:

juanrga · Nov 14, 2013

de5_Roy :

Extendable to other uarches doesn't mean adding layers on top. MANTLE consists of two layers: one API plus the driver. If you substitute the MANTLE driver for GCN by another for Intel or Nvidia the MANTLE API will work on top of non-AMD hardware.

juanrga · Nov 14, 2013

gamerk316 :

We are discussing MANTLE not HSA...

gamerk316 :

As I said before the overhead in DX9 was of ~10x or higher. In DX11 it is reduced to something as 2x only if you use batch calls, which reduce the game richness and developer freedom.

No sure why you repeat what I said about optimization in PC (~30--50%) being less than in consoles (~100%).

AMD CPU speculation... and expert conjecture

Distinguished

Distinguished

Glorious

Glorious

Splendid

Glorious

Titan

Honorable

Distinguished

Splendid

Glorious

Distinguished

Glorious

Splendid

Distinguished

Distinguished

Glorious

Distinguished

Glorious

Honorable

Distinguished

Distinguished

Distinguished

Distinguished

Distinguished

Share this page