AMD CPU speculation... and expert conjecture

Page 365 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.
Status
Not open for further replies.


1. The 3.7GHz virtual match the 3.8Ghz and we don't know the turbo. Yes we waited a small decrease in clock speeds, but I mean all the fear and negative claims about it is a very low clocked part. I recall people saying that it would be clocked at 2.6 GHz. I recall an italian site, mentioned here, saying that was 2.9Ghz and so on. I recall someone saying that 3.5GHz couldn't be surpassed because that was the maximum frequency of Intel chips on bulk... I also recall your bad feelings



The truth is what they have managed to maintain the high frequencies without SOI.

2. I can imagine many reasons. First, Richland is made in a mature 32nm process, tweaked during years. Kaveri is the first product released by Glofo in their new 28nm process. If you leave them more years, they could offer more freq. no doubt. Second, Kaveri has lower TDP constraint than Trinity/Richland. Kaveri is 95W APU or less. Third, we don't know if AMD is following Intel approach of lower base clocks and more aggressive turbos. Fourth, Kaveri has a bigger GPU that consumes much more power. Therefore the CPU cannot be clocked so high as with a small GPU if the total TDP is of 95W or less.

This will not break OC records, but we heard rumors that can OC up to 4.5GHz.

Had it go for FD-SOI they have had to have to pay more; split resources between two process (one for dGPU other for iGPU, one for Steamroller other for jaguar); be following a dead-way (no evolution beyond 20nm at Glofo, everyone else on FinFETs); be delayed (Glofo has not ready 28nm SOI) and then compete against Broadwell instead Haswell; be locked at Glofo (one of poor foundries actually)...

The move to bulk was smart and means they pay less; concentrate resources in a 28nm bulk process for all products; be following a highly evolving road (20nm bulk for 2015 and then 14nm FinFETs); be selling the first Kaveri chips, and prepare Carrizo successor for Broadwell arriving by 2015; be open to more reliable and aggressive foundries such as TMSC, which has almost ready the 16nm node and is already tapping out chips in 10nm FINFET.
 


Are you aware that I am posting here with my real name? But I can say you that rocks is wrong with its FX replacement thing.
 


Which is exactly how I DON'T want HSA implemented, because this will never work in the real world.

Remember, first and foremost, you can not make the assumption no other application has access to the hardware you are trying to use. If you are not VERY careful, you can easily tank performance in a multi-application environment compared to what it would be if you stuck with a more traditional approach.

Secondly, you will run into problems in regards to future scalability. Today, if you had a GTX 610, you would obviously want the APU doing rendering, and most of the compute work going to the dGPU, based on how fast each one is. But how about the GTX 910; does that assumption still hold? You run into the very dangerous situation where the assumptions you make about the hardware are no longer true in a year or two, which results in your very fine tuned model costing possibly significant performance.

Frankly, what AMD should have done was get MSFT to modify their WDDM model to allow separate devices to be selected as the default GPU device, and default compute device. That solves the problem with having to do this application side, and forcing the developers to make guesses about the state of the hardware.
 


As I've said before, DX11 doesn't have nearly as much overhead as DX9. You are loosing, worst case, ~10% maximum performance in non-CPU bottlenecked situations.

Secondly, when developing for consoles you can make assumptions about the hardware that you can NOT on PC's. If you are not careful, fine tuning too much can tank performance on PC's, because you, the developer, do not have final say on when your application even runs, let alone where resources are allocated or when your worker thread actually starts. There's a reason we went away from coroperativly scheduling threads.
 


When did Crysis 3 and BF4 become a niche examples again?

The ONLY FX that is competitive is the FX-8350; the others, even the lower clocked 8 core variants (8230, etc) all lag behind IB i5's. The weak individual cores hobble the architecture, even in games that scale well, like BF4 and Crysis 3.

http://pclab.pl/art52489-9.html

crysis3_cpu_evil_1920.png


FX-8350 lagging behind the i5-3570k.

crysis3_cpu_jungle_1920.png


Obvious GPU bottleneck at 46 FPS. Worth noting the FX-6300 lags behind the 3570k.

crysis3_cpu_human_1920.png


8350 lagging behind the 3570k.

And I already posted a GPU analysis of Crysis 3; it uses 13 threads, so you can't say it won't scale. In Crysis 3, 3570k > FX-8350

How about BF4?

http://pclab.pl/art55318-3.html

bf4_cpu_geforce.png


Same trend exists, the FX-8350 still lags behind the i5-3750k, adn even the FX-6350 comes in just ahead of the i3 lineup, trailing the i5's by a decent margin. Even the 8 core FX-8320 can't match even the cheapest i5, the 3350p, in BF4.

Hence my point: FX's architecture deficiencies are hidden in part due to clock speed. You typically see, at max OC, Intel pulling farther ahead. You also see the 4-core FX being noncompetitive, and even the 6-core being one step up from the i3-lineup.

Farther, the FX-8350, in both Crysis 3 and BF4, lost to the i5-3570k, which comes in just $10 more expensive. And at max OC, the i5-3570k pulls farther ahead (except in the case of a GPU bottleneck being reached, as seen in the second image). So I'd have to recommend the i5-3570k over the FX-8350 in EVERY instance right now. The 6350 is attractive for its price (~$140), but you accept sub-i5 performance by going that route.

That is not the sign of a good architecture, that with a clock speed edge and DOUBLE the cores, you still lag in performance, even in titles that use more then a dozen threads. The only time FX matches high end IB chips is when a GPU bottleneck suppresses the results.
 




the problem with that statement is, amd compared 15 watt beema to kabini 25 watt that we know the less efficient kabini chip, if amd compared to 15 watt kabini (a4 5000), or 25 watt beema vs 25 watt kabini than we will have better picture of the improvement, is there result of pcmark 8 home with amd 5000?
 

If that was the case, we would have already seen the slide for it 1000x because you would be trying to convince us that there is no dCPU coming, and that would be your "holy grail" of evidence. However, I am more willing to bet you have the official 2014 Desktop APU roadmap that everyone else has...

 
@logain: you don't have to be a native citizen to find a website unreliable. i am not defending pclab.pl, multiplayer games are very hard to consistently benchmark anyway. but.. at least they did it, and used dice recommended o.s. their test hardware were more well-rounded than the other site that benched bf4 mp and crysis 3 - gamegpu.ru.
....
now what if some russian poster said that they considered gamegpu.ru unreliable? :)
 


And I know a few Russian gamers on other forums who claim the same about GameGPU. I'm sure "biased" in both groups minds is "doesn't agree with my pre-conceived result set".

Here's the main issue here: I've got very limited samples to choose from here. And the only two that bench significant numbers of CPU's conflict, come from foreign sources (PcLab and GameGPU), and most annoyingly, don't match up with any other set of benchmarks I can find.

Toms and PcLabs match in terms of ordering (2500k ahead of FX-8350), but Toms didn't bench nearly enough CPU's to be enough to validate PcLabs entire result set. Same thing occured with Crysis 3 results (and I noted it at the time), in that GameGPU tended to place the FX-8350 and other FX chips higher then other sites (Toms, etc). So we get the situation where no ones result sets agree, and NO ONE is investigating why thats happening.

Point being, until some other site gets serious about benching CPU's, we're stuck with two conflicting result sets.

One thing I AM finding though, significant in the case of BF4, is that Win7.1 and Win8 results tend to differ, significantly in some cases. So we may have some OS effects in this case, which could explain everything. But I doubt any site is going to investigate that issue...

Point being, since the ordering with PcLab tends to agree with Toms (as far as ordering is concerned), I'm sticking with them unless someone can prove the results aren't legit. And no, saying "the results aren't legit" doesn't count by itself.
 
Ok, here's what I know...

AMD's 2014 product roadmap will release shortly if not today...

Piledriver FX is going to be the HEDT line through 2014 until further notice. AMD concedes Kaveri is not going to replace FX in HEDT, and they know the segment is crucial for gaming, and it's the only segment growing at this time in DT PC.

AMD expects to see the PD arch gain performance boosts in several facets and remain mostly competitive at the price points they have established through software optimization and other means.

The next HEDT platform already has an internal codename, though he could/would not tell me what it was. There were no promises that we would see the HEDT successor in SR arch, it may not be until excavator that they move forward. The issue is simply that they will not do HEDT on bulk given the way things went with Kaveri. If they had been able to completely meet/exceed the design goals on bulk with Kaveri they would have moved forward with HEDT. However, he expressed the sentiment that they regretted being forced to downgrade their process to bulk and that they know they lost performance to release the APUs in a timely manner.

Next HEDT line may come on 28nm or 20nm, he wasn't clear there, but he was certain it would not be bulk, and felt like they would move back toward FD-SOI if GF could get their "issues" sorted out for 28/20nm SHP. He could not tell me if the next iteration of their HEDT line would be a FM2+/FM3 product, or if they would in fact move forward with AM4 when DDR4 arrives, though he expects that the timing of DDR4 and the new HEDT platform would be pretty close. Supposedly, there is a new IMC in the works for next gen HEDT hardware (to coincide with DDR4).

That's all I could get...

 
if amd is so fab-bound and using bulk substrate, it makes sense for them to wait for fd-soi at 20nm and include ddr4 support to move on to an entirely new socket and new 'big' core for hedt. one catch is that current info showing st micro licensing their fd-soi booster tech for 28 and 14nm(xm version) only.
 


They hit their release targets more reliably before they spun off GF.

I agree with the sentiment...though at the point where they are, launching a private fab would be near disastrous to their bottom line. They would need to be at a point where they could offset the startup losses from the fab end with product revenue. It will be interesting to see if they go that route down the road.
 


The XM version is really a hybrid, using 14nm front end and 20nm back end with a gate last approach.
 


My point was that it was a mistake to spin GF in the first place.
 


Well, you could argue that it was...

However, you could also argue that it wasn't...since GF has lost significant money since it was spun off, where would AMD be right now? Hemorrhaging cash right now...that's where. From a business perspective it made sense in some ways, and not in others. I think all we are doing is arm chair quarterbacking at this point.
 

Which is exactly my point. You keep referring to the same website and the same benchmarks to make your point.

When FX 8350 came out, you could find games FX sucked at all over the place.

Going back in time to see FX 8350 reviews, here are the results in google and the games tested

1. Tom's hardware: BF3, Skyrim, Warcraft
2. PCGamer: Shogun 2
3. PCMag: no benchmarks
4. Amazon: no benchmarks
5. TechReport: Skyrim, Batman, BF3, Crysis 2
6. Tom's hardware with the same gaming benchmarks again
7. Engagdet roundup
8. Newegg link
9. Anandtech: Skyrim, Diablo 3, Dragon Age, Dawn of War, WoW, SC2.

My point is that to show FX sucked at gaming when FX first came out, there were plentiful resources and pages upon pages of benchmarks showing that FX was behind the competition. BF3 and Crysis 2 were really the only strong games for it back then. Basically, just a handful of outliers.

Now you are stuck posting the same benchmarks from the same website saying "see, nothing has changed!!!!!" If multi-core is not catching on and it's impossible for it to catch on, I want to see pages of google results of modern games where FX is still significantly behind.

GamerK you really need to grow up. The old 10ghz Nehalem died a long time ago and we're never going to see anything like that.

Intel can no longer squeeze more single core performance out of their CPUs. They are even going to add "MOAR COARS" with Haswell-E. AMD has just finally caught up to older Intels in IPC with SR (going by rumors).

There is a big problem here that chips can no longer scale in the ways they previously have, and that we have to do something else to make a difference.

The path you are suggesting of sticking to single core rendering and not looking for alternatives is suggesting that we create a world where there's no longer a reason to upgrade your CPUs because Intel can't make anything faster and AMD is still catching up in single thread.

Tell me how good that works out for AMD and Intel when they need to sell CPUs? You are so backwards thinking it makes my brain hurt. I can imagine you as a crotchety old man sitting in front of his computer going "these stupid kids and their Penium 1s, it all went downhill after the 486, I wish I could go back to the golden age of computing!"
 


Except AMD has clearly stated that they want to build a standard about it, offering it to Nvidia and Intel.

 


Kaveri is not "low end hardware".

And as explained before the benchmarks given are using "ordinary workloads". They are not "specialized software" for APUs neither the "best case".

You don't understand what was made/measured in the BSN* article. I have corrected your misunderstandings again and again and again. Still you insist on that those benchmarks are "specialized software" for APUS, when are not.

Pure AMD hate...
 


You? Because AMD said "jaguar servers" replaced by "arm servers"... :lol:
 


Extendable to other uarches doesn't mean adding layers on top. MANTLE consists of two layers: one API plus the driver. If you substitute the MANTLE driver for GCN by another for Intel or Nvidia the MANTLE API will work on top of non-AMD hardware.

 


We are discussing MANTLE not HSA...



As I said before the overhead in DX9 was of ~10x or higher. In DX11 it is reduced to something as 2x only if you use batch calls, which reduce the game richness and developer freedom.

No sure why you repeat what I said about optimization in PC (~30--50%) being less than in consoles (~100%).
 
Status
Not open for further replies.