AMD CPU speculation... and expert conjecture

Page 364 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.
Status
Not open for further replies.


Anyone attempting to argue that the entire purpose of an APU is to satisfy the budget / compact computing market is deluded.

AMD APUs have been competing as a low-cost alternative in much of the consumer market, because the APU concept had not been still completely developed. Kaveri is the first APU that fulfills that 'ancient' AMD dream that was born with the acquisition of ATI.

It has not still completely developed by two basic reasons: memory bandwidth and software. But the second is a consequence of the former. Once the memory bandwidth problem was solved with the almost ready stacked memory we will see the born of ultra-high-performance APUs

http://www.marketwatch.com/story/microns-hybrid-memory-cube-earns-high-praise-in-next-generation-supercomputer-2013-11-07

P.S.: the bandwidth provided by this new RAM technology is superior to the L3 cache in FX chips.
 


But it is the other possibility ;-) "The application also acquires the ability to take control of the multi-GPU and to decide where to run each command issued". Therefore both iGPU and dGPU work in tandem, as I said, and the workload is split dynamically. In some cases it will make sense the iGPU for compute and the dGPU for graphics, in other cases it will make sense that both iGPU and dGPU execute graphics work, in another case part of the dGPU can assist to the iGPU on compute...
 


HMC is superior to even GDDR5, however, that technology is years away from wide spread adoption in consumer PCs. Sure, super computers costing millions and millions of dollars use it...however, there isn't even a standard for a way to integrate it into the motherboard yet...
 


Except that is was mentioned several times here and it also appears in the slide: a R9 280X.



First, it was fear about AMD going bulk, then it was skepticism about clocks (your why-did-they-only-tested-1.8GHz argument). Now we know it clocks at high frequency of 3.7GHz, you have bad feelings about OC. What will be next?
 


No. MANTLE improves game performance thanks to eliminating the overhead in Microsoft DirectX API.

In consoles avoiding the DirectX overhead allows for about 2x more performance on the same hardware. I think we don't will see that in the PC, but 30%--50% increase in performance seems feasible.

Dice developers have said that Radeon + MANTLE enabled version of BF4 will ridicule Nvidia Titan. We will see.
 


Well, let's break that down shall we...?

1. BECAUSE AMD WENT BULK the clocks are not even what they achieved in Trinity (3.8 GHz + Turbo) much less Richland. Which means I was right, and my assessment was correct. Clock speeds decreased in Kaveri. I told you they would not even break 4.0 GHz in stock configuration, and here I am proven right.

2. Why would AMD not at least put the flagship part at the SAME clockspeed as the flagship Richland unless they had difficulties with performance? If AMD is not getting the performance they want, then that means the headroom must be quite a bit less for them to be at a 400 MHz clockspeed disadvantage to start. AMD has historically pushed clockspeeds to the limits, this time we see a regression. The OC tests will tell the tale.

Why would they lose 9% clockspeed unless they had to? Especially when the architecture gains are roughly ~20% from what we know. That means flagship Kaveri is only a ~11% improvement over Richland as near as we can tell currently. It may end up being less once the cherry picked benchmarks are tested against the more rigorous benchmarks for CPU performance.

In short...they're losing HALF their performance gain by giving up that much clockspeed. THAT is what going to bulk gets you.

Had it been FD-SOI, then it would really be a flat out ~20% gain we would be testing against other benchmarks to determine the range of improvement. As it sits, this generation is likely to see something on the order of a 1-15% improvement because they gave up too much clockspeed going bulk...all when it could have been 10-25% improvement had they been able to keep the clockspeed and make the uarch improvements.

Who was wrong and who was right?

This generation, in tests where the flagship Kaveri is really 10% better in efficiency versus Richland, it will be 1% better due to loss of clockspeed.

They should have gone FD-SOI, as I said all along.
 


The sad fact is there is a lack of electrical engineers. The tools are expensive, complex and more difficult to self teach. You can't just compile and run stuff in your apartment like you can with software. They require fabs and expensive oscilloscopes, logic analyzers, etc.

Go to any college university and the ratio of computer science to electrical engineers is likely 5:1 or higher. Of my EE peers that I still keep in touch with only 1 (me) out of 10 still deals with hardware at all. The rest all switched from hardware to software.

So pushing software more may just be their only option. At least they're doing it through foundations and open source and not going it alone.
 


If the demand for FX-8350 CPUs was higher than it is, AMD margin would be better. Look at the 9000 series. They were overpriced, nobody was purchasing them and recently their prices dropped by giant amounts. It is the always the same: lack of demand.

I consider that steam statistics must be pretty accurate from my knowledge of the local market, but look at own AMD numbers: the FX 8-core represents like a 2% of the total revenue generated by both APUs and CPUs. Now try to compute the percentage of the total revenue.



And where in AMD vice-president words says that they will release a FX successor?

What if I already have in my hands the official desktop roadmap for 2014?



I didn't mean this to be happening tomorrow. Kaveri is a 2014 product and uses DDR3. Carrizo successor comes in 2015 and uses DD3/4. The new high-bandwidth memory technology is something for 2016/17. Then we will start to see ultra-high-end performance APUs (which not will be mainstream). Nvidia is developing an APU with cube memory that is projected to give 10x the performance of the GTX Titan.
 


And you always change the story. You stated that kaveri will =i5 in ordinary cpus workloads. When proven wrong now it's only in apu specialized software.

You may be happy with weaker cores coupled with a software solution, it will not go over well when the reviews come out.
 


And unsurprisingly you come with the same misunderstanding again, one that I corrected before and before...

In my BSN* article, which evidently you didn't read, I show how kaveri performs as an i5 using ordinary cpu workloads. I am not using "apu specialized software" as you pretend.

Finally, I am happy with the idea of software using my hardware. All my hardware (or most of it) and not only one half or one third of it. There is another person here with similar ideas to me. He compiled a program for his hardware and now it runs 2x faster or said in another way the former unoptimized program was using only a 50% of the performance of his hardware.

You seem to prefer the other way. If the software only uses a 50% of the hardware you update to a 2x more powerful hardware for that the same software can now ignore the 50% of more.
 
I personally don't trust Steam Hardware Survey, it never asks to test my FX 8350 but it always asks to test my Core 2 Duo Laptop running Gentoo.

Can anyone answer my question about 22nm SOI that GloFo was working on?
 


If you have the roadmap, lets see a pic
 


I mentioned this elsewhere, but the software side of things are very good for AMD right now in regards to gaming.

Remember when FX 8350 first launched? You'd have to dig forever to find good FX 8350 benchmarks and the AMD guys were floating the same few benchmarks which showed it doing well over and over again while there was a bombardment of Skyrim, Shogun 2, Starcraft 2, Warcraft, etc.

Now look at how things have changed. GamerK is trying to disprove the FX and the best he can do is post a few niche examples of modern games that don't run well on FX.

I do think this is why AMD is not in a rush to push SR on HEDT. PD has gone from trailing Intel 3570k by 30% in reviews in all the gaming benchmarks to beating 3770k. Nothing on the chip has changed.

AMD simply has absolutely no reason to even release SR on HEDT. Why release a chip that is 15% faster when you just got 30% improvement out of software increases by nipping the whole "game developers optimizing for Intel" thing right in the butt?

After watching the APU13 keynote on Mantle it's quite clear that AMD is designing Mantle to not only scale where GPU on APU does some other calculation besides rendering, but that this will scale between TWO dGPUs.

http://www.youtube.com/watch?feature=player_embedded&v=tDPgJB2x7dQ

Watch the whole thing, I think Johan screwed up. He said you'd see a Mantle situation where one GPU calculated global illumination whilst one just rendered the scene.

Do you know what's missing from that equation?

An APU.

And I hate to beat a dead horse, but it's quite clear AM3+ is completely incapable of HSA.

Yet at the same time running two dGPUs on APU platform is a complete waste because of the PCIe lanes available.

So why talk about Mantle using two dGPUs? AMD doesn't have a publicly available platform that supports everything required for Mantle to use two dGPUs.

Johan dun goofed.
 

the said bench was done with an a10 6790k, fx6350, and fx8350 (all stock) with 64 bit windows 8.1, 16 GB (may be, i don't remember precisely) ddr3 2133 ram, radeon R9 280X. the benchmark was BF4 suez map, at 1080p at ultra settings. strangely, the apu delivered over 50 fp/s (average, not minimum), quite close to the other two with minimal variance, very, very strongly suggesting it was a scripted, offline, single player, gpu-bound benchmark. as a cmparison, earlier i saw a pclab.pl bf4 mp bench where an athlon 750k was struggling to break 25 fps minimum and 30-35 fps avg. (forgot the precise number and the map).


and seemingly shift the monopoly to amd. doesn't look like a good thing in long term.

it won't? i wonder why not... may be because the majority of work would be performed by the gfx card. why so much aversion to cpu benchmarks in a cpu discussion?

oh really? false, is it? here's what you said in your post, ad verbatim:

but wait.. there is more, in case you have conveniently forgotten (along with the poorly executed deflection when i ask for audio version):
i didn't have to bold or underline parts, it's all there.
yeah... is she saying that amd cancelled sr-fx cpus because of 6790k delivered close performance to higher core fx cpus in a single player offline benchmark(as you incessantly claim, not speculate)? i dont see that here. it's also ridiculous to think amd would promote cpus at a conference literally called APU13. amd has been migrating to apus since 2011, while their cpu lineups co-exist.

to be precise, i said that you never posted those values to me (the context-twisting is clearly on your part). not only that, you repeatedly avoided posting the values, intermediate results and calculations. 'one of my posts' is such a blanket term....anywho, please post them asap.
 
AMD's 2014 mobile roadmap is built around low power, security logic
http://www.pcworld.com/article/2062137/amds-2014-mobile-roadmap-is-built-around-low-power-security-logic.html
this has been posted before. here's the roadmap, notice the mullins and temash entries:
http://images.techhive.com/images/article/2013/11/amd_2013_mobility_roadmap-100068083-orig.png
but i don't think anyone has noticed/discussed that amd has joined intel on this SDP scam. notice in the roadmap where amd specifies sdp but strangely abstains from specifying other chips tdp wattage (e.g. kaveri's) even though they're mentioning them. this is similar to what intel did while introducing y series core i cpus aimed at tablets.
allegedly, SDP stands for scenario design power/point/idontcare. basically, amd/intel sets a lower stock clockrate so that the tdp can be stated lower while in reality, it isn't. i support amd's mobile apus more than anyone here... but this is uncool. the apus were already better than intel's, they didn't have to lower them to intel's level. otoh, bay trail doesn't have sdp (it might, but the tdp is still lower), intel's y-series core i cpus have sdp.
amd's definition of SDP may be different. i am speculating based on intel's definition of SDP.

another implication is the 28nm mention. might mean that tsmc (who historically has fabbed mobile apus/socs) 20nm mobile bulk process isn't ready yet.. or amd is not partnering for them or apple/qualcomm has blocked them out or both of last two.
 


I love the idea of utilizing software. Not at the expense of only releasing low end hardware and relying solely on the software to make up the lack of good hardware. We need both. APU are not high end. Period.

APU are a little of both, and a master of none. Without the specialized software as I have shown kaveri needs 80% or more boost to catch the low end i5. It needs 40% to catch the 4320. This is in ordinary gaming cpu workloads.

Your the one not looking at the data I have provided and coming up with lame excuses instead of trying to convince me with hard evidence. Your website and marketing slides that are designed to only show the best possible scenario do not count as evidence. Those will be only true in a few select cases and are not representative of "ordinary workloads".
 


Wow , that's just strange. Beema and Mullins weren't canceled to make way for arm cpus? Wonder who said they would be.
 
I thought the whole point of kaveri was increase ipc AND take away the 'hit' from using both cores in the same module so the single threaded performance increase would transfer to multithreaded???
In a month or so we'll get benchmarks I'm sure from review sites...
 


That's steamroller your describing. Kaveri is an apu and generally speaking they are slower than their counter parts. Removed l3 cache to make room for the gpu and sacrifices performance.

Couple that with a reduction in clock speed on the final silicon, things don't look good.
 

yeah, the keyphrase here is "gpu bound". does not mean fx8350 is comparable to i7 4770k or vice versa.
looks like mantle is turning out to be amd's admission that they cannot reduce cpu overhead from their current catalyst drivers, blaming dx instead (partly justifiable). it's great for amd and gcn based gpus' performance.
 
^^Dice think or would like nvidia to jump on board going by the talk last night, one of the last slides also specifically stated that mantle wasnt just for gcn and was forward compatible open to *any* vendor or platform.
This is the best thing for pc gaming in a long time imo, hope it all works out.
 
from those slides, another sentence that caught my eye was "thin layer of abstraction" but how thin? amd claims it is extendable to other uarches and forward compatible. doesn't that mean it will eventually become bloated? may be i'm being too skeptical.
 
Status
Not open for further replies.