AMD CPU speculation... and expert conjecture

Page 672 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.
Status
Not open for further replies.

jdwii

Splendid
Is unfair to compare any Intel extreme CPU after the Pentium 4 days to amd so its irrvelvent. When you mention a 4790 its way over amds head in performance per watt. Then again one is 340$ while the other is 150$
 
this is why performance per power per dollar is more important than perf/watt alone. since vendors often charge more for more efficient parts, customers(from customer p.o.v.) may have to trade off efficiency for budget to strike a balance in perf per watt per $$. you can also trade off efficiency and budget for raw performance alone. these are more or less dependent on customer's state of mind and overall requirements rather than actual technological or technical facts.
 

Ranth

Honorable
May 3, 2012
144
0
10,680


If the 220watt cpu was blowing all other processors out of the water we'd see many, but that is not the case. And in the case of AMD's 220watt it's not really amazingly fast and is way too expensive for the performance. (perf/$ again).
 

juanrga

Distinguished
BANNED
Mar 19, 2013
5,278
0
17,790


No. It is not "getting very hard", it is physically impossible because of the physical constraints. Accelerating an electron to the speed of light is not "very hard", but physically impossible because the laws of physics explain that an electron cannot travel at that speed. CPUs also follow the laws of physics and energy scaling silicon laws explain why traditional scaling is dead and we cannot rely on node shrinks to increase performance at historical trends anymore.

Traditional scaling was (70%, 70%). If you read the excerpt that I quoted before, scaling now is (70%, 20%). That 50% deficit is a consequence of the physical properties of the silicon at smaller lengths, it is a intrinsic characteristic of Nature and cannot be avoided by working a bit hard.

Efficiency is now a key parameter to get higher performance on future nodes.

The consequence of the laws of physics is not what you believe. The future HEDT systems will be power constrained exactly the same than they are constrained now at about 1000W. The difference is that those future HEDT will be about 25 times more efficient and thus will run much faster than today.

This is an AMD thread and I have discussed only AMD plans to increase efficiency by 25x for the year 2020. But Nvidia or Intel have the same plans, because the laws of physics are the same for everyone. Below is a slide from Nvidia, you can check by yourself that Nvidia plans to increase the efficiency of its top devices by 25x, the power by 2x and the performance by 50x for the same year 2020

DARPA-goals.png
 

juanrga

Distinguished
BANNED
Mar 19, 2013
5,278
0
17,790


And then a customer will meet AMD's semi-custom division and will reject that hypothetical future card (by same reasons that Sony chose a 18CU APU instead a 18CU discrete card), and according to AMD data 90% of new PCs sales are for APUs and only 10% are for CPUs, which means less than 10% of PC users would select the hypothetical future card. And AMD server division will sell the 18 CU APU (by the same reasons now is selling Berlin APUs), and what you guess AMD will use for future workstations? (Hint: check the next slide)

1416466970774057.jpg


At the end that hypothetical future card will be useless for AMD and not released to the market.
 

wh3resmycar

Distinguished
yawners.... takes 3 gtx980 to do smooth 4k circa 2014 for crysis 3 *imagine a how a demanding game in 2020 will be. HSA will not make my videogames faster so i could care less. mantle nearing it's 1st birthday and i still can't find a game worth it's full price on it's release date, i moved from a 260x > 650ti > 280x > gtx970 and back to a 260x (i'll wait for the die shrink maxwell) and i still can't find a darn good mantle enabled game. tressfx only works for shampoo acceleration for lara croft. instead of SLI'ing GPUS, people will use multi APU setups perhaps? quad socket consumer boards?

i still giggle every time someone posts his cute equation explaining why GPUs will be obsolete.

i've seen enough AMD slides during my time, and i learned that every time you get hyped by it you need to ponder and pull yourself back to reality.

juanrga could've been so right if only we didn't move forward from the quake3 engine.
 


You love arguing even when someone agrees to what you're saying, don't you?

But in any case, i'll tell you again I AM NOT SAYING OTHERWISE TO WHAT YOU ARE EXPLAINING.

HEDT folks will try and push ANY piece of hardware to a higher power ceiling, REGARDLESS OF IT GETTING LESS EFFICIENT and as a consequence getting more raw performance. The comparison between the sweet spot's performance and the higher consumption point are valid of course, and again, Pilediver is an excellent example. You get more out of an Intel design when doing OC than AMD, so HEDT will turn over to Intel for raw performance. For perf / price, I'd say (and agree to others) that is budget constrained (that's why Tom's has SBMs) and AMD is not so bad against Intel; and notice efficiency there means squat, because you're at a low enough power territory it's not an issue.

And wh3resmycar, even though Juan is arguing given on marketing slides, I do believe AMD wants to replace at least low en GPUs with APUs. So far, that's what the roadmaps are telling. High end GPUs, I believe palladin's opinion has more weight and I agree with him. AMD will have to do a quantum leap (or magic) for squeezing that much compute in a lower thermal envelope.

Cheers!
 

noob2222

Distinguished
Nov 19, 2007
2,722
0
20,860
Efficiency is a requirement for getting future nodes feasible.

Die shrinks increase the thermal density at a lower rate than the inherent efficiency/area gain. That means if you dont make the design itself more efficient, it will just continue to run hotter and force the cpu itself to be downclocked, meaning raw performance just got worse.

Efficiency is a key to get equal performance from future nodes.

To put it another way, take pd to 14nm without changing anything else, it will run 60c at 3.5 ghz instead of 4 ghz. Sure power and efficiency is better, but raw perf is worse. Prime example is people with HEDT wanting to oc the 2500k rather than buying the more efficient 3500k.

Edit: part of this is what ive said in the past. Hedt is slow because raw performance has stalled and will continue because we are on some sort of crusade to make things "more efficient" not faster. And no those 2 dont = without the main constraint for speed, heat transfer. Sure, in a server you can just slap more cores to make it seem faster, but "ipc" is barely changed and thats what drives HEDT sales.
 

blackkstar

Honorable
Sep 30, 2012
468
0
10,780
AMD isn't going to sell good low end GPUs anymore, because they know instead of buying an APU, people will buy an Intel CPU with the AMD GPU. They are being greedy a bit, but they need market share for HSA enabled rigs to get developers to make HSA software. And Intel CPU + AMD GPU is not the answer.

I don't think you are understanding what I am getting at, Juan. Efficiency is important. But a comparison between FX 9590 at 200w+ and 4790k is not what I am talking about, they are apples and oranges. I am talking about something like 4790k and 5960x. In HEDT that 140w CPU looks awfully good compared to something like ~50w Pentium G.

HEDT is also important because from a marketing standpoint, the people buying HEDT computers are the people who tell the less informed people what to buy. They are usually the token tech guys who people run to to answer computer problems and make buying decisions. Having the minds of the HEDT customers into your products is very valuable as that translates into other organic sales. It is also a place for Intel, AMD, and Nvidia to sell their leaky reject parts for server, compute, mobile/embedded (like iMac getting the good Tonga GPUS) and workstation.

My point is that in HEDT, performance is king. If FX 9590 creamed everything that Intel had, it would sell well. But it's slower so the fact that it's less efficient is irrelevant to my claims, because I'm claiming raw performance is the deciding factor.

Yes, you are right when it comes to comparing efficiency of something like Piledriver to Haswell. But you're completely missing my point when I'm talking about comparing 50w Haswell to 140w Haswell. HEDT is about taking the most efficient parts and binning them to have the most performance regardless of power consumption, heat, etc. Which is where my statement that HEDT doesn't care about efficiency comes in. People will go for what gives best performance and if it's good perf/$ will be up in the air (people buy Titans and such even if they are awful perf/$ just because they had good performance relative to everything else).

You also ignore demands of HEDT users. My 24-core Opteron rig is faster than my FX 8350 overclocked and is much faster in multi-thread, while using less power. It's more efficient by far. But ask this forum how many people would be all over a 24-core 2.4ghz K10 system over something like 5ghz FX 8350 and they'd definitely take the FX 8350, because it trades the efficiency of many smaller, more efficient chips for better single thread performance, which is something HEDT people need.

And yes, I realize you're going to say 5690x is far more efficient, and it is, even though my Opteron rig trades blows in MT with stock 5960x. But I also paid $8.50 for each Opteron 8431. But 5960x is 30.8 times more expensive than what I bought. So I throw efficiency out the window for performance first and performance/$ second.

It's like I said, no one cares about efficiency for HEDT. The most important metrics are performance, then the majority of people still care about perf/$, though people will not care about it and spend a bunch of money on something that's just a bit faster (it's why HEDT prices don't scale linearly with performance).

We are at odds with each other always, because I actually have needs from computers that I have a difficult time satisfying and what you always suggest is at direct odds with things I need. And you may find this difficult to fathom, but I'm far from the only person who wants more power yet doesn't have something massive like Disney's 55k core 1.5megawatt render farm. the problem is there are no marketing slides that represent people like mine's needs so you just think those needs don't exist.

And it's a shame, because those needs definitely exist, yet you refuse to acknowledge they exist because you think the only types of computers are either gigantic datacenters where 10% more efficient means saving tens of thousands of dollars in electricity a month or people who want 25w ARM SoCs so they get battery life that lasts for days. There's a big gap in these two markets that you consistently over-look and even refuse to acknowledge.

And as a bonus to how wrong you are about these specific markets, consider the state of Xbone and PS4 and the fact that they are where they are because of new power requirements forcing the consoles to reach levels of efficiency. Do you think PS4 and Xbone would be in better shape if they could consume power like Xbox 360 at launch and PS3 at launch? But I thought efficiency was the only valuable metric! We should all be selling our gaming PCs and buying xbones and PS4s! Sell your Tahitis, Hawaiis, GM204s, GK110s, GK104s, Haswell-Es, Piledrivers, etc. Efficiency is king an the only valuable metric so buy a PS4 or Xbone for the superior gaming experience!
 


Yes, you are correct, just like Juan is correct. We're discussing slightly different things.

I won't deepen the argument and we'll have to agree to disagree.

Cheers!
 

colinp

Honorable
Jun 27, 2012
217
0
10,680
Isn't this the 12th time we've had this argument on this forum this year? Each time started by the same person, each time it goes the same way.
 

juanrga

Distinguished
BANNED
Mar 19, 2013
5,278
0
17,790
Some people got my point about efficiency. Others are ignoring my point or misinterpreting my claims.

Also a final remark, I am not "arguing given on marketing slides" as some pretend. I am using physical laws to explain how will be the future CPUs, GPUs, APUs, and SoCS of AMD. Evidently the "Future of Compute" talk slides cannot ignore the underlying physical reality, because engineers are constrained by the laws of physics.
 

Reepca

Honorable
Dec 5, 2012
156
0
10,680


Where could I find the "future of compute brief" article in English? It's a lot easier to comprehend all the slides when there's nice readable text surrounding them to provide context. :~)
 

google translate could help.
the text could be useless if the author is not quite tech-knowledgable. you might end up with wccf-level "analysis".
 

etayorius

Honorable
Jan 17, 2013
331
1
10,780



So AMD changed the "14+ titles for the end of the year" to "20+ Games Launched or in Development", the only ones where performance shine is Thief and Civilization 5. There are not many Benchmarks for Dragon Age Inquisition with MANTLE, also Need For Speed Rivals Mantle support never arrived even though the game Tech Support aknowledge MANTLE was coming for the title.

We have no info on Minimum System Requirements for GTAV or if it will use MANTLE at all, and were are already 2 months before release... which means the game will be pushed even further in release... gosh.
 


And so would the APU.

The graphics components of current APU's are limited by the factor the DDR3 7750's were, the memory interface. 7750 had DDR3-1600 as it's memory, APU's now have DDR3-2133 (or 1600 if your cheap), in both case's the memory is limiting the performance of the graphics processor. This is evident by the significant speed increase the GDDR5 7750 got over the DDR3 version, same graphics processor but the memory no longer holds it back. Any technology that is developed to increase the power of low end graphics hardware will also be applicable to combined APU's

To those who asked about multi-socket APU's, that would be a negative. The engineering cost for multi-socket systems is astronomical due to the complexity of running all those traces. Multiple memory subsystems, lots of PCIe lanes along with whatever additional I/O chips are needed to bridge everything. This additional cost is warranted in expensive high performance server / workstation builds but not in a value orientated market that APU's are targeted at. Even for computational servers a customized designed that puts APU chips onto circuit cards would be cheaper then trying to make multi-socket boards.
 

blackkstar

Honorable
Sep 30, 2012
468
0
10,780


I was bringing up multi-APU systems not for budget, but for those professional markets. A workstation with 2 or 4 APUs and full HSA would be extremely fast and AMD could command a premium for it, specially if it ends up faster than something like dGPGPU + CPU. Even cloud based computing that's heavy compute based would be good.

Also, as a side note, I recall reading in another earlier (I forgot where but it's a forum) that someone is claiming AMD is going to eventually break down parts of an APU and stick them together with an interposer on a single package. I.E. each core is an individual die and so is the GPU.

I guess in a roundabout way, what I've been getting at is HSA is going to make it so you don't have to keep everything on die. If everything has access to the same memory the interconnects already exist.

I.E. if CPU can access a memory pool somewhere (HBM, RAM, GDDR, whatever) and the GPU can access the same, then it makes it easier to do things like put separate dies on a single package or even go as far as multiple packages in different slots (dual/quad APU, PCIe GPU accessing same memory as CPU, etc). I realize these methods all have significant draw backs, but the hurdle of having to transfer the entire working memory over a bus has essentially been removed. And that opens up a lot of potential. Yes it has problems, latency and slow buses, a lot of added complexity, etc.

But we've seen what HSA can do with JPEG decoding and with OpenOffice. It's more than likely worth it in the long run if AMD can find someone interested in it. And all of that sort of works very well with AMD being semi-custom. It's basically opening up the doors for AMD to run around telling people they have solutions to their problems if they're willing to pay for semi-custom. And the more modular AMD makes all of this the more solutions they can solve.

Off the top of my head, GPU on one die ,CPU on another, both in the same package. CPU is on an aggressive process good for CPUs, GPU on one good for GPUs. They don't have to be on the same die anymore because they can directly access memory whether it's some big HBM cache or system RAM or whatever. Right now APU has to make trade offs in fab process and we get stuck with things like Kaveri process tuned for GPU and CPU at the same time (what they both want are at odds with each other) and you lose clock speed.

I'm just trying to ask what can HSA be used to solve. I really doubt AMD is pushing for HSA so hard just so their all in one prototypes (we all know how well their good prototypes make it to market *coughmullinstabletscough*) can decode JPEGs faster and you can sort big lists in Open Office faster.
 

juanrga

Distinguished
BANNED
Mar 19, 2013
5,278
0
17,790


I am still waiting for Mantle on linux
 
I'm just trying to ask what can HSA be used to solve

Not much, honestly. It's a convenience, nothing more. Anything you can do with HSA you could already do with OpenCL/Direct Compute. Having everything in the same memory context is nice, but you ran into the same problem I said you'd have from day 1: Slow system RAM. That's why you still have, and will continue to have, dGPUs, since the memory you can put on them will always be faster then what you use as main system memory. The only overhead is the initial transfer, which is small compared to the speedup of GDDR5 versus DDR3.

You have one thing good at serial workloads, and one thing good for parallel ones. Trying to mix and match will dilute both, and advance none.
 


The whole purpose of HSA is to try and correct this issue. As it stands when your working on a bit of software that isn't 3D graphics but has large blocks of parralell codde in it, this code runs on the serial focused CPU which is slow. Now if the portions of the code are fairly short, using Open CL to manually move this to the GPU, perform you calculation, then move it back again will cost more time than the time saved by the GPU doing the work. With HSA (and an included GPU portion on die) the code can be passed to the GPU for the parallel portion with no moves or copying. This allows the best processing resource to work on the code when required, without the inherent costs of shoving across a slow bus.

HSA is about allowing the correct tool to do the correct job- now that doesn't mean it does anything to improve graphics performance really, as graphics are already tied to a dedicated accelerator and the job is large enough for it to stay there (although I can envisage allowing shared memory access could be a benefit in some circumstances, like bringing the GPU in for more non graphics stuff like physics calcs, audio processing and such).
 


No, the point of HSA is to push forward in computing in areas where power maters. The drive for heterogeneous compute is going to happen either way, it is just more efficient in a SoC for it to be able to use all its processors for all tasks rather than only specific hardware accelerated tasks. It is the only real solution to the dark silicon problem we will be faced in the coming years. Everyone will have to have heterogeneous cores in the next few years simply due to efficiency.
 

etayorius

Honorable
Jan 17, 2013
331
1
10,780


Well they first said NO, then they said Maybe, then they said OK and Finally AMD Said NO to MANTLE on Linux...
 

8350rocks

Distinguished


They have not said no to MANTLE on Linux. The last conference call I had with Richard Huddy, he said specifically that they were looking at getting people on board with an open source project, but it would be after they opened it up as an open standard...

The biggest issue they have with AMD doing something for Linux and calling it a unilateral Linux implementation is the numbers of things that can change between installs of the same distro. They went with an open source standard so that the development teams behind the separate distros could implement the MANTLE API into their distros as they saw fit to support AMD graphics.

Unfortunately, AMD cannot write a driver for every linux branch out there, let alone every separate distro...
 
Status
Not open for further replies.