AMD CPU speculation... and expert conjecture

juanrga · Nov 22, 2014

de5_Roy :

No. Carrizo was always planned for the 28nm node. I gave the correct node for Carrizo many months ago in this thread before it was made official by AMD and I did repeat it yesterday once again. The reason why AMD is not using 20nm is the same reason why others are avoiding the 20nm node from any foundry: cost

The cost of the 20nm node is not a problem for a rich company such as Apple, but AMD, Nvidia, and others will remain at the 28nm node by economic reasons.

The 14/16nm nodes (Glofo, Samsung, TSMC) will be also expensive compared to 28nm planar but will bring many advantages due to using FinFETs and almost any company will migrate to the 14/16 node by 2016 or so.

AMD's Zen and K12 will be built on FinFET node. Jim Keller already admitted that he is happy with the new FinFET node.

This new node (especially the FF+ version at TSMC) will be only half node away from Intel 14nm node in basic parameters such as M1HP. Thus AMD (and APM, Nvidia, Apple, Broadcomm, and others) will be very competitive against Intel in a pair of years. They don't need to "hype" anything, the reality is that Intel foundry advantage will be reducing. In fact, even Intel has admitted this week, in a conference, that the rest of foundries (TSMC, Glofo, Samsung) will caught Intel at the 10nm node.

AMD did put "apu optimized process" on a Kaveri slide because Glofo was optimizing the node (reason for the delay) for the unusually high density required by Kaveri. This high density was a result of using HDL on the GPU. Not even Intel 22nm node can provide the density required by Kaveri... The info that I have is that Carrizo core is smaller than Steamroller because AMD is now using HDL on the CPU as well. This will bring another quantum leap in number of transistors per area. Thus, I would not be surprised if AMD emphasizes the use of an "optimized process" during presentation of Carrizo.

colinp · Nov 22, 2014

So there'll be a desktop Carrizo? Cool! I wonder if it'll perform like an i5 in traditional computing tasks.

cemerian · Nov 22, 2014

colinp :

not a chance its still using the module design as kaveri and richland before, only now its using the last of the module design cores Excavator, and in the promotional video they already stated much smaller cores, which again means more power efficiency, but pretty much the same performance as steamroller (5-10%ipc improvement and again less potential for overclocking ) while its graphics portion should be interesting the cpu will be pretty much the same story as with any other they launched on the module based cores so no it won't

Reepca · Nov 22, 2014

cemerian :

Name of the game is "doing the same with less". On the CPU side, anyway - so far, the GPU hasn't been strong enough to be bottlenecked by the CPU (in games, at least) - mostly because the GPU is already bottlenecked, by memory bandwidth. Would it be safe to assume that if they're continuing to dedicate more space to GPU, they probably have some way to handle the bandwidth issue? I've heard good things about some lossless-compression-Tonga dealio, is that likely to change gaming performance?

Also, for many (non-render, compute) tasks that could be performed on the iGPU using HSA, would bandwidth remain an issue? If it was being used for collision detection, pathfinding, AI, etc?

juanrga · Nov 22, 2014

Reepca :

It is expected that the compression technique can increase the effective bandwidth by up to ~40% compared to Kaveri. How much will change gaming performance? It depends, because not all games have the same requirements. I believe that BF4 and similar games could run about 20--30% better.

Reepca :

As stated in the "Future of Compute" slide about HSA workstations that I shared before, copying data between dCPU and dGPU "kills performance". What AMD means is that the APU runs the computations faster than the dCPU+dGPU. Some studies show that even non-HSA APUs are at least one order of magnitude faster than dGPU

http://ieeexplore.ieee.org/xpl/login.jsp?tp=&arnumber=6031577&url=http%3A%2F%2Fieeexplore.ieee.org%2Fxpls%2Fabs_all.jsp%3Farnumber%3D6031577

noob2222 · Nov 22, 2014

Wasnt AMD moving away from GF?

http://wccftech.com/samsung-14nm-process-candidate-amd-zen-architecture/

I can see amd going with samsung while samsung is trying to help GF get their 16/14nm working, not AMD waiting for the latter. If AMD has to wait on GF, thats going to make AMD even later on product launches, making AMD that much less likely to even need GF in the future, hurting GF if they try to deny AMD the use of Samsung.

juanrga · Nov 23, 2014

noob2222 :

No. AMD has moved GPUs and console APUs from TSMC to GF

http://www.amd.com/en-us/press-releases/Pages/amd-amends-wafer-2014apr1.aspx

noob2222 :

Samsung is not "trying to help". Samsung has licensed its 14FF node to GF.

http://www.globalfoundries.com/newsroom/press-releases/2014/04/17/samsung-and-globalfoundries-forge-strategic-collaboration-to-deliver-multi-sourced-offering-of-14nm-finfet-semiconductor-technology

The 14nm node will be the same for both GF and Samsung this is the reason why Samsung will use GF for fabricating the future Apple A9 chips.

Not that I would doubt from a rumour spread by the infallible WCCFTECH (<== sarcasm) but wouldn't be a bit weird (and economically suicidal) if AMD gives its products to Samsung and then Samsung fabricates them on GF?

szatkus · Nov 23, 2014

Move, move, move...
Do you know that AMD can manufacture one product in both foundries? WSA doesn't imply abandoning TSMC.

-Fran- · Nov 23, 2014

From your link, Juan: "and the manufacturing of certain Graphics Processor Units (GPUs)"

They're not abandoning TSMC. At least, not in 2014 from that link.

Cheers!

Reepca · Nov 23, 2014

juanrga :

Reepca :

It is expected that the compression technique can increase the effective bandwidth by up to ~40% compared to Kaveri. How much will change gaming performance? It depends, because not all games have the same requirements. I believe that BF4 and similar games could run about 20--30% better.

Reepca :

As stated in the "Future of Compute" slide about HSA workstations that I shared before, copying data between dCPU and dGPU "kills performance". What AMD means is that the APU runs the computations faster than the dCPU+dGPU. Some studies show that even non-HSA APUs are at least one order of magnitude faster than dGPU

http://ieeexplore.ieee.org/xpl/login.jsp?tp=&arnumber=6031577&url=http%3A%2F%2Fieeexplore.ieee.org%2Fxpls%2Fabs_all.jsp%3Farnumber%3D6031577

What I mean is that I've read that iGPUs in traditional rendering scenarios (where it's being used specifically as a graphics processor) are bottlenecked by system memory bandwidth (hence why faster RAM helps such a fair bit). Would this limitation apply to other compute cases, that aren't traditional rendering scenarios?

juanrga · Nov 23, 2014

-Fran- :

Right, but I didn't say AMD is abandoning TSMC. Someone asked if AMD was "moving away from GF" and I showed that AMD has moved production from TSMC to GF this year.

I think the reason why AMD has moved only some GPUs to GF is because GF lacks the experience and technology for fabricating the bigger and more complex GPUs.

juanrga · Nov 23, 2014

Reepca :

Yes, I understood it and I gave you two examples (the "Future of compute" slide about oil and gas simulation, and the IEEE Xplore paper) showing how APUs can be up to 70 times faster than dGPUs for certain compute workloads. The GDDR5 memory used by the dGPU gives more bandwidth than the DDR3 memory used by the APU, but the PCIe interconnect of the dGPU is creating a bottleneck when moving data to and from the CPU.

In any case, just note that the bandwidth problem of APUs is temporal. Future APUs will use stacked DRAM with lots of bandwidth. For instance, I mentioned before that AMD has announced that will use HSA-enabled APUs for its future extreme scale supercomputers. The announcement can be found here

http://ir.amd.com/phoenix.zhtml?c=74093&p=RssLanding&cat=news&id=1989919

I know the details of the project and I can say you that the engineers claim 4TB/s of bandwidth for the APU.

noob2222 · Nov 23, 2014

A bit of adding whats not there as usual.

The amd wafer agreement was march 2014 for the remainder of 2014 meaning this is strictly for 28nm parts. The samsung-AMD rumors stated on multiple sites (not just the parrot wccf site) are for 2015 manufacturing at 14nm.

Where anywhere in that other article does it even suggest that Samsung is utilizing GF for apple soc? Samsung owns the fab in austin.

GF hasnt been very reliable for being on time with their roadmap, I dont see them getting there without Samsung's help, not the other way around, hence why its "Developed by Samsung and licensed to GLOBALFOUNDRIES, the 14nm FinFET..." This isnt GF manufacturing for Samsung.

gamerk316 · Nov 23, 2014

What I mean is that I've read that iGPUs in traditional rendering scenarios (where it's being used specifically as a graphics processor) are bottlenecked by system memory bandwidth (hence why faster RAM helps such a fair bit). Would this limitation apply to other compute cases, that aren't traditional rendering scenarios?

Anything that needs to transfer data to the GPU over the main memory bus is going to be affected. Why do you think we've added so much cache each CPU generation? It's a hack to work around slow memory access times. dGPUs get around this by putting GB's of super-fast RAM on die, and you only have the initial latency of the first memory transfer into the VRAM. That's the primary reason dGPUs will ALWAYS outperform iGPUs/APUs, unless we ever go to putting memory directly on the CPU (which is CRAZY expensive to do).

szatkus · Nov 24, 2014

gamerk316 :

GPUs also have cache. GDDR5 has greater latencies than DDR3. Maybe HBM would solve all those problems, because right now they are better than GDDR5 and DDR3/4 in every aspect except price.

wh3resmycar · Nov 24, 2014

so you guys are back on the 2020: GPU extinction topic again, hilarious. i just bought a gtx970 2 days ago, and i doubt any APU can match this card in 6 years time.

it's not like GPUs will stay @ GDDR5 forever even, so can anyone explain how stacked DDR4 APUS will catch up against stacked (volta onwards) GDDR6 (or 7)?

szatkus · Nov 24, 2014

wh3resmycar :

I don't think that dGPUs will extinct by 2020. It's another class of problems, where clusters with APUs will be better than CPU+GPU. For gaming traditional GPUs will be safe for at least next 10 years.

wh3resmycar :

They will rather use HBM and/or HMC.

con635 · Nov 24, 2014

wh3resmycar :

Maybe with a die shrink they could fit 4-6 sr/ex cores onto a 290, a bit of hbm and bam your 970 is matched within a year?

de5_Roy · Nov 24, 2014

^^ there might not be future gddr vram. we'll likely see another level of cache memory tacked on cpus and gpus as well as stacked, high pin-count, high bandwidth memory techs. near-future node shrinks will open up enough space. problem is that foundries that's called glofo will be competent enough to fabricate such complex chips or not. tsmc, intel and samsung can and they have headstart on glofo.

edit:
AMD Mobile "Carrizo" Family of APUs Arrive in 2015
http://www.techpowerup.com/207481/amd-mobile-carrizo-family-of-apus-arrive-in-2015.html

Prices for AMD's powerful Radeon R9 295X2 graphics card plunge another $200
http://www.pcworld.com/article/2850870/you-can-now-get-amds-radeon-r9-295x2-graphics-card-for-800.html
Low cost, high resolution: Prices for 4K displays sink below $500
http://www.pcworld.com/article/2851372/prices-for-4k-monitors-sink-below-500.html

gamerk316 · Nov 24, 2014

wh3resmycar :

Simple answer: They won't.

juanrga · Nov 24, 2014

noob2222 :

It is not random that Samsung has licensed the 14nm node to Glofo instead everyone else. The Glofo link I gave before contains AMD words about this:

Emphasis mine. AMD wouldn't need to say that, if the plan was to use only Samsung for the 14nm products...

blackkstar · Nov 24, 2014

de5_Roy :

Stacked memory uses a lot less power, so it lets you use more power on things like GPU cores. I would have to agree, I don't think GDDR has much of a future.

Is there hope for Mantle to exploit a dual GPU card in a way where it doesn't have to mirror everything in each GPU's memory? That seems like a pretty good place for HSA and Mantle to solve a long standing problem.

I am holding my breath for the A and A- 4k IPS panels to show up on ebay. I have a Catleap and it's the best choice I ever made for a monitor. After going plasma TV though, even IPS looks awful. I don't know if I could go back to TN again.

de5_Roy · Nov 24, 2014

BenQ and Viewsonic FeeSync monitors in time for holidays
http://www.fudzilla.com/home/item/36389-benq-and-viewsonic-feesync-monitors-in-time-for-holidays

Rafael Luik · Nov 24, 2014

gunrave :

When did i3 became able to play anything at higher than 15 FPS with the IGP? Oh yeah.

A10-7700 and A10-7850K IGP give you 30 FPS at worst and 45~50~55~60 FPS on quite modern games on medium/high 720p... Some people don't need a dGPU, I think the price is justified.

gamerk316 · Nov 24, 2014

Is there hope for Mantle to exploit a dual GPU card in a way where it doesn't have to mirror everything in each GPU's memory? That seems like a pretty good place for HSA and Mantle to solve a long standing problem.

Depends on how you split the workload. Take Alternate Frame Rendering: You have one card doing one frame, and another card doing a second frame. It's feasible even for DX to not have to duplicate the VRAM contents, since both cards are working on a different frame. There's just a lot of plumming involved if you want to do it that way (remember: One card is still the "primary" card). For Split-Frame rendering however, since both cards are working on the same frame, its more or less a requirement they share the same memory context.

AMD CPU speculation... and expert conjecture

Distinguished

Honorable

Honorable

Honorable

Distinguished

Distinguished

Distinguished

Honorable

Illustrious

Honorable

Distinguished

Distinguished

Distinguished

Glorious

Honorable

Distinguished

Honorable

Honorable

Splendid

Glorious

Distinguished

Honorable

Splendid

Reputable

Glorious

Share this page