AMD CPUs, SoC Rumors and Speculations Temp. thread 2

cdrkf · Jul 21, 2016

gamerk316 :

Yes but we aren't talking 40% IPC over an FX8350. We are talking 40% IPC games over an Excavator based Athlon X4 845 (which itself is 15 - 25% higher in IPC than the Pildriver based FX, depending on workload).

That means we are talking around a 60% IPC uplift from the last FX platform, rather than 40%.... it will be a really nice improvement over PD.

Also worth noting that is in single thread- in multi thread we have 8 *full cores* and multi threading which means:
1: Zen will mop the floor with PD in FP code (as we have 8 FP units not 4)
2: Zen is not only faster per core, but we should expect another nice boost from the inclusion of multi threading. I predict in heavily threaded apps the 8 core, 16 thread zen is going to be pushing anywhere from 50% to 80% faster than an FX8350 depending on type of code (as it has 2x the FP throughput, 2x the decode capability as each core has a decoder, double the number of threads). The *only* disadvantage is a reduction in clocks.

Edit some numbers:
Cinebench R15 Single thread score for X4 845 (assuming running at turbo speed of 3.8ghz): 94
http://stablecomputer.com/amd-fx-8350-review/

Cinebench R15 single thread score for FX8350 at 4.2ghz turbo: 96
http://stablecomputer.com/amd-fx-8350-review/

Assuming zen single thread will run at 3.5ghz (looks plausible given current rumors):
(94/3800) x 3500 = 87
87 x 1.4 (for 40% ipc boost of excavator) = 122

122 / 96 (to compare to PD) = 1.27, i.e. 27% uplift in outright performance in single thread.

Now lets look at what happens with multi thread:
This link shows Cinebench scores for many Intel CPU, including the same gen i5 and i7 (this is to get an idea of potential gain due to SMT):
http://www.kitguru.net/components/cpu/luke-hill/intel-core-i7-6700k-i5-6600k-skylake-cpu-review/5/

I5 6600 at stock scores 654
i7 6700 at stock scores 926, however is clocked slightly higher so normalized for clock speed = 860

860 / 654 = 1.31 i.e. 31% uplift from HT. Maybe that is unfair to expect AMD to get quite that much, but I'll go with this number for now...

So, FX 8350 in multi thread scores 628 from previous link.
628 x 1.27 (from higher per core performance) x 1.31 (boost due to SMT) = 1045 multi thread score in CB (which is 66% faster than a FX 8350 in same workload).

Edit 2: That multi thread score suggests an 8 core Zen sits close to a Haswell 5820k (6 core). That said, I haven't taken into account the CMT performance penalty PD has so it could arguably scale higher than that in multi thread. Either way that puts zen comfortably above the current i5 / i7 in multi thread (although due to clocks it will be slower in single thread).

jimmysmitty · Jul 21, 2016

-Fran- :

simon12 :

Yes.

simon12 :

Since it is scalable, like I said, the maker can choose whatever suit him best.

simon12 :

Complexity vs cost I'd say is the correct answer to this question. You always need to strike a balance between simplifying the MoBo circuitry so it is cheaper to manufacture (chipset logic included) and how much stuff you can slap into the APU (on-core and in same package).

Putting it to a more concrete scenario, and one that I'd love to see:
APU: Zen + GCN v5 (+ 2GB HMB ~ 256GB/s in-package).
MoBO: ITX / mATX + DDR4 (2 slots) + bazillion USB3.

That would be quite the potent little machine for a living room IMO.

Cheers!

EDIT: Formatting.

Plausible but to be honest I think Intel will have some sort of built on memory before AMD will only due to the costs associated and the experience, Intel already has the large L4 cache with Broadwell under its belt.

-Fran- · Jul 21, 2016

jimmysmitty :

There is one key difference though... Please correct me if I'm wrong, but from what I know of HMC, it doesn't go as high in terms of density as HBM (GB per mm^3). So if your goal is to make it a dual approach to iGPU and CPU, then you would need 8GB+, but it would bring the price off the roof to get that amount into the same package. So, it would only be feasible as a Server solution and that's about it. At least, initially. For CPU only tasks, I don't know if the extra bandwidth is needed for them.

Now... You can put it as main memory and make the iGPU use it externally just like regular RAM though. I don't think there are any plans for HBM to be used like that.

Cheers!

jimmysmitty · Jul 21, 2016

-Fran- :

I don't expect Intel to use HMC as anything more than a cache and possibly a memory for the iGPU especially now with 3DXpoint.

That said, I can't find anything that specifies how much memory per HMC module they could have.

HBM will not be used in that way no. At best it will be, to start, a MCM design with it on package but not on die. That will come later but I don't know if that will even be cost effective yet.

cdrkf · Jul 21, 2016

jimmysmitty :

If they were to build and APU with HBM- the HBM would have to be on an interposer as a discrete package due to the fact it's stacked vs a planar design for the rest of the APU. Id on't think it can be stacked directly onto the die (at least yet- I think there may well be ideas for that in the future perhaps).

It certainly wouldn't be a budget part, the advantage of it would be space saved for a given performance level compared to discrete components.

juanrga · Jul 21, 2016

cdrkf :

juanrga :

3.0ghz with boost speed of 3.6 wouldn't be a disaster, although I'd hope to see a higher max boost speed for a single core than that. Base speed of 3ghz on an 8 core part wouldn't be a bad thing.

The clock problem will be minimized in an octo-core but will amplify in quad-cores.

juanrga · Jul 21, 2016

gamerk316 :

Are you talking about single thread or multi-thread? If it is the former then it is better to use max turbo clocks. If it is about the latter then the IPC gain is not 40%, but 40% plus the SMT gain plus the gain from PD-->Excavator

1.4 x 1.2 x ~1.2 = ~ 2.0

This is close to the 2x drawn in the Orochi vs Summit slide

Cazalan · Jul 21, 2016

-Fran- :

The underlying memory arrays are largely the same so the capacities can be similar if they choose. The main difference between HMB and HMC is the physical interface. HBM is very wide parallel (speeds similar to DDR2). HMC is very fast serial interfaces (about PCIe 4.0 level). Both could be used as a main memory if they choose but the capacities probably aren't there yet for the projects they're being used on.

If you consider today that Vega/Pascal will be using 4GB HBM2 they're getting 16GB this year with 4 modules. There are supposed to be 8GB versions due out later which would yield 32GB.

My main system only has 16GB RAM but it rarely uses more than 8GB, so if you built an expensive APU with 16GB/32GB I think you'd be set for memory for many workloads.

Heck I got a laptop that gets by with 4GB and an SSD. Its fine for most every day tasks.

juanrga · Jul 21, 2016

jimmysmitty :

Intel must be using the 2GB HMC 1.0 modules in the KNL CPU. I don't know the capacity of HMC 2.0 modules but the 3.0 spec is expected to increase this to 8GB. However modules capacity doesn't tell the whole history. HMC has a far-link mode that allows modules to be connected to other modules to increase overall capacity. We have to recall that HMC has been designed to replace DDR4 as main memory, whereas HMB doesn't.

gamerk316 · Jul 26, 2016

Several sites are reporting the Nintendo NX is basically a portable tablet running on Tegra Hardware, so it looks like AMD is shut out in that front.

Also, confirms what I suspected over a year ago.

juanrga · Jul 26, 2016

AMD got the console business thanks to a good amount of luck. Both Microsoft and Sony wanted ARM hardware for the consoles, but ARM was only 32bits when they started to design the PS4 and the XboxOne. There was a technical "bake-off" between ARM hardware and x86 hardware and at the end both companies decided that ARM was not ready and that 32bit wasn't enough. Then they went the x86 route which left Nvidia out the competition, and AMD and Intel as the only players. Intel rejected to provide custom SoCs, and then AMD won both console designs. Today, with ARM64 hardware, the consoles wouldn't be using jaguar cores. In fact even AMD has replaced jaguar cores with ARM cores in the server SoCs.

With ARM64 now a reality and mature, I am not surprised that Nintendo chose ARM hardware and that Nvidia won to AMD. I have been predicting this kind of stuff for a while now.

-Fran- · Jul 26, 2016

Pats on the back for everyone!

Cheers!

jimmysmitty · Jul 26, 2016

juanrga :

The funny thing is I think this was a blessing for Microsoft in the long run which will also be a blessing for AMD too. They are probably going to stick with x86 as they plan to make it so the XB1, XB1 S and Scorpio all work with the same games and that games also have cross platform play with PC.

I think they will use 10 to help unify that which will be a massive benefit to PC.

Of course I can't say they will always stick with AMD or even x86 as better options may come along.

ComputerSecurityGuy · Jul 26, 2016

I was suspecting the NX would be ARM, but I thought it would be AMD. So somebody is spending hundreds of millions of dollars on a semi custom SoC from AMD, and is ARM based. And we don't know who.

Vogner16 · Jul 27, 2016

regarding your discussion of an APU with HBM... if we are seeing leaks of HBM in an APU in 2018 isn't that when amd's "next gen memory" graphics cards come out?

I have serious doubts if an HBM APU will ever see the light of day. more likely amd will do a new socket with quad memory channels like intels 2011 socket

juanrga · Jul 28, 2016

Vogner16 :

The HPC APU with HBM already uses a quad-channel DDR4 socket.

jdwii · Jul 28, 2016

ComputerSecurityGuy :

I basically thought they would be using Arm plus a AMD custom GPU. But that would have probably cost even more since it would have to be a heavily custom SOC. Plus Nintendo needed the best possible solution for performance per watt and that leaves Amd so far off the table its not even funny.

Jaguar isn't even that great either i actually suspect NX to have a competitive CPU or something slightly better then what is on the PS4/Xbox one. Terms of GPU lol well if they don't make a actual console i do not expect much above the Wii U currently. Tegra X1 tablet only has 365gflops which is basically even with Wii U. But the amount of ram in the NX could be up to 8GB but again as its a portable i do not expect it to be GDDR5 memory. Unless the dock that the portable plugs into offers more power or a way to cool the tablet more to offer more power i do not expect much.

Amd would have won this battle if they would have made a decent Arm design like they said they would instead it seems like they just through Arm out the table.

ComputerSecurityGuy · Jul 28, 2016

Well we have confirmation on three big SoC orders, two AMD64 and one ARM. So somebody placed an ARM order.

Cazalan · Aug 2, 2016

ComputerSecurityGuy :

With the recent XBOne S release now we know of 3 new ones.

XBone S(lim)
XBone Scorpio
PS4 Neo

Unless they are not counting Scorpio as that is due fall 2017.

ComputerSecurityGuy · Aug 2, 2016

Well I doubt it's ARM. I wouldn't be surprised if the Xbox1/Xbox1S actually use the same SoC unless somebody has confirmation of something else. Just a faster clock due to more mature process and driver included 4K video support.

Cazalan · Aug 3, 2016

ComputerSecurityGuy :

Xbox1S is a new 16nm SoC from AMD made at TSMC.

http://www.eurogamer.net/articles/digitalfoundry-2016-inside-xbox-one-s-tech-interview

Largely just a die shrink + 4k capabilities. Uses much less power as the CPU clocks were kept the same. GPU clocks got a slight boost (853 MHz to 914 MHz).

jimmysmitty · Aug 3, 2016

Cazalan :

Strange that they went with TSMCs 16nm and not GloFlos 14nm. Could that be a sign that yields are not high enough or that GloFlo can't handle that big of an order on top of the Radeon and Zen they already have?

ComputerSecurityGuy · Aug 4, 2016

Could just be price. That's crucial in consoles.

gamerk316 · Aug 4, 2016

jimmysmitty :

Cazalan :

Strange that they went with TSMCs 16nm and not GloFlos 14nm. Could that be a sign that yields are not high enough or that GloFlo can't handle that big of an order on top of the Radeon and Zen they already have?

Who produced the original chip? If TSMC, it could just be a prior contractual agreement.

jimmysmitty · Aug 4, 2016

gamerk316 :

jimmysmitty :

Who produced the original chip? If TSMC, it could just be a prior contractual agreement.

Hard to say but I would assume agreements would stop at a certain process because I am sure AMD would want the best process for the new APU.

At least I know I would and Polaris is not showing me that GloFlos 14nm is very power efficient, the latest 470 shows power numbers near a 1070.

AMD CPUs, SoC Rumors and Speculations Temp. thread 2

Judicious

Champion

Glorious

Champion

Judicious

Distinguished

Distinguished

Distinguished

Distinguished

Glorious

Distinguished

Glorious

Champion

Admirable

Honorable

Distinguished

Splendid

Admirable

Distinguished

Admirable

Distinguished

Champion

Admirable

Glorious

Champion

Share this page