AMD CPUs, SoC Rumors and Speculations Temp. thread 2

Page 66 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.
Status
Not open for further replies.


Yes but we aren't talking 40% IPC over an FX8350. We are talking 40% IPC games over an Excavator based Athlon X4 845 (which itself is 15 - 25% higher in IPC than the Pildriver based FX, depending on workload).

That means we are talking around a 60% IPC uplift from the last FX platform, rather than 40%.... it will be a really nice improvement over PD.

Also worth noting that is in single thread- in multi thread we have 8 *full cores* and multi threading which means:
1: Zen will mop the floor with PD in FP code (as we have 8 FP units not 4)
2: Zen is not only faster per core, but we should expect another nice boost from the inclusion of multi threading. I predict in heavily threaded apps the 8 core, 16 thread zen is going to be pushing anywhere from 50% to 80% faster than an FX8350 depending on type of code (as it has 2x the FP throughput, 2x the decode capability as each core has a decoder, double the number of threads). The *only* disadvantage is a reduction in clocks.

Edit some numbers:
Cinebench R15 Single thread score for X4 845 (assuming running at turbo speed of 3.8ghz): 94
http://stablecomputer.com/amd-fx-8350-review/

Cinebench R15 single thread score for FX8350 at 4.2ghz turbo: 96
http://stablecomputer.com/amd-fx-8350-review/

Assuming zen single thread will run at 3.5ghz (looks plausible given current rumors):
(94/3800) x 3500 = 87
87 x 1.4 (for 40% ipc boost of excavator) = 122

122 / 96 (to compare to PD) = 1.27, i.e. 27% uplift in outright performance in single thread.

Now lets look at what happens with multi thread:
This link shows Cinebench scores for many Intel CPU, including the same gen i5 and i7 (this is to get an idea of potential gain due to SMT):
http://www.kitguru.net/components/cpu/luke-hill/intel-core-i7-6700k-i5-6600k-skylake-cpu-review/5/

I5 6600 at stock scores 654
i7 6700 at stock scores 926, however is clocked slightly higher so normalized for clock speed = 860

860 / 654 = 1.31 i.e. 31% uplift from HT. Maybe that is unfair to expect AMD to get quite that much, but I'll go with this number for now...

So, FX 8350 in multi thread scores 628 from previous link.
628 x 1.27 (from higher per core performance) x 1.31 (boost due to SMT) = 1045 multi thread score in CB (which is 66% faster than a FX 8350 in same workload).

Edit 2: That multi thread score suggests an 8 core Zen sits close to a Haswell 5820k (6 core). That said, I haven't taken into account the CMT performance penalty PD has so it could arguably scale higher than that in multi thread. Either way that puts zen comfortably above the current i5 / i7 in multi thread (although due to clocks it will be slower in single thread).
 


Plausible but to be honest I think Intel will have some sort of built on memory before AMD will only due to the costs associated and the experience, Intel already has the large L4 cache with Broadwell under its belt.
 


There is one key difference though... Please correct me if I'm wrong, but from what I know of HMC, it doesn't go as high in terms of density as HBM (GB per mm^3). So if your goal is to make it a dual approach to iGPU and CPU, then you would need 8GB+, but it would bring the price off the roof to get that amount into the same package. So, it would only be feasible as a Server solution and that's about it. At least, initially. For CPU only tasks, I don't know if the extra bandwidth is needed for them.

Now... You can put it as main memory and make the iGPU use it externally just like regular RAM though. I don't think there are any plans for HBM to be used like that.

Cheers!
 


I don't expect Intel to use HMC as anything more than a cache and possibly a memory for the iGPU especially now with 3DXpoint.

That said, I can't find anything that specifies how much memory per HMC module they could have.

HBM will not be used in that way no. At best it will be, to start, a MCM design with it on package but not on die. That will come later but I don't know if that will even be cost effective yet.
 


If they were to build and APU with HBM- the HBM would have to be on an interposer as a discrete package due to the fact it's stacked vs a planar design for the rest of the APU. Id on't think it can be stacked directly onto the die (at least yet- I think there may well be ideas for that in the future perhaps).

It certainly wouldn't be a budget part, the advantage of it would be space saved for a given performance level compared to discrete components.
 


The clock problem will be minimized in an octo-core but will amplify in quad-cores.
 


Are you talking about single thread or multi-thread? If it is the former then it is better to use max turbo clocks. If it is about the latter then the IPC gain is not 40%, but 40% plus the SMT gain plus the gain from PD-->Excavator

1.4 x 1.2 x ~1.2 = ~ 2.0

This is close to the 2x drawn in the Orochi vs Summit slide

AMD-Zen-Performance-Double-FX-83501.jpg
 


The underlying memory arrays are largely the same so the capacities can be similar if they choose. The main difference between HMB and HMC is the physical interface. HBM is very wide parallel (speeds similar to DDR2). HMC is very fast serial interfaces (about PCIe 4.0 level). Both could be used as a main memory if they choose but the capacities probably aren't there yet for the projects they're being used on.

If you consider today that Vega/Pascal will be using 4GB HBM2 they're getting 16GB this year with 4 modules. There are supposed to be 8GB versions due out later which would yield 32GB.

My main system only has 16GB RAM but it rarely uses more than 8GB, so if you built an expensive APU with 16GB/32GB I think you'd be set for memory for many workloads.

Heck I got a laptop that gets by with 4GB and an SSD. Its fine for most every day tasks.
 


dram_table_lrg.png


Intel must be using the 2GB HMC 1.0 modules in the KNL CPU. I don't know the capacity of HMC 2.0 modules but the 3.0 spec is expected to increase this to 8GB. However modules capacity doesn't tell the whole history. HMC has a far-link mode that allows modules to be connected to other modules to increase overall capacity. We have to recall that HMC has been designed to replace DDR4 as main memory, whereas HMB doesn't.
 
AMD got the console business thanks to a good amount of luck. Both Microsoft and Sony wanted ARM hardware for the consoles, but ARM was only 32bits when they started to design the PS4 and the XboxOne. There was a technical "bake-off" between ARM hardware and x86 hardware and at the end both companies decided that ARM was not ready and that 32bit wasn't enough. Then they went the x86 route which left Nvidia out the competition, and AMD and Intel as the only players. Intel rejected to provide custom SoCs, and then AMD won both console designs. Today, with ARM64 hardware, the consoles wouldn't be using jaguar cores. In fact even AMD has replaced jaguar cores with ARM cores in the server SoCs.

With ARM64 now a reality and mature, I am not surprised that Nintendo chose ARM hardware and that Nvidia won to AMD. I have been predicting this kind of stuff for a while now.

 


The funny thing is I think this was a blessing for Microsoft in the long run which will also be a blessing for AMD too. They are probably going to stick with x86 as they plan to make it so the XB1, XB1 S and Scorpio all work with the same games and that games also have cross platform play with PC.

I think they will use 10 to help unify that which will be a massive benefit to PC.

Of course I can't say they will always stick with AMD or even x86 as better options may come along.
 
regarding your discussion of an APU with HBM... if we are seeing leaks of HBM in an APU in 2018 isn't that when amd's "next gen memory" graphics cards come out?

I have serious doubts if an HBM APU will ever see the light of day. more likely amd will do a new socket with quad memory channels like intels 2011 socket
 


The HPC APU with HBM already uses a quad-channel DDR4 socket.
 


I basically thought they would be using Arm plus a AMD custom GPU. But that would have probably cost even more since it would have to be a heavily custom SOC. Plus Nintendo needed the best possible solution for performance per watt and that leaves Amd so far off the table its not even funny.

Jaguar isn't even that great either i actually suspect NX to have a competitive CPU or something slightly better then what is on the PS4/Xbox one. Terms of GPU lol well if they don't make a actual console i do not expect much above the Wii U currently. Tegra X1 tablet only has 365gflops which is basically even with Wii U. But the amount of ram in the NX could be up to 8GB but again as its a portable i do not expect it to be GDDR5 memory. Unless the dock that the portable plugs into offers more power or a way to cool the tablet more to offer more power i do not expect much.

Amd would have won this battle if they would have made a decent Arm design like they said they would instead it seems like they just through Arm out the table.
 


With the recent XBOne S release now we know of 3 new ones.

XBone S(lim)
XBone Scorpio
PS4 Neo

Unless they are not counting Scorpio as that is due fall 2017.
 
Well I doubt it's ARM. I wouldn't be surprised if the Xbox1/Xbox1S actually use the same SoC unless somebody has confirmation of something else. Just a faster clock due to more mature process and driver included 4K video support.
 


Xbox1S is a new 16nm SoC from AMD made at TSMC.

http://www.eurogamer.net/articles/digitalfoundry-2016-inside-xbox-one-s-tech-interview

Largely just a die shrink + 4k capabilities. Uses much less power as the CPU clocks were kept the same. GPU clocks got a slight boost (853 MHz to 914 MHz).

 


Strange that they went with TSMCs 16nm and not GloFlos 14nm. Could that be a sign that yields are not high enough or that GloFlo can't handle that big of an order on top of the Radeon and Zen they already have?
 


Who produced the original chip? If TSMC, it could just be a prior contractual agreement.
 


Hard to say but I would assume agreements would stop at a certain process because I am sure AMD would want the best process for the new APU.

At least I know I would and Polaris is not showing me that GloFlos 14nm is very power efficient, the latest 470 shows power numbers near a 1070.
 
Status
Not open for further replies.