AMD CPUs, SoC Rumors and Speculations Temp. thread 2

Page 54 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.
Status
Not open for further replies.


Considering that BD/PD were long pipeline/high clock speed designs, and AMD's new uarch will be shorter/wider pipelines...there will certainly be some decrease in clock speed from the FX9590 to Zen...though the reality is that things are coming along quite smoothly...and expectations are high.

However, you imply the process at Samsung is going to be terrible for high clocks...based on the feedback I am hearing...that is really just not the case.

As for the AM4 platform...I am a bit confused. How can you sit and say that is going to have any impact on the clockspeed of the CPU architecture, when sufficient details to look at flow charts, etc. are not really available right now? I know such technical engineering diagrams would not be out in the open any time soon...

Please elaborate on that technomancy...
 


He is referring to the compromises taken by AMD engineers when designing the common AM4 platfform. You cannot design a socket that is efficient, cheap, and high performance at all, because those are contradictory requirements. That is why Intel split the sockets into mainstream and HEDT, CPU and APU,...

Skybridge was canceled because was a nonsense of giant proportions, and Carrizo/Carrizo-L common platform resulted to be a disaster with OEMs releasing single-channel Carrizo boards. Precisely the Stilt discussed time ago the design mistakes made by AMD on Carrizo/Carrizo-L common platform.

AMD promised a common platform to reduce costs to OEMs. The problem is that OEMs confronted with a dilemma: either they would design two different boards (one for Carrizo and other for Carrizo-L chips) killing the advantage of the common socket, or would desing a common board using Carrizo-L as minimum common denominator. Most OEMs chose last option. The problem was on wiring memory signals from the socket to the board. If engineers connect the dual DDR channels to corresponding Carrizo pins, then one of the channels is automatically disabled when you insert a Carrizo-L chip lacking that feature, which means people would believe one of the slots in his hardware is broken, when in reality it is disabled. The other option is to connect both memory slots to the same channel on the socket. Then both slots are always enabled, but Carrizo chips are limited to using single-channel access to memory which kills iGPU performance.
 


Sure...however...Zen and the APUs will be dual channel memory.

So, you will not have the same issues as Carrizo/Carrizo-L.

Additionally, those same lessons learned are being applied elsewhere. The minimum specs are geared toward the flagship part, obviously.

As for the iGPUs, I am not exceedingly familiar with how they are handling that, however, I expect it would have something to do with adding ~200 pins to the socket...
 


I know it will be dual-channel... I predicted that years ago.

I am not saying that AM4 common platform will have the same issues than Carrizo/Carrizo-L. I only used the common socket approach on Carrizo platfform to illustrate how a common design implies " taking compromises". The issues that will have AM4 are the limitation to dual-channel, the overclocking limits,...

As for the iGPU, I wrote a post about how AM4 includes extra power planes (are not found in AM3+) for the iGPU, but some mod deleted my post....
 


If those are specifically for iGPU on APUs, I am sure that it adds complexity, but not such that it would make overclocking more difficult.

As my understanding goes, AM4 has more in common with FM2+ than it does with AM3+. Given that being the case, I do not foresee lots of problems. Many APUs with deactivated iGPUs overclocked well on such a platform.

I think that apprehension over complexity is best reserved once it launches, and see what happens with the platform from there.
 
Each CPU core is backed by 512K of L2 cache, with 32MB of L3 cache across the entire core. Interestingly, the L3 cache is shown as 8MB contiguous blocks rather than a unified design. This suggests that Zen inherits its L3 structure from Bulldozer, which used a similar approach — though hopefully the cache has been overhauled for improved performance. The integrated GPU also supposedly offers double-precision floating point at 1/2 single-precision speed.

 

Bulldozer had unified L3 across the whole chip with L2 shared across modules. Zen will be like the console APUs where the last level cache is shared across only 4 cores. That design will help the speed for the L3 which is probably why they went with a small L2. Just looking at the design here will tell you cache speed will probably be increased. Smaller cache = faster cache in general.
 


Even with smaller cache sizes they still have a lot of work to do on speeding it up. Not only that but they need to lower their latency. Intels cache has always been much faster which helps a lot.

They also need to implement the same cache tech where the last level L3 cache saves instructions so it doesn't have to access the main memory again to re-run it.
 


They would have to change from victim cache to do that. However, they can store non-duplicate instructions in L3 as it sits now. Just missed branches hurt a lot more right now in their current cache setup. According to what I have read/heard/seen, branch prediction should come up a very high amount. Which was part of the performance deficit before anyway.
 


Branch prediction is a part of IPC; you're double dipping on your math.
 


Most of the IPC increase seen on Excavator is due to the smaller L2 cache. Zen L2 cache has the same size than Excavator, but there is another important aspect that affects performance: the L2 cache on Zen is private, which will help to reduce latency compared to Excavator.

In any case, it is worth to mention that Haswell/Broadwell has a smaller 256KB L2 cache, which suggest that either AMD engineers are not targeting the same high clocks than Intel designs or the latency of Zen cache design is higher.
 


No. As explained to you hundred of times performance is ( IPC * frequency ). Therefore if you increase IPC by 40% the other only way to increase performance is by increasing clocks.

As correctly noted by gamerk branch prediction is already included on the IPC. In fact it is a function can be first approximated by

IPC ~ alpha * sqrt (W) / L

where alpha is a program dependent parameter, W is the OOOE depth and L the average instruction latency. The effects of a branch misprediction penalty are included on the parameter L.

Construction cores had good branch prediction hardware. The problem was on the large pipeline, which did target unreachable clocks, whereas increasing the branch misprediction penalty, which reduced the IPC.
 


Yeah, I was thinking about 2 separate issues while I was typing that post, and it actually kind of co-mingled into that.

That is what I get for trying to multi-task too hard...
 
Polaris info...

http://wccftech.com/amd-radeon-r9-480-470-polaris-10-polaris-11/

+NaCl

New mobile processors drop:

http://www.amd.com/en-us/press-releases/Pages/amd-accelerates-availability-2016apr05.aspx

Bristol Ridge has arrived.

Interesting if true:

http://finance.yahoo.com/news/behind-amd-polaris-nvida-pascal-200542139.html

Specifically, Pascal will use 16FP mixed-precision computing, which has lower accuracy than the standard 32FP, making it unsuitable for modern gaming applications. But 16FP should deliver strong power efficiency, making it more suitable for mobile devices.

Keep that pinch of salt out.
 
Specifically, Pascal will use 16FP mixed-precision computing, which has lower accuracy than the standard 32FP, making it unsuitable for modern gaming applications. But 16FP should deliver strong power efficiency, making it more suitable for mobile devices.

What I see here is a finance site not knowing what they're looking at. Its more then likely that several mobile version would use 16FP foe power savings, but theres literally between a zero and zero percent chance consumer boards would go this route.
 


Yeah, I usually take the tech bits from finance sites with a grain of salt, if I bother to take them into consideration at all.

What does intrigue me though, is Nvidia running gpu compute through software emulation, and not through hardware abstraction.
 


Yet.

Remember AMD/ATI had Tesselation before nVidia. I don't think I need to revisit that part of history.

My point is, AMD might have an advantage in the hardware component now, but given history, they need to capitalize on it before we can say they succeeded in that front. Specially with no wide support of said feature.

Cheers!
 


The problem with most of these features is much like anything with a GPU or even CPU, they normally all become so close that the performance difference is meh.

Even if AMD capitalized on it it wont be long before nVidia has a competitive product and they might be able to do it better. I remember when ATI had the better workstation GPU but then nVidia dropped CUDA and everything changed.

That is a good thing though. Having them go back and forth is a great way for us to benefit.
 


Considering the doors Mantle/DX12/Vulkan are already opening...the use of the word yet indicates they are already behind.

Wait and see...when polaris drops there will be enough DX12 capable software out there to show off.
 
Well, yeah to both of you.

There is zero chance nVidia will not join the party and DX12 will get relevant now with all the VR. This upcoming months will be interesting in terms of what games come out and the "feature set" they support from DX12 and devices.

Related: what does LiquidVR really do? Is it another Framework to work with VR centric products (something like Blender, but for VR?). Or just a bundle of tools that complement Graphical engines using AMD cards?

Cheers!
 
S|A Bristol Ridge update analysis: http://semiaccurate.com/2016/04/05/amd-announces-its-7th-generation-of-a-series-apus/

Interestingly, AM4 will support 140W+ chips...meaning some very high end options may well be in order.

 


140W is not even mentioned in the link...
 


No, it does not...

This is the source of the 140W info: http://www.eteknix.com/amd-am4-will-feature-140w-support-and-%CE%BCopga/
 


Liquid VR is a complete package in terms of 3D support.

Liquid VR will be a driver designed to setup and render the multiple separate displays for VR coherently, and at high frame rates. It also has built in features that allow developers to better design VR/3D oriented games by giving them access to more options in VR.
 
Status
Not open for further replies.