AMD Piledriver rumours ... and expert conjecture

Page 297 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.
Status
Not open for further replies.
We have had several requests for a sticky on AMD's yet to be released Piledriver architecture ... so here it is.

I want to make a few things clear though.

Post a question relevant to the topic, or information about the topic, or it will be deleted.

Post any negative personal comments about another user ... and they will be deleted.

Post flame baiting comments about the blue, red and green team and they will be deleted.

Enjoy ...
 


Couple linked above, they mentioned trustzone, but talked about 3rd party IP's assisting HSA. Nobody knows what GCN core will be used, so speculation is SeaIslands, considering the release of SeaIslands is soon, while Steamroller is slated for Q4 provides enough time to implement a far more power efficient GCN core on die.

I guess the bottleneck is no less related to existing memory sharing with the CPU, the articles state dedicated memory towards the IGPU component, along with much improved IMC, they mentioned a unified IMC.

That is what AMD targeted that is what they want no numbers are being branded around but on face value this looks like a monumental leap forward in iGPU performance, with architectural changes on, CPU, GPU and IMC components with the impressive results trinity has shown there is nothing to suggest that Kaveri wont continue the trend.

I think the bigger question is, what does this mean for FX, to me wouldn't be surprised if it is terminated as APU's will already deliver similar performance makes no sense flogging something that AMD sees no future in.
 


First of all, unless AMD has a new promo slide out on Steamroller, all they are promising is 15% performance gains each generation. PD was 5% IPC and 10% base clock = 15% total on average. Where do you get 30% IPC alone for Steamy??

2nd, what do you think 850 radeon cores will draw TDP-wise?? Remember, AMD and Intel are both going after the mobile - ultra-mobile markets. Unless GF or maybe TSMC can get their 28nm transistors power down by 75% (extremely implausible) then doubling the iGPU transistors (based on roughly double the # of radeon cores over Trinity), the power draw would likely go up, not decrease. Not the direction AMD wants to go if they are after the low-power, all-day-battery-life market..

3rd, ARM is supposed to be for their SeaMicro server fabric according to the news reports.

4th, AMD CPUs seem to average around 9 months to a year's delay for the last couple of generations. Given their target date of Q4 of next year, 9 months delay would put it 2nd half of 2014. Given AMD's difficulties in finding a reliable foundry partner, who can actually deliver yield ramps according to schedule, I think my bet will turn out pretty close..

No offense, but you seem to take paper announcements from AMD as a done deal, when history proves that a reasonable person should take them with a shaker of salt 😛.
 


You can play with TDP lowering the speed. So, 850 units in there can be clocked very low to have reasonable TDP numbers. Also, there's process. What process will they use for it?

And people going by the Intel statements on the GT3 parts is just as bad as people buying AMD statements with no questioning. Fans tend to overblow both parties statements IMO, so we should trust whatever AMD and Intel says with a grain of salt.

Cheers!
 



There is a "," in his statement so the 30% refers cache latency no IPC.

+30% IPC will be a miracle but +7% - +10% is achivable, and more if they fix all the problems they are saying, cache latency, branch prediction and power consumption (+Mhz), lot of things to fix in only one gen
 


come on dude, quit trying to make an ant hill out of thin air. fist off, its 99th percentile, ie the slowest 1%, 98% is ignored, your only looking at the 99th percentile.

its ok to just say it. the longest frame time = the slowest fps.
 


AFAIK only Intel has mentioned the iGPU performance of Haswell, and they stated "2X" that of Ivy's HD4K at IDF. If AMD has made a similar statement about Kaveri, I haven't seen it. So there is some basis about speculating on Haswell's iGPU performance; and not so much on Kaveri..
 
+30% is very feasible. We are looking at a complete decoder change, cache changes, and branch prediction improvements. All of which are massive bottlenecks for Piledriver.

You guys who are poo-pooing AMD are forgetting that AMD has a brand new architecture with lots and lots of room for improvement. Intel is basically milking a Pentium M as hard as they can. Pentium M was based off of Pentium 3 Tualatin which came out in 2001.

AMD is improving a brand new architecture that is on its second revision, and Intel is working on an architecture that is already nearly 12 years old. It is absolutely reasonable to expect AMD to make much bigger gains in performance than Intel, even given Intel's massive R&D advantage.

Piledriver is not bad but there are lots of things wrong with it still. Decoders, cache, brance prediction, no uOP cache, etc. I'm sure people more knowledgeable could list even more things.

Now, name what is wrong with Ivy Bridge and Sandy Bridge? The TIM on Ivy Bridge is bad so it makes more heat? That's about all I have for IB and SB having big bottlenecks. Further proof that Intel knows there isn't much room for improvement lies in the fact that they're not bothering with it anymore. Haswell's focus isn't x86 performance, it's on the GT iGPU and power consumption.

The smartest thing AMD can do in the future is offer ARM 64 bit CPUs for those who want microservers with low power and to offer vastly improved x86 Opterons for people who want a chip with better price to performance than Intel while not being a complete power hog, and let that trickle over to the desktop and workstation CPUs. AMD needs to get cozy with Adobe and Autodesk to get OpenCL acceleration working on GCN Firepros so they can start cramming x86 CPUs in workstations. APUs would be incredibly strong in that market, and a high end APU that can run OpenCL code would beat the crap out of Intel MIC and Nvidia CUDA on price to performance.

AMD has so many opportunities to do good things in the future, but it's up to management to see if they will actually do something about it or not. It's all up to Lisa Su and Rory Read for now to make the right choices.
 


Yep, you are right - good catch or good eyesight 😀..
 

not very big considering they are going down half a node and they really aren't adding many things that take die space. Depending on how much l3 if any is in there, it should be smaller than trinity.
 


For the last time, FRAME LATENCY HAS NOTHING TO DO WITH HOW MANY FPS THE GPU IS CAPABLE OF OUTPUTTING. Hence the long explanation showing this.

Secondly: the 99% is NOT the bottom 1%, its the nominal 99%. You have it backwards, again.
 

your killing me smalls

bf3-latency.gif


bf3-99th.gif


so your saying the more you average the frame lat, the higher the numbers get?

anyone should be able to see that the nominal lag time in that chart is below 15ms, but ALL of the 99th percentile is above 15ms.

99th percentile IS the slowest frame times.
 
http://wccftech.com/amds-kaveri-based-28nm-richland-apu-features-steamroller-cores-compatibility-fm2-socket/

Overall, Richland offers 15-25% increase in Clock-Per-Clock performance.

Now, that would be nice. I doubt they are going to improve the clock speeds much from where they are for next generation, so if they are going to make their 15% improvement generation-over-generation, it is going to have to be from IPC. AMD is talking a lot of changes for Steamy, but it may pay off for them.
 

physically the same size (CPU in dimensions / diameter) I would think.
 
Well die are will be the same, but if transistors are smaller they will be able to fit more, BUT L3 & more than double (A10 has 384 cores) GCN cores at the same time........ I don't see that
 


They're not doubling the number of cores according to the article.

"Richland would consist of the GCN based 8000 series IGP featuring the same core count as Trinity of 384 but a new architecture"
 
^^+1

Has anyone seen any OC comparisons between 8350 and 8320s? The local MC has tons of the latter and I am wondering if there is reason to wait for the 8350s to come in. If the difference is 4.6 vs 4.5 on air I see no reason to wait.

Never-mind, I found the answer here. There doesn't appear to be a great deal of difference.
 


Assuming a perfect shrink, going from 32nm to 28nm theoretically yields about a 23.5% decrease in transistor size, as (28/32)^2. I dunno how much of a Trinity core the iGPU occupies, but if we guess maybe half, then doubling the number of iGPU transistors in Kaveri would mean an areal increase of 50% over Trinity's iGPU (2 x 0.765 = 1.5). Throw in some L3 and more transistors for the improvements and voila - double the die size..
 
I missread some article and tough they were going to cram 800 GCN cores together. That's why it didn't make sense.



Another thing that I can't undestand is how Intel is going to put 64 MB cache on Haswell and that's only for the iGPU

http://vr-zone.com/articles/amd-afds-seattle-trinity-follow-on-kaveri-to-have-true-shared-memory/16258.html


If this is the die of an i7 3770K with "only" 8 MB of shared L3
ivb-die,0101-334808-0-2-3-1-jpg-.html


Where do they plan on putting 64 MB using the same 22nm process ??? Or they have already developed 3D chips and now is the time to build them??
 
Status
Not open for further replies.