AMD Piledriver rumours ... and expert conjecture

Page 269 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.
Status
Not open for further replies.
We have had several requests for a sticky on AMD's yet to be released Piledriver architecture ... so here it is.

I want to make a few things clear though.

Post a question relevant to the topic, or information about the topic, or it will be deleted.

Post any negative personal comments about another user ... and they will be deleted.

Post flame baiting comments about the blue, red and green team and they will be deleted.

Enjoy ...
 


I DID say for a specific benchmark. Different benchmarks would yield differnet IPC measurements. Taken across a majority of benchmarks, you can generally "guestimate" an IPC difference between two different CPU's.
 

Thus why AMD wanted the initial reviews to be on the APU as a whole, and NOT CPU only benchmarks. Because CPU side, Trinity still can't touch SB i3's. So anything much above a GT640, and the i3 is a better buy.

Trinity makes sense for mom and pop PC's and mobile devices, but thats it. APU's aren't ready for decent gaming...yet.
 


Nice find, even if the author's butchery of the English language is amusing. I think that is the first time I have seen the relationship between cache latency/write speeds and core performance and understood why it is such a major problem.

It's good to see that he actually suggests improvements instead of just bashing the chip.
 



Yeah when it comes to bulldozer and its design its hard not to bash it a little i can think of 20 things at least that were horrible ideas but then again i can think of a couple things that were smart. I still personally think their design is flawed from the beginning and i guess some engineers at Amd did as well which is why they discontinued bulldozer for a while but they did bring it back.

But maybe in a couple years they might prove me wrong by bringing in comparable performance per watt in CPU tasks. If you think about it adding 2 decoders instead of 1 pretty much means they know sharing resources is kinda a bad idea and sharing the L2 cache is a bad idea as well as it hurts Latency and the cores are always fighting for the L2. I really don't know who even thought of the Cache system in bulldozer but they had to be on some epic drugs way more than normal possibly some newer stuff that we can't seem to get on the street.

Unlike some i do however think their idea of sharing the FP is good as most programs don't even use it and it can save on power consumption and size.
 


•Graphics card: NVIDIA GeForce GTX 680 (2 GB/256-bit GDDR5, 1006/6008 MHz).

sorry, but who in their right mind are going to pair an APU with a top end Nvidia card?

Xbits article is no more than another "if we do this for intel, its better" article.

$500 gpu + $100 cpu = epic fail. notice that even the slowest I5s kill even the I3s (wich are supposedly great choice for gaming)

farcry.png
 



Nice article and nice conclusions, intead of bashing is proposing solutions.


I think BD design is actually quite good, at least the idea behind it, the problem is the lack of money to put on R&D and being 1º gen, GloFO sure is another reason why BD didn't shine, but with some menors teaks they can improve performance a lot if the fix cache latencies, sharing is good to increase performance without increasing die size but too much sharing resources ended up hurting performance, why only 1 decoder for module?? Also I would increse bit lenght

Also the FPU part, well if AMD is going with fusion as planned with excavator they shouldn't need FPU just off-load everything FPU related to the iGPU
 


Actually, all the current Intel line up (and Sandy, for that matter) is an APU.



One hard point for my "dual cores are not a good long term choice" mantra, haha.

Oh, and I just received my new lappy. Sager/Clevo i7 IB + gtx675m; I think it's faster than my Phenom II, hahaha. Gonna do some testing.

Cheers!
 



"GlobalFoundries's silicon low physical performance lead to excessive heat generation"


People jump on GloFo yet they delivered 32nm CPUs running with higher clocks than Intel chips. The design is more important than the process tech. As AMD showed with Trinity the clock circuitry can be improved significantly to affect power consumption, they just didn't plan far enough ahead for it.

Transistor count wise the Module approach has promise, and the Steamroller improvements should help a lot.

Power wise I think they're shooting themselves in the foot. Intel can clock up/down each core as necessary. The Bulldozer modules are tied together. For servers with medium/heavy workloads that is fine, but in the desktop and mobile arena that will keep hurting them.

 



Well yeah! My friend has a I7 sandy laptop and its only clocked at 2.3Ghz and in single threaded benchmarks it beats mine by 5-10% with my 1100T clocked at 3.9GHZ!!!

Mine is how ever faster in multithreaded apps by around 15%.
 

Glofo's process also improved with trinity. It was sad that the 32nm llano chips couldn't clock as high as the phenoms.
 



Not only did GF not allow Bulldozer to clock at the right speed GF was the number 1 reason why Bulldozer was delayed for so long.
 

heh, this is the similar reaction after toms' sub $200 gaming cpu roundup article was published. they, conveniently forgot why a powerful gfx card(radeon hd 7970) was used in a cpu comparison article - to remove/ease gpu bottleneck, so that the cpus/apus can perform to without the gpu limiting their performances. they went on to making excuses like 'unrealistic configuration', 'paid by intel', 'real gamers and enthusiasts don't care...' and so on. for people who have deep knowledge of how chips work and are 'familiar with the enthusiast community' should know this basic stuff at least. XD
still, that kind of pairing isn't unheard of. i have read plenty of llano users wanting to put in 7970, 7950s in their systems. some actually do run those cards. lots of people run gtx 79xx/680/670 in their current phenom rigs.

i guess radeon hd 7950 should be a bit more 'realistic', yes?
http://techreport.com/review/23662/amd-a10-5800k-and-a8-5600k-trinity-apus-reviewed/3
to repeat, purpose of using a powerful card in a cpu review is to ease gpu bottleneck thus let the cpus/apus flex their muscle as much as possible. in desktops, it's is quite possible to build such config. most people who own any quad core cpu will want to upgrade their gfx card first. that includes older core2quad users as well.
i should add that techreport has fallen out of amd fanboys' favor ever since they published their first gaming cpu round up and then publishing an editorial thingie on how amd approached them to make their trinity reivew a two-tier affair. :)
 



By all means argue your case but lets all keep it civil and friendly ... thats all the mod team asks of the users.




 
I've been leaning towards Piledriver because of the multi-thread performance. Plus, I want something cheap I can beat on with overclocking.

But, as far as where we're going to see performance gains, where do you guys think we're going to see them?

Will Piledriver catch up in gaming? Will it pull further ahead in multi-thread and not move in single? Can we extrapolate from Trinity's strongest and weakest points, or is that moot because of a lack of L3?
 

I think you completely missed the point but anyways what that shows is Intel CPU's make up over 85% of its 3DMark score on CPU physics and around 10-15% on IGPU, AMD almost 50-50 and lets also bear in mind its a much weaker x86 core and rather low end GPU by AMD's standards, but fear not when you see what will be on Steam Roller APU's what you will see is HD resolutions at mainstream settings on a minor budget, then there is Excavator APU's to follow, by Q3 next year you will start to hear the IGPU solution on that, I think by that stage Intel will stop making IGPU's 😀

In case I didn't say before but I have spoken to AMD liaisons and there is a new roadmap after Excavator so, as for AMD stopping making CPU's is a complete fallacy cooked up by half bakes like OBR and his Coolaler cronies, or anyone that simply refuses to accept what AMD are trying to achieve as something that is not important for computing on a whole.

I just grow tired of the same fall back to x86 performance, I thought we were past that a while ago, heck even AMD have said it direct from the top in front of the media that AMD will not be focusing on pure X86 performance alone.

But since we are on APU's, Trinity beat Stars based llano's in single threaded scores in Cinebench by some margin, it lost out a bit in multi threaded performance as one can expect, llano individual cores and separate FPU's help there but it was so minimal, in every other regard Trinity hands down broke llano, comparing apples to apples thats impressive. It should now be accepted that APU's and FX parts will not perform in the same degree of so its pointless taking a trinity and saying vishera will be exactly the same. Back to APU's, its fundamental strength is offloading FPU intensive calculation to the GPU component, in 3D rendering and image creation the results are mind blowing.
 

pd is likely to carry on performing well on highly threaded applications that can properly utilize higher end pd's 8 integer units.
performance gains (compared to bulldozer) might be apparent on 2 pass x264 video encoding, encryption, archiving/extracting, may be some 3d rendering apps etc.
gaming - catch up to what? that question is vague and the answer 'not really' can be easily misinterpreted by biased people. it will definitely outperform core2duo cpus... phenom ii... may be not so much.
it will improve (on bulldozer) in both single and multi-core performance. toms' showed this already in the trinity review.
 


From what i understood from the leaked AMD roadmaps, in excavator, the plan is that any FPU calculation that is there in the code is automagically transferred to the integrated iGPU. That is the main plan of the HSA, that the iGPU and the CPU are to be treated as single compute units ,which are transparent to the coder. So the coder just writes the same code he writes, but during compilation/execution the CPU 'guesses' what the instruction is doing, and redirects it to the appropriate 'section' of the processor.
 


Well AMD have experience with CPU/GPU integration in Firepro environments so I am sure its more than guess work, but hopefully time will lend itself to greater knowledge on this.
 


Which I again stress is VERY dangerous. Simple example: We (finally) get dynamic physics engines that can handle multiple object collisions. The API uses the GPU, since this type of physics engine would scale to a reasonable degree (think PhysX).

Now AMD steps in, and on top of rendering and physics, every FP computation is also offloaded to the iGPU.

Moving the bottleneck from one component to another is not a good long term solution.
 
Status
Not open for further replies.