AMD Piledriver rumours ... and expert conjecture

Page 184 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.
Status
Not open for further replies.
We have had several requests for a sticky on AMD's yet to be released Piledriver architecture ... so here it is.

I want to make a few things clear though.

Post a question relevant to the topic, or information about the topic, or it will be deleted.

Post any negative personal comments about another user ... and they will be deleted.

Post flame baiting comments about the blue, red and green team and they will be deleted.

Enjoy ...
 
90% Of the reviews I saw shows 1100t over the 980, only one or a couple of games by the most show better fps with the 980. About PD it really seems like a good improve, I hope the 8350 is not priced beyond $250
Watch them price it at $300+... The 8150 was worse off, but they still priced it at $250 or so.
 
Sweet.

I bought FarCry 2 for like $5 ... man what a rubbish game.

I seem to spend so much time sleeping in a shack and my AK-47 is about as reliable as an Ivy-Bridge overclock.

I threw it away.

Wow $5, I wouldn't have paid that much for FC2 :lol: The only decent part about the whole game is the editor but that quickly wears off.
That game is worth like $0.02 at best.
 
Is that representative across the board, or the two best results?

I'd say it's a best case scenario, since that particular bench is usually used to show "single core performance" and "how nice is the turbo boost in CPUs", haha.

Across the board, we'll have to balance loads and stuff, so that 15% will not be 15% when all the modules/cores are taxed equally. Maybe something along the lines of 7-10% across is more feasible.

Still an improvement on BD indeed. I'm still wondering how far this thing will OC with no GPU on it... If what AMD told Chris is partially true (4.8Ghz on air), then we should see something similar with PD or a tad better.

Cheers!

EDIT: Typos.
 
I don't know. Looking at those benchies I'm really not happy. 3.8GHz vs 2.9? You're talking about a ~33% advantage in clock speed alone. And considering it was clocked so much faster it wasn't faster by much. Temps were also pointed out but isn't Llano 40nm? I am really not happy with where AMD is taking "my" company.
 
I don't know. Looking at those benchies I'm really not happy. 3.8GHz vs 2.9? You're talking about a ~33% advantage in clock speed alone. And considering it was clocked so much faster it wasn't faster by much. Temps were also pointed out but isn't Llano 40nm? I am really not happy with where AMD is taking "my" company.
Llano was 32nm. Same process as Trinity.

If Trinity was 3Thz and performed 'only' 300% faster than the 3770k in everything while using the same amount of power, would you say it sucked?
 
Why is anyone surprised that the PD cores require higher clocks to achieve similar performance to K10? We already knew that BD was designed for high clocks. The fact that they are able to fit the higher clocks into the same TDP on the same process is impressive to me. Even if it doesn't increase performance as high as clocking Stars to that level would theoretically deliver.

As far as Chris's comparisons to BD go, he locked both CPUs to the same clocks, eliminating any Turbo concerns, and disabled two modules from the BD to achieve thread parity. This should give the closest comparison we will have until we see actual 8150 to 8350 results and he showed a 15% improvement which I for one am happy with.

 
Llano was 32nm. Same process as Trinity.

If Trinity was 3Thz and performed 'only' 300% faster than the 3770k in everything while using the same amount of power, would you say it sucked?

Precisely.

This is where the P4 comparison falls short today. With Trinity, AMD is actually keeping things in control regarding temps and consumption. Something we all know P4 wasn't good at.

Trinity is showing that PD won't be the game changer we all want, but at least, it's not another flop. It might be actually be considered an upgrade over K10 if they keep the good showing.

In a simple car analogy, a turbo'ed I4 engine that manages to get the car at 200mph with the exact same times as a supercharged V8 is still good in anyone's book. There will be other disparities among the 2 products, but that's where differentiation comes into play. The I4 could have better mileage, but the V8 could have better end torque for pulling big loads and stuff. That's why if you take the Herz as a full metric of performance, you'll loose the overall sight. Unfortunately, for Intel, with the P4 they couldn't make the winner they wanted, but they did get good lessons from NetBurst.

Cheers!
 
http://www.tomshardware.com/reviews/a10-5800k-a8-5600k-a6-5400k,3224-21.html

A majority of our benchmarks favor Trinity over Llano thanks to IPC improvements and significantly higher clock rates. Piledriver still gives up significant instruction per cycle throughput compared to the older Stars design, but is better able to compensate than Bulldozer. The result, then, is modest x86 performance. It’s better than Bulldozer, but only a slight step up from what you get Llano. And that’s if we ignore the competition entirely. I didn’t have a appropriately-priced Intel chip to test, but just received a Core i3-2100 from Newegg that comes close to matching an A8-3870K’s price tag. Tests commence on that tonight.

PD basically looks to be a clock bump with minor IPC improvements. Doesn't look like it will close the gap on SB. And it looks like Cache latency and memory access times actually degraded compared to Bulldozer...
 
I'd say it's a best case scenario, since that particular bench is usually used to show "single core performance" and "how nice is the turbo boost in CPUs", haha.

Across the board, we'll have to balance loads and stuff, so that 15% will not be 15% when all the modules/cores are taxed equally. Maybe something along the lines of 7-10% across is more feasible.

Still an improvement on BD indeed. I'm still wondering how far this thing will OC with no GPU on it... If what AMD told Chris is partially true (4.8Ghz on air), then we should see something similar with PD or a tad better.

Cheers!

EDIT: Typos.


For starters they disabled turbo and ran trinity and bulldozer at the same clock and the results showed a 12.5-19% increase in IPC.


Why is anyone surprised that the PD cores require higher clocks to achieve similar performance to K10? We already knew that BD was designed for high clocks. The fact that they are able to fit the higher clocks into the same TDP on the same process is impressive to me. Even if it doesn't increase performance as high as clocking Stars to that level would theoretically deliver.

As far as Chris's comparisons to BD go, he locked both CPUs to the same clocks, eliminating any Turbo concerns, and disabled two modules from the BD to achieve thread parity. This should give the closest comparison we will have until we see actual 8150 to 8350 results and he showed a 15% improvement which I for one am happy with.

I completely agree i mean if Amd can increase performance on the same die while using less TDP i'd say +1 and if they need a higher clock who cares? I mean i would to them people need to wake up and understand that Amd is worth WAY less then Intel and they actually jump more this generation then Intel did.

To set their and expect miracles is just sad, i got what i was expecting from Piledriver and maybe a little more, X86 performance is getting less important everyday and after looking at Amd's progress with OpenCL and software developers i'm pretty sure Amd is on the right track as long as they can release their Processors ON TIME and price their parts accordingly!

Piledriver will finely beat Phenom and most likely be worth upgrading to(If your a Amd fan) and i expect my 8 core PD in the 4th Quarter of this year.

Its also nice to see Amd getting evolved with software devs all i can say is its about time!

PD basically looks to be a clock bump with minor IPC improvements. Doesn't look like it will close the gap on SB. And it looks like Cache latency and memory access times actually degraded compared to Bulldozer...

Amd only said their would be a 10-15% increase in IPC/TDP and that is pretty impressive that they actually did that i was expecting a modest 7% increase in IPC. Not only that Piledriver is showing to be a Great overclocker, If priced accordingly Piledriver will be a decent product. And for the memory i must say that's because the Graphics part of trinity is eating the bandwidth for the Processor and after looking at Bulldozer we know that it loves fast ram.
 
considering adding an L3 should improve performance by up to an additional 5% I would say piledriver is much faster than bulldozer at the same clock.
026_itunes.png

L3 would even more useful at higher clock speeds.

I predict at minimal 15% ipc improvement from bulldozer. If they can clock at 4.2ghz for 8 core piledriver. The performance increase would be at least 30% more than bulldozer pretty much across the board.
 
Based on the results, what do you think Trinity will offer? Is it a great processor to consider?

Are you referring to Trinity or Piledriver?

If Piledriver, it is still early days to be trying to make definitive predictions, and we don't yet know what the final clock speeds will be.

If Trinity, then Tom's review speaks for itself.
 
considering adding an L3 should improve performance by up to an additional 5% I would say piledriver is much faster than bulldozer at the same clock.
http://media.bestofmicro.com/H/H/223541/original/026_itunes.png
L3 would even more useful at higher clock speeds.

I predict at minimal 15% ipc improvement from bulldozer. If they can clock at 4.2ghz for 8 core piledriver. The performance increase would be at least 30% more than bulldozer pretty much across the board.

Yeah but Remember clock speed does not scale perfect with performance Piledriver will most likely be clocked 15% higher but this does not mean it will be 15% faster most likely it would only be 7-10% faster on average(improved turbo speed maybe 10% compared to bulldozer just over the clock speed and probably by looking at the IPC increase i would say the performance increase on average based on IPC will be 12-15% better so i'll have to increase my performance expectations from 15-20 to 20-25% on average. I still can't give a good guess on efficiency since we all know L3 cache adds a decent amount of TDP not to mention we know Amd is going to clock PD as high as they can for performance so i expect the power consumption improvement to be minor like 10-15% maybe even 5-10%
 
Yeah but Remember clock speed does not scale perfect with performance Piledriver will most likely be clocked 15% higher but this does not mean it will be 15% faster most likely it would only be 7-10% faster on average(improved turbo speed maybe 10% compared to bulldozer just over the clock speed and probably by looking at the IPC increase i would say the performance increase on average based on IPC will be 12-15% better so i'll have to increase my performance expectations from 15-20 to 20-25% on average. I still can't give a good guess on efficiency since we all know L3 cache adds a decent amount of TDP not to mention we know Amd is going to clock PD as high as they can for performance so i expect the power consumption improvement to be minor like 10-15% maybe even 5-10%
actually, the bulldozer architecture scares perfectly with clock speed. based on the OC results this goes to 5ghz or more.

Turbo should be improved in piledriver as well so I would expect piledriver turbo to bring more performance than bulldozer turbo. Hard to say how much but it probably won't be much.

15% IPC and 15% clock = 34% improvement by itself, Additional 5% from the L3 would put piledriver even higher so I'd say that 30% is actually a pretty reasonable estimate.

L3 won't add much to the TDP I would imagine, judging by trinity, the CPU portion of the 5800k would be less than half of the 100W TPD, 8 core at the same clock speed could probably fit into 80W plus L3 and a clock bump the 125W should be pretty standard. Efficiency results will probably be higher at lower clock speeds tho. Hard to really say about power consumption but judging by trinity compared to llano I would think piledriver is quite a bit more efficient than bulldozer.
 
And it turns out PD did exactly what they said it would, 10~15% increase in performance at the same clock while reducing power usage.

And as predicted the haters and bandwagon jumpers have begun their typical bashing, finding ways to twist words and numbers to try to make PD look bad.

Any who, looks like a solid revision of the uArch. Cache latencies are still high, and their still sharing L2, neither of which I like, but they seem to of made a better predictor which masks those latencies. Most of the benefit will come from the increased TDP headroom allowing high clocks, something we haven't really seen fleshed out yet. Interested in seeing what they can do in highly threaded workloads.
 
actually, the bulldozer architecture scares perfectly with clock speed. based on the OC results this goes to 5ghz or more.

Turbo should be improved in piledriver as well so I would expect piledriver turbo to bring more performance than bulldozer turbo. Hard to say how much but it probably won't be much.

15% IPC and 15% clock = 34% improvement by itself, Additional 5% from the L3 would put piledriver even higher so I'd say that 30% is actually a pretty reasonable estimate.

L3 won't add much to the TDP I would imagine, judging by trinity, the CPU portion of the 5800k would be less than half of the 100W TPD, 8 core at the same clock speed could probably fit into 80W plus L3 and a clock bump the 125W should be pretty standard. Efficiency results will probably be higher at lower clock speeds tho. Hard to really say about power consumption but judging by trinity compared to llano I would think piledriver is quite a bit more efficient than bulldozer.

Few things, I doubt BD scales perfectly. I have never seen a CPU scale perfectly with clock speed performance wise nor power consumption. Its near impossible as most software is still not very optimized.

30% is a very high estimate. There could be many variables in it that could cause this such as new instructions but I will wait and see a PD vs BD. Trinity will give a nice look, but wont gurantee a thing at all.

As for the L3 not adding to TDP, it shouldn't but it will add to power draw, at least a bit. And if I remember, the new clock mesh is only efficient up to a certain clock speed.

I think its to hard to compare BD to Trinity as they are very different in what they are targeted towards.

Still I think the review was a bit flat. They could have done a lot more to get real performance comparisons as the people with llano do have to upgrade their motherboards since the FM2 boards just came out. Its better to have a overall comparison so they can decide if its worth the upgrade or if they should go a different route.

I do think its a nice step up, but then again this might not even be final silicon and that can change things, since its not out for a while (paper launch).
 
Few things, I doubt BD scales perfectly. I have never seen a CPU scale perfectly with clock speed performance wise nor power consumption. Its near impossible as most software is still not very optimized.

30% is a very high estimate. There could be many variables in it that could cause this such as new instructions but I will wait and see a PD vs BD. Trinity will give a nice look, but wont gurantee a thing at all.

As for the L3 not adding to TDP, it shouldn't but it will add to power draw, at least a bit. And if I remember, the new clock mesh is only efficient up to a certain clock speed.

I think its to hard to compare BD to Trinity as they are very different in what they are targeted towards.

Still I think the review was a bit flat. They could have done a lot more to get real performance comparisons as the people with llano do have to upgrade their motherboards since the FM2 boards just came out. Its better to have a overall comparison so they can decide if its worth the upgrade or if they should go a different route.

I do think its a nice step up, but then again this might not even be final silicon and that can change things, since its not out for a while (paper launch).
I agreee with most of that but I think most software scale pretty well to frequency, maybe not perfect but well enough to be viewed as linear. Im still not sure about the clock mesh, seems to be good for stock but might not work well when pushed.

I remember reading somewhere that the piledriver inside Vishera will be more advance than trinity as well, hopefully that pans out and the 30% won't be too optimistic.
 
Okay, I realized my mistake. When I was reading the review late last night, I saw that Trinity beat Llano by fairly small margins in most of the tests done. Productivity, content creation, and file compression were all close. I figured that PD would at least match PII IPC, and seeing it fall close to Llano with such higher clock speeds didn't look to great.

Then I remembered that small little detail about the BD arch, CMT. It all makes sense now!

Taking a second look, the PD cores are definitively a step up in Per/watt over Husky cores even, and way better than BD. Add some L3 cache (~3-5% performance boost) and you've got quite a good step up for anyone on BD or PII. Hope there is still some OC potential left for PD.
 
Wonder which will be better for price/performance in folding, trinity, piledriver, or haswell.

Power usage too...

Shouldn't a video card be the best Folding option?

In any case, it seems that Haswell is going to be very good at HSA, but it's way too soon to make guesstimations on where it could even land. PD seems like a safe bet for folding for a "fair" price in the short/mid term.

Cheers!
 
Shouldn't a video card be the best Folding option?

In any case, it seems that Haswell is going to be very good at HSA, but it's way too soon to make guesstimations on where it could even land. PD seems like a safe bet for folding for a "fair" price in the short/mid term.

Cheers!
Yeah, but I'm wondering about trinity. Since it's an APU, should that mean I can fold both with the IGP and cpu cores!? But I want to be able to make a folding @ home farm for cheap, just a ton of cpu's folding :pt1cable:
 
I agreee with most of that but I think most software scale pretty well to frequency, maybe not perfect but well enough to be viewed as linear. Im still not sure about the clock mesh, seems to be good for stock but might not work well when pushed.

I remember reading somewhere that the piledriver inside Vishera will be more advance than trinity as well, hopefully that pans out and the 30% won't be too optimistic.


The bit JS said about Software not scaling with CPU frequency is nonsense, we're no longer in the days of DOS where software handled I/O interrupts and requires a raw boot disk to work. Software doesn't always scale with core count, this is where software limits come into play. Computers are very complex devices, trying to summarize their workings into 10s statements that a gold fish can understand is impossible.

Two CPU's of the same uArch and core count, CPU A having 50% more cycles then CPU B, CPU A is then by definition 50% faster then CPU B and has 50% more processing power. Will your application get a 50% increase in "performance", that highly depends on which measuring stick your using. If it's the number of frames copied to a frame buffer per second, then no it won't be linear, to many non-CPU related functions involved. If it's something like calculating the number of molecules in a neutron star a few galaxy's over, then yes the scale will be linear or nearly linear (accounting for memory latency / bandwidth).

This article was a quick show and tell on some hardware they just got into the lab. At the very end it stated they were doing further tests and would get back to us when they were finished. As such this is a preview meant to demonstrate that PD is indeed significantly faster then BD. Haters will use the worlds minor / slight / underwhelming or try to shift focus away from Power vs Speed vs Cost (Instructions per watt per USD), and to be honest it's rather disturbing. Trinity is not fully blown PileDriver, at least not according to AMD's information. It's supposed to be enhanced BD, having many of the PD modifications but not all.

I want to see what a full 6~8 "core" PD could do performance wise, especially if there is the capability to down-clock cores in software. I want to see Toms redo some of those single thread benchmarks using AMD's PSCheck to force two to three cores to 800Mhz and force one core to 4.5~5Ghz. Doing that with my Llano yielded impressive results.
 
Status
Not open for further replies.