AMD Piledriver rumours ... and expert conjecture

Reynod · Oct 27, 2011

We have had several requests for a sticky on AMD's yet to be released Piledriver architecture ... so here it is.

I want to make a few things clear though.

Post a question relevant to the topic, or information about the topic, or it will be deleted.

Post any negative personal comments about another user ... and they will be deleted.

Post flame baiting comments about the blue, red and green team and they will be deleted.

Enjoy ...

beethree · Jul 25, 2012

Another thing, during their quick overview this site overclocked an A10 APU (dektop) and compared it to an FX8150 which was overclocked but had cores disabled to get them both comparable, and found that in CPU tasks, Piledriver was on average about 15% faster than buldozer.

http://www.tomshardware.com/reviews/a10-5800k-a8-5600k-a6-5400k,3224-2.html

gamerk316 · Jul 25, 2012

You forgot one important thing here.

The way developers utilize CPU's and how AMD designed theirs, could change in the next few years, meaning newer programs make better use of the FX Modules and thus bring in better performance numbers in AMD's favor.

You seem to write out software from the future, like saying this:

Speaking as a developer:

We don't code to cores. We create threads, and let the Windows scheduler figure out the rest.

And I AGAIN note that if AMD simply labeled the second core of a BD Module a logical core, it would behave (as far as scheduling goes) EXACTLY like HTT, fixing AMD's performance problem.

Who cares if a cpu rocking 8 cores or 20 cores but if most games/apps use 2-4 cores then its pointless.

Click to expand...

The problem with that statement, is in 2 years, that will most definitely change.

Not really. Games already use dozens of threads (40+ on average in a few sample games I bothered to check), but the main problem is, the heavy lifting is limited to a few key threads (specifically, the main Rendering, AI, and Physics threads). Theres also a LOT of interplay that goes on, which forces a lot of the code to happen in a serial manner.

If your code is serial, you aren't going to scale. Period.

Simple example: Look how AI has to work under the hood. Some event happens, the AI then reacts.

For instance, the AI "sees" a target, then chases after it.

What that means is, you need the results of the rendering engine in order to determine if the AI has line of sight. You need at least a landscape geometry and depth buffer in order to do this. So you can't process this event until AFTER the geometry has been made.

Another example: the AI "hears" a gunshot, and goes to investigate.

Same problem. You have to establish whether or not the AI can "hear" the sound effect. So you need to establish distance. Next, assuming 3D audio effects, you need to establish whether the sound reaches the AI with enough power to be distinguishable. So again, you can't process audio until AFTER the scenescape has been created. So you have a situation where the audio engine needs data from the rendering engine, then the AI needs that data to process. Notice theres an order of things that has to occur?

Now, lets throw in the physics engine. See how messy things are getting in regards to program flow?

Now, all these threads are going on at the same time. So you have a lot of pre-empting as events start to happen. You fire a gun, now the Audio engine needs to calculate how the sound travels, and the AI engine needs to figure out what effects that has on the active AI objects. Etc. This involves a LOT of inter-thread interaction, which implies a lot of synchronization. This kills performance, and limits performance gains.

At the end of the day, your speedup is limited to how much of the program can be made parallel. If 60% of the program is serial, 60% of the program will NEVER benefit from more cores, and your max speedup is limited to 40%, no matter how many cores you put into a system.

viridiancrystal · Jul 26, 2012

Speaking as a developer:

We don't code to cores. We create threads, and let the Windows scheduler figure out the rest.

And I AGAIN note that if AMD simply labeled the second core of a BD Module a logical core, it would behave (as far as scheduling goes) EXACTLY like HTT, fixing AMD's performance problem.

That would work, if modules were as useful as HT. Problem is, it's much, much stronger (relatively). Windows schedules to threads last, it doesn't treat them equal to the physical cores, because they aren't.

Another thing to note, the two cores in a module are IDENTICAL. The second core is not in any way inferior to the first, it is only when they work together that they show a performance decrease, in some cases, to two separate cores.

At the end of the day, your speedup is limited to how much of the program can be made parallel. If 60% of the program is serial, 60% of the program will NEVER benefit from more cores, and your max speedup is limited to 40%, no matter how many cores you put into a system.

I believe that putting care and effort into your coding can yield incredible results. IE: Source.

palladin9479 · Jul 26, 2012

Ok I can't see past page 120. What gives

de5_Roy · Jul 26, 2012

stuck at page 60 here. i can only see last post alerts and when i load the page, nothing. only see the last post if i hit 'add a reply' but none of the preceding post after blandge's post.

dogman_1234 · Jul 26, 2012

I have found the secret hideout of teh AMD thread!

mayankleoboy1 · Jul 26, 2012

http://www.obr-hardware.com/2012/07/knock-knock-is-anybody-else-there-amd.html

Some facts: AMD Trinity has serious design bug in current revision, error will not affect only lower-performance models with low frequencies (laptops). Desktop FM2 version in A1 revision are buggy as hell. FX2 based on Piledriver core is delayed to december/january 2013. Radeon HD 7970 GHz Edition is still only paper-launched. New Radeons HD 7950/7990 are delayed again. Look at stocks, lowest price now for the last several years.

this guy has been spot-on on inside information for some time now.

mayankleoboy1 · Jul 26, 2012

Some facts: AMD Trinity has serious design bug in current revision, error will not affect only lower-performance models with low frequencies (laptops). Desktop FM2 version in A1 revision are buggy as hell. FX2 based on Piledriver core is delayed to december/january 2013. Radeon HD 7970 GHz Edition is still only paper-launched. New Radeons HD 7950/7990 are delayed again. Look at stocks, lowest price now for the last several years.

http://www.obr-hardware.com/2012/07/knock-knock-is-anybody-else-there-amd.html

noob2222 · Jul 26, 2012

Speaking as a developer:

We don't code to cores. We create threads, and let the Windows scheduler figure out the rest.

And I AGAIN note that if AMD simply labeled the second core of a BD Module a logical core, it would behave (as far as scheduling goes) EXACTLY like HTT, fixing AMD's performance problem.

Its called windows hotfix and its now part of the windows update package. Run cinebench, 4c/8t. Run skyrim, loads cores 0 2 4 6 and not 1 3 5 7. Oh, thats right, only me and vindi can do that, everyone else just talks about what they can't see because we were dumb enough to buy the 8120 FX. :pt1cable:

Its already implemented but not tested other than during the beta phase.

BTW, folding at home page is stuck at 120 also. I was finally able to get to this page by "showing right column" and clicking on the active thread there. otherwise I can't get here, just to the bottom of page 120.

de5_Roy · Jul 26, 2012

I CAN READ THE PILEDRIVER THREAD AGAIN!!!!!!
thank you very much noob!!!
people, use the right column to open the thread.
anyway @obrhardware's news: what's the basis of hardware bugs (that's what i'm assuming) in trinity apus? imo that writer was visibly pissed at amd. i thought that trinity was delayed due to too many unsold apus. i did read about a 'design issue' that affected socket fm2 motherboards and delayed launch of top unlocked trinity apus. nothing about hardware bugs... need more info and articles not written by angry writers.

in other news, investors give amd an f..
http://www.tomshardware.com/news/amd-investors-sell-stock-rating,16430.html

de5_Roy · Jul 26, 2012

the hell... i am 'locked' out of the thread again, can't see the last post unless i hit the 'add a reply' button. even right column doesn't work for me...
is this the mythical 6000th reply??!?! i didn't want it to be this way.... *sob* TT_TT
anyway on topic.
amd's working on socket fm2 athlons
http://www.fudzilla.com/home/item/28096-amd-working-on-socket-fm2-athlons

gamerk316 · Jul 26, 2012

Yeah, clicking the latest page takes you back to the CPU section again...Think we may need to close this one and open a new thread...

gamerk316 · Jul 26, 2012

Speaking as a developer:

We don't code to cores. We create threads, and let the Windows scheduler figure out the rest.

And I AGAIN note that if AMD simply labeled the second core of a BD Module a logical core, it would behave (as far as scheduling goes) EXACTLY like HTT, fixing AMD's performance problem.

Click to expand...

That would work, if modules were as useful as HT. Problem is, it's much, much stronger (relatively). Windows schedules to threads last, it doesn't treat them equal to the physical cores, because they aren't.

Another thing to note, the two cores in a module are IDENTICAL. The second core is not in any way inferior to the first, it is only when they work together that they show a performance decrease, in some cases, to two separate cores.

It has nothing to do with HTT/Non-HTT, it has to do with core loading. Remember that using the second core of a BD module, because of resource sharing, is only about 80% effective. Labeling the second core of a BD module as a logical core would tell the OS to use the physical cores first, bypassing that 20% performance penalty. The Windows scheduler treats all BD cores equal, so you run into that performance hit the second a thread of offloaded to a second core.

That being said, in lightly threaded situation, you may want to do the reverse, so you can turbo boost.

At the end of the day, your speedup is limited to how much of the program can be made parallel. If 60% of the program is serial, 60% of the program will NEVER benefit from more cores, and your max speedup is limited to 40%, no matter how many cores you put into a system.

Click to expand...

I believe that putting care and effort into your coding can yield incredible results. IE: Source.

Source only scales to about 4 cores, with limiting scaling after that. Not the best example.

Its trivial to make PARTS of a program parallel of eachother. I can make every single game AI totally independent of the rest. But because the AI engine is tied to both rendering and audio processing to some degree, you are going to have roadblocks that limit performance increases of adding more cores. You may still have a very high core usage as shown by TM, but you won't be able to get more then about 80% loading or so simply due to software locking between threads.

I really can't see games every scaling past 8 cores under any circumstances, hence why I doubt we'll see consumer CPU's with more cores then that.

gamerk316 · Jul 26, 2012

Its called windows hotfix and its now part of the windows update package. Run cinebench, 4c/8t. Run skyrim, loads cores 0 2 4 6 and not 1 3 5 7. Oh, thats right, only me and vindi can do that, everyone else just talks about what they can't see because we were dumb enough to buy the 8120 FX.

Which is EXACTLY what calling the cores logical cores would have done, just without needing to wait a few months for a MS Hotfix. The Windows Scheduler loads logical cores last.

Cazalan · Jul 26, 2012

AMD just received a devastating certificate from the investment industry. AMD has been crowned "sell of the week" by AAII Journal as investors do not seem to have not much confidence in the company right now. Navellier Ratings has given AMD another downgrade from "sell" last week to "strong sell" this week as the company's financial performance was graded with six "F" performances in eight disciplines.

It's no surprise. Look at the key statistics of AMD vs say Nvidia.

They need help in a big way. Like a Microsoft bailing out Apple in the 90s way.

The only positive is they're employing 11,000 people.

http://finance.yahoo.com/q/ks?s=AMD+Key+Statistics

http://finance.yahoo.com/q/ks?s=NVDA+Key+Statistics

dattimr · Jul 27, 2012

I hardly see a difference based on L3 in there, to be honest. I wouldn't say that it's 10% advantage also. It's a good thing to have L3, but not worth of 10% improvements at the same clock speed.

This assumption is inaccurate if we base ourselves on Tom's own previous testing.

With regard to gaming, L3 might provide up to 17% more performance at the same clock speed, according to what has already been tested. I think there were a few more articles that covered this topic back at Athlon II's launch, but, nevertheless, here is Tom's:

http://www.tomshardware.com/reviews/athlon-l3-cache,2416-6.html

dattimr · Jul 27, 2012

I hardly see a difference based on L3 in there, to be honest. I wouldn't say that it's 10% advantage also. It's a good thing to have L3, but not worth of 10% improvements at the same clock speed.

This assumption is inaccurate if we base ourselves on Tom's own previous testing.

With regard to gaming, L3 might provide up to 17% more performance at the same clock speed, according to what has already been tested. I think there were a few more articles that covered this topic back at Athlon II's launch, but, nevertheless, here is Tom's:

http://www.tomshardware.com/reviews/athlon-l3-cache,2416-6.html

dattimr · Jul 27, 2012

I have just found another article that proves the point regarding L3 being able to make a significant difference in gaming. If we compare the Phenom II X2 550 BE (3.1GHz) to the Athlon II X2 250 (3.0GHz) adjusting for the small clock difference, here is the advantage that the Phenom II has over the Athlon II:

7% in Fallout 3
13% in Left 4 Dead
11% in FarCry 2
4% in Crysis Warhead

http://www.anandtech.com/show/2836/8

Also, if you head back to the index of this same article, you will see a table with performance data and the conclusion that "At the same clock speed the Athlon II X4 should offer roughly 90% of the performance of a Phenom II X4."

Pyree · Jul 27, 2012

Old thread is fixed!!!

Guest · Jul 27, 2012

no more sticky?

noob2222 · Jul 27, 2012

Which is EXACTLY what calling the cores logical cores would have done, just without needing to wait a few months for a MS Hotfix. The Windows Scheduler loads logical cores last.

Somewhat true, AMD could have also just copied Intel's cpu itself, or maybe Intel's logical cores are already coded into windows to begin with, (except for the windows XP without the HT hotfix) and since AMD likely couldn't infringe on whatever patent is out there, they had to wait till MS coded their "3/4 core"

The thing is you can't expect AMD to be allowed to ride on Intel's software/hardware implementation. I would venture to bet that AMD had to do it the way it happened. Just more fuel to the marketing blunder.

vitornob · Jul 27, 2012

I have just found another article that proves the point regarding L3 being able to make a significant difference in gaming. If we compare the Phenom II X2 550 BE (3.1GHz) to the Athlon II X2 250 (3.0GHz) adjusting for the small clock difference, here is the advantage that the Phenom II has over the Athlon II:

7% in Fallout 3
13% in Left 4 Dead
11% in FarCry 2
4% in Crysis Warhead

http://www.anandtech.com/show/2836/8

Also, if you head back to the index of this same article, you will see a table with performance data and the conclusion that "At the same clock speed the Athlon II X4 should offer roughly 90% of the performance of a Phenom II X4."

L3 cache able to make difference IF the architecture is "well wrote".
Sometimes the L3 cache could input a bigger lag or latency in the line processing enough to counter the L3 cache advantages.

dattimr · Jul 27, 2012

L3 cache able to make difference IF the architecture is "well wrote".
Sometimes the L3 cache could input a bigger lag or latency in the line processing enough to counter the L3 cache advantages.

There is no justifiable reason to believe it won't improve Vishera's results vs Trinity's, though.

The main point was that it is perfectly reasonable to expect a ~10% increase in performance (especially in gaming) because of Piledriver's added L3. This is not the same as expecting a "BIOS fix" or "microcode update" to completely change the leaked benchmarks of a brand new architecture, such as we saw back at Bulldozer's impending launch.

rorysory · Jul 27, 2012

Yes:
http://blogs.amd.com/play/2011/10/13/our-take-on-amd-fx/

I wouldn't call BD a failure, but rather a disappointment for some (if not most). BD needs work, but I try not to give up ALL hope on AMD.

i dont see why people all say bulldozers a fail its baised of an old architecture amd bought rights too the company they got it from liquidised 10 years back they used to make expensive proccessors that were hand made every silicon bit and all that so you can see why they were expensive but they were insane however as its such an old architecture amd have yet to bring it to todays top standards but for a first run they have done well, i have the fx 6100 and its much better then reviewers say theyre so corrupt these days they all get paid to say what they say now my 6100 does proberly 15-20fps better in games then what they reviewers say.

Pyree · Jul 27, 2012

Translation:

I don't see why people all say bulldozers a fail. It's based an old architecture. AMD bought rights to the company they got it from liquidised 10 years back. They used to make expensive processors that were hand made every silicon bit and all that. So you can see why they were expensive, but they were insane. However, as it's such an old architecture, AMD have yet to bring it to today's top standards. But for a first run, they have done well. I have the FX-6100 and it's much better then reviewers say they're. So corrupt these days, they all get paid to say what they say now. My 6100 does properly 15-20fps better in games then what the reviewers say.

baised of an old architecture amd bought rights too the company they got it from liquidised 10 years back they used to make expensive proccessors that were hand made every silicon bit and all that

?? That bit was translated but still doesn't sound right.

AMD Piledriver rumours ... and expert conjecture

Administrator

Honorable

Glorious

Distinguished

Splendid

Splendid

Splendid

Distinguished

Distinguished

Distinguished

Splendid

Splendid

Glorious

Glorious

Glorious

Distinguished

Distinguished

Distinguished

Distinguished

Splendid

Guest

Guest

Distinguished

Distinguished

Distinguished

Honorable

Splendid

Share this page