AMD Piledriver rumours ... and expert conjecture

Page 206 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.
Status
Not open for further replies.
We have had several requests for a sticky on AMD's yet to be released Piledriver architecture ... so here it is.

I want to make a few things clear though.

Post a question relevant to the topic, or information about the topic, or it will be deleted.

Post any negative personal comments about another user ... and they will be deleted.

Post flame baiting comments about the blue, red and green team and they will be deleted.

Enjoy ...
 
You forgot one important thing here.

The way developers utilize CPU's and how AMD designed theirs, could change in the next few years, meaning newer programs make better use of the FX Modules and thus bring in better performance numbers in AMD's favor.

You seem to write out software from the future, like saying this:

Speaking as a developer:

We don't code to cores. We create threads, and let the Windows scheduler figure out the rest.

And I AGAIN note that if AMD simply labeled the second core of a BD Module a logical core, it would behave (as far as scheduling goes) EXACTLY like HTT, fixing AMD's performance problem.

Who cares if a cpu rocking 8 cores or 20 cores but if most games/apps use 2-4 cores then its pointless.

The problem with that statement, is in 2 years, that will most definitely change.

Not really. Games already use dozens of threads (40+ on average in a few sample games I bothered to check), but the main problem is, the heavy lifting is limited to a few key threads (specifically, the main Rendering, AI, and Physics threads). Theres also a LOT of interplay that goes on, which forces a lot of the code to happen in a serial manner.

If your code is serial, you aren't going to scale. Period.

Simple example: Look how AI has to work under the hood. Some event happens, the AI then reacts.

For instance, the AI "sees" a target, then chases after it.

What that means is, you need the results of the rendering engine in order to determine if the AI has line of sight. You need at least a landscape geometry and depth buffer in order to do this. So you can't process this event until AFTER the geometry has been made.

Another example: the AI "hears" a gunshot, and goes to investigate.

Same problem. You have to establish whether or not the AI can "hear" the sound effect. So you need to establish distance. Next, assuming 3D audio effects, you need to establish whether the sound reaches the AI with enough power to be distinguishable. So again, you can't process audio until AFTER the scenescape has been created. So you have a situation where the audio engine needs data from the rendering engine, then the AI needs that data to process. Notice theres an order of things that has to occur?

Now, lets throw in the physics engine. See how messy things are getting in regards to program flow?

Now, all these threads are going on at the same time. So you have a lot of pre-empting as events start to happen. You fire a gun, now the Audio engine needs to calculate how the sound travels, and the AI engine needs to figure out what effects that has on the active AI objects. Etc. This involves a LOT of inter-thread interaction, which implies a lot of synchronization. This kills performance, and limits performance gains.

At the end of the day, your speedup is limited to how much of the program can be made parallel. If 60% of the program is serial, 60% of the program will NEVER benefit from more cores, and your max speedup is limited to 40%, no matter how many cores you put into a system.
 
Speaking as a developer:

We don't code to cores. We create threads, and let the Windows scheduler figure out the rest.

And I AGAIN note that if AMD simply labeled the second core of a BD Module a logical core, it would behave (as far as scheduling goes) EXACTLY like HTT, fixing AMD's performance problem.
That would work, if modules were as useful as HT. Problem is, it's much, much stronger (relatively). Windows schedules to threads last, it doesn't treat them equal to the physical cores, because they aren't.

Another thing to note, the two cores in a module are IDENTICAL. The second core is not in any way inferior to the first, it is only when they work together that they show a performance decrease, in some cases, to two separate cores.

At the end of the day, your speedup is limited to how much of the program can be made parallel. If 60% of the program is serial, 60% of the program will NEVER benefit from more cores, and your max speedup is limited to 40%, no matter how many cores you put into a system.
I believe that putting care and effort into your coding can yield incredible results. IE: Source.
 
http://www.obr-hardware.com/2012/07/knock-knock-is-anybody-else-there-amd.html

Some facts: AMD Trinity has serious design bug in current revision, error will not affect only lower-performance models with low frequencies (laptops). Desktop FM2 version in A1 revision are buggy as hell. FX2 based on Piledriver core is delayed to december/january 2013. Radeon HD 7970 GHz Edition is still only paper-launched. New Radeons HD 7950/7990 are delayed again. Look at stocks, lowest price now for the last several years.

this guy has been spot-on on inside information for some time now.
 
Some facts: AMD Trinity has serious design bug in current revision, error will not affect only lower-performance models with low frequencies (laptops). Desktop FM2 version in A1 revision are buggy as hell. FX2 based on Piledriver core is delayed to december/january 2013. Radeon HD 7970 GHz Edition is still only paper-launched. New Radeons HD 7950/7990 are delayed again. Look at stocks, lowest price now for the last several years.

http://www.obr-hardware.com/2012/07/knock-knock-is-anybody-else-there-amd.html
 
Speaking as a developer:


We don't code to cores. We create threads, and let the Windows scheduler figure out the rest.


And I AGAIN note that if AMD simply labeled the second core of a BD Module a logical core, it would behave (as far as scheduling goes) EXACTLY like HTT, fixing AMD's performance problem.
Its called windows hotfix and its now part of the windows update package. Run cinebench, 4c/8t. Run skyrim, loads cores 0 2 4 6 and not 1 3 5 7. Oh, thats right, only me and vindi can do that, everyone else just talks about what they can't see because we were dumb enough to buy the 8120 FX. :pt1cable:

Its already implemented but not tested other than during the beta phase.

BTW, folding at home page is stuck at 120 also. I was finally able to get to this page by "showing right column" and clicking on the active thread there. otherwise I can't get here, just to the bottom of page 120.
 
I CAN READ THE PILEDRIVER THREAD AGAIN!!!!!!
thank you very much noob!!!
people, use the right column to open the thread.
anyway @obrhardware's news: what's the basis of hardware bugs (that's what i'm assuming) in trinity apus? imo that writer was visibly pissed at amd. i thought that trinity was delayed due to too many unsold apus. i did read about a 'design issue' that affected socket fm2 motherboards and delayed launch of top unlocked trinity apus. nothing about hardware bugs... need more info and articles not written by angry writers. :)
in other news, investors give amd an f..
http://www.tomshardware.com/news/amd-investors-sell-stock-rating,16430.html
 
Speaking as a developer:

We don't code to cores. We create threads, and let the Windows scheduler figure out the rest.

And I AGAIN note that if AMD simply labeled the second core of a BD Module a logical core, it would behave (as far as scheduling goes) EXACTLY like HTT, fixing AMD's performance problem.
That would work, if modules were as useful as HT. Problem is, it's much, much stronger (relatively). Windows schedules to threads last, it doesn't treat them equal to the physical cores, because they aren't.

Another thing to note, the two cores in a module are IDENTICAL. The second core is not in any way inferior to the first, it is only when they work together that they show a performance decrease, in some cases, to two separate cores.

It has nothing to do with HTT/Non-HTT, it has to do with core loading. Remember that using the second core of a BD module, because of resource sharing, is only about 80% effective. Labeling the second core of a BD module as a logical core would tell the OS to use the physical cores first, bypassing that 20% performance penalty. The Windows scheduler treats all BD cores equal, so you run into that performance hit the second a thread of offloaded to a second core.

That being said, in lightly threaded situation, you may want to do the reverse, so you can turbo boost.


At the end of the day, your speedup is limited to how much of the program can be made parallel. If 60% of the program is serial, 60% of the program will NEVER benefit from more cores, and your max speedup is limited to 40%, no matter how many cores you put into a system.
I believe that putting care and effort into your coding can yield incredible results. IE: Source.

Source only scales to about 4 cores, with limiting scaling after that. Not the best example.

Its trivial to make PARTS of a program parallel of eachother. I can make every single game AI totally independent of the rest. But because the AI engine is tied to both rendering and audio processing to some degree, you are going to have roadblocks that limit performance increases of adding more cores. You may still have a very high core usage as shown by TM, but you won't be able to get more then about 80% loading or so simply due to software locking between threads.

I really can't see games every scaling past 8 cores under any circumstances, hence why I doubt we'll see consumer CPU's with more cores then that.
 
Its called windows hotfix and its now part of the windows update package. Run cinebench, 4c/8t. Run skyrim, loads cores 0 2 4 6 and not 1 3 5 7. Oh, thats right, only me and vindi can do that, everyone else just talks about what they can't see because we were dumb enough to buy the 8120 FX. :pt1cable:

Which is EXACTLY what calling the cores logical cores would have done, just without needing to wait a few months for a MS Hotfix. The Windows Scheduler loads logical cores last.
 
AMD just received a devastating certificate from the investment industry. AMD has been crowned "sell of the week" by AAII Journal as investors do not seem to have not much confidence in the company right now. Navellier Ratings has given AMD another downgrade from "sell" last week to "strong sell" this week as the company's financial performance was graded with six "F" performances in eight disciplines.


It's no surprise. Look at the key statistics of AMD vs say Nvidia.

They need help in a big way. Like a Microsoft bailing out Apple in the 90s way.

The only positive is they're employing 11,000 people.

http://finance.yahoo.com/q/ks?s=AMD+Key+Statistics

http://finance.yahoo.com/q/ks?s=NVDA+Key+Statistics
 
I hardly see a difference based on L3 in there, to be honest. I wouldn't say that it's 10% advantage also. It's a good thing to have L3, but not worth of 10% improvements at the same clock speed.

This assumption is inaccurate if we base ourselves on Tom's own previous testing.

With regard to gaming, L3 might provide up to 17% more performance at the same clock speed, according to what has already been tested. I think there were a few more articles that covered this topic back at Athlon II's launch, but, nevertheless, here is Tom's:

http://www.tomshardware.com/reviews/athlon-l3-cache,2416-6.html
 
I hardly see a difference based on L3 in there, to be honest. I wouldn't say that it's 10% advantage also. It's a good thing to have L3, but not worth of 10% improvements at the same clock speed.

This assumption is inaccurate if we base ourselves on Tom's own previous testing.

With regard to gaming, L3 might provide up to 17% more performance at the same clock speed, according to what has already been tested. I think there were a few more articles that covered this topic back at Athlon II's launch, but, nevertheless, here is Tom's:

http://www.tomshardware.com/reviews/athlon-l3-cache,2416-6.html
 
I have just found another article that proves the point regarding L3 being able to make a significant difference in gaming. If we compare the Phenom II X2 550 BE (3.1GHz) to the Athlon II X2 250 (3.0GHz) adjusting for the small clock difference, here is the advantage that the Phenom II has over the Athlon II:

7% in Fallout 3
13% in Left 4 Dead
11% in FarCry 2
4% in Crysis Warhead

http://www.anandtech.com/show/2836/8

Also, if you head back to the index of this same article, you will see a table with performance data and the conclusion that "At the same clock speed the Athlon II X4 should offer roughly 90% of the performance of a Phenom II X4."
 
Which is EXACTLY what calling the cores logical cores would have done, just without needing to wait a few months for a MS Hotfix. The Windows Scheduler loads logical cores last.
Somewhat true, AMD could have also just copied Intel's cpu itself, or maybe Intel's logical cores are already coded into windows to begin with, (except for the windows XP without the HT hotfix) and since AMD likely couldn't infringe on whatever patent is out there, they had to wait till MS coded their "3/4 core"

The thing is you can't expect AMD to be allowed to ride on Intel's software/hardware implementation. I would venture to bet that AMD had to do it the way it happened. Just more fuel to the marketing blunder.
 
I have just found another article that proves the point regarding L3 being able to make a significant difference in gaming. If we compare the Phenom II X2 550 BE (3.1GHz) to the Athlon II X2 250 (3.0GHz) adjusting for the small clock difference, here is the advantage that the Phenom II has over the Athlon II:

7% in Fallout 3
13% in Left 4 Dead
11% in FarCry 2
4% in Crysis Warhead

http://www.anandtech.com/show/2836/8

Also, if you head back to the index of this same article, you will see a table with performance data and the conclusion that "At the same clock speed the Athlon II X4 should offer roughly 90% of the performance of a Phenom II X4."

L3 cache able to make difference IF the architecture is "well wrote".
Sometimes the L3 cache could input a bigger lag or latency in the line processing enough to counter the L3 cache advantages.
 
L3 cache able to make difference IF the architecture is "well wrote".
Sometimes the L3 cache could input a bigger lag or latency in the line processing enough to counter the L3 cache advantages.

There is no justifiable reason to believe it won't improve Vishera's results vs Trinity's, though.

The main point was that it is perfectly reasonable to expect a ~10% increase in performance (especially in gaming) because of Piledriver's added L3. This is not the same as expecting a "BIOS fix" or "microcode update" to completely change the leaked benchmarks of a brand new architecture, such as we saw back at Bulldozer's impending launch.
 
Yes:
http://blogs.amd.com/play/2011/10/13/our-take-on-amd-fx/

I wouldn't call BD a failure, but rather a disappointment for some (if not most). BD needs work, but I try not to give up ALL hope on AMD.

i dont see why people all say bulldozers a fail its baised of an old architecture amd bought rights too the company they got it from liquidised 10 years back they used to make expensive proccessors that were hand made every silicon bit and all that so you can see why they were expensive but they were insane however as its such an old architecture amd have yet to bring it to todays top standards but for a first run they have done well, i have the fx 6100 and its much better then reviewers say theyre so corrupt these days they all get paid to say what they say now my 6100 does proberly 15-20fps better in games then what they reviewers say.
 
Translation:

I don't see why people all say bulldozers a fail. It's based an old architecture. AMD bought rights to the company they got it from liquidised 10 years back. They used to make expensive processors that were hand made every silicon bit and all that. So you can see why they were expensive, but they were insane. However, as it's such an old architecture, AMD have yet to bring it to today's top standards. But for a first run, they have done well. I have the FX-6100 and it's much better then reviewers say they're. So corrupt these days, they all get paid to say what they say now. My 6100 does properly 15-20fps better in games then what the reviewers say.

baised of an old architecture amd bought rights too the company they got it from liquidised 10 years back they used to make expensive proccessors that were hand made every silicon bit and all that

?? That bit was translated but still doesn't sound right.
 
Status
Not open for further replies.