AMD Piledriver rumours ... and expert conjecture

Page 5 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.
Status
Not open for further replies.
We have had several requests for a sticky on AMD's yet to be released Piledriver architecture ... so here it is.

I want to make a few things clear though.

Post a question relevant to the topic, or information about the topic, or it will be deleted.

Post any negative personal comments about another user ... and they will be deleted.

Post flame baiting comments about the blue, red and green team and they will be deleted.

Enjoy ...
 
win 8 will not be on my desktop for sometime to come.
I see it more for mobile sectors..

but my soon to be ordered FX-4170 (come on AMD) will do serious battles with my 965BE and i5-760...😀

right on right on. yeah I'm not upgrading to win 8 not into it being so linked to the web win 7 is bad enough lol yea my 8120 i'll be giving to my uncle when the piledrivers come out
 
There is no negativity. I am just looking at the facts. The fact is that AMD designed a CPU that could not be optimized, or rather they changed it too many times and pushed it back too many times to allow MS the time to optimize it for 7.

Microsoft decided not to change their timing or thread scheduling mechanism for Windows 7 but to put those changes into Windows 8. That is actually not surprising; it represents a large enough change that they would definitely want to do some additional testing before releasing. AMD can not force Microsoft to update anything; they can provide the data and ask for changes. Although I do expect that we will eventually see a retroactive patch after they get done testing the changes in Windows 8. (Which is completely awful and crashes a lot at this time.)


I guess there has to be a reason for it. In this world we have, if it was Intel doing the same thing I would have been saying the same thing. You create a arch and make sure it works well with the OS thats current.

The bolded comment made me literally spit coffee: I couldn't believe that you typed something so utterly devoid of reality. A designer would take the action you outlined if their company enjoys staying with old technology and never creating anything innovative. If everyone followed your guidelines we would still be running single core chips. You do realize that Windows never supported multiple core chips until after they had became readily available. Actually I remember being required to buy Windows XP Professional since that was required to run my dual CPU machine. They didn't add dual core support until well after that.


The 7 kernal is just a highly optimized Vista kernal. Thats why you can "upgrade" from Vista to 7, although I never would nor would I recommend it. On the other hand, the XP and Vista/7 kernals are very different and thats why a clean install "upgrade" is required from XP -> Vista/7.

The main difference between XP and Vista was the removal of direct access to Ring 0. The other major differences were changes to thread scheduling. (EDIT: Although true optimizations for hyperthreading didn't come until Windows 7.)

But there really is no connection between the ability to "upgrade" and the requirement to wipe and reload clean. In many cases Microsoft attempts to provide the ability to "upgrade" without wiping; but anybody knowledgeable will encourage people to wipe and load regardless of whether that ability exists.(Personally I have never trusted their "upgrades" even when they allowed it from windows 3.0 to 3.1... or it might have been 3.1 to 3.11. I tried it one time and will never be stupid enough to do it again.)
 
The main difference between XP and Vista was the removal of direct access to Ring 0. The other major differences were changes to thread scheduling.

And a new video driver model, a completely re-worked audio stack, major changes to how UI was handled, etc.

Oddly, I was able to do an upgrade from Vista->7 without any issues whatsoever. And yes, I went into it expecting to have to format for some reason, but to my shock, it worked fine.
 
One thing we do know is that the scheduler in Windows 7 isn't any help. The OS is completely unaware of how Bulldozer modules work; it sees only eight equal cores and schedules threads on them evenly. AMD says Windows 8 will address these issues, but outside of developer previews and such, that OS probably won't be available for another year, at least.

Again, simple fix: Simply label the second core of a BD module as a logical processor. Done. If AMD broadcasts 8 individual CPU cores, they shouldn't be shocked when the OS issues threads on them sequentially. Instead, the onus is suddenly on MS to determine which processors need "special" treatment to run well.

Which brings us to the next question: How is MS supposd to know which processors to treat in this manner? I mean, I HOPE they won't have to hardcode specific processors/models in the scheduler, as that would make the logic a mess...

Sorry, rather then make the OS scheduler a mess, how about AMD simply call a logical core a logical core and be done with it?

EDIT

In fact, we're a little surprised AMD hasn't attempted to piggyback on Intel's Hyper-Threading infrastructure by making Bulldozer processors present themselves to the OS as four physical cores with eight logical threads. One would think that might be a nice BIOS menu option, at least. (Hmm. Mobo makers, are you listening?)

Exactly what I've been saying...
 
Can you say SUPERSCALAR OPERATION? Nearly every modern CPU is designed to do more in one cycle. Some may even reach near two executions per cycle.

The FE should have two ports. If not THAT would be the issue as one port has to achieve 2 executions per cycle( fetch, decode, etc). I would think they're smarter than that. Plus, the majority of Operatiosn are memory operations so by decouples AGU\ALU they don' have switching penalties and less scratch space required to maintain last state (load\store or execute). This is one reason why different parts of the CPU run at different speeds.

Win 8 sees latency go down by A LOT. (Check out the PCStats link from Chad) Of course, it would be better if it was ready for Win 7 or vice versa but for heavy everyday use, FX can't be beat. There are places where it even beats 12 threads(990X).

I think most people here understand what superscalar means, years ago already. :sarcastic:

As for your speculation on the front end: http://semiaccurate.com/2011/10/17/bulldozer-doesnt-have-just-a-single-problem/ . There were several sites that mentioned BD's front end effectively having a 2-issue decoder per core when both cores in a module are active.

Finally, the pcstats article that Beenthere linked (not Chad) shows SiSoft Sandra's "Multicore efficiency latency" go down by 16%. Not sure what that means - maybe you can explain to us. However you should note the article's conclusion once again: "on the whole the AMD FX-8150 processor was slightly faster under the Windows 8 Developer environment." So it seems Win8 won't be quite the universal fixit for BD after all..
 
Your die size argument is not valid. Much of the die size for BD comes from cache. It has almost twice the cache of Thuban or sandy bridge.

In order to properly measure the effect of CMT on die size, we need to be able to compare the units on equal terms. Difficult to do right now because of the cache size difference.

On the same basis the original statement "die size increases show CMT to be a worthy approach to get this extra perf with minimal die space and power usage." also has no basis in fact, given that we don't know how much die space the extra BD cache takes up. All I've seen is either "5%" or "12%" extra die space for the extra cores that BD's CMT uses. I do recall some Nehalem articles stating that HT uses less than 5% extra die space however.

As for power usage, I don't know of any published info stating how much power is used by BD's cache vs. extra cores. I would think that cache memory, being static RAM, would have lower power usage than the ALU/AGU execution pipes in the extra cores. And isn't cache located in the'uncore' part anyway (lower clocks, lower voltage)? About the only comparison we can make here would be the fact that SB runs 8 threads at a similar clock speed as BD's 8 threads, at a lower overall CPU power usage.

As for performance, the difference between SB and bulldozer has more to do with the design of the processors execution components and not CMT. I will remind you that AMD was well behind Intel long before CMT.

Well at this point I don't think anybody knows what the whole story is, except perhaps AMD's engineers. However looking at the Anandtech benchmark page, I don't see any multithreaded vs. single threaded tests where BD scales up by 640% (8 x 80%) as alleged in this thread. At most, the Cinebench R10 score goes up by 541% 1 --> 8 cores, while the R11.5 score goes up by 587% 1 --> 8 cores. On a per-core basis, that amounts to 67% to 73% . For SB's HT, which Intel alleges offers up to 30% improvement with 'light' multithreading, R10 shows 48% improvement and R11.5 shows a 56% improvement per thread. Admittedly 2 data points don't make a case, but for these 2 Intel's HT performs higher than the advertised max while AMD's CMT lower than the advertised max.

I could go on and compare by increase in die size - 12% vs. 5%, but that's probably stretching a 2-data-point example too far 😛.
 
jimmysmitty wrote :
I guess there has to be a reason for it. In this world we have, if it was Intel doing the same thing I would have been saying the same thing. You create a arch and make sure it works well with the OS thats current.
The bolded comment made me literally spit coffee: I couldn't believe that you typed something so utterly devoid of reality. A designer would take the action you outlined if their company enjoys staying with old technology and never creating anything innovative. If everyone followed your guidelines we would still be running single core chips. You do realize that Windows never supported multiple core chips until after they had became readily available. Actually I remember being required to buy Windows XP Professional since that was required to run my dual CPU machine. They didn't add dual core support until well after that.

Actually, considering the quick & easy BIOS fix suggested by the Techreport article, AMD was really deficient in not addressing this issue long before BD was released. Seriously, they left a 10-20% performance improvement on the table?? What were they doing during the last 2 years when Win7 betas were out and available? That's what I find "devoid of reality"..
 
@ fazers.... please don't try and bust his bubble...
me might get mad and use that 'feed the beast' line again..


Any trolling or snide comments from you and your off for a 3 day holiday mal ... ditto for anyone else trying to drag the topic into the e-peen arena or wanting to get personal.

Focus on the topic of the thread thanks.
 
The bolded comment made me literally spit coffee: I couldn't believe that you typed something so utterly devoid of reality. A designer would take the action you outlined if their company enjoys staying with old technology and never creating anything innovative. If everyone followed your guidelines we would still be running single core chips. You do realize that Windows never supported multiple core chips until after they had became readily available. Actually I remember being required to buy Windows XP Professional since that was required to run my dual CPU machine. They didn't add dual core support until well after that.

So lets put a scenario out there:

What if Microsoft did not do these changes to the 7 kernal for Windows 8 (we all know the 8 kernal is an improved 7 kernal). What happens if Windows 8 did not have the scheduling fix and MS decided to wait for Windows 9 (or whatever they call it)?

My point is that you can never create a product expecting someone to utilize it unless you take the necessary actions.

Don't get me wrong, I am glad MS will make the changes to help boost BDs performance a bit. I just think its a bad move to release a product that cannot be properly utilized until the end of next year and will give Intel time to release two new CPUs and not soon after, another new one.

I applaud AMD for trying something new but I don't think they did it the right way. I think they could have just skipped BD, shrink Phenom II to 32nm and then work on PD instead and release when 8 comes out so they get a decent performance review.

Currently it is a bit unfair but that was AMDs own doing.
 
In 1993, it seemed completely obvious that the next version of Windows was going to be fully 32-bit. It seemed equally obvious that designing a CPU for anything but this was old school thinking, and decisions were made in the P6 architecture which optimized it for 32-bit code and made it perform marginally worse on 16-bit code than the older architecture. After all, Microsoft themselves said it would be a 32-bit OS (like WinNT)... Designing for it seemed the right thing to do.

Then Windows 95 actually arrived and the Pentium Pro, though inarguably a better CPU for 32-bit code, didn't have the same performance on older 16-bit code-- which Win95 was still designed around. Of course, Intel recovered relatively quickly since MS broadcast its intentions in time to intercept later designs (Pentium II), but the lesson was learned.

Designing for the next generation of SW is not a bad idea... but it carries with it some significant risks unless you are in control of the next generation of SW. We learned that in 1995, much to our chagrin, and have endeavored to avoid repeating the same mistake again.
 
In 1993, it seemed completely obvious that the next version of Windows was going to be fully 32-bit. It seemed equally obvious that designing a CPU for anything but this was old school thinking, and decisions were made in the P6 architecture which optimized it for 32-bit code and made it perform marginally worse on 16-bit code than the older architecture. After all, Microsoft themselves said it would be a 32-bit OS (like WinNT)... Designing for it seemed the right thing to do.

Then Windows 95 actually arrived and the Pentium Pro, though inarguably a better CPU for 32-bit code, didn't have the same performance on older 16-bit code-- which Win95 was still designed around. Of course, Intel recovered relatively quickly since MS broadcast its intentions in time to intercept later designs (Pentium II), but the lesson was learned.

Designing for the next generation of SW is not a bad idea... but it carries with it some significant risks unless you are in control of the next generation of SW. We learned that in 1995, much to our chagrin, and have endeavored to avoid repeating the same mistake again.


I'm so odd i like reading about hardware from the 90's and before, Thank you did not know this.
 
My point is that you can never create a product expecting someone to utilize it unless you take the necessary actions.

And what I quoted above is exactly the point where we completely disagree. You are focusing on a product as an entity and I am only focusing on a new feature of something that already exists. (i.e., a cpu. Not the architecture.)

In a similar light: What operating system first supported the new x64 architecture and what operating system didn't really support it for a long time? Did they wait until Windows fully supported x64 before they released it? The only difference between that older change and the new BD architecture would be if the addition of x64 created a slight decrease in the x86 functionality. (Which I think actually did happen... but the amount was small enough that it was a non-issue.)

There is really nothing AMD could have done in the past short of giving them a working BD chip in about 2008 that would force Microsoft to make these updates to Windows 7.

The main Microsoft development teams are now focusing their efforts on Windows 8 for everything new so that would be where anybody that understands their software development process would expect these changes; they generally never make changes of this magnitude (kernel level) to older operating systems. The older OS will only get bug fixes and minor changes. If the new functionality is added to Windows 8 kernel and goes through all testing and these changes can be back ported to Windows 7 then that might happen. However we can't expect anything like that to happen until after these changes are fully tested in the Windows 8 kernel.

(BTW: My response cut out your strawman argument. Sorry but that wasn't really worth responding to.)
 
^^ Again, the question I have is simple: Why should MS be forced to re-write its scheduler because AMD is basically lieing to the OS about BD's capabilities:

OS: How many cores do you have?
BD: Eight
OS: HTT/SMT?
BD: Nope 😀
OS: Ok, I'll use all your cores equally
BD: ...why do that? Use the odd ones less!
OS: ???
 
^^ Again, the question I have is simple: Why should MS be forced to re-write its scheduler because AMD is basically lieing to the OS about BD's capabilities:

Using the word "lying" would be incorrect since the needed functionality is not the same as what is used for hyperthreading.

And nobody "forces" Microsoft to do anything. They either adopt new technology or they don't. Look at how many CPU extensions have come and gone without being adopted by Microsoft.

We know that Microsoft has chosen to make changes to their kernel for Windows 8 so they have chosen to support BD. And we know that Windows 8 would be the earliest kernel that anybody knowledgeable about the subject would expect those updates.

Although these things are true we can still expect even better performance for PD.

EDIT: I think people keep forgetting that Windows 7 was the first version that fully supported Core i7's hyperthreading. Does anybody remember how long after i7 was released did it take before windows 7 was released? (About the same amount of time that it will be before Windows 8 is available.)
 
^^ Again, the question I have is simple: Why should MS be forced to re-write its scheduler because AMD is basically lieing to the OS about BD's capabilities:

OS: How many cores do you have?
BD: Eight
OS: HTT/SMT?
BD: Nope 😀
OS: Ok, I'll use all your cores equally
BD: ...why do that? Use the odd ones less!
OS: ???


Funny. W8 will probably just allow BD to go under full turbo so the 8 core can be clocked at 4.2Ghz more instead of just a turbo of 3.9Ghz. And maybe fix some small things that's wrong with sharing. Really I do think CMT on BD works well when it comes to how it performs it could be better as i only see about 60% boost from CMT not a 90% scaling. The issue is its 2IPC per core and two ALU and two AGU Per core. Also the slow cache. I'm sorry but that's just it. When Amd can improve on these mistakes they will have a better processor but they have to do it with lower TDP.

You may ask why do i think CMT works pretty good(not perfect!) Well clock to clock the BD is around 15-25% slower then the Phenom ii, But with the 8 core its about on par with the 6 core Phenom maybe a little better at times. That's pretty good with CMT(A 4 core processor with 8 cores) plus less performance per core.
 
Using the word "lying" would be incorrect since the needed functionality is not the same as what is used for hyperthreading.

...

EDIT: I think people keep forgetting that Windows 7 was the first version that fully supported Core i7's hyperthreading. Does anybody remember how long after i7 was released did it take before windows 7 was released? (About the same amount of time that it will be before Windows 8 is available.)

Technically, Vista was the first OS by MS with *proper* HTT support. And HTT, like CMT, is just a form of SMT. If BD is telling the OS that it has 8 cores, but 4 of those cores are really only 80% efficent, you shouldn't be upset when the OS incorrectly uses those cores. Thats EXACTLY why we have the CPU ID field, but AMD, for whatever reason, refuses to call its processors a 4+4 approach...

AMD could fix the scheuling problem TODAY with a simple BIOS update that would assign the second core of a BD module as a logical core.
 
Status
Not open for further replies.