AMD Piledriver rumours ... and expert conjecture

Page 187 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.
Status
Not open for further replies.
We have had several requests for a sticky on AMD's yet to be released Piledriver architecture ... so here it is.

I want to make a few things clear though.

Post a question relevant to the topic, or information about the topic, or it will be deleted.

Post any negative personal comments about another user ... and they will be deleted.

Post flame baiting comments about the blue, red and green team and they will be deleted.

Enjoy ...
 
Regarding the PD improvements and games... That 15% Toms showed in IPC regarding BD won't translate directly to games and side a little with gamerk (just not so extreme, lol). It will be 7% to 10% across the board tops. The rest will come from higher clocks and won't account for more than an overall of 20% IMO (thinking of a flagship with 3.8 base and 4.2 Turbo).

What I'm sure though, is the reduced overall power consumption PD will show. Trinity showed and proved that much at least.

Cheers!

EDIT: Changed a bit some wording.
EDIT2: MU_Engy, if you're gonna edit, do it correctly.
 
For Piledriver and the expected outcry's and gnashing of teeth from the blue painted folk, I'll say this. The only thing that matters is performance vs cost vs power. Couldn't care if it's 1Ghz, 3Ghz, 10Ghz, or 1Thz, just provide performance vs cost vs power. Everything after that is just blue vs green warfare.
 
For Piledriver and the expected outcry's and gnashing of teeth from the blue painted folk, I'll say this. The only thing that matters is performance vs cost vs power. Couldn't care if it's 1Ghz, 3Ghz, 10Ghz, or 1Thz, just provide performance vs cost vs power. Everything after that is just blue vs green warfare.


Agree but they have to get the performance somewhere to do a performance/tdp/cost
 
Well, I think that's what VM Ware does.

What the guys here are talking about is running very low-powered tasks on the ARM CPU to save battery life and then waking up the big x86 one when needed to provide more power for more intensive tasks. VMware won't help you with that as you would be running both OSes at the same time on only one of the CPUs. Running an ARM OS inside of an emulator on an x86 CPU running an x86 OS wouldn't be of any benefit because you're already firing up the x86 CPU and have a much more fully-featured OS already running. Running an x86 OS inside of an emulator on an ARM CPU would save power and provide better functionality than the ARM OS itself, but it would be *impossibly* slow.

I think the best way to go forward with this is to stick with x86 cores but continue the aggressive Turbo and clock gating of unneeded parts. That way you get most of the benefit of a super-weak, low-powered CPU but without the problems associated with asymmetric multiprocessing. Also, if you really want to cut power consumption, look at other parts of the computer. RAM, the chipset, disk drives, and especially the display all suck down power.

I run the PhII at 4Ghz with CnC and C1E activated. It IDLEs to 800Mhz and 0.9v when not in use. So, PD would have to beat that as well, which I'm sure it will.

Bulldozer already does better than the Phenom IIs with idle power due to the 32 nm process and aggressive C-states and clock gating. Its load power is what is higher than Phenom II's and what everybody complains about. Piledriver should do even better in both respects.[/quotemsg]

 

i mean a extra core other than main cores
example
a tweaked core (x86) running at 0.5-1.0GHz@~0.4v apart from other 4 powerful cores in apu (means a 4+1 core apu)
also a support in os to automatically and manually switch between low power core or full performance mode, then there will be no need to introduce arm and stuff (apart from security of arm :??: , i dunno about that 😗 ) in windows.
 
^ how about a weak core (x86) that uses less power specially designed for lite tasks with less ipc and clocks ?

because windows mess with turbo and C&Q and thus they does not work effectively

You could do that, and clock gate off the "fat" cores until they are needed. You will have some latency involved in firing up the cores, but it shouldn't be any worse than with current CPUs as they flush caches before they power off. However thread scheduling will possibly (probably) be an issue with there being one weak core and several much more powerful ones. In theory, running an asymmetrical setup should be pretty straightforward- if the system load exceeds a set amount, fire up a fat core, move everything to it, and then do not schedule anything on the weak core until the fat core's utilization drops below another set amount. In that case, then move all threads to the weak core and shut off the fat one. That is pretty much the same as how a sane OS would handle Turbo and CnQ as well but as you note Windows stinks at that. MS surely will try to invent some needlessly complex way to address this- probably involving "the cloud" or expensive hardware- and will fail miserably to do what other OSes coded by unemployed guys in their mothers' basements managed to have successfully accomplish some time ago.
 
Your completely wrong, Games benefit from clock speed and IPC both of which improved on PD. Now will it be better then the Phenom when it comes to gaming Yes i think so not by much but by 5-10%. I don't even understand why you would say this of course its not going to be "identical". Will it beat sandy no but 90% of the time its your GPU that's the bottleneck like with my Phenom/ I can even come close to maxing out skyrim. Overclock Piledriver to 4.6Ghz and you will game perfect maybe not at 120FPS but you will game better on PD then you will from a 360/PS3/Wii U?/

People at least try to look up stuff before saying statements like this.

Except you fail to understand that no matter how fast a CPU is, if you are sending more data then the GPU can process, you are not going to see FPS increase. Barring a few engines like source, or a handful of VERY well threaded games [Lost Planet, etc], you aren't going to see any difference between BD, PD, SB, or IB in gaming, for no other reason then the GPU is the slower component.
 
In theory, running an asymmetrical setup should be pretty straightforward- if the system load exceeds a set amount, fire up a fat core, move everything to it, and then do not schedule anything on the weak core until the fat core's utilization drops below another set amount.

Here's where the tablet market could help out the desktop market. With Tegra 3 running Windows 8 RT Microsoft has potentially had 8+ months experience with asymmetrical cores.

How much advantage they will take of it is any ones guess but they do have to compete with Android and Apple. If the same hardware only gets 8 hours battery life on Win8 and it gets 20 hours on Android they're going to have a hard sell.

 
Except you fail to understand that no matter how fast a CPU is, if you are sending more data then the GPU can process, you are not going to see FPS increase. Barring a few engines like source, or a handful of VERY well threaded games [Lost Planet, etc], you aren't going to see any difference between BD, PD, SB, or IB in gaming, for no other reason then the GPU is the slower component.
Plus, even in well threaded or lightly taxing games or engines, there isn't any difference between 100 fps and 130 fps, or 10000 fps for that matter. When running that many frames isn't as smooth as running 60 w/ Vsync.
 
Except you fail to understand that no matter how fast a CPU is, if you are sending more data then the GPU can process, you are not going to see FPS increase. Barring a few engines like source, or a handful of VERY well threaded games [Lost Planet, etc], you aren't going to see any difference between BD, PD, SB, or IB in gaming, for no other reason then the GPU is the slower component.


This is true you will see little benefit when this is a issue, but when the CPU does scale well PD will most likely show an improvement nothing to major though. I remember 2 years ago toms did a article proving that a I7 920 and a single 5850 did better then two 5870's and a Athlon II x3 chip in gaming.

http://www.tomshardware.com/reviews/athlon-ii-x3-440-gaming-performance,2619-9.html

 
Plus, even in well threaded or lightly taxing games or engines, there isn't any difference between 100 fps and 130 fps, or 10000 fps for that matter. When running that many frames isn't as smooth as running 60 w/ Vsync.

What you notice most in a game is not the max fps but the minimum fps. When checking benchmarks that's what I look at most. Min/Avg fps. Those times of heavy calculation is where the "IPC" boost helps the most. Granted it's not always calculations causing that delay but disk IO loading new models or general inefficiency in the game engine.

I agree that 100fps or 130fps doesn't make much difference unless doing 3D gaming (60fps/eye), but generally a setup with higher peak FPS also has higher min FPS.
 
This is true you will see little benefit when this is a issue, but when the CPU does scale well PD will most likely show an improvement nothing to major though. I remember 2 years ago toms did a article proving that a I7 920 and a single 5850 did better then two 5870's and a Athlon II x3 chip in gaming.

http://www.tomshardware.com/reviews/athlon-ii-x3-440-gaming-performance,2619-9.html

Far Cry 2: CPU bottleneck until ~1080p. Above that, the GPU system is more important, and the SLI'd GPU's of with the Athlon II X3 pull it ahead of the i7 920.

Stalker: CPU bottleneck at 1280x1024, GPU bottleneck at higher resolutions.

Crysis: CPU bottleneck at 1080p, GPU bottleneck at higher resolutions.

WiC: CPU bottleneck at 1080p, GPU bottleneck at higher resolutions.

See the trend line here? Up to a point, a single i7 920 + ATI 5850 overpowers a Athlon II X3 + CF'd ATI 5870's. Above that point, the GPU's are far more important to overall performance, and the GPU configuration starts to matter more. Right around 1080p is where GPU performance starts to become that major overall factor. Below that, raw CPU horsepower matters more.

Hence why when you benchmark CPU's, you HAVE to include low resolution tests, because with the same GPU config at 1080p, most games are going to give near identical FPS.
 
Far Cry 2: CPU bottleneck until ~1080p. Above that, the GPU system is more important, and the SLI'd GPU's of with the Athlon II X3 pull it ahead of the i7 920.

Stalker: CPU bottleneck at 1280x1024, GPU bottleneck at higher resolutions.

Crysis: CPU bottleneck at 1080p, GPU bottleneck at higher resolutions.

WiC: CPU bottleneck at 1080p, GPU bottleneck at higher resolutions.

See the trend line here? Up to a point, a single i7 920 + ATI 5850 overpowers a Athlon II X3 + CF'd ATI 5870's. Above that point, the GPU's are far more important to overall performance, and the GPU configuration starts to matter more. Right around 1080p is where GPU performance starts to become that major overall factor. Below that, raw CPU horsepower matters more.

Hence why when you benchmark CPU's, you HAVE to include low resolution tests, because with the same GPU config at 1080p, most games are going to give near identical FPS.

Well, I'm just getting into the more technical aspects of UE3 inside TERA, and it seems the game is so badly coded... It barely uses 4 cores, which it seems to be Windows jumping on the threads instead of TERA balancing out or using more than 2 threads.

Here's a long post of people detailing the problems, lol.

https://forum.tera-europe.com/showthread.php?t=70699&page=31

I'm still playing with the config and found some sweet spots for the game, but I can say this much: a friend of mine has an i7 2600K @4.4Ghz with a 7970 and he never lags when I do. I have a GTX670 and the PhII at 4Ghz (should be more than plenty for 99% of the games out there @1080p), but I get horrible lag spikes in some places (mostly highly crowed events). This is where the higher IPC of the i7 triumphs with no questions asked (or UE3's Intel affinity, don't know) over the PhII and I'm willing to bet that BD or PD will not solve the problem with more hertz. Now, I will make the test at some point, but this is more and more weird (getting deeper into the dark swap UE3's configs are, lol).

Cheers!

EDIT: Forgot the link xD!
 
Plus, even in well threaded or lightly taxing games or engines, there isn't any difference between 100 fps and 130 fps, or 10000 fps for that matter. When running that many frames isn't as smooth as running 60 w/ Vsync.

Why do people say this? There is a massive difference in how smooth the game looks and feels, this coming from experiencing CoD4 @ 60 fps vs 250 on a 60Hz screen, I can't even play that game on 60 FPS with vsync anymore cause it looks like its lagging. If you got a 120Hz screen its even worse, some of my buddies said they can never go back to a 60Hz screen for gaming.

 
Far Cry 2: CPU bottleneck until ~1080p. Above that, the GPU system is more important, and the SLI'd GPU's of with the Athlon II X3 pull it ahead of the i7 920.

Stalker: CPU bottleneck at 1280x1024, GPU bottleneck at higher resolutions.

Crysis: CPU bottleneck at 1080p, GPU bottleneck at higher resolutions.

WiC: CPU bottleneck at 1080p, GPU bottleneck at higher resolutions.

See the trend line here? Up to a point, a single i7 920 + ATI 5850 overpowers a Athlon II X3 + CF'd ATI 5870's. Above that point, the GPU's are far more important to overall performance, and the GPU configuration starts to matter more. Right around 1080p is where GPU performance starts to become that major overall factor. Below that, raw CPU horsepower matters more.

Hence why when you benchmark CPU's, you HAVE to include low resolution tests, because with the same GPU config at 1080p, most games are going to give near identical FPS.


Since no one(unless their on a laptop) really games on resolutions under 1080P(or over for most people at least i don't 😉 ) anyways all were doing is stretching my original point which is i can game fine with my 6950+1100T, Sure i have them overclocked and in return this setup is not the best for Performance/ per watt but my PSU is running fine and the electricity bill is cheap.

This is why gaming is not first on my list when it comes to buying a CPU. I really can't get myself to spend more then 100-130$ on a gaming CPU but i could see myself spending 400-500$ on a beast gaming GPU. but since i do some encoding/rendering i can see myself spending up to 250-300$ on a CPU nothing more though and i would rather spend less, I'm really hoping a 8 core Piledriver will compete well and even beat a I7 4C with HT under encoding/rendering and i like overclocking i'm always a person who likes tweaking my stuff so i hope it has decent efficiency as well unlike BD.

 
i mean a extra core other than main cores
example
a tweaked core (x86) running at 0.5-1.0GHz@~0.4v apart from other 4 powerful cores in apu (means a 4+1 core apu)
also a support in os to automatically and manually switch between low power core or full performance mode, then there will be no need to introduce arm and stuff (apart from security of arm :??: , i dunno about that 😗 ) in windows.


Well why don't they just take a normal BD/PD core and let the multi drop way below that 7x multiplier right now, say around 2x 😀? Do u think there are any stability issues with the cores below 1.4GHz. Funny actually, i don't think there were any P4's clocked below 1.4GHz too :)

What annoys me the most is that AMD can only gate off a whole Module, but not switch off 1 core in a module, and then keep the FPU and the other core active. I guess its just too complex to implement.🙁 If possible thoe, it wud lead to even higher single core turbo's, and even lower idle power draw:)
 
Well why don't they just take a normal BD/PD core and let the multi drop way below that 7x multiplier right now, say around 2x 😀? Do u think there are any stability issues with the cores below 1.4GHz. Funny actually, i don't think there were any P4's clocked below 1.4GHz too :)

AMD previously had idle clock speeds of 800 MHz on the Phenom II generation. The rumors I heard as to why they settled on 1.40 GHz as their lowest clock speed was that there were some complaints of lagginess with the idle speed being only 800 MHz. Plus with the clock gating and C6 sleep states, Bulldozer at 1.40 GHz uses much less power at idle than the lower-clocked Phenom IIs did.

There was one P4s clocked below 1.4 GHz- an original Willamette clocked at 1.30 GHz. The Mobile P4-Ms also reduced their clock speeds to 800 MHz or 1.20 GHz at idle.

What annoys me the most is that AMD can only gate off a whole Module, but not switch off 1 core in a module, and then keep the FPU and the other core active. I guess its just too complex to implement.🙁 If possible thoe, it wud lead to even higher single core turbo's, and even lower idle power draw:)

I wonder how much of a benefit there would be to switching off one set of integer pipes in a module. The big power-hungry things in the module are the decoder and FPU, and there is only one of each of those in the module. I don't work for AMD so I don't know the exact answer, but my guess is that there would be minimal benefit in being able to clock gate off one set of integer pipes. If you could get any increase in single-core Turbo speeds, they would be probably 100 MHz if at all. There would certainly be more complexity in having yet more clock domains to handle, though. And as we all discussed earlier, that would be one more thing for the poor Windows scheduler to goof up.
 
Hmm depending on how PD turns out, might end up getting one just to hold me over till next years system refresh. Biggest concern is how flexible the clock rates are. I've become addicted to using K10stat to boost clock rates higher then they should be on programs that refuse to utilize more then one core.
 
Far Cry 2: CPU bottleneck until ~1080p. Above that, the GPU system is more important, and the SLI'd GPU's of with the Athlon II X3 pull it ahead of the i7 920.

Stalker: CPU bottleneck at 1280x1024, GPU bottleneck at higher resolutions.

Crysis: CPU bottleneck at 1080p, GPU bottleneck at higher resolutions.

WiC: CPU bottleneck at 1080p, GPU bottleneck at higher resolutions.

See the trend line here? Up to a point, a single i7 920 + ATI 5850 overpowers a Athlon II X3 + CF'd ATI 5870's. Above that point, the GPU's are far more important to overall performance, and the GPU configuration starts to matter more. Right around 1080p is where GPU performance starts to become that major overall factor. Below that, raw CPU horsepower matters more.

Hence why when you benchmark CPU's, you HAVE to include low resolution tests, because with the same GPU config at 1080p, most games are going to give near identical FPS.
The painful truth of this is, not long ago, those averages between res/cpu/gpu were different, where a much smaller res was the tipping point.
GPUs have been outpacing CPUs for their joint usage, and it appears, at least for the next several nodes it will continue, and is but 1 more reason Intel is on this bandwagon as well, as thats where growth/perf/potential resides by wider margins
 

i clocked my old athlon 7750 down to 130MHz 😀 (130x1 : fsbxmulti) on both cores (below 130MHz fsb it is unstable) but still stable and massively slow. but on my 1090t system lowest cpumulti available is x5 with 200mhz base clock


after downclocking my cpu (1090t) to 1.0GHz (C&Q enable) it idles at 0.8GHz and at that clock (.8 and 1.0) the audio from vlc feels like an old scratch disk in an old player
😗
78turntable.jpg
 
after downclocking my cpu (1090t) to 1.0GHz (C&Q enable) it idles at 0.8GHz and at that clock (.8 and 1.0) the audio from vlc feels like an old scratch disk in an old player
😗
http://www.juneberry78s.com/sounds/78turntable.jpg

"Old scratch disk in an old player?!" 😱

Pawn_Stars_-_Old_Man_2.jpg


These damn kids nowdays don't even know what a record player and a turntable are and how they are supposed to be used. The music they used to put on vinyl actually sounded good, not that Auto-Tuned garbage kids listen to nowdays. Next thing they will be saying is that the first computer they'd used was a 2 GHz P4 and that it was "really, really old" and stand around with their jaws dropped when we tell them that cell phones used to actually be used to talk to people instead of texting. Kids these days :pfff:
 
"Old scratch disk in an old player?!" 😱

http://xfinity.comcast.net/blogs/tv/files/2011/06/Pawn_Stars_-_Old_Man_2.jpg

These damn kids nowdays don't even know what a record player and a turntable are and how they are supposed to be used. The music they used to put on vinyl actually sounded good, not that Auto-Tuned garbage kids listen to nowdays. Next thing they will be saying is that the first computer they'd used was a 2 GHz P4 and that it was "really, really old" and stand around with their jaws dropped when we tell them that cell phones used to actually be used to [strike]talk to people[/strike] emailing, chatting [strike]instead of[/strike] and texting. Kids these days :pfff:

:lol:

cell phones used to actually be used to talk to people
Talk :heink: what is it? is it fun? 😗

2.gif



music players :sol:
83276_newipod.jpg



Next thing they will be saying is that the first computer they'd used was a 2 GHz P4 and that it was "really, really old"
my first pc was with an athlon x2 7750 (k10) with only 1GB ram and it was tooooo slow 😉

P4= powerstate 4 😗
:ange:

found this on FaceBook (the only book that i read 😀 ), perfectly fits in the situation 😗
179171_410197672352596_1384264729_n.jpg
 
Status
Not open for further replies.