AMD Piledriver rumours ... and expert conjecture

Page 137 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.
Status
Not open for further replies.
We have had several requests for a sticky on AMD's yet to be released Piledriver architecture ... so here it is.

I want to make a few things clear though.

Post a question relevant to the topic, or information about the topic, or it will be deleted.

Post any negative personal comments about another user ... and they will be deleted.

Post flame baiting comments about the blue, red and green team and they will be deleted.

Enjoy ...
 
I had to say something, before this thread degenerates into skynet taking over.

If AMD doesnt receive the hoped for increases down the road, we will know they made a mistake, its too early to truly say at this point

I would add that going forwards, fusion of gpu to cpu SoC, this design may be required, and its completely to AMDs advantage.
I would suspect so, as AMD does know gfx......
 
hey you malmentaaaaaaaaaal,
having a personality teribaaal,
the one behind all this messs,
is your 'ipc is the key' caaaall.

I know that everyone here is curious to rate it 100 ona scale of 10 😛 , but mod will not like being off topic, so you can hit 'thumbs up' 😀

when i use hwinfo32 to test fpu of my thuban, then it score higher (50% more) than i7-980x 😱

i5-2300 costs 180, while 955 is at ~125, and you only need to clock 955 to 3.7 to 4.2ghz so as to beat i5 :) , also 955 is 2 year old.

After disabling 2 cores on my thuban and downvolting to 1.2v, then cpuwizard says that tdp is 62w 😱
 
Lol so true i'm just going to clock my Phenom to well lets see 4000,000Ghz since the number seems to be endless. :kaola:
one of the many flaws of the phenom II architecture would be it being limited to under 4.2ghz for general use. Over that gain from OC become extremely small. Phenom also does not physically go much higher no matter how hard you push it so it ends up being all ok. Bulldozer does not suffer from this and if the transistors could handle it and heat can be removed fast enough, it can be clocked very high. Much higher than any phenom II chip can.

Different archs run at different speeds. Bulldozer seems to do perfectly fine at higher clocks abide the heat and power use. Much of that comes from the transistor leakage.

Some people seem to be unable to see the good in bulldozer and accept some basic things that could make it good. weird.
 
Just for kicks, I'm going to compare a FX-8170 and i7-2600k, at stock, to highlight IPC.

For this discussion, I'm going to take a random benchmark, take the speed/performance difference of the CPU's, and calculate IPC for that benchmark. I'm assuming a CMT core is 80% efficent, a HTT Core is 10% efficent, and 100% core loading. [Big assumption there, I know]

http://images.anandtech.com/graphs/graph4955/41709.png

i7 2600:
Speed: 3400
Performance: 119

BD 8150:
Speed: 3600
Performance: 76.2

Reducing to 1:

i7 2600:
Speed: 1
Performance: 1.56

BD 8150:
Speed: 1.06
Performance: 1

Solving for IPC:

i7 2600
Performance = (IPC * Clockspeed) * 4 + ((IPC * Clockspeed) * 4) * .10 [HTT cores]
1.56 = IPC*4 + IPC*.4
1.56 = 4.4IPC
IPC = 1.56/4.4
IPC = .3545

FX 8150
Performance = (IPC * Clockspeed) * 4 + ((IPC * Clockspeed) * 4) * .80 [CMT cores]
1.56 = 1.06IPC*4 + 1.06IPC*3.2
1.56 = 4.24IPC + 3.392IPC
1.56 = 7.362IPC
IPC = 1.56/7.362
IPC = .2119

Reducing to 1:
BD 8150 IPC = 1
SB i7 2600 IPC = .3545 / .2119 = 1.679

So, assuming full core loading [a huge assumption, I know], SB's IPC is ~1.68 times BD's for this particular benchmark. If not 100% loading, then the results will be skewed in favor of SB.

Conclusion: IPC MATTERS.


Try these benchmarks then ...

7Zip
x84HD
AES-128

My point is depending on the application and compiler used your IPC changes ... on any of the above there is relative parity ... however the AMD CPU still has a slightly faster clock.

I do like gaming benchies though ... picking a few of them and averaging the result gives a good general idea.

Remember IPC is something AMD highlighted when they brought out the Athlons ... and to be fair to Intel, the REPLAY function caused most of the Pentium's grief in that regard.

http://en.wikipedia.org/wiki/Replay_system
 
If you look at it that way, ipc of 41xx is much higher than 81xx, as well dual core sb is much higher than 2600k. That's not testing instructions per clock per core, that testing instructions per cpu and dividing the results in a way that's not even remotely accurate.

I'm taking into account HTT/CMT don't offer 100% performance like a true core does, so you have to separate them out. And specifically stated that for simplicity sake, I was assuming a 100% full core load.

My point is depending on the application and compiler used your IPC changes ... on any of the above there is relative parity ... however the AMD CPU still has a slightly faster clock.

Which I specifically stated on several occasions. IPC varies by benchmark.

And I note again, Clockspeed is HALF the equation; AMD has a slightly faster clock, but significantly slower IPC. Thus, unless all the CPU cores are used, BD is slower.

 
IPC is not all that matters but when it comes to the BD its issue is PURE CORE PERFORMANCE(Which means IPC/CPI BD is already clocked high enough!)

Core Performance = Clock Speed * IPC

If its clocked .06% higher, and is 50% slower per core, guess what? Performance sucks.
[/quotemsg]

Most people scream and say benchmarks aren't done right,??First of all I never take any type of synthetic benchmark to seriously, Only programs that people use, Such as Video encoding and even that At best And ties with a 4 core processor from Intel that cost less money and uses less heat and it cost more for Amd to make it as well since it has a bigger die size.

Because its one of the few tasks that actually scale beyond a few cores. In that case, because you get 100% scaling, the extra cores of BD [and much more effective SMT implementation] overcome Intels per-core performance advantage.

Then i care about Gaming and in that regard the BD is worse then the Phenom and the Phenom can't even compete head on with first gen I7's and some high end I5's.
I remember someone here complaining about Skippy FPS in skyrim when he had a 7970+8150fx their is no excuse for that, when a I3 can play it fine.

Which makes sense, since games don't scale well (GPU matters more) and BD's per-core performance decreased.

Again Amd does not need to have the same performance as Intel per core heck i would even take 20% less performance per core but twice the cores for the future for the same price but not 50% as powerful per core(Compared to Intel) with twice the cores.

You assume 100% software scaling uses those cores. If that was the case, then you wouldn't see i3's being faster in gaming benchmarks.

Again, at 100% CPU loads, BD can be competitive with SB, and is probably even faster in pure performance. The problem is SOFTWARE DOES NOT SCALE. If those extra cores don't get used, you get no performance benefit. Hence why BD consistantly lags until you get to 6+ cores being used, and then starts to narrow the gap between itself and SB.

I really hope People working at Amd know better then some of the people in this forum if not we wont ever see competition in CPU's again at any price point.

If AMD did what some AMD fanatics on this forum wanted, they'd have been bankrupt long ago. Like it or not, Phenom was a dead end design. BD was at least a step forward, but unless IPC improves or software design is fundamentally reworked, it isn't going to show its power outside the server world.
 
Where are you coming up with 50%? Only time its even close to 50% is when its compiled with intel's help, but I guess that's amd's fault too.

In all my years of software design, I've not once run into a company using Intels compiler. Everyone uses MSVC, or in the rare case they also do linux builds, GCC. The "intel compiler" argument is the type of argument people make to try and rationalize AMD's poor performance, because its easier to believe Intel is evil then AMD screwed up its design.

Right now, AMD fans have moved from depression to rationalization.
 
Erm... 😛

In multithreding, bd will suffer Further loss in performance due to modular design so it will need 5.3g (maybe upto 5.5 as performance doesnot inc........ Blah blah), in single threaded tasks it will need around 5g only

your post contain 2 words only but still
""Message edited by malmental on
04-27-2012 at 08:18:57 AM""
:lol:
 
In all my years of software design, I've not once run into a company using Intels compiler. Everyone uses MSVC, or in the rare case they also do linux builds, GCC. The "intel compiler" argument is the type of argument people make to try and rationalize AMD's poor performance, because its easier to believe Intel is evil then AMD screwed up its design.

Right now, AMD fans have moved from depression to rationalization.
What does "optimized for Intel" mean then? Easiest thing Intel can do to optimize for their cpu is offer the dev their compiler with a little help to use it.

http://www.intel.co.uk/content/www/uk/en/gaming/overclocking/games-optimized-for-intel.html

If absolutely everyone uses MSVC then why does Intel even have their own compiler?

And here is Intel's fix for AMD's lawsuit for their "defective compiler"

/QxO -xO Enables SSE3, SSE2 and SSE instruction sets optimizations for non-Intel CPUs

This discussion is already been done, ~10 pages ago, but the fact remains that these "games optimized for intel" sabotaged AMD support. Can these games be taken seriously as to how well AMD cpus perform on say the compiler everyone else uses?

If you refuse to believe that Intel's agenda is to get everyone to believe how slow AMD is then there isn't any more we can discuss.
 
I do get the feeling people have a tendency to blow things out of proportion, AMD's Zambezi's can do everything a Intel 2nd/3rd generation Core I can do at the same price point, just a tad slower. What bakes the brain is that this is not a universal, there are instances where AMD's Zambezi's are faster, almost as perplexing as to how one quantifies performance. Zambezi will be regarded as a disappointment and that is granted, but it is not that it pales in insignificance to anything team blue throws out at the price point.




 
What does "optimized for Intel" mean then? Easiest thing Intel can do to optimize for their cpu is offer the dev their compiler with a little help to use it.

http://www.intel.co.uk/content/www/uk/en/gaming/overclocking/games-optimized-for-intel.html

If absolutely everyone uses MSVC then why does Intel even have their own compiler?

And here is Intel's fix for AMD's lawsuit for their "defective compiler"

/QxO -xO Enables SSE3, SSE2 and SSE instruction sets optimizations for non-Intel CPUs

This discussion is already been done, ~10 pages ago, but the fact remains that these "games optimized for intel" sabotaged AMD support. Can these games be taken seriously as to how well AMD cpus perform on say the compiler everyone else uses?

If you refuse to believe that Intel's agenda is to get everyone to believe how slow AMD is then there isn't any more we can discuss.

Optimized for Intel does NOT equate to not optimized for AMD.
 
If absolutely everyone uses MSVC then why does Intel even have their own compiler?

1. Intel's compiler is generally the best compiler for squeezing the last bit of performance out of Intel chips. The commercial-licensed version of Intel's compiler isn't cheap, so Intel makes money from selling their compiler to people who want that last little bit of extra performance.

2. Intel's compiler makes Intel's chips look good (and possibly making non-Intel chips look bad), so if Intel can get more software compiled with their compiler instead of a third-party compiler such as MSVCC, GCC, etc., it makes Intel's CPUs look better than the competition and Intel sells more CPUs.

3. Not everybody uses MSVCC. GCC in particular has a good following.

And here is Intel's fix for AMD's lawsuit for their "defective compiler"

/QxO -xO Enables SSE3, SSE2 and SSE instruction sets optimizations for non-Intel CPUs

This discussion is already been done, ~10 pages ago, but the fact remains that these "games optimized for intel" sabotaged AMD support. Can these games be taken seriously as to how well AMD cpus perform on say the compiler everyone else uses?

Game performance in current games is largely determined by these factors in this rough order:

1. Your graphics card's performance in that particular game
2. Having an adequate amount of RAM
3. Single-threaded throughput of the CPU cores
4. Having a CPU that can handle more than 2 threads
5. How well a particular game runs on your particular CPU's architecture (hits the FPU a lot vs. not a lot, likes a lot of cache vs. doesn't really care, single-threaded vs. uses a lot of threads well, sensitive to memory latency vs. not, etc.)
6. The performance of your computer's platform (chipset, RAM speed [as long as it's not insanely slow], disks, etc.)
7. Your OS/windowing manager/graphics subsystem
8. What compiler was used to compile the game

There is fairly little real difference in current games between current enthusiast CPUs from AMD and Intel. They can all handle games just fine if paired up with a good GPU. Your GPU is going to be the biggest determinant of performance, period. Sure, you have all seen the graphs in game benchmarks showing that a CPU like the i7-3770K can produce something like 20-30 more FPS than an FX-8150 in a certain title...at 1366x768 resolution with low detail where no chip is pushing less than 100 fps. That sort of a benchmark is truly just academic as the game is obviously playable on any of those CPUs (you need a minimum of only 30 fps) and any real person is going to be using something like 1920x1080 where the differences between CPUs is only a few insignificant FPS.

But, if you wonder why the Intel CPUs are at the top of those "academic use only" low-resolution game benchmarks, it is because games generally only really hit two cores of a CPU very hard. SB/IB has better single-threaded performance per clock than Bulldozer and the lightly-loaded clock speeds aren't that different between the two CPUs. Bulldozer's prowess is in heavily multithreaded applications, which most games are not.

If you refuse to believe that Intel's agenda is to get everyone to believe how slow AMD is then there isn't any more we can discuss.

Of course Intel's agenda is to get everybody to believe that they are better than AMD. That's exactly the same as when Ford tries to convince everybody that they make better vehicles than GM and the Democrats try to convince everybody that the Republicans are wrong.

Optimized for Intel does NOT equate to not optimized for AMD.

Mmmmm, that's debatable. Intel was supposed to stop that behavior, but there is quite a bit of evidence they are still refusing to use more recent optimizations on non-Intel CPUs. I can understand Intel not supporting AMD-only extensions such as SSE4a in ICC, but ICC should ONLY query feature flags when deciding whether or not to turn on what SIMD extensions. For example, the post above gives an optional switch to allow non-Intel CPUs to use SSE, SSE2, and SSE3. Meanwhile, current AMD CPUs like Bulldozer support SSSE3, SSE4.x, and AVX in addition to SSE/SSE2/SSE3. Not allowing them to be used on a non-Intel CPU when the binary WILL use them on an Intel CPU is in fact crippling the non-Intel CPU.
 
I wonder how many people waited and bought the Z77 MoBos... Hell, even waited for IB.

What you said there, works both ways.

Except for overclockers, IB is not a step sideways or even backwards. The 3770K is a bit cheaper than the 2700K, slightly outperforms it in CPU benchmarks and greatly outperforms it in QS and GPU. Plus there's USB 3.0 and PCIe 3.0. Besides, the Z77 mobos did not appear and then just sit around for 6 months waiting for IB to release, unlike AM3+ and BD.

IIRC SB wasn't that great an overclocker either when it first released - I think the average review pegged it around 4.6 or 4.7GHz. Most people expect IB to similarly improve with time.

While IB for desktop does not make sense as an upgrade for anyone with SB, it does for those of us still using C2Q's 😛..
 
games generally only really hit two cores of a CPU very hard.
Random +1 to valve. 9 year old games from them push 4 of my cores equally. I love good programming.

On topic; I think that saying BD should have been clocked higher is true. Most every FX-8xxx can reach 4.2Ghz or more. I think AMD left the clocks lower is because of the already excessive power use. That being said, the fact that my FX-8120's voltage supposedly goes up to 1.55v under max turbo blows my mind. I could undervolt the chip to 1.1-1.2v with little problems probably. Power use must drop some by doing that, so why doesn't AMD do it?
 
Status
Not open for further replies.