AMD Piledriver rumours ... and expert conjecture

Reynod · Oct 27, 2011

We have had several requests for a sticky on AMD's yet to be released Piledriver architecture ... so here it is.

I want to make a few things clear though.

Post a question relevant to the topic, or information about the topic, or it will be deleted.

Post any negative personal comments about another user ... and they will be deleted.

Post flame baiting comments about the blue, red and green team and they will be deleted.

Enjoy ...

jdwii · Sep 12, 2012

fazers_on_stun :

I think SMT hurt the Pentium 4 more back then since it made its single core performance weaker by 5-10% that was probably more of a issue with XP then the CPU. CMT is hurting Amd by 20% in single threading.

de5_Roy · Sep 12, 2012

AMD execution got better with Trinity
http://www.fudzilla.com/home/item/28723-amd-execution-got-better-with-trinity

gamerk316 · Sep 12, 2012

Lock free programs have their own issues to overcome. Using your own links:

http://www.eetimes.com/design/embedded/4214763/Is-lock-free-programming-practical-for-multicore-?pageNumber=2

What CAS does is to compare the current content of var with an expected value. If it is as expected, then var is updated to a new value. If it is not as expected, var is not updated. All of this is done as a single noninterruptable, nonpreemptable operation.

The way this is used in the development of lock-free code is to begin by copying the original value of a variable (or pointer) into expected. Then a second copy is used in a (possibly long and complex) calculation to arrive at a new value for the variable. But during the time of the processing, another task—perhaps on another core—may have already modified the variable from its original value. We would like to update the variable with the new value only if it has not been "tampered with" by another task in the interim. CAS does the combined check and update together, without the need for a locking mechanism.

Emphasis mine. So if two different threads want to access my geometry matrix, you are telling me I have to make a copy OF THE ENTIRE WORLD GEOMETRY? Good luck with that. Sure, assuming infinite memory, trivial times to copy large data structures, you could get away with it. For large data structures, this approach is infeasible. Regardless, memory usage will increase linearly with the number of threads. The more threads, the more memory you use. The more memory you use, the more likely data will be paged to disk. The more data paged to disk, the more likely you'll lose 100,000 or so clock cycles cleaning up the pagefault that results.

Secondly, you have the dreaded ABA problem:

http://en.wikipedia.org/wiki/ABA_problem

For simple operations (increment/decrement), we already have Interlocks that are guaranteed to be atomic, so for simple operations at least, locking isn't a significant issue.

For reference:

http://en.wikipedia.org/wiki/Lock-free_and_wait-free_algorithms

All the algorithms have their own problems. Theres no magic bullet here.

Wait-Free: Slow, and eats RAM
Lock-Free: Executing thread slows down (Is the cost of decreased thread performance worth it in order to avoid locks? Depends on the app in question)
Obstruction-freedom (or Optimistic Concurrency Control, for you database guys): Sucks performance if a conflict is detected (fast otherwise, but the performance cost in the case of a conflict is SIGNIFICANTLY more then the cost of locks in most cases)

esrever · Sep 12, 2012

de5_Roy :

I wonder how they expect to have a good comeback in Q4. The most interesting stuff they have planned isn't until next year.

parazito86 · Sep 12, 2012

I really enjoy the new AMD efforts on making new type of processors

cgner · Sep 13, 2012

parazito86 :

At least this time it comes without BS marketing campaign. :wahoo:

noob2222 · Sep 13, 2012

gamerk316 :

yes, every one of them has their issues, and workarounds. Does that mean we should just ignore them and stick with the old-standard of "its impossible"

wait-free is designed to speed things up, how is that slow?
lock-free only decreases performance on single-core systems, the thing your fighting, multi-core era.

using more ram is bad for low end systems, punish everyone else that has 8+gb system memory and stick with the old code. Its slow on 1-2 core systems, punish everyone that has 4-8 core systems and stick with the old code.

Like I stated before, it comes down to promoting your product to the market, not eliminating anyone because you made the biggest bad ass program around, but it requires some high end system due to how its written. <-- this is whats holding back multi-core game development.

viridiancrystal · Sep 13, 2012

"We make games less demanding so people with low-end systems can run them."
This way of thinking brings about no progress. If you never give people a reason to upgrade, then they wont. Following that, your graphics will never get better, because you still want to satisfy those people.

It is probably dramatized by the economy as it is, because people don't want to spend money if they don't have to.

But still, by always crippling your game for the audience with low-end systems, you are essentially saying you don't want to make games look better, or do better things with that unused pool of cpu and gpu power.

cgner · Sep 13, 2012

We reached the point where for "most" people graphics are good enough. Look at consoles. Their graphics have been stagnant for like 3 years now, and the new generation is only somewhat improving on the platform. Nothing dramatic, no high end pc graphics in next gen consoles.

viridiancrystal · Sep 13, 2012

I feel like that is a result of people not knowing how much graphics can advance from where they are. If everyone had seen The Witcher 2 gameplay at its highest settings, they would absolutely want games to look that good.

The stagnation is not because people have willingly settled, but more like they are forced to settle for what they have, which traces back to software developers not pushing any further.

esrever · Sep 13, 2012

cgner :

thats how consoles work.

jdwii · Sep 13, 2012

cgner :

I would say the Wii U is only going to have a 4670-5570 level video card in it at best with a quad core Arm(IBM 🙁) CPU with around 1GB of ram. Even if this is the level of power its still a nice jump for Nintendo since their current wii has less performance then a common tablet. I know people here will disagree with me but i think Console gaming is kinda slowing down because of tablets and smartphones. Most games cost to much money on consoles and parents can't afford it. This is one reason why i love PC gaming Steam and Amazon always has great deals. I'm happy that we have integrated graphics that can play games now on OK settings maybe PC gaming can make a slight come back!

To think of all the people who spend 60$ on a new game every month and 200-300$ on a console and most people have 2 consoles. If they just spend 500$ on a desktop they will save loads of money going to a PC instead and the 500$ PC will play games better and look better while doing it Not to mention a PC is more then just a gaming console. For 60$ a month you can buy 6-8 good games for the PC. Just wait a little to buy the newer games and the price will be way lower.

But back on my point i think it would be smart for Nintendo to just use a custom 4 core Trinity with 1GB of Vram in it would most likely be cheaper and use less power consumption then their current plan. Still be a decent amount faster then the PS3 and 360 anyways Nintendo has never been about Graphics their more into gameplay.

Talking about Consoles if Sony or Microsoft was smart they would just do the same they can charge a lot less for their consoles and it would be a decent amount faster and way more efficient plus easier to program for unlike the PS3 at release. Have them both use a APU and a 6570 level video card that would be OK in terms of performance vs cost wouldn't it be cool to get the Wii U for 99.99$(in a year or 2) or the next Playstation for 199$ people would go crazy for it. Those consoles are so outdated its not even funny even a A6 in a laptop is better so to use something like a custom made trinity would be a smart idea.

Plus for are benefit this would force these companies to push the new consoles to their limit to get a 1080P output and maybe they will use OpenCl and some other tricks to get more performance out of the hardware.

Leslie also mentioned that FM2 socket for desktop, which will be introduced with the official launch of desktop Trinity parts, will be here to stay. You can expect that AMD will stick with one socket for a longer time, as this is what AMD understand its upgrade market and loyal customers want. AMD believes that the recent changes in the company can get things going for the better, promising less delays, better execution and more marketing and market presence.

😀

cgner · Sep 13, 2012

viridiancrystal :

Crysis on max settings @1080p pretty much rapes anything consoles can show us, and that game is 5 years old. Developers don't push because of lowest common denominator - xbox 360. They develop things for weakest common console, then port games to better machines to cover widest possible audience.

esrever :

Thats how market full of noob ass little 10 year kids and their parents works.

jdwii :

And to that philosophy I have to say no. Those consoles would hardly be better than what is offered today. 4670 I had and it is just not made for 1080p, and I had 1gb version. I seriously expected gtx570 level graphics form next gen, but now they say 6670 gpu will be used. :heink:

Smeg45 · Sep 13, 2012

viridiancrystal :

I'm running the Witcher 2 maxed (minus Ubersampling) and I nearly orgasmed in certain places. It looks spectacular. Who cares about creaky old consoles. Games for PC will always look better and there will be games like the Witcher 2 that look spectacular occasionally too. PC gaming is for a different breed of person. Simple.

mayankleoboy1 · Sep 13, 2012

looking from a casual and occasional gamers perspective, a console is far far better. It just works.

jdwii · Sep 13, 2012

And to that philosophy I have to say no. Those consoles would hardly be better than what is offered today. 4670 I had and it is just not made for 1080p, and I had 1gb version. I seriously expected gtx570 level graphics form next gen, but now they say 6670 gpu will be used.

The hardware i picked would get you a decent graphics performance increase over the current console hardware. Since game programmers can actually optimize their games for it were your 4670 had no optimization or very little. I'd say a Trinity with 5670 level graphics and a 6670 for CF would get OK performance since they can squeeze every bit of efficient juice out of it. You might be surprised to find out that the level of graphics in a PS3 and 360 is lower then what you think.

"Based on NV47 Chip previously based on the 7800 but with cut down features like lower memory bandwidth and only as many ROPs as the lower end 7600. (Nvidia GeForce 7800 Architecture"

That means a 7800GT is more powerful then the graphics card in the PS3 which is better then a 7600Gt but not quite a 7800GT.

The Xbox 360 is around ATI Radeon X1800 level which by looking at old reviews has better performance then a 7600GT but not quite a 7800GT.

This is going to take a long time of me looking of benchmarks but here i go!

So lets just say the Xbox 360 and PS3 is around 10% weaker when compared to a 7800gt. A 7800GT is just as powerful as a 8600GTS and a Nvidia 9500GT DDR3 version is just as fast as a 8600GTS. Now a 9500GT is around 10% weaker then a radeon 4670 but around 10% faster then a 4650 DDR2 version. So i'd say the PS3 and the Xbox 360 has 5550 level graphics. Now a 6670 is about 45% faster then a 5550.

On top of this even if they put a 5550 inside the Wii U you would noticed better graphics since it would support newer shaders and better lighting and such when compared to the 360 and PS3.

This will be a huge jump for Nintendo since their current wii is very weak and has lower graphics then the Xbox 1 or about on par, Wii has a little bit more ram.

Second 570 level graphics would also bring 570 level cost/heat which is not so efficient for a console look at the PS3 sales when it first came out consoles need to be priced at 300$ or less to even be popular.

mayankleoboy1 · Sep 13, 2012

Also, i read that in the PS3, it had 2 integer pipelines working at (1.6 ghz ?) . BUT, the pipelines worked alternately. So in effect, each integer pipeline works at half of the top frequency.

cgner · Sep 13, 2012

Smeg45 :

Hopefully. I am sick and tired of console ports with dx9 graphics that attempt to look good and have sucky, dumbed down controls.

mayankleoboy1 :

Unless you are into pc modding, overclocking, etc then yea. personally, I put a machine together, installed windows, installed steam, and its as user friendly as it gets. So yea, it just works.

jdwii :

I suppose you are right about 45% increase, but in 7 years graphics and hardware improved several folds (or at least possibilities). Compare 7000 gtx cards to GTX600. I wanted them to double or triple the performance in next consoles. Instead its just a nice healthy little bump. Disappointing, if u ask me.

gamerk316 · Sep 13, 2012

noob2222 :

No, I'm just pointing out that the disadvantages outweigh the gains (at least as currently implemented).

wait-free is designed to speed things up, how is that slow?

Wrong. Wait-free only removes waiting contention between threads. This does NOT equate to an automatic speedup. In this case, because of the increased memory usage (and the overhead that comes with it), while your threads won't be waiting on eachother, the slowdowns that result from removing waits from the equation actually slows the entire process. (For example, making a copy of a large data structure is a time consuming process, especially if no block in RAM is currently large enough to hold it.)

lock-free only decreases performance on single-core systems, the thing your fighting, multi-core era.

Again, wrong.

using more ram is bad for low end systems, punish everyone else that has 8+gb system memory and stick with the old code. Its slow on 1-2 core systems, punish everyone that has 4-8 core systems and stick with the old code.

Wrong again. The 4GB Address Space limitation is in effect for ALL 32-bit applications. The application can not, under any circumstances (PAE aside), access more memory then that. So if you start to increase storage usage linearly, you are very quickly going above that 4GB limitation.

Moving to Win64 would solve that issue, but even then, you'd continue to be using more and more RAM as you increased the number of threads.

[/quotemsg]
Like I stated before, it comes down to promoting your product to the market, not eliminating anyone because you made the biggest bad ass program around, but it requires some high end system due to how its written. <-- this is whats holding back multi-core game development.[/quotemsg]

Disagree totally with your conclusion. But hey, I'm only the Software guy; lets ignore his opinions on software design.

gamerk316 · Sep 13, 2012

jdwii :

I actually agree on this. Tablets especially are going to continue to encroach on consoles domain. Consoles still have the advantage of an easy interface and are (mostly) idiot proof and trouble free, but between the HW cost to create the things and diminishing returns on investment, they are becoming a very hard sell.

I would not be shocked if this is the last major console generation.

But back on my point i think it would be smart for Nintendo to just use a custom 4 core Trinity with 1GB of Vram in it would most likely be cheaper and use less power consumption then their current plan. Still be a decent amount faster then the PS3 and 360 anyways Nintendo has never been about Graphics their more into gameplay.

Sticking with PPC makes sense for Nintendo, since they won't require another CPU (and the cost involved) for backward compatibility.

Talking about Consoles if Sony or Microsoft was smart they would just do the same they can charge a lot less for their consoles and it would be a decent amount faster and way more efficient plus easier to program for unlike the PS3 at release. Have them both use a APU and a 6570 level video card that would be OK in terms of performance vs cost wouldn't it be cool to get the Wii U for 99.99$(in a year or 2) or the next Playstation for 199$ people would go crazy for it. Those consoles are so outdated its not even funny even a A6 in a laptop is better so to use something like a custom made trinity would be a smart idea.

Remember again, consoles are dedicated HW, so you don't need nearly as much power as you do for a PC.

I'm really not sure which way sony is going to go on the CPU front. I still can't see X86, but the Cell is REALLY hard to code for properly. Going to be interesting to see which way they go...

cgner · Sep 13, 2012

Yea I doubt they will use cell again. So many games didnt make it to ps3 from xbox simply because developers/distributors didn't think that investing many hours into porting the game was worth it. Its a beast of a processor but coding is a pita. I hope they just stick to a dual/quad core cpu next time.

fazers_on_stun · Sep 13, 2012

jdwii :

IMO the P4 SMT was partly a marketing gimmick and partly Intel wanting to show a "dual core" according to task manager but the rest of the hardware not up to it. I.e., a good idea but mediocre execution as a first, but too early attempt. Which sorta describes CMT as well 😛.

Didn't MS have to patch XP (or maybe SP1) before it would support SMT properly?

-Fran- · Sep 13, 2012

AMD wrote it's own patch as well for the dual cores. Remember the dual core optimizer? Don't remember having performance improvements with it though, hahaha.

Well... It seems the NT 5.x kernel was conceived as a single CPU arch.

Cheers!

EDIT: The desktop version, off course.

cgner · Sep 13, 2012

lol I remember early hyperthreading and early Pentium D dual cores. Those were done so poorly.

gamerk316 · Sep 13, 2012

fazers_on_stun :

OS's that can see multiple cores can use SMT, but Windows needed a patch to NOT use the HTT core unless the workload demanded it. Because a HTT core shares almost all CPU resources, it could kill performance in some loads. Same issue as CMT basically, just to a much greater extent.

-Fran- :

You mean XP? Makes sense.

I again note: MS could have killed dual cores if they insisted on per-core licensing. Thank god they didn't...

AMD Piledriver rumours ... and expert conjecture

Administrator

Splendid

Splendid

Glorious

Splendid

Honorable

Honorable

Distinguished

Distinguished

Honorable

Distinguished

Splendid

Splendid

Honorable

Honorable

Distinguished

Splendid

Distinguished

Honorable

Glorious

Glorious

Honorable

Splendid

Glorious

Honorable

Glorious

Share this page