AMD CPU speculation... and expert conjecture

de5_Roy · Mar 12, 2013

gamerk316 :

tbh, i've had a suspicion like that as well. when i saw ps vita specs and prices, i said it'd do badly. people jumped on me singing sony and console praise. smartphones and tablets have sucessfully disrupted traditional pc and console ecosystem in many ways. but, the current consoles are aiming for the specific areas where smaprtphones and tablets become inadequate. for example, party games on a large high-def(1080p tv, 3d - gimmicky as it is.) screen @60 fps, 4k playback, hardcore fps, multiplayer. at least ps4 has stronger media consumption capability than tablets. sony can even couple psv/ps2 with ps4 like wii u does with its tablet-sticks. devs seem to like ps4 as well... well moer than win. 8 and wii u, i think. without jobs, ios will have harder and harder time to reinvent itself. ps4 can easily do well in japan. u.s. might be tougher sale, but keep expectations low and you won't be disappointed.

still, amd can make less money if the deals themselves aren't made to amd's favor. i think this has happened with ati before. then i'll blame the people who made those deals... wait, amd fired them already.

Cazalan · Mar 12, 2013

There's always going to be a bottleneck somewhere. AMD Richland has some "new" dynamic (CPU<>GPU) clocking based on workload. Looks good on paper but I wonder how good the microcode will be that handles it.

Cazalan · Mar 12, 2013

gamerk316 :

It will come down to marketing and what games are available. With a 5 year upgrade cycle the new consoles have a decent chance of creating excitement for that sector again. Of course Microsoft or Sony could kill that with draconian DRM and always online requirements.

viridiancrystal · Mar 12, 2013

mayankleoboy1 :

You are actually not making any sense right now. The GPU could, in theory compute the problems faster, but if it is already working at full capacity to render textures/shadows, then making the CPU do the work instead is more efficient.

Imagine you are cooking omelettes with two different pans. One pan can cook an omelette in 5 minutes, the other can in 20 minutes. If the first pan is constantly cooking (at a 100% work load) then waiting to cook all the omelettes in that one pan is not faster than using the other pan also. By using both pans, you will finish 5 omelettes in 80% of the time it would take just the first pan to cook them.

On topic: Does anyone know when we may see some Richland benchmarks?

noob2222 · Mar 12, 2013

gamerk316 :

So in other words you don't like new ways of writing code. Fight the future, make any excuse possible to avoid making the cpu work harder, and essentiallly making dual core cpus obsolete, instead lets rely on going sli/crossfire so the gpu can handle the extra workload ... oh, thats right, you don't like those either ...

Lets look at your theory in another way. say you get game X in the future with geforce 890. at medium settings, your at 100% gpu usage, pushing 120fps, with the cpu utilizing 2 cores at 75%. lets crank it to ultra, make the gpu do the calculating. gpu stays at 100%, fps drops to 30, cpu still just idling 6 cores (4 being HT) and 2 working ...

So now to get the game to actually be playable, we need to run sli since the cpu isn't overworked, and so that the I3 is still usable.

We can't alienate our dual core cpus. <--- This way of thinking needs to die in order for programming to move forward. Yes, I know that means your product will not sell to those with i3 cpus, after all its all about making money and not looking to the future.

mayankleoboy1 · Mar 13, 2013

I have a redesigned, highly optimized Pentium4 processor, with power gating on each register. Plus, each part of the chip has a separate voltage plan, so that each part of the chip can boost its speed according to the load. It can boost 20% over the existing P4's for longer period of time.
It performs in best-of-class category.

Would you care to buy?

/trolling.

amdfangirl · Mar 13, 2013

http://www.hardwarecanucks.com/forum/hardware-canucks-reviews/60147-detailing-richland-s-dual-graphics-gcn-compatibility.html

Richland dual graphics GCN

mayankleoboy1 · Mar 13, 2013

^ consider me a sceptic unless we see actual implementation.
But IF they have done this, it deserves more credit than i can possibly give in a reply.

de5_Roy · Mar 13, 2013

this is good.
amd must provide timely and bug free catalyst updates for dual graphics support. if they're even remotely serious about richland's success in laptops, they better deliver on the software side.

mayankleoboy1 · Mar 13, 2013

^
This.
AMD needs not great, but EXCELLENT catalyst driver Out-Of-The-Box.
And they need comon real world software to use their HSA.

gamerk316 · Mar 13, 2013

Cazalan :

No, I'm against ways of coding that artificially slow down processing. Because thats exactly what the Crysis devs did.

The rendering code in queston that was offloaded to the CPU will execute faster on a GPU. GPU's are gaining performance faster then CPUs. So how does offloading that code to the CPU make ANY sense whatsoever?

Fight the future, make any excuse possible to avoid making the cpu work harder, and essentiallly making dual core cpus obsolete, instead lets rely on going sli/crossfire so the gpu can handle the extra workload ... oh, thats right, you don't like those either ...

Because of the latency problems, which sites are now starting to reveal. How many threads over the years have been along the lines of "I'm getting 80 FPS on my SLI/CF setup, so why does the game feel choppy?". As long as SLI/CF is implemented by copying on cards VRAM to another, thus introducing unacceptable latency into the processing, I don't view it as a "good" implementation.

Secondly, I'm all for using the power CPU's have, providing it DOESN'T SLOW DOWN THE APPLICATION, OPERATING SYSTEM, OR OTHER APPLICATIONS THAT MAY OR MAY NOT BE RUNNING.

Lets look at your theory in another way. say you get game X in the future with geforce 890. at medium settings, your at 100% gpu usage, pushing 120fps, with the cpu utilizing 2 cores at 75%. lets crank it to ultra, make the gpu do the calculating. gpu stays at 100%, fps drops to 30, cpu still just idling 6 cores (4 being HT) and 2 working ...

Except the GPU wouldn't be that loaded to begin with, especially since my 570 isn't hitting 100% load at medium (V-sync enabled). So those performance numbers you pulled out of your rear aren't even remotely valid.

Secondly, follow this REALLY simple logic:

1: Year over year performance increases in GPU's is significantly greater then year over year performance in CPU's.
2: GPU's excel when executing parallel code.
3: Rendering code is very parallel.

Therefore, which component should handle the processing for rendering?

So look at it this way instead: two years from now, CPU's have increased in power by about 20-25% (not unreasonable at current trends). Meanwhile, GPU performance has doubled (not unreasonable either; look at the performance increases the past two GPU generations). So now when you look at that render code that was offloaded to the CPU, guess what? You now have a CPU bottleneck for all non-8 core processers (and even they are tasked to capacity), meanwhile, the GPU is running at ~50% load, not rendering anything because its waiting on the CPU to finish rendering.

Congrats. You just slowed down your application for no reason whatsoever for 90% of all CPU's on the market, with no performance increase for the other 10%.

So, if I understand you right, coding in a way that REDUCES FPS is a GOOD thing.

We can't alienate our dual core cpus. <--- This way of thinking needs to die in order for programming to move forward. Yes, I know that means your product will not sell to those with i3 cpus, after all its all about making money and not looking to the future.

Great. How about this then: We remove all GPU's and move back to executing all code on the CPU, in order to force CPU's to move to 64-core processors? Theres a reason why the entire rendering stack was moved to GPU's in the first place: Because at rendering, GPUs are faster then CPUs.

Now, if some dev can make a game engine that is more parallel in a way that makes programming sense, I'm all for it. But moving code back to the CPU from the GPU, for tasks that execute faster on the GPU in the first place, and causing a CPU bottleneck instead of a GPU bottleneck, is the WRONG approach.

palladin9479 · Mar 13, 2013

No, I'm against ways of coding that artificially slow down processing. Because thats exactly what the Crysis devs did.

The rendering code in queston that was offloaded to the CPU will execute faster on a GPU. GPU's are gaining performance faster then CPUs. So how does offloading that code to the CPU make ANY sense whatsoever?

Gamer your better then this. Their implementation resulted in their program running faster and more efficiently utilizing system resources then it otherwise would of. Nobody can deny this as it's rather evident from various tests and performance profiles that have since been done. For a long time it's been commented that multi-core CPUs were underutilized with most of their capability going to waste. This is most evident when an i3 match's an i5/i7 at the same clock speed even though it only has 50% of the CPU resources. At the same time GPU's are being pushed harder and harder mostly due to their workload being highly conducive to parallel operations. Crysis 3 is no exception to this, lots and lots of benchmarks have demonstrated that it scales perfectly with additional GPU resources.

Any sane rational individual can take a look at this scenario and see the balance between CPU and GPU usage is highly lopsided towards the GPU. As is resulted in your i3 + 570 scenario (please tell me you didn't actually build this) where your taking a budget CPU and mating it with a mainstream performance GPU, or in some people's cases even a high end GPU. The C3 devs just balanced it out by pushing more onto the CPU to alleviate usage on the GPU to do more work. In effect they turned the CPU into a co-processor for the GPU, which while hilarious is somewhat fitting considering the context.

So while you may take a purist position, and that's your right, you can not in good faith claim it is an implementation that results in less performance. Benchmarks and performance profiling have already empirically demonstrated otherwise.

mayankleoboy1 · Mar 13, 2013

gamerk316 :

I think Crytek devs had to resort to using CPU for rendering because of the GPU in consoles getting all maxxed out.Contrary to what the Crytek devs tell, i dont think that PC was really their main platform (and who can blame them, when console sales are more?)

Anyone remember that Crytek devs said that "after Crysis3, there want even 1% spare computation power left in consoles" ?

Regarding doing rendering stuff on 6/8 core processors : PLEASE read on LLVM-pipe drivers. (and how they are good for 2D, but awfully inadequate for even semi-serious 3D work). There is a reason that the industry went to 1000 tiny 'cores' rather than 6 big cores.

palladin9479 Said :
So while you may take a purist position, and that's your right, you can not in good faith claim it is an implementation that results in less performance. Benchmarks and performance profiling have already empirically demonstrated otherwise.

What do benchmarks show ? That Crysis3 gets performance gains from using more cores, and more clock. But it does not show that if the same work was done on the GPU, the FPS would be higher.

Plus, in the next gen of GPU, expect conservative ~30-50% perf increase. When was the last time you saw 30% improvement in CPU processing ? Reason : its sort of trivial to add moar coares to the GPU, and the workload scales effortlessly

palladin9479 · Mar 13, 2013

What do benchmarks show ? That Crysis3 gets performance gains from using more cores, and more clock. But it does not show that if the same work was done on the GPU, the FPS would be higher.

Look at previous sources posted in the thread. On anything with four or more cores the game scales with GPU performance, this is definitely a "GPU limited" game. It really does seem to use the CPU as a "coprocessor".

http://www.gamegpu.ru/action-/-fps-/-tps/crysis-3-test-gpu.html

http://www.overclock.net/t/1362591/gamegpu-crysis-3-final-gpu-cpu-scaling

http://www.techspot.com/review/642-crysis-3-performance/

palladin9479 · Mar 13, 2013

Ok after doing some digging around I can see what they did and what makes this different from previous games.

Games tend to have one primary render thread along with a handful of other threads for various other tasks that need to be done. The implementations of most games have those secondary threads implemented in a very lock-step serialized method, each one needs data from the previous one to do it's job. So even though you've created other threads, rarely will they be able to run at the same time. This is where the "games only ever use and will only every use two threads" idea comes from.

What C3 did was implement those additional threads in a non-serialized manor. There is still one primary render thread that use's as much CPU as it possibly can, and often it needs data from those other threads to do it's job. It doesn't need all the data all the time, and the exact scene your looking at will dramatically determine how much extra CPU power is required. The more grass, vegetation, environmental effects and various physics effects present the more additional CPU power your going to need to keep that primary thread fed which in turn keeps your GPU fed.

That explains why dual core CPU's fall so hard on this, they simply do not have the raw processing resources required to keep the GPU fed with data. Once you have four cores it comes down to the exact scene and how much additional work needs to be done, the more work the more raw CPU resources are needed (vs single threaded performance). It also explains the disparity between benchmarks, different scenes, resolutions, detail settings and camera angles result in different amounts of work required to be done.

jdwii · Mar 13, 2013

Crysis 3 runs better then Crysis 2 and looks better while doing so on my system. I was quite amazed actually. Still not getting how he keeps saying the bottleneck is the CPU when the CPU's are barley being used and the GPU is still at 100% on dual graphics cards just not seeing gamers point.

iceclock · Mar 13, 2013

longest thread ever

jdwii · Mar 13, 2013

iceclock :

How? This statement makes no sense. Bulldozer thread was the biggest

de5_Roy · Mar 13, 2013

i loved bulldozer.....thread

gamerk316 · Mar 13, 2013

palladin9479 :

More or less correct. The render thread is typically done VERY early in processing, so the data can be passed to the GPU. While the GPU is then busy creating the current frame, the audio, physics, AI, and UI engines do their work, which will be reflected in the next in-game frame. Now, since sound, AI, physics, and UI typically aren't high workload threads, this results in one or two threads that do the vast majority (>90%) of the workload.

What C3 did was implement those additional threads in a non-serialized manor. There is still one primary render thread that use's as much CPU as it possibly can, and often it needs data from those other threads to do it's job. It doesn't need all the data all the time, and the exact scene your looking at will dramatically determine how much extra CPU power is required. The more grass, vegetation, environmental effects and various physics effects present the more additional CPU power your going to need to keep that primary thread fed which in turn keeps your GPU fed.

There are downsides though. You get into issues with folding the other threads into the main render thread. Specifically, do the worker threads send the main thread data (in which case, the main thread risks stalling if it executes too fast], or does the main thread request data from the workers (same problem, in reverse). So you need to create a very robust thread-management system to make a schema like this work well. Keeping the main thread fed in this scheme is a LOT harder to accomplish. [Basically, this is very close to codind to the PS3y. Keeping the 6 PPE's fed with data is probably the hardest challenge of coding for the PS3.]

And again, I dislike moving rendering code off the GPU. Thats its own separate debate.

There are other things that should be threaded, but typically aren't, due to simplistic implementations. Take Sim City: The game uses really simple pathfinding for traffic, specifically, "shortest distance", with a small weight towards highways when applicable. This results in some halarious traffic patterns, especially when you put two highways side by side, one is gridlock, and the other never used. While "shortest time" pathfinding would solve this issue, its a LOT more computationally expensive, since each path would need to be analyzed for EVERY traffic element. While each is simple to compute, the sheer workload will effect FPS due to the overhead. Something like this is a perfect threading opportunity, as each car and each possible path is more or less parallel of the rest. But in todays implementations? Just a simple "shortest distance" equation that takes up next to no processing power.

Its things like this that annoy me, because it actually affects gameplay. Thats the type of stuff I want fixed going forward.

palladin9479 · Mar 13, 2013

gamerk316 :

palladin9479 :

More or less correct. The render thread is typically done VERY early in processing, so the data can be passed to the GPU. While the GPU is then busy creating the current frame, the audio, physics, AI, and UI engines do their work, which will be reflected in the next in-game frame. Now, since sound, AI, physics, and UI typically aren't high workload threads, this results in one or two threads that do the vast majority (>90%) of the workload.

What C3 did was implement those additional threads in a non-serialized manor. There is still one primary render thread that use's as much CPU as it possibly can, and often it needs data from those other threads to do it's job. It doesn't need all the data all the time, and the exact scene your looking at will dramatically determine how much extra CPU power is required. The more grass, vegetation, environmental effects and various physics effects present the more additional CPU power your going to need to keep that primary thread fed which in turn keeps your GPU fed.

There are downsides though. You get into issues with folding the other threads into the main render thread. Specifically, do the worker threads send the main thread data (in which case, the main thread risks stalling if it executes too fast], or does the main thread request data from the workers (same problem, in reverse). So you need to create a very robust thread-management system to make a schema like this work well. Keeping the main thread fed in this scheme is a LOT harder to accomplish. [Basically, this is very close to codind to the PS3y. Keeping the 6 PPE's fed with data is probably the hardest challenge of coding for the PS3.]

And again, I dislike moving rendering code off the GPU. Thats its own separate debate.

There are other things that should be threaded, but typically aren't, due to simplistic implementations. Take Sim City: The game uses really simple pathfinding for traffic, specifically, "shortest distance", with a small weight towards highways when applicable. This results in some halarious traffic patterns, especially when you put two highways side by side, one is gridlock, and the other never used. While "shortest time" pathfinding would solve this issue, its a LOT more computationally expensive, since each path would need to be analyzed for EVERY traffic element. While each is simple to compute, the sheer workload will effect FPS due to the overhead. Something like this is a perfect threading opportunity, as each car and each possible path is more or less parallel of the rest. But in todays implementations? Just a simple "shortest distance" equation that takes up next to no processing power.

Its things like this that annoy me, because it actually affects gameplay. Thats the type of stuff I want fixed going forward.

Yet we have indication that .... it's working exactly as intended. They somehow "solved" all the problems you indicated and the result is a high performance game that is actually taking advantage of modern CPU capabilities. On several occasions I've said that in order for programming to move forward people need to rethink and redefine the problem from the beginning. Not try to take an already serialized methodology and make it parallel. Expect more of this over the next few years. The era of "you just need a dual core" is now ending.

-Fran- · Mar 13, 2013

palladin9479 :

Haha, you have just earned a quote in my sig.

Cheers! 😛

Cazalan · Mar 13, 2013

gamerk316 :

Ideally GPUs would do all the work. The world is moving to APUs though and the GPUs are going to be relatively constrained compared to discrete.

griptwister · Mar 14, 2013

My thoughts, if you're using a GTX 890, You're not going to be using a 2600k. Sooo, any more excuses? Or are we going to accept the fact that Games will be using more than 4 cores now?

Paladin's comment was Genius. Hahaha!

sarinaide · Mar 14, 2013

Got a few Richland notebooks and ultrabooks in. Think these are going to be very good offerings.

AMD CPU speculation... and expert conjecture

Splendid

Distinguished

Distinguished

Distinguished

Distinguished

Distinguished

Expert

Distinguished

Splendid

Distinguished

Glorious

Splendid

Distinguished

Splendid

Splendid

Splendid

Illustrious

Splendid

Splendid

Glorious

Splendid

Glorious

Distinguished

Distinguished

Splendid

Share this page