AMD CPU speculation... and expert conjecture

palladin9479 · Oct 15, 2014

On optimizations...

As a practical mater, no console orientated optimizations will be seen on PC. This has absolutely nothing to do with the hardware and everything to do with the nature of the OS's involved. Console's use whats known as a Real Time Operating System (RTOS) while PC's don't. RTOS's are special because the system knows exactly how long each task will take prior to execution and has extremely tight timing. Task switching is handled on a regular predictable basis and every program running will know exactly how much time it has on any given processor. Because the program's developer could know this, they can program their code to be extremely tight, wasting very little CPU resources and having very few systemic bottlenecks or stalls. On a non-RTOS system, the programmers would have zero guarantees on how much time their program would have on the CPU nor could their program be capable of predicting how long it has before the OS's task scheduler preempts them. In a RTOS, things like AV and background services are timed such that foreground programs will always know the frequency of task swapping and how much time it has on the CPU. The downside is a RTOS isn't as flexible to a sudden change of user demands (user originated task switching). The predictability makes them ideal for application driven devices, things like sensors, control systems, media players or other non-multi tasking devices. This is also why modern consoles have "dedicated" CPU resources for the OS and background services, it ensures the application programmers will have a known amount of resources at their disposal.

So no matter what hardware is used, the fact that the OS's are radically different then what you would use on PC is going to prevent the majority of the performance optimizations. The only ones to cross over would be the multi-processing ones, so we will definitely see an increase in the number of CPU resources utilized.

As for the perceived "weakness" of the console HW, that is laughable. Current consoles have a ridiculous amount of processing resources at their disposal, much more then the previous generation. Multi-processing isn't anything new and is particularly adapted to console RTOS environments. Most of the discussed hold backs to multicore processing on games simply don't apply to consoles because you always, without question, know exactly what the system will be doing at any given time, it's 100% predictable. Consoles have always used cheap components, price limits are what is restricting them to heavily. It's entirely feasible to build a console around a $300 USD GPU and get stunning graphics, but it would be an expensive console that nobody would buy. So instead they use low prices components, stuff we would expect inside a $400 USD or less build, hell Sony even sprung for GDDR5 which is very expensive compared to DDR3. Then there is the power budget to deal with, consoles don't have anywhere near the power inputs nor the thermal dissipation to deal with higher end components. You guys talk about i5 level power, but an i5 CPU is way past a console's capabilities, same with GPU chips.

Simply put, your not going to see 1080p gaming on a console without dramatically reducing the number of objects and textures rendered on the screen. The resolution increase alone from 720p to 1080p would require 225% more power. Further increasing the objects, textures, special effects and complexity would increase the value even higher. So it's not that the console HW isn't keeping up with comparable non-console HW, it's that expectations are increasing at a rate faster then the industry can meet. All because a specific combination of three letters was used as the vender name.

And gamer, no way in hell can the 8 core Jaguar be harder to program for then the abomination that was IBM's cell, and yet the PS3 was a success.

Reepca · Oct 16, 2014

palladin9479 :

On optimizations...

As a practical mater, no console orientated optimizations will be seen on PC. This has absolutely nothing to do with the hardware and everything to do with the nature of the OS's involved. Console's use whats known as a Real Time Operating System (RTOS) while PC's don't. RTOS's are special because the system knows exactly how long each task will take prior to execution and has extremely tight timing. Task switching is handled on a regular predictable basis and every program running will know exactly how much time it has on any given processor. Because the program's developer could know this, they can program their code to be extremely tight, wasting very little CPU resources and having very few systemic bottlenecks or stalls. On a non-RTOS system, the programmers would have zero guarantees on how much time their program would have on the CPU nor could their program be capable of predicting how long it has before the OS's task scheduler preempts them. In a RTOS, things like AV and background services are timed such that foreground programs will always know the frequency of task swapping and how much time it has on the CPU. The downside is a RTOS isn't as flexible to a sudden change of user demands (user originated task switching). The predictability makes them ideal for application driven devices, things like sensors, control systems, media players or other non-multi tasking devices. This is also why modern consoles have "dedicated" CPU resources for the OS and background services, it ensures the application programmers will have a known amount of resources at their disposal.

So no matter what hardware is used, the fact that the OS's are radically different then what you would use on PC is going to prevent the majority of the performance optimizations. The only ones to cross over would be the multi-processing ones, so we will definitely see an increase in the number of CPU resources utilized.

As for the perceived "weakness" of the console HW, that is laughable. Current consoles have a ridiculous amount of processing resources at their disposal, much more then the previous generation. Multi-processing isn't anything new and is particularly adapted to console RTOS environments. Most of the discussed hold backs to multicore processing on games simply don't apply to consoles because you always, without question, know exactly what the system will be doing at any given time, it's 100% predictable. Consoles have always used cheap components, price limits are what is restricting them to heavily. It's entirely feasible to build a console around a $300 USD GPU and get stunning graphics, but it would be an expensive console that nobody would buy. So instead they use low prices components, stuff we would expect inside a $400 USD or less build, hell Sony even sprung for GDDR5 which is very expensive compared to DDR3. Then there is the power budget to deal with, consoles don't have anywhere near the power inputs nor the thermal dissipation to deal with higher end components. You guys talk about i5 level power, but an i5 CPU is way past a console's capabilities, same with GPU chips.

Simply put, your not going to see 1080p gaming on a console without dramatically reducing the number of objects and textures rendered on the screen. The resolution increase alone from 720p to 1080p would require 225% more power. Further increasing the objects, textures, special effects and complexity would increase the value even higher. So it's not that the console HW isn't keeping up with comparable non-console HW, it's that expectations are increasing at a rate faster then the industry can meet. All because a specific combination of three letters was used as the vender name.

And gamer, no way in hell can the 8 core Jaguar be harder to program for then the abomination that was IBM's cell, and yet the PS3 was a success.

Excellent information, but I'm definitely going to quote you sometime on "As a practical mater" :~).

So would it be a valid statement to say that the console OS's have little to no actual multitasking, therefore it is very easy and predictable to program spreading a load across multiple cores, since it knows that it will only ever be trying to run one thing on those cores at a time (or if it is trying to run more than one thing on those cores at a time, it knows exactly how many it will try to run at a time and how much time it will take - everything goes exactly as planned, basically).

palladin9479 · Oct 16, 2014

So would it be a valid statement to say that the console OS's have little to no actual multitasking, therefore it is very easy and predictable to program spreading a load across multiple cores, since it knows that it will only ever be trying to run one thing on those cores at a time (or if it is trying to run more than one thing on those cores at a time, it knows exactly how many it will try to run at a time and how much time it will take - everything goes exactly as planned, basically).

There can be multitasking, it's just extremely predictable multitasking, no surprise evictions in the middle of code execution. In the case of consoles they go a step further and segregate the CPU resources such that the application always has a set amount of resources. I believe the PS4 has two of the jaguar cores dedicated to the OS and background services while the other six are always available for the game. So games are being coded to maximize resource utilization on six cores not eight. The low level kernel mode drivers that manage the hardware would be running on the OS cores along with all memory management tasks and downloads. RTOS's are used extensively in device automation, most cars today have a RTOS that runs the engine timing and fuel injection. Calculators, digital watch's, smart TV's, those are all examples of a RTOS. Because of how much they are stripped down and specialized they can afford to be extremely small. QNX is a well known RTOS that has a "free for personal use" type download. Try it out if you want to see what the differences are.

jdwii · Oct 16, 2014

I thought the evil within was just a bad PC port i guess its just bad in general
http://a.disquscdn.com/uploads/mediaembed/images/1373/3825/original.jpg
I thought the PS4/One should perform like a titan with optimization wonder when we will see that

sapperastro · Oct 16, 2014

Bethesda hasn't progressed much in the programming department in years. I am totally unsurprised at how poor that game runs. I think too many people are blaming the consoles rather than the sloppy programming (more like brute force porting) of these developers.

Nobody seems to find it amusing when looking at the difference between specs required for Evil Within and the new Dragon Age game?

I suppose everyone has forgotten GTA4? There are many other examples of piss poor porting efforts from before the latest consoles.

Embra · Oct 16, 2014

sapperastro :

I have really been looking forward to Dragon Age on PC. I just hope they don't ruin it:

http://venturebeat.com/2014/10/15/dragon-age-inquisition-producer-talks-about-xbox-one-vs-ps4-performance-framerate-and-more/

blackkstar · Oct 16, 2014

I think some of you are forgetting that regulations regarding power consumption have changed a lot.

Playstation 3 would see 200w when gaming
https://en.wikipedia.org/wiki/PlayStation_3_technical_specifications#Form_and_power_consumption

Playstation 4 peaks at 140w when gaming and using menus
https://en.wikipedia.org/wiki/PlayStation_4_technical_specifications#Power_usage

Playstation 3 uses ~43% more power than Playstation 4. EU is clamping down on power consumption on these sorts of devices
http://ec.europa.eu/enterprise/policies/sustainable-business/ecodesign/product-groups/sound-imaging/files/console_maker_proposal_annex_en.pdf

Here is some more research on the subject. I can not find original xbox or playstation numbers. I think it's because at that time no one cared about power consumption.
https://wpweb2.tepper.cmu.edu/ceic/pdfs/CEIC_11_01.pdf

As for optimized software, I have Gentoo on my A4-5000 laptop that runs at 1.55ghz. If you wanna look at BMW Blender Benchmark, it scores around 12 minutes with 4 cores. So 8 cores should be around 6 or 7 minutes.

Similar times include (from Blender BMW Thread)
Intel Core i5-2410M 2.30GHz : 6 min 36 seconds (CPU 4 threads)
Intel Core 2 Quad 2.9Ghz 8 min 9 sec (CPU)
AMD Phenom II x4 960T Processor 3.GHz 5 min 15 sec

Clearly, considering new console power requirements, the chip is not so bad. This is also considering that I am taking generic x86 code and letting GCC do all the optimizations instead of optimizing for the hardware itself.

I think some of you are under-estimating things. I tried as much as possible to pick numbers from generic Windows environment to compare to an optimized Linux one because I am quite sure the vast majority of gaming systems are still Windows systems.

Cat cores are very good if you consider that power consumption is a huge part of their performance metric.

szatkus · Oct 16, 2014

blackkstar :

Don't worry about that one. Compilers do that for us from many, many years. Devs rarely write assembly in software, because compilers usually are better than people. The advantage of consoles is that program is compiled for one specific CPU. On PC one binary must work even on 7 years old CPUs.

-Fran- · Oct 16, 2014

Very interesting information.

I totally forgot about the Consoles having RTOS'es. Good point palladin. And I never new the PS4 (and prolly XB1) had to meet new power requirements from the EU. That explains the HW decisions to some extent, but doesn't justify the shenanigans MS and Sony are pulling, according to those devs (from Ubricrap, was it?).

Now then, moving the conversation a little to Linux and SteamOS'es effort. Given the big array of HW combination available for a PC, the chances of Gabe & Co to be able to get the Linux Kernel to an RTOS level of "preemptivity" is almost impossible. Closest would be simplifying the Kernel to a point where you get optimized drivers and modularize it to the point where you just load what you need and play with the slices of time for the CPU (much like the BFS tries to).

That is just from the design perspective, but from the actual CPU themselves, I'm sure Intel CPUs (like it or not) would have a better performance in games thanks to cache latencies and prediction. Now, something I don't really remember, does branch prediction exists in the RISC uArch?

Cheers!

Cazalan · Oct 16, 2014

Most reasonable people realize that but when ever there is a case of software engineers on one team and hardware engineers on another team, one always blames the other. Both can be right at times but hardware deadlines are generally more strict and compromises must be made. 150W is a rather low target for a design closure probably a full 2 years ago now.

We're still only a year into these new consoles. Plenty of tricks/optimizations to learn still. They said they started at 9fps @720p right? lol

szatkus · Oct 16, 2014

-Fran- :

That's just business crap. Technically consoles shouldn't bottleneck PC games for few years.

-Fran- :

I don't think so. Vendor must control the whole platform to do things like that. SteamOS will on Intel/AMD+Radeon/Geforce+OpenGL+"normal" Linux drivers. They may do some optimizations in few places, but nothing spectacular.

-Fran- :

Which RISC? Cortex? Cell? Sure, it's nothing new in CPU world.

gamerk316 · Oct 16, 2014

szatkus :

Actually, you still see a LOT of cute optimizations on consoles. For example, there was one problem discovered recently in the Dolphin emulator for an UE3 based title. Apparently, the devs for that game were setting a HW defined bit in the GPU, and gaining performance via an undocumented feature within the GPU HW. That's the type of thing you NEVER see on PCs since the HW can change, but on consoles, its still done to squeeze performance. This example may be somewhat extreme, but its not like 2009 is the dark ages.

Coding on Embedded HW is a totally different world. You're expected to have an understanding how the HW works, and how to best take advantage of it.

szatkus · Oct 16, 2014

gamerk316 :

szatkus :

Actually, you still see a LOT of cute optimizations on consoles. For example, there was one problem discovered recently in the Dolphin emulator for an UE3 based title. Apparently, the devs for that game were setting a HW defined bit in the GPU, and gaining performance via an undocumented feature within the GPU HW. That's the type of thing you NEVER see on PCs since the HW can change, but on consoles, its still done to squeeze performance. This example may be somewhat extreme, but its not like 2009 is the dark ages.

Coding on Embedded HW is a totally different world. You're expected to have an understanding how the HW works, and how to best take advantage of it.

Few years ago I read an article about the most insane console optimizations (of course it was about consoles like Dreamcast or Gamecube because of NDA). It's amazing what those guys do to squeeze everything from devices

-Fran- · Oct 16, 2014

szatkus :

Well, I took a read at the Wiki entrance and they do: http://en.wikipedia.org/wiki/Branch_predictor

There's a lot of flavors as well, haha.

Cheers!

palladin9479 · Oct 16, 2014

gamerk316 :

szatkus :

Actually, you still see a LOT of cute optimizations on consoles. For example, there was one problem discovered recently in the Dolphin emulator for an UE3 based title. Apparently, the devs for that game were setting a HW defined bit in the GPU, and gaining performance via an undocumented feature within the GPU HW. That's the type of thing you NEVER see on PCs since the HW can change, but on consoles, its still done to squeeze performance. This example may be somewhat extreme, but its not like 2009 is the dark ages.

Coding on Embedded HW is a totally different world. You're expected to have an understanding how the HW works, and how to best take advantage of it.

Gotta love those "undocumented features", I f*cking hate when devs use those and then I have to go figure out what broke after a system update. PC devs still love to use them but its software "features" instead of hardware ones. Screwing around with weird parameters for functions or trying to outsmart memory management and make unfounded assumptions.

colinp · Oct 17, 2014

Lisa Su has delivered the bad news. Missed targets, red numbers mostly everywhere and 7% headcount reduction. And no sign of how AMD is going to get itself back on track.

It has massive problems, really. Intel is miles ahead, Nvidia has made anything that isn't Maxwell obsolete and ARM is new, unproven ground.

I think that AMD as we know it may not exist in a couple of years.

Cazalan · Oct 17, 2014

colinp :

The report wasn't as bad as I thought it would be. They were actually profitable in Q3. I was expecting a loss after Rory was fired.

They had to bring the headcount down to what their business model will support. Trim the SKUs a bit. They did get their goal of 2 new semi-custom design wins (1 x86, 1 ARM).

colinp · Oct 17, 2014

The worrying thing is that their forecasts are very pessimistic. And there is nothing to counter very strong products from their competitors. I don't know what AMD are doing in the consumer space to be honest. Sure they won Apple but there are no Kaveri laptops available in the UK and it's been MONTHS since that launched. The job of the CEO is to pick up the phone to other CEOs and make things happen, so I hope Su has some friends in the industry.

cemerian · Oct 17, 2014

szatkus :

ROFL, do you have any aidea of how slow those jaguars actually are, they are meant to compete with atoms
http://www.anandtech.com/show/6974/amd-kabini-review/3

gamerk316 · Oct 17, 2014

palladin9479 :

gamerk316 :

Gotta love those "undocumented features", I f*cking hate when devs use those and then I have to go figure out what broke after a system update. PC devs still love to use them but its software "features" instead of hardware ones. Screwing around with weird parameters for functions or trying to outsmart memory management and make unfounded assumptions.

Remember early in the PS3's life, how it seemed every Firmware upgrade from Sony broke some game or another? I'd wager that was caused by some "aggressive programming" on the part of Devs.

blackkstar · Oct 17, 2014

colinp :

Yeah, well that's sort of expected. Their big x86 cores are in a bad spot. They have been depending on Piledriver for two years now, and it looks to continue. It's killing them in servers. Desktop is also losing market share. I don't expect it to have much chance of changing course until Zen comes out, and even then it needs to be competitive.

AMD has only been focusing on APUs and mobile, and this quarter reflects how that strategy is not viable. The whole "the future is mobile big cores are dead" is the strategy AMD is following right now and it's not working.

Missing out on big margins in HEDT, server, workstation, etc for CPU is hurting them badly. You can see they are trying to make it up by charging a premium for 7850k, but that's not nearly enough. 2m/4c is too weak for a decent gaming system.

Their mobile is fighting Intel contra-revenue as well a ARM. They can't find many good design wins at all. They are still selling two year old Tahiti chips as new products. And Nvidia has released new products.

They are in a bad position, and the worst part of it is their HEDT and professional markets. So this quarterly report should show you all that APU and mobile only is not a future for AMD. I would expect APU + mobile + ARM to yield similar results. The truth is they need to be competitive in x86 HEDT and professional market to be profitable.

cemerian :

Please see my other post. There are new power requirements in EU that forced Sony and MS to go for a system in ARM/Atom/Jaguar power envelope. You are not going to see a 200w peak power consumption console again in your life, so get used to consoles lagging further behind HEDT.

If you are ever expecting modified HEDT CPU in a console again, you're going to be disappointed. CPU in PS4 and Xbone are limited to 30w. A Pentium G3220 is a 53w CPU.

Pentium 3 was around 30w, but x86 CPUs have gotten much larger and much more power hungry.

The absolute best something like Intel CPU could do would be Haswell mobile 4c/8t at 2.2ghz, and that's at 37w. And then they'd need another GPU from somewhere. But I shouldn't have to tell you that Haswell wasn't out by the time we saw PS4 and Xbone being developed. It would really have to be SB age chip.

http://ark.intel.com/products/codename/29900/Sandy-Bridge#@All

Their options at the time were

Weak and old ARM core
Jaguar
Intel Atom
Intel SB age chip with 2m/4t at under 2.5ghz base clock.

Jaguar was the best they could do. Not to mention Intel CPU in a console would probably be pretty expensive.

con635 · Oct 17, 2014

colinp :

Power consumption is pretty an all but in hedt who really cares esp if it lacks brute force performance. For me maxwell has been a let down and nvidia had to price it the way they did, testament to their marketing people are buying them. Also rumor has it that nvidia have been forced to use amds hbm due to hmc delays and amd have 1 year exclusive use, another reason for aggressive pricing maybe?

gamerk316 · Oct 17, 2014

CPU bottleneck much? First graph that is clearly the case. Second case is likely SSAA being purely shader performance bound, which would favor AMD slightly. That's the more interesting of the two cases.

con635 · Oct 17, 2014

gamerk316 :

I seen similar 4k results, 290x in clear lead, cant remember where though, I just find strange the pricing for nv (although here in uk it seems sellers have added their own nv tax) and nv using/licensing amds memory tech, its not like them.
edit here we go:

http--www.gamegpu.ru-images-stories-Test_GPU-Action-Ryse_Son_of_Rome-test-Ryse_3840.jpg

jdwii · Oct 17, 2014

^ Dude its one game, also on my comments about Jaguar being a weak CPU i stick to that point and i once again say its not anything to brag about. It was picked for power consumption issues. With HBM memory and APU's i think we all know what the next gen consoles well bring.

http://www.techpowerup.com/reviews/NVIDIA/GeForce_GTX_980/26.html

Again never pick one game to say what is better or not in general the 980 series is amazing and its made on the same fabrication process, Amd should have something soon to compete with it.
The 970 Alone is better then a 290X at 1080P gaming don't take my word for it look at the results.

AMD CPU speculation... and expert conjecture

Splendid

Honorable

Splendid

Splendid

Honorable

Distinguished

Honorable

Honorable

Illustrious

Distinguished

Honorable

Glorious

Honorable

Illustrious

Splendid

Honorable

Distinguished

Honorable

Honorable

Glorious

Honorable

Honorable

Glorious

Honorable

Splendid

Share this page