AMD CPU speculation... and expert conjecture

Page 122 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.
Status
Not open for further replies.

anxiousinfusion

Distinguished
Jul 1, 2011
1,035
0
19,360
http://blogs.amd.com/fusion/2013/04/29/amds-much-beloved-fx-product-line-welcomes-two-new-additions-to-the-lineup/

This part is all about bringing maximum multitasking with up to 10% more performance

So is this the new metric with which the Steamroller cores must exceed over Piledriver?
 

8350rocks

Distinguished
That's talking about performance of PD vs. BD, 10% gain for PD over BD is a just a little light I think, but it's ok.

[iii] AMD FX 4350 or AMD FX 4300 Processor with AMD Radeon™ HD 6670 with 2x4GB DDR3-1600, 990FX Motherboard, Windows 7 64bit, Driver 8.982. Benchmarking by AMD Labs using Cinebench 11.5 (3.64 vs 3.3), MainConcept 2.1 measuring time to transcode hd to flash 720p (156.6s vs 171.74), POV Ray 3.7 (3154 vs 2859), Handbrake time to transcode file WMV 1920×1080.wmv to MP4 High Profile (614s vs 555.33s). DVT-8

There's the fine print.
 


Ryan Smith is a bit of a tit, The Asrock A85X ITX has been out for 3 months and is barely more expensive than that MSI and yet he has no knowledge of it. Oh wait Anandtech had a article about this board. SMH.

The APU platform is flexible enough to throw in a discrete level HD7790 to lower the power over a 7850 but also offer Dual Graphics for a much cheaper 6670 option, granted none will beat the 3570K, but they all will be significantly cheaper. And again AMD are at very least on integrated ready gaming, the same experience is not said about the intel part, the gaming is a stutter fest.

 

juanrga

Distinguished
BANNED
Mar 19, 2013
5,278
0
17,790


But is not one of Intel marketing tricks to label as HD4000 GPUs with different performance?

HD4000 on an i3 will not perform as HD4000 on an i7.
 

juanrga

Distinguished
BANNED
Mar 19, 2013
5,278
0
17,790


Which will be the TPD? 375W? :)
 

juanrga

Distinguished
BANNED
Mar 19, 2013
5,278
0
17,790


Well Ok. It is anandtech and I would add a disclaimer due to their well-known bias...

If I was to build a A10 based rig I would not use their hardware recommendation. I did not like the chosen mobo, case... Regarding performance I would use a fast 2133 G.Skill kits because is almost so cheap in my country.

But my point was simply that the A10 is a good choice for a SFF and many people is using trinity chips on their builds.
 

8350rocks

Distinguished


Just, FYI: That's the system being utilized in PS4 and Kabini already...in the SR APUs it should be good for a big performance increase over Trinity/Richland
 

anxiousinfusion

Distinguished
Jul 1, 2011
1,035
0
19,360


A juicy bit for those of you that didn't read the article:

"people will be able to build APUs with either type of memory [DDR or GDDR] and then share any type of memory between the different processing cores on the APU."

For a second, I thought they were going to say both memory types can be used simultaneously. But it seems their GDDR5 pitch will only be one of two options.
 


Medium Settings @ 720p != Maxed Out
 


We've had consoles with lots of cores for a LONG time now, remember?

Xbox 360: Tri-core with 2-way SMT (2 threads per core = 6 threads total)
PS3: 6 functional PPE for developer use (seventh reserved for the OS, eighth fused off due to low yields)

So the hardware added two more cores. I suspect one will be reserved for the OS in both the Infinities and PS4's case, leaving 7 (one more then current consoles). In the Infinities case, you also have the Kinnect, which may eat up a core all by itself, bringing you to six (same as the 360).

Odds are, you'll see some scaling due to more advanced features due to more powerful consoles (better physics, etc) which make their way into PC ports, but I'm not expecting anything magical.
 

8350rocks

Distinguished


You can't play it with HD4K maxed out on any intel CPU either...so what was your point? It's just as playable on either platform...at similar settings no less...
 

8350rocks

Distinguished


Actually...Sony has a separate CPU on board to deal with OS, according to Mark Cerny, the hardware specified at the release will be 100% dedicated to games only. He also states that there will be massive parallel capabilities for the hardware (Note: Sony is a founding member of HSA). The system also utilizes HUMA, or heterogenous unified memory architecture, allowing the CPU/GPU to act in unison on the exact same memory addresses. The GPU will also have much higher GPGPU capabilities with a system bus directly from it to the memory.

http://www.gamasutra.com/view/feature/191007/inside_the_playstation_4_with_mark_.php

Mark Cerny discusses unified memory here:

"The 'supercharged' part, a lot of that comes from the use of the single unified pool of high-speed memory," said Cerny. The PS4 packs 8GB of GDDR5 RAM that's easily and fully addressable by both the CPU and GPU.

If you look at a PC, said Cerny, "if it had 8 gigabytes of memory on it, the CPU or GPU could only share about 1 percent of that memory on any given frame. That's simply a limit imposed by the speed of the PCIe. So, yes, there is substantial benefit to having a unified architecture on PS4, and it’s a very straightforward benefit that you get even on your first day of coding with the system. The growth in the system in later years will come more from having the enhanced PC GPU. And I guess that conversation gets into everything we did to enhance it."

Here he discusses the GPGPU in the PS4 in greater detail:

Cerny envisions "a dozen programs running simultaneously on that GPU" -- using it to "perform physics computations, to perform collision calculations, to do ray tracing for audio."

But that vision created a major challenge: "Once we have this vision of asynchronous compute in the middle of the console lifecycle, the question then becomes, 'How do we create hardware to support it?'"

One barrier to this in a traditional PC hardware environment, he said, is communication between the CPU, GPU, and RAM. The PS4 architecture is designed to address that problem.

"A typical PC GPU has two buses," said Cerny. "There’s a bus the GPU uses to access VRAM, and there is a second bus that goes over the PCI Express that the GPU uses to access system memory. But whichever bus is used, the internal caches of the GPU become a significant barrier to CPU/GPU communication -- any time the GPU wants to read information the CPU wrote, or the GPU wants to write information so that the CPU can see it, time-consuming flushes of the GPU internal caches are required."

The list of major adaptations to the hardware by Sony specifically for the PS4:

The three "major modifications" Sony did to the architecture to support this vision are as follows, in Cerny's words:
•"First, we added another bus to the GPU that allows it to read directly from system memory or write directly to system memory, bypassing its own L1 and L2 caches. As a result, if the data that's being passed back and forth between CPU and GPU is small, you don't have issues with synchronization between them anymore. And by small, I just mean small in next-gen terms. We can pass almost 20 gigabytes a second down that bus. That's not very small in today’s terms -- it’s larger than the PCIe on most PCs!
•"Next, to support the case where you want to use the GPU L2 cache simultaneously for both graphics processing and asynchronous compute, we have added a bit in the tags of the cache lines, we call it the 'volatile' bit. You can then selectively mark all accesses by compute as 'volatile,' and when it's time for compute to read from system memory, it can invalidate, selectively, the lines it uses in the L2. When it comes time to write back the results, it can write back selectively the lines that it uses. This innovation allows compute to use the GPU L2 cache and perform the required operations without significantly impacting the graphics operations going on at the same time -- in other words, it radically reduces the overhead of running compute and graphics together on the GPU."
•Thirdly, said Cerny, "The original AMD GCN architecture allowed for one source of graphics commands, and two sources of compute commands. For PS4, we’ve worked with AMD to increase the limit to 64 sources of compute commands -- the idea is if you have some asynchronous compute you want to perform, you put commands in one of these 64 queues, and then there are multiple levels of arbitration in the hardware to determine what runs, how it runs, and when it runs, alongside the graphics that's in the system."

Discussing launch titles:

"The launch lineup for PlayStation 4 -- though I unfortunately can’t give the title count -- is going to be stronger than any prior PlayStation hardware. And that's a result of that familiarity," Cerny said. But "if your timeframe is 2015, by another way of thinking, you really need to be doing that customization, because your competition will be doing that customization."

So while it takes "weeks, not months" to port a game engine from the PC to the PlayStation 4 according to Cerny, down the road, dedicated console developers can grasp the capabilities of the PlayStation 4, customize their technology, and really reap the benefits.

"There are many, many ways to control how the resources within the GPU are allocated between graphics and compute. Of course, what you can do, and what most launch titles will do, is allocate all of the resources to graphics. And that’s perfectly fine, that's great. It's just that the vision is that by the middle of the console lifecycle, that there's a bit more going on with compute."

Discussing dedicated units to reduce game overhead on the hardware:

Freeing Up Resources: The PS4's Dedicated Units

Another thing the PlayStation 4 team did to increase the flexibility of the console is to put many of its basic functions on dedicated units on the board -- that way, you don't have to allocate resources to handling these things.

"The reason we use dedicated units is it means the overhead as far as games are concerned is very low," said Cerny. "It also establishes a baseline that we can use in our user experience."


"For example, by having the hardware dedicated unit for audio, that means we can support audio chat without the games needing to dedicate any significant resources to them. The same thing for compression and decompression of video." The audio unit also handles decompression of "a very large number" of MP3 streams for in-game audio, Cerny added.

There he discusses how they addressed bottlenecks:

One thing Cerny was not at all shy about discussing are the system's bottlenecks -- because, in his view, he and his engineers have done a great job of devising ways to work around them.

"With graphics, the first bottleneck you’re likely to run into is memory bandwidth. Given that 10 or more textures per object will be standard in this generation, it’s very easy to run into that bottleneck," he said. "Quite a few phases of rendering become memory bound, and beyond shifting to lower bit-per-texel textures, there’s not a whole lot you can do. Our strategy has been simply to make sure that we were using GDDR5 for the system memory and therefore have a lot of bandwidth."

That's one down. "If you're not bottlenecked by memory, it's very possible -- if you have dense meshes in your objects -- to be bottlenecked on vertices. And you can try to ask your artists to use larger triangles, but as a practical matter, it's difficult to achieve that. It's quite common to be displaying graphics where much of what you see on the screen is triangles that are just a single pixel in size. In which case, yes, vertex bottlenecks can be large."

"There are a broad variety of techniques we've come up with to reduce the vertex bottlenecks, in some cases they are enhancements to the hardware," said Cerny. "The most interesting of those is that you can use compute as a frontend for your graphics."

This technique, he said, is "a mix of hardware, firmware inside of the GPU, and compiler technology. What happens is you take your vertex shader, and you compile it twice, once as a compute shader, once as a vertex shader. The compute shader does a triangle sieve -- it just does the position computations from the original vertex shader and sees if the triangle is backfaced, or the like. And it's generating, on the fly, a reduced set of triangles for the vertex shader to use. This compute shader and the vertex shader are very, very tightly linked inside of the hardware."

There he discusses using the dev kit to design a game for the hardware, so they better understand what developers are dealing with:

There's another way Cerny is working to understand what developers need from the hardware.

"When I pitched Sony originally on the idea that I would be lead system architect in late 2007, I had the idea that I'd be mostly doing hardware but still keep doing a bit of software at the time," he said. "And then I got busy with the hardware."

That detachment did not last. "I ended up having a conversation with Akira Sato, who was the chairman of Sony Computer Entertainment for many years. And his strong advice was, 'Don't give up the software, because your value is so much higher to the process, whatever it is -- whether it's hardware design, the development environment, or the tool chain -- as long as you're making a game.'"

That's the birth of Knack, Cerny's PlayStation 4 game, which he unveiled during the system reveal in New York City. And it's his link to understanding the practical problems of developing for the PlayStation 4 in an intimate way.

There he discusses using the dev kit to design a game for the hardware, so they better understand what developers are dealing with.



 


Damn right its excellent for SFF and that is where it excels. I know I say this a lot but Seven (7) SATA ports on ITX is just creamy dreamy. The other aspect and I touched on this, the APU's graphics are strong enough with the right RAM kit to comfortably play todays titles ranging from lower presets and resolutions all the way to ultra presets game dependent. If integrated is not the thing then throw on a low powered HD7750 or 7790 and get notably the 7790 High-Ultra settings on lower power and heat. If you don't have 100-150 to spend a $50 6670 gives the option of Dual Graphics support which game dependent and with new drivers is a feature that makes the APU unrivaled despite the x86 advantage.

 

juanrga

Distinguished
BANNED
Mar 19, 2013
5,278
0
17,790


The PS3 had a memory subsystem with about 25GB/s of bandwidth for CPU and about 15GB/s for the GPU. The PS4 has about 176GB/s of unified memory bandwidth.

 

Cazalan

Distinguished
Sep 4, 2011
2,672
0
20,810


Put your marketing filters on. There is just the 1 APU that's from AMD. That's why they can do HUMA because it's all in the same chip.
 

8350rocks

Distinguished


Of course, you would only need a minimal CPU to simply run background processes, something like a simple single or dual core would easily suffice for a console.
 

juanrga

Distinguished
BANNED
Mar 19, 2013
5,278
0
17,790


That quote is from a post from mine!!!

The PS3 had a single general-purpose core (PPE) which handle most of the computational workload. This core controlled seven coprocessors (SPE) that assisted it for accelerating 3D and multimedia tasks.

The PS4 has eight general-purpose cores (AMD_64) which can be assisted by the 18 compute units (GCN) via the new HSA technology.
 

Cazalan

Distinguished
Sep 4, 2011
2,672
0
20,810


You're not making sense. If Sony wanted more cores for the OS they would have contracted AMD to make a 10 core APU. You took the comment "100% dedicated to games only" too literally.
 
Status
Not open for further replies.