AMD CPU speculation... and expert conjecture

juanrga · Aug 21, 2014

truegenius :

Everyone knew that 176 GB/s is the maximum bandwidth allowed by the memory. The effective bandwidth depends of lots of factors, including programmer ability to use the hardware. If your code is bad or not optimized the hardware will not optimize it for you.

The geniuses at wccftech took a slide as basis for their typical 'professional' level ruminations. :sarcastic:

gamerk316 · Aug 21, 2014

PS4 GPU has an Effective Bandwidth of 120 to 140 GB/s, Not 176 GB/s As Previously Reported
http://wccftech.com/sony-ps4-effective-bandwidth-140-gbs-disproportionate-cpu-gpu-scaling/

Not shocked; you have the CPU and GPU both fighting for the same memory resources. So when one is getting data, the other is probably blocked from accessing the bus.

noob2222 · Aug 21, 2014

jdwii :

cdrkf :

whyso :

He probably doesn't use the rig for gaming much i see now reason to treat a PC as second class almost degrading lol. For 120 watts a Pentium and a 750Ti would be about that limit with a SSD. For 45 watts Amd has a solution that works and would be good enough for his use. If it was me i would use that Coolermaster case and go crazy ha ha would fit in their just fine.

depends on the uses, if its just an htpc, could go with an embedded solution, sapphire has a 4"x4" board with cpus that go from 6w to 25w. with that small you could mount it to the back of the tv and hide all the wires, just depends on the application or use.

http://www.sapphiretech.com/embedded/product.aspx?pid=embedded_boards

pair it with one of these if you need a drive.

http://www.storagereview.com/samsung_intros_wireless_optical_drive_and_portable_bluray_writer

esrever · Aug 21, 2014

gamerk316 :

The bus is a lot smarter than you give it credit.

8350rocks · Aug 21, 2014

esrever :

Yes, HTX can have multiple lanes, and the PS4 hardware was already known to have HTX from CPU to memory, and GPU to memory, bypassing the northbridge to get data directly.

Now, the interesting situation is, are developers using the direct route to memory efficiently? That is likely where this generations performance gains will come from. Programming for higher efficiency of the direct route to memory, among other optimizations for the consoles.

gamerk316 · Aug 21, 2014

8350rocks :

Ah, but you forget the case where the CPU/GPU want to access the same data. That will kill your bandwidth in a hurry, since the memory access will fail due to software locks. So one component has to sit and wait for the other to finish.

You're never going to get perfect theoretical numbers anyway, simply due to overhead.

8350rocks · Aug 21, 2014

Agreed, but someone who was willing to optimize to the point of micro managing their processes could theoretically get pretty close...

de5_Roy · Aug 21, 2014

i thought that the throttling was due to the cpu (on load) eating into the igpu's thermal budget and causing clockrate throttling thus lowering the bw. the cpu and the igpu should have full "visibility" of each other's caches so if one could read the other's data, it could do it.

palladin9479 · Aug 21, 2014

jdwii :

palladin9479 :

jdwii :

Oh that was so the wrong thing to say.

Yamaha RX-V3800

That is my receiver and it cost about $1800 when I first purchased it back in 08.

I'm currently in the process of redoing my setup and focusing on hiding all the damn wiring. Been looking for a nice glossy black stereo cabinet to clean up the environment.

The box on the right is my current living room PC, it's a nice low profile mini-itx case but it's still a bit too large and clunky to fit in with everyone else. I plan on going with something smaller that runs on a picopsu with an external power brick that I can hide, and at that level you absolutely can not run an dGPU as your looking at sub 100~120w total system power.

Not bad sorry i claimed you owned a bose 😀,actually looks pretty good i guess the receiver does look like its hanging off the entertainment center a bit. I see the solution get rid of all those consoles except the wii and make a build with this case its your receiver is about the same size and you can fit a ATX psu in it and a SSD and a I5+what ever video card you want.

Did you not read? Why the hell would I "get rid of" consoles that I actually use just to satisfy some internet guys feelings? And no you can't change the parameters of the problem to fit your preconceived solution, the real world doesn't work that way. The living room PC is a cross between a HTPC and a casual gaming box, why in the hell would I over-engineer it by making it some large blockish device just for "internet coolz points!!!". Small, stylish (or hidden), low power and low noise are my requirements. The kinds of cases and cooling have a 65W cap with 100~120W being total system usage otherwise you start to need more airflow which in turn means larger cases and / or more fans. That is what makes the A8-7600 stand out, it provides about 90% of the actual graphics power that the 7800/7850 have but with a low enough TDP to make use of ULP components.

I guess the point I'm making is that I entertain guests and having my living room look nice is a requirement, it can't look like a gamers shack or a geeks den, that is what my lab is for (that picture was when I was testing a bunch of stuff, in day to day stuff those game box's and much of that wiring isn't visible).

palladin9479 · Aug 21, 2014

jdwii :

Testing memory speeds with a single threaded program is a very bad way to go about it, all it does is test how good the cache is at prediction. Best way to test is multiple independent memory copies happening simultaneously. Create an imaginary 10GB data source, split it into four 2.5GB chunks then do four bulk copies at 16MB block sizes. Should completely bypass all cache and prefetch checks on all systems and give you an absolute memory bandwidth. This is important because while the CPU tends to access memory in a serial fashion, GPU's access it in very large bulk transfers in parallel. So just because a single thread on the CPU gets ~60% cache efficiency doesn't mean the GPU would get the same, especially since the GPU doesn't use the CPU's caching mechanism at all. In the case of AMD uArch's, the limiter on memory access isn't the IMC but the prefetcher. The IMC will sit there idling waiting for the prefetcher to tell it to get something.

jdwii · Aug 21, 2014

palladin9479 :

jdwii :

palladin9479 :

Not bad sorry i claimed you owned a bose 😀,actually looks pretty good i guess the receiver does look like its hanging off the entertainment center a bit. I see the solution get rid of all those consoles except the wii and make a build with this case its your receiver is about the same size and you can fit a ATX psu in it and a SSD and a I5+what ever video card you want.

Did you not read? Why the hell would I "get rid of" consoles that I actually use just to satisfy some internet guys feelings? And no you can't change the parameters of the problem to fit your preconceived solution, the real world doesn't work that way. The living room PC is a cross between a HTPC and a casual gaming box, why in the hell would I over-engineer it by making it some large blockish device just for "internet coolz points!!!". Small, stylish (or hidden), low power and low noise are my requirements. The kinds of cases and cooling have a 65W cap with 100~120W being total system usage otherwise you start to need more airflow which in turn means larger cases and / or more fans. That is what makes the A8-7600 stand out, it provides about 90% of the actual graphics power that the 7800/7850 have but with a low enough TDP to make use of ULP components.

I guess the point I'm making is that I entertain guests and having my living room look nice is a requirement, it can't look like a gamers shack or a geeks den, that is what my lab is for (that picture was when I was testing a bunch of stuff, in day to day stuff those game box's and much of that wiring isn't visible).

Well its all up to personal preference i guess obviously their is a market for it however unless these steam boxes wouldn't exist or those types of cases, i know some people are however very picky about what is in their living room its usually called WAF. I personally don't care to hide that i'm a geek around others i would feel like i was lying to them, if they didn't like it i would show them a map that i can print out with a layout of my house and the door.

juanrga · Aug 21, 2014

esrever :

Indeed! And there are other improvements beyond the main bus, such as the second bus that allow the GPU to read/write directly to system memory, eliminating synchronization with CPU. This second bus can pass almost 20 gigabytes a second.

It is worth to mention that the unified memory pool of the PS4 was the largest piece of feedback the company got from game developers, because this solves "a common bottleneck where data has to be shuffled from main memory to graphics memory and back again in non-unified designs".

blackkstar · Aug 21, 2014

As far as game software and hardware is concerned, we've hit a standstill. We're not strong enough for ray tracing yet we've got enough shader power to let mid range cards perform at desired resolutions. CPUs have no where to go because we've reached the point where single core performance is not really improving drastically while games are stuck running only a few threads.

DirectX has stagnated beyond belief. I remember getting a DX9 card and playing UT2k4 and having my mind get blown at all the cool new effects. That sort of thing hasn't existed in a long time. It's to the point where you can use a 6950 or GTX 480 and not miss out on anything significant while playing games.

There are two things that can happen. Either this market can be abandoned entirely, or the software problems can be solved via things like Mantle. GPUs need to be stressed in a way beyond shader power and CPUs need some task to scale to a lot of cores. PC gaming needs some sort of killer feature that makes people want to upgrade. It's grown way too stagnant as I said before. People are content with mid-range cards and mid range parts. The raw number of pixels a card has to push hasn't really gone anywhere. Adopting LCD was probably one of the worst things to happen to enthusiast GPUs.

I worry about shifting so hard to mobile. We've seen what can happen there. Eventually the smaller, cheaper chips catch up (like mediaTek) and it starts this big race to the bottom with devices with prices. And it's left a lot of ARM markets with small margins. I realize some of you really believe mobile is 100% the future, but Nvidia has spent 5 product cycles trying to push a premium, high performance ARM part and it hasn't gone anywhere. So they can either race with MediaTek and company to the bottom or remain uncompetitive. Yes, the market is growing, but is it the kind of market you want to be in? Competing with a bunch of cheap Chinese companies and anyone else who can buy an ARM license and compete with you? As compared to a high end product like large GPU where the barrier to entry is so massive you can't just flop over and see a new company with a competitive GPU. The fact that ARM and ARM chip makers are pushing for ARM in other things besides servers tells me they want out of the mobile market and they either see the market going downhill as it gets eaten from the bottom up or that demand is going to die down soon.

I do think that if Mantle, OGLNG, and DX12 end up with game developers pushing hardware in ways they couldn't before due to *some things* being able to scale to more cores and pushing existing hardware much harder, that it could turn sales around. If we got games that made the mid range cards cry and run slow instead of run fine at 1080p, sales would change. The market is there but the demand is not. And if ARM mobile devices keep going the way they are going, it will end up the same.

de5_Roy · Aug 22, 2014

amd apus are getting price drops too. not as much as i expected though
http://www.xbitlabs.com/news/cpu/display/20140821121037_AMD_to_Lower_Prices_of_A_Series_APUs_for_Back_to_School_Season.html

Avexir Readies 3.40GHz DDR4 Memory Modules.
DDR4 Could Hit 3.40GHz This Year
http://www.xbitlabs.com/news/memory/display/20140821223332_Avexir_Readies_3_40GHz_DDR4_Memory_Modules.html
this, i wanna see running on a carrizo pc.

AMD Radeon R7 250XE GPU Is Targetted Directly at 1st Generation Maxwell – No Power Connector
http://wccftech.com/amd-radeon-r7-250xe-no-power-connector/

jdwii · Aug 22, 2014

de5_Roy :

Yeah the 7850K is still the same price man i have no idea WTF they are thinking. The 250XE seems like a simple downclock i thought they would make a update to their GCN design and go right after the 750Ti. When are we supposed to see a update on their GPU series i'm pretty sure the 290X is noting more than a 7970ghz edition with more cores.

palladin9479 · Aug 22, 2014

de5_Roy :

7700K to $140 puts it inside the price range where it's actually worthwhile. And WTF at the 7600 going down to $100.

DDR4 going to do very interesting things to APU's. 3400MT/s nets you 27.2GBps per channel so 54.4GBps in a typical dual channel configuration. Of course we won't see that at value pricing for another year or two, that will be when iGPU's get another jump in capability.

Got done reading and do people still not understand the relationship between CL and clock rate?

For those wondering with DDR4 has such higher "latency" its' because physics places a real limit on the refresh of a memory cell in a DRAM configuration. That limit is about 7ns on really good silicon, you can get faster if you go to SRAM but it gets really expensive really fast. Since timings are measured in clock ticks, the faster the clock the higher the CL needs to be to be above that 7ns barrier.

DDR3-1600 CL8 is 10ns, DDR-2133 CL11 is 10.318, DDR3-2400 CL12 is 10ns and so on. So the current advertised latencies are about what you'd see on a new technology and once it matures you'll see the same 10ns for mainstream with ~7ns at the expensive end.

sapperastro · Aug 22, 2014

blackkstar :

As far as game software and hardware is concerned, we've hit a standstill. We're not strong enough for ray tracing yet we've got enough shader power to let mid range cards perform at desired resolutions. CPUs have no where to go because we've reached the point where single core performance is not really improving drastically while games are stuck running only a few threads.

DirectX has stagnated beyond belief. I remember getting a DX9 card and playing UT2k4 and having my mind get blown at all the cool new effects. That sort of thing hasn't existed in a long time. It's to the point where you can use a 6950 or GTX 480 and not miss out on anything significant while playing games.

There are two things that can happen. Either this market can be abandoned entirely, or the software problems can be solved via things like Mantle. GPUs need to be stressed in a way beyond shader power and CPUs need some task to scale to a lot of cores. PC gaming needs some sort of killer feature that makes people want to upgrade. It's grown way too stagnant as I said before. People are content with mid-range cards and mid range parts. The raw number of pixels a card has to push hasn't really gone anywhere. Adopting LCD was probably one of the worst things to happen to enthusiast GPUs.

I worry about shifting so hard to mobile. We've seen what can happen there. Eventually the smaller, cheaper chips catch up (like mediaTek) and it starts this big race to the bottom with devices with prices. And it's left a lot of ARM markets with small margins. I realize some of you really believe mobile is 100% the future, but Nvidia has spent 5 product cycles trying to push a premium, high performance ARM part and it hasn't gone anywhere. So they can either race with MediaTek and company to the bottom or remain uncompetitive. Yes, the market is growing, but is it the kind of market you want to be in? Competing with a bunch of cheap Chinese companies and anyone else who can buy an ARM license and compete with you? As compared to a high end product like large GPU where the barrier to entry is so massive you can't just flop over and see a new company with a competitive GPU. The fact that ARM and ARM chip makers are pushing for ARM in other things besides servers tells me they want out of the mobile market and they either see the market going downhill as it gets eaten from the bottom up or that demand is going to die down soon.

I do think that if Mantle, OGLNG, and DX12 end up with game developers pushing hardware in ways they couldn't before due to *some things* being able to scale to more cores and pushing existing hardware much harder, that it could turn sales around. If we got games that made the mid range cards cry and run slow instead of run fine at 1080p, sales would change. The market is there but the demand is not. And if ARM mobile devices keep going the way they are going, it will end up the same.

Be careful what you wish for. When prices were going through the roof and gpus were out dated before the first 12 months had passed, I knew a hell of a lot of people that leaped off them to the consoles and mobile devices. Those people have been streaming back in because of the affordability these days. It used to take a LOT of money to keep up with the top 8+ years ago, and even I used to get frustrated when my 12 month old super pc became a cheap hooker in a year or so time.

szatkus · Aug 22, 2014

Slobodan-888 :

Kaveri probably has quite nice memory controller... for GDDR5

http://www.chip-architect.com/news/Kaveri_Trinity_2014-01-07.jpg

Unfortunately, they've never released mobo with GDDR5.

Slobodan-888 · Aug 22, 2014

I don't quite understand you.

Someone is planning to release a motherboard with integrated GDDR5 memory that can be used by the APU's iGPU (instead of using system RAM)?

Edit: OK I see what you mean. But why it is so limited for DDR3? And can you, perhaps, help me with my problem?

szatkus · Aug 22, 2014

Slobodan-888 :

I heard that AMD excepted lower prices for GDDR5 in 2014. Maybe they'll release something with GDDR5 in future. For now DDR3 is the only option and A10-7850K can't show its whole potential.

cdrkf · Aug 22, 2014

szatkus :

I very much doubt we will ever see Kaveri with GDDR5 on the desktop. There is a slim possibility that it may show up in mobile though (I don't think mobile Kaveri parts are out yet?)...

szatkus · Aug 22, 2014

cdrkf :

Mobile parts are out, just no devices on market now.

I have to correct myself:
http://www.anandtech.com/show/7702/amd-kaveri-docs-reference-quadchannel-memory-interface-gddr5-option
It's a DDR3 controller, but wider to be compatible with GDDR5 controller.

cdrkf · Aug 22, 2014

szatkus :

cdrkf :

Mobile parts are out, just no devices on market now.

I have to correct myself:
http://www.anandtech.com/show/7702/amd-kaveri-docs-reference-quadchannel-memory-interface-gddr5-option
It's a DDR3 controller, but wider to be compatible with GDDR5 controller.

Ah so is that to allow it to crossfire with a GDDR5 dgpu then?

juanrga · Aug 22, 2014

The group who designed AMD bulldozer family had problems with both caches and system memory. To get better caches and better memory controllers you need both smart people and money. AMD lacked both, and Bulldozer is what it is.

It is worth mentioning that one of the main tasks made by Keller during last years at AMD has been on the cache subsystem. AMD has developed dozens of new techniques to improve caches. A part of his work on caches will be transfered over Excavator modules, but the whole of the improvements is aimed for the new architectures: K12/Zen.

As other companies, AMD supports official specs for DDR. Last DDR-3300 modules are not part of official JEDEC specs and thus neither AMD nor Intel support if officially. Overclocking is not official support and it depends on silicon lottery.

DDR3 is in its physical limits. AMD wouldn't waste time and money on developing an improved memory controller when the modules cannot scale up enough and when DDR3 is going to be replaced soon. This is why they turned its eyes towards GDDR5 memory. This memory is much faster than DDR3 and a six-core version of Kaveri with a more powerful iGPU and GDDR5 for system memory was planned, but it had to be abandoned in last minute because one of the companies doing the GDDR5 DIMMs was out of business.

The GDDR5 memory controller is still in the Kaveri die, but it was fused out. AMD docs for Kaveri still mention the quad memory controllers: DCT0, DCT1, DCT2, DCT3.

No future APU will use GDDR5 as system memory because the company that would do the DIMMs continues out of business. AMD is now moving to HBM, a new JEDEC standard, which can provide more bandwidth and efficiency than GDDR5.

Finally, I mention that Carrizo officially supports DDR4-2400. Thus DDR4 will not bring any bandwidth benefit to Carrizo APUs and, in fact, Carrizo mobile will probably only support DDR3. DDR4 support makes sense for the server version of Carrizo: Toronto.

We would see improvements however from the new cache subsystem (which it seems that also reduces latencies).

CooLWoLF · Aug 22, 2014

The new 95watt 8370E is a very interesting addition to the FX line. Considering how well the 8320/8350s overclock, I am excited to see how far someone can push the 8370E with that extra room with regards to reduced TDP.

AMD CPU speculation... and expert conjecture

Distinguished

Glorious

Distinguished

Splendid

Distinguished

Glorious

Distinguished

Splendid

Splendid

Splendid

Splendid

Distinguished

Honorable

Splendid

Splendid

Splendid

Honorable

Honorable

Reputable

Honorable

Judicious

Honorable

Judicious

Distinguished

Distinguished

Share this page