AMD CPU speculation... and expert conjecture

de5_Roy · May 1, 2014

AMD begins Private Mantle SDK Beta
http://semiaccurate.com/2014/05/01/amd-begins-private-mantle-sdk-beta/
took them over 6 months but it's here. hopefully linux support will follow soon.

blackkstar · May 1, 2014

con635 :

As far as speculation goes, I saw a video where someone from AMD was discussing HPC applications. They were actually specifically mentioning a scenario where you would have multiple APUs working together in a system with HSA.

It would work something like this:

APU with 2m/4c design and a larger GPU, maybe around 1024 GCN cores.

You'd buy a motherboard with 2 or 4 sockets for APU.

If you want to upgrade your GPU, you can add another APU and get 2048 GCN cores. You also get 2 more modules and 4 more cores. The HSA and traditional overall computational power grows equally. It is relatively ideal if you ask me.

It would actually prove rather cost effective as well. 290x has around 2800 GCN cores. A 14nm 2m/4c part with 1024 GCN cores would probably cost at most what 7850k costs. So you'd easily be able to get 4096 GCN cores for the current price of R9 290 (maybe x?) while also getting a 8m/16c CPU.

The only missing piece is single thread performance.

But I would also expect some Hypertransport or Infiniband slots for adding pure GPU as well. They'd have to be pin compatible with PCIe though so they'd still work with Intel platforms. I don't think AMD would want to completely give Nvidia dGPU market on Intel platform.

AH, forgot this link: http://hexus.net/gaming/news/industry/69141-global-pc-games-market-revenue-overtakes-consoles/

8350rocks · May 1, 2014

juanrga :

Juan, the discussions I had with someone at AMD regarding bulk vs FD-SOI were what gave me the impression I had. Their engineers told me going to bulk would be "suicide", because the properties of the substrate would be "catastrophic" for the BD uarch design. It NEEDS high clocks, it was DESIGNED FROM ONSET to have high clocks to offset the single thread performance deficit in the name of parallel threading.

Now look, we have Kaveri, and the highest clocked part does not even hit the stock clock of the flagship part introduced 2 YEARS AGO. Was my prediction that Kaveri on bulk would compromise clock speed so greatly that any gains in performance would be negated by loss of clockspeed wrong? No. Thank you for playing *bows*

juanrga :

1.) AMD designed a modular architecture designed to run massively parallel code with high clock speeds from the manufacturer to offset the less stellar single thread performance deficit. Except now, they no longer have a clockspeed advantage to offset some of that performance, and the uarch design tweaks were not sufficient to bring it up to where it could have been with higher clocks.

2.) The PC market will grow more quickly again when it has a reason to upgrade. No one in the software or hardware world has provided a sufficient enough reason for people to give up 3-4 year old systems. When does the average consumer upgrade? When their PC does not do what they want with the software they need anymore. That is not happening unless the machine craters. Your average gamer is a different breed, and that segment of the PC market is actually growing albeit not quite as quickly as your average Joe Q Public PC owner group is slowing down, but as pointed out by Palladin, we have reached saturation at this point. Until the latest and greatest drives more purchases it will be this way for a while.

3.) If that is the case, then why did AMD just release a new workstation GPU with 12 GB VRAM? Because their APU is going to scale so well it would outrun a R9-290X on steroids right? No. Wrong.

4.) This remains to be seen. Your crystal ball has failed you several times already. (The NVidia ARM praise ring a bell anyone?) If it is an APU, I would tend to think it would be more of a coprocessor design based on what I hear, but then, Jim Keller has free reign over this, so we will see what comes.

8350rocks · May 1, 2014

blackkstar :

con635 :

As far as speculation goes, I saw a video where someone from AMD was discussing HPC applications. They were actually specifically mentioning a scenario where you would have multiple APUs working together in a system with HSA.

It would work something like this:

APU with 2m/4c design and a larger GPU, maybe around 1024 GCN cores.

You'd buy a motherboard with 2 or 4 sockets for APU.

If you want to upgrade your GPU, you can add another APU and get 2048 GCN cores. You also get 2 more modules and 4 more cores. The HSA and traditional overall computational power grows equally. It is relatively ideal if you ask me.

It would actually prove rather cost effective as well. 290x has around 2800 GCN cores. A 14nm 2m/4c part with 1024 GCN cores would probably cost at most what 7850k costs. So you'd easily be able to get 4096 GCN cores for the current price of R9 290 (maybe x?) while also getting a 8m/16c CPU.

The only missing piece is single thread performance.

But I would also expect some Hypertransport or Infiniband slots for adding pure GPU as well. They'd have to be pin compatible with PCIe though so they'd still work with Intel platforms. I don't think AMD would want to completely give Nvidia dGPU market on Intel platform.

AH, forgot this link: http://hexus.net/gaming/news/industry/69141-global-pc-games-market-revenue-overtakes-consoles/

The issue is, when AMD can do that...you will probably be looking at GPUs with 8000+ GCN cores in the dGPU segment.

GPU performance is growing at roughly 30% gain per generation, while CPU performance is growing around 10% per generation. Eventually dGPU performance will be so staggeringly ahead of what you can fit on an APU (the gap will only increase) that no matter how you want to stack them the dGPU will be more performance per watt AND per dollar.

8350rocks · May 1, 2014

juanrga :

palladin9479 :

I already showed you why your physics is incorrect. I gave you some correct physics numbers.

I gave you a quote from Nvidia Research Team mentioning how they predict that dGPUs will disappear in favor of iGPUs integrated in APUs.

I gave you a SC13 slide from Intel showing how their current Phi discrete card will be replaced by a standalone 'CPU' the next year. I gave you three links with Intel and others people predicting that gaming dGPUs will be killed in some few years.

I also mentioned you that AMD, Nvidia, and everyone else in the research community is designing APUs for supercomputers, because a dCPU+gGPU doesn't scale up. I already explained you why doesn't scale up.

I already gave details of the ultra-high-performance APUs currently designed by AMD and Nvidia. I repeat: the Nvidia one is rated at 300W and has a size of 290mm.

I got finally the design by Intel for exascale. Their expected die size is 400 mm^2. I could give more details if anyone is interested.

The PC market is dying

http://news.yahoo.com/pc-dying-death-thousand-cuts-040011100.html

Gardner last reports admits it is "the worst decline in PC market history."

And each player in the traditional PC market has a plan to migrate to alternative markets.

Microsoft plan are tablets and phones.

Logitech pivoted to tablets

http://qz.com/170096/pcs-are-dying-so-logitech-pivoted-to-tablets-and-had-a-surprisingly-profitable-quarter/

Apple did to tablets and phones

http://www.cultofmac.com/273905/pc-market-dying-even-apple-apple-plan/

Intel did to phones/tablets and buzzyword "internet of things"

Nvidia did to tablets/phones and embedded (cars) and is transforming itself from a graphics PC company to a mobile company:

If you need any more evidence that we have moved into a post-PC era, the quote from NVIDIA President and CEO Jen-Hsun Huang should be a clear indication for you. [...] Huang also dropped numbers about how NVIDIA was doing with an increase of 13% bringing the company $1.2 billion from just this last quarter. It was mentioned that the Tegra line, was making up for the downturn in PC sales numbers.

http://www.droid-life.com/2012/11/09/president-of-nvidia-says-a-great-tablet-is-better-than-a-cheap-pc/

AMD transition outside of the dying PC market has been explained again and again and again and again, but some people don't get it still. I will repeat it once again:

Now on the last stage, AMD is transforming from a company where 96% of its revenue relied on the dying PC, into one that will incorporate its products into high-growth markets that will make up 50% of its revenue by the end of 2015

CEO Rory Read shines light on this transformation to the public

Earlier this month, Rory Read was able to enlighten the public on this transition thanks to interviews by Bloomberg and CNBC. Spiking shares by as much as 6% that day, it was clear that this transition, along with Rory's confidence was not common knowledge. Rory heavily emphasized on how AMD was transforming its company's dependency from primarily on the PC into a business model that features a half-PC, half-diverse product portfolio, which focuses on high-growth markets, such as dense cloud servers, gaming, medical, semi-custom chips, and more. This is a very promising approach, as the old AMD was risky because all it took in a weakening PC market to be knocked out was a set of inferior products - which is exactly what happened. Plummeting PC sales, along with Intel's superior product lineup blew AMD near extinction.

http://seekingalpha.com/article/2115183-a-new-amd-is-emerging

I LOVE the part where you say "physics is wrong"...that means I can TL;DR the rest of this absurd text wall...LOL!!!

truegenius · May 1, 2014

^ whats this ? mods are taking nap and you are using this opportunity

gamerk316 :

history is repeating itself, as now fanboys are promising apus as gaming dgpus killer

8350rocks :

and here are some signs of more cores
AMD R9 390X Bermuda XTX 4224 cores
http://www.guru3d.com/news_story/amd_pirate_islands_radeon_r9_300_series.html

Red_Sun · May 1, 2014

With the FX/AM3+ road map being what it is, any thoughts if we will see price drops on the upper end of the FX line any time soon? I have a phenom II x4 980 and am a bit curious to see what I can squeeze out of the 8350 for gaming.

logainofhades · May 1, 2014

I would wait till excavator. Your CPU is still fairly decent, all things considered.

Red_Sun · May 1, 2014

logainofhades :

Very true.

I've been trying to talk myself out of a needless upgrade for about a week now...and can't quite justify upgrading for ~$200 of marginal gaming gains for the sake of curiosity. BUT, if I got a good deal, the temptation might be too great...lol

logainofhades · May 1, 2014

Kinda how I have been with the 1230v3 mini-itx build I have been wanting to do. No need for it really. Was thinking about a Kabini HTPC, but that is kinda pointless since I still don't have an HDTV. I wanted the Asrock board with the 19v DC jack on it. Maybe by the time I am ready, there will be some boards out with that and a M-SATA slot as well. That way, the only cable I would need is for a blu-ray drive.

gamerk316 · May 1, 2014

and here are some signs of more cores
AMD R9 390X Bermuda XTX 4224 cores

Which you can do while you can still die shrink. But even GPU's are nearing the end; they're already at 28nm. How many more shrinks can they easily get? Two? Three? GPU's can't keep throwing more cores to gain performance for too much longer.

jdwii · May 1, 2014

juanrga :

jdwii :

Under your belowed Windows, which is terribly coded, badly threaded, and doesn't understand the CMT architecture a FX-8350 suffers against an i5 or even an i3. Under a modern and efficient operative system as linux, which is very well threaded, and CMT aware, the FX-8350 is toe to toe with an i7-3770k.

The i7 wins in FP tasks and the FX wins in integer tasks. The average being the i7 only about a 10% faster.

Everyone who owns an 8-core FX and use it in both Windows and linux knows how well the chip runs under linux.

Similar stuff about games. On badly threaded games, the 8-core has difficulties to stand against an i3. On well threaded games? The FX outperforms the i5 and matches the i7.

I couldn't stop laughing at the horribly inaccurate statements being made

Cazalan · May 1, 2014

gamerk316 :

Except that Phi is what the next generation of HPC are being built on because it will be the first to market (2015) with 3D memory.

NERSC-8 a 9300+ node system with Xeon + Phi (Cray XC based)

http://www.hpcwire.com/2014/04/29/emerging-system-sets-stage-exascale-science/

jdwii · May 1, 2014

gamerk316 :

GamerK, you do realize Bulldozer was designed to be a server chip first and foremost

I win. The forums now accepts my analysis from the initial BD reveal.

Under your belowed Windows, which is terribly coded, badly threaded, and doesn't understand the CMT architecture a FX-8350 suffers against an i5 or even an i3. Under a modern and efficient operative system as linux, which is very well threaded, and CMT aware, the FX-8350 is toe to toe with an i7-3770k.

I'm assuming you're referring to this:

http://www.phoronix.com/scan.php?page=article&item=amd_fx8350_visherabdver2&num=1

By a quick review, Intel wins overall 9-7-3 (W/L/T). Though its worth noting many of the benchmarks can be run in Windows, with the SAME RESULTS, blowing a hole in your "The OS is the problem" theory. AMD problem is, depending on workload, you need to handle CMT differently. In light threaded tasks, you need to schedule on one module if you can, so you can turbo clocks. In highly threaded tasks, you need to schedule on as many modules as you can and avoid the 2nd modules 20% performance hit. And since the OS doesn't know ahead of time what type of workload you have, there is a 50% chance the scheduler will be wrong. The BD scheduling patch basically made the scheduler treat CMT the same as HTT, improving the highly threaded workload case, but reducing performance for the single threaded workload case.

Its also worth noting the tests AMD wins are the tests that are naturally well threaded, which is expected.

The main argument against BD has always been "Why does a newer processor with a 600MHz speed advantage and twice as many cores generally only do almost as good as a chip produced by Intel two years ago?". That's why BD has been so underwhelming. Part of that is because some people hyped the arch to death, promising an Intel killer. BD was never going to be that.

Similar stuff about games. On badly threaded games, the 8-core has difficulties to stand against an i3. On well threaded games? The FX outperforms the i5 and matches the i7.

Even though "well threaded" games like BF4 favor Intel?

To be fair CMT is a flawed design at least with HT you don't have to add much space to the CPU die if any. Juan really is confused when it comes to threading as well as he can't understand Microsoft doesn't usually make a big system update every 6 months. Again its called imaginationland some people just can't live their others like juan and ken ham can.

Cazalan · May 1, 2014

gamerk316 :

"At the 2013 EUVL Workshop, Intel announced that EUV would still be under development in 2015, and hence would be targeted for 2017 7 nm HVM"

So Intel will continue using DUV (Deep UltraViolet) at 14nm and 10nm, although with double or maybe triple patterning.

juanrga · May 1, 2014

gamerk316 :

If you are running something as Windows and your AMD processor is about 50% slower than intel processor but then you change the OS to linux and 'magically' the AMD processor turn to be only a 10% slower, where do you believe is the problem? On the Windows OS? On the oil price? On the the phase of the Moon? Hint: Wintel.

Yes I know that you predicted that dual-cores would be at the top of chart for gaming this round, but reality is that FX-8350 is at the top, above the i5-4670k, on the normal edition. And it performs better under the MANTLE edition.

gamerk316 :

Neither Larrabe nor Phi are "multicore CPU architecture". In the first place it is a many-core architecture. In the second place it is optimized for throughput unlike CPUs such as Xeon which are optimized for latency.

Those who did claim that Larrabe is dead will be very surprised in a pair of years, because it returns for consumers products.

Intel Phi is the excellent for parallel computing. Nvidia is getting nervous, because the Phi already beats its best GPGPUs and has the plus of the single-ISA approach. Are you aware that the fastest supercomputer in the world uses Phi instead Nvidias K40?

juanrga · May 1, 2014

8350rocks :

Yes, the 'engineer' supposedly said you wrong stuff, which you parroted here, but was replied then. I recall explaining to you during many months that Kaveri was bulk no SOI and you pretending otherwise again and again.

I also recall your FUD about clocks and how the world come to an end. I replied you that Kaveri could hit 4.5Ghz on air. Then the first leaks of an engineering sample appeared and you posted FUD about those. Your crazy argument was that AMD benchmarking an ES at only 1.8GHz was implying that Kaveri couldn't be clocked much higher than that. I still laugh today.

I already explained to you why clocks are lower. Kaveri CPU is clocked at 3.7GHz because (i) it has lower TDP than Richland/Trinity and (ii) it has a bigger and powerful iGPU sharing that lower TDP.

If you increase the TDP to 100W you can hit the 4GHz without any problem. In fact despite having lower clocks at stock, Kaveri has higher OC factor.

Richland has an OC factor of 19.7% on air.
Kaveri has an OC factor of 20.9% on air.

8350rocks :

Are you really asking me to explain you why AMD releases an dGPU today, when they didn't select any for their future exascale supercomputer? Interesting.

Then you would be also very confused on why Nvidia Research Team claims that dGPUs will be replaced by APUs about 2018 but the same Team is releasing a Titan Z dGPU those days.

I suppose that both AMD and Nvidia engineers know how to use a calendar. :sarcastic:

wh3resmycar · May 1, 2014

Yes I know that you predicted that dual-cores would be at the top of chart for gaming this round, but reality is that FX-8350 is at the top, above the i5-4670k, on the normal edition. And it performs better under the MANTLE edition.

care to show the chart where that 8350 is on top?

http://www.bit-tech.net/hardware/2013/11/14/intel-core-i3-4130-haswell-review/5

http://

juanrga · May 1, 2014

wh3resmycar :

Therefore, I have to explain to you something said about BF4 and the i5-4670k using a... crysis 3 benchmark where the i5-4670k doesn't appear?

Revolutionary concept! I will try:

palladin9479 · May 1, 2014

blackkstar :

Making multi-socket systems can be extremely expensive due to having to route the interconnects. Typical motherboard design has the NB and CPU as the central components with everything else branching off those, adding another CPU socket increases the complexity by a factor of 2, adding a third socket jacks it up to a factor of 4. Also remember because threads can be dynamically switched from one CPU to another, all of their cache's need to be coherent with each other. Otherwise CPU 1 and CPU 2 maybe have different cached valued for a specific memory address and when the thread gets switched you get data corruption.

This is why multi-core chips have been so successful. It's 100+ times easier to to maintain cache coherency between L1 and L2 caches inside a chip then it is to maintain L1, L2 and L3 cache's across multiple chips. Cache is extremely sensitive to latency and which means you can't wait around for the MMU to ensure the cache contents are correct during a transaction, it needs to be done preemptively. So every time you get a cache write on one cache it needs to be reflected across every level of cache in the entire system. With four sockets, each containing four cores, that gets expensive quick. It's why chips from IBM and Oracle emphasize the link bandwidth between sockets and devote so much space and effort on coherency controllers.

So no, we won't be seeing any cheap consumer multi-socket motherboards. You won't be wiring four APU's together on a single board, it would be extremely inefficient. Instead you'd want to put the APU on an add-in card with dedicated high speed memory wired into it and treat it as an external co-processor. There is no need to maintain cache coherency with it as system software manually controls the workload, there is no pseudo-random dynamic task switching.

palladin9479 · May 1, 2014

a chart @ 720p, revolutionary indeed.

The resolution of the framebuffer has relatively little effect on the level of CPU effort required. It's often used to isolate raw CPU performance separate from graphics performance.

That being said, APU's are just entry level CPU's with bolted on entry level GPU's. You will get similar performance with a $70 USD CPU and a $70 USD dPGU. So comparing them to a $250 USD CPU + dGPU setup isn't very intelligent. Ultimately costs and heat are what's important here. That is what makes the 7850K such a bad purchase, it's priced too high. The 6800K ($140 USD) and 7700K ($155 USD) are far better values, and the 7600 ($125 USD) is insanely good value. Though to be fair $140~150 is about the price limit for APU's, anything higher is rarely a good buy as you can get a 1GB GDDR5 dGPU for $80~100 USD.

juanrga · May 1, 2014

About 40 developer studies have signed the recently opened AMD beta program for MANTLE SDK. And that despite some here predicting that "nobody will use mantel", and that "DX12 killed MANTLE"!

palladin9479 · May 1, 2014

so from his chart vs toms, you can easily draw 2 different conclusions. platform wise @ 1080p, where you will actually be playing, that 8350 is a pretty bad idea. the reason why every single day you'd see a lot folks here in this very forums, trying to upgrade out of it.

WTF are you talking about? Or are you just trying to troll folks.

I'm running a FX8350 with 2 GTX 780 Hydro's inside an 900D with a custom WC loop. I steamroll everything at 1920x1080x120hz. When it comes to resolution and graphics settings, your GPU's are almost always going to be far more important then your CPU.

The 720 vs 1080p discussion is about APU's not CPU's.

juanrga · May 1, 2014

wh3resmycar :

I like the double standard where 720p benchmarks are accepted when Intel wins, but are rejected when AMD wins.

But no problem with giving one at your solicited 1080p

and here BF4 with the FX-8350 above the Haswell i5-4670k

http--www.gamegpu.ru-images-stories-Test_GPU-Action-Battlefield_4-test-bf4_proz_2.jpg

juanrga · May 2, 2014

First product of the openpower initiative

http://www.zdnet.com/google-eyes-power-chips-amazon-arm-both-add-up-to-intel-headaches-7000028881/

It is shown once again that AMD head did the correct when abandoned competing in the traditional Opteron server products (Warsaw CPU is only for legacy customers) and migrated to new HSA APU (Berlin) plus ARM (Seattle) strategy.

Some time ago I predicted that new gen games would increase hardware requirements towards the top end. Wolfenstein recommends an i7 or FX 8-core

http://gamingbolt.com/wolfenstein-the-new-order-minimum-system-specs-revealed

AM1 vs Bay-Trail:

When looking at the benchmarks, it appears that the CPUs within the SoCs, each of which consists of four relatively slow cores (AMD Jaguar and Intel Silvermont), are closely matched with regards to performance. Sometimes Intel is slightly ahead, but most of the time AMD manages to be slightly faster. You can consider the CPU performance to be roughly half that of a Core i3 from the Haswell generation.

AMD clearly has the better graphical performance: the difference to Intel is larger in the 3DMark and game benchmarks. AMD appears to be better suited for multimedia work. In the short amount of time that we had, we were unable to run video tests, but considering the implemented technologies we are willing to predict that AMD will take the lead here as well. We will revisit this at a later point in time.

As for power consumption, both platforms are far more energy efficient than the Intel Socket 1150 or AMD Socket FM2(+) platforms. Intel is a little more energy efficient under load, but AMD wins when the processors are idle.

As far as we're concerned, AMD is also the victor with regards to connectivity: the AM1 platform has an extra USB 3.0 port, a Serial ATA 600 port instead of a Serial ATA 300 one, and support for DisplayPort 1.2.

The battle of AMD AM1 versus Intel Bay Trail-D ends, as far as we're concerned, in a resounding victory for AMD.

http://uk.hardware.info/reviews/5334/28/amd-am1-vs-intel-bay-trail-d-review-cheap-desktop-platforms-conclusion

AMD CPU speculation... and expert conjecture

Splendid

Honorable

Distinguished

Distinguished

Distinguished

Distinguished

Honorable

Titan

Honorable

Titan

Glorious

Splendid

Distinguished

Splendid

Distinguished

Distinguished

Distinguished

Distinguished

Distinguished

Splendid

Splendid

Distinguished

Splendid

Distinguished

Distinguished

Share this page