AMD CPU speculation... and expert conjecture

jdwii · Apr 16, 2014

Jimmy the thing is what if one day Amd makes a APU that is comparable in performance when we look at mid-range Nvidia Quadros that would be amazing. Honestly i'm with juan in the fact that Nvidia is in the worst position, maybe they can continue with Arm+Nvidia graphics and push towards the server with a chip that might work.

palladin9479 · Apr 17, 2014

It will be interesting to see if Intel can finally push out a IGPU that can beat or compete at least evenly with AMD. As they have a process advantage and already have the ability to put stacked DRAM on CPUs they might come out on top if the iGPU itself is actually good.

Even if Intel creates the hardware they still lack the software experience to create corresponding drivers and application performance profiles. NVidia and AMD (ATI) have a very large lead in this field, especially NVidia. We can all attest to amount of performance increase you can get from an optimized application profile and tweaked drivers. Intel could possibly catch up in about two years.

juanrga · Apr 17, 2014

jimmysmitty :

AMD claims will support both DX12 and MANTLE:

AMD's take and the future of Mantle
From the day Microsoft announced DirectX 12, AMD has made it clear that it's fully behind the new API. Its message is simple: Direct3D 12 "supports and celebrates" the push toward lower-level abstraction that AMD began with Mantle last year—but D3D12 won't be ready right away, and in the meantime, developers can use Mantle in order to get some of the same gains out of AMD hardware.

At GDC, AMD's Corpus elaborated a little bit on that message. He told me Direct3D 12's arrival won't spell the end of Mantle. D3D12 doesn't get quite as close to the metal of AMD's Graphics Core Next GPUs as Mantle does, he claimed, and Mantle "will do some things faster." Mantle may also be quicker to take advantage of new hardware, since AMD will be able to update the API independently without waiting on Microsoft to release a new version of Direct3D. Finally, AMD is talking to developers about bringing Mantle to Linux, where it would have no competition from Microsoft.

Corpus was adamant that developers will see value in adopting Mantle even today, with D3D12 on the horizon and no explicit support for Linux or future AMD GPUs. Because the API is similar to D3D12, it will give developers a "big head start," he said, and we may see D3D12 launch titles "very early" as a result.

http://techreport.com/review/26239/a-closer-look-at-directx-12/3

jimmysmitty :

AMD will use ARM to steal server market share from Intel. In fact AMD predicts that ARM will win in servers.

http://www.itproportal.com/2013/06/18/why-arm-will-win-long-run-game-changing-amd-slide-no-one-published-yet/

Ubuntu server already supports ARM and AMD has just demo ARM server running on Fedora.

The ARM CPU in Seattle is already 'strong': Each A57 core has about 30% more IPC than jaguar core, and jaguar has more IPC than Piledriver. Adds that those A57 cores start clocked at jaguar maximum freq. and that scale up to 16 of them and you get a strong product.

16 A57 cores @ 3GHz would be faster than the fastest 16 Piledriver core Opteron, the 6386 SE, whereas consuming only a fraction of the power.

jimmysmitty :

AMD FirePro cards are competing against Intel Phi products.

Two heterogeneous systems, based on NVIDIA’s Kepler K20 GPU accelerators, claim the top two positions and break through the three-billion floating-point operations per second (gigaflops or GFLOPS) per watt barrier. Eurora, located at Cineca, debuts at the top of the Green500 at 3.21 gigaflops/watt, followed closely by Aurora Tigon at 3.18 gigaflops/watt. The energy efficiency of these machines, manufactured by Eurotech, improves upon the previous greenest supercomputer in the world by nearly 30%. Two other heterogeneous systems, Beacon with an efficiency of 2.449 gigaflops/watt* and SANAM with an efficiency of 2.35 gigaflops/watt, come in at numbers 3 and 4 on the Green500. The former is based on Intel Xeon Phi 5110P coprocessors while the latter is based on AMD FirePro S10000 GPUs.

http://www.green500.org/news/green500-list-june-2013

jimmysmitty :

If rumours are correct Broadwell GPU will be faster than Kaveri GPU. Skylake would improve over that bar, unless it transforms into Larrabe 2.0 fiasco. :sarcastic:

About new CPUs, Keller is not working in a new architecture just to get minimal gains over Excavator.

jimmysmitty :

Nvidia is losing market in everything: from consoles to new Apple workstation. The new Firepro W9100 beats anything from Nvidia

http://techreport.com/news/26231/amd-hawaii-gpu-gives-up-amateur-status-joins-firepro-w9100

AMD mantle is destroying Nvidia Geforce numbers. Why do you believe Nvidia lied about the performance of its new driver?

Moreover, their Denver project looks poor each day, despite continuous cancellations and delays during a decade. During Tegra K1 presentation I expected the Denver core to be roughly twice as fast as the A15 core. However, recent leaks show that Nvidia had to OC the core up to 3GHz just to match the performance (loosing in some test) against the A15 based K1.

In fact Nvidia has lowered spectatives a lot of and they are now talking about how Denver will be faster than A57:

http://www.fudzilla.com/home/item/34348-denver-tegra-k1-64-bit-to-beat-a57

LOL! Everyone is expecting Denver to be faster than A57 used in AMD Seattle and Hierofalcon, which is the point of spending years on a custom Denver core if it is not faster than standard cheaper core?

My conclusion is that Denver will be a good product, but not so good to compete against Apple A7 and other custom cores.

Finally, don't forget that next year Intel will be selling a 'CPU' that is much faster than the GTX 780Ti.

colinp · Apr 17, 2014

yaay, we're back to ARM vs x86 and meaningless IPC claims!

Swede69 · Apr 17, 2014

I have no idea why people still bother with AMD cpus...they're so far behind Intel it's not even funny...

Then their's AMD's gpus which are a total joke, the 290X, which just released broke world records for hottest graphic card out of the box at 95C!!! Yet the 780ti only reaches 80C at its hottest point!
-http://www.techradar.com/reviews/pc-mac/pc-components/graphics-cards/nvidia-geforce-gtx-780-ti-1197839/review

So we have AMD 8-10 core cpu's which are lagging sooo far behind Intel's quad cores it's not even funny!!
The 290X is probably the worst gpu AMD has ever released!!
Mantle turned out to be a utter flop(no shocker)
I don't get it why would builders want to pay more for less??
Here's a video of the 8350FX@5.0ghz showing a huge bottleneck coupled to a 7970....a stock Intel 2500k@3.3ghz on a 660ti killed the AMD set up with a much faster gpu I might add......
https://www.youtube.com/watch?v=HX6N9uJtzPA

jdwii · Apr 17, 2014

^ sorry but when both of them are over 100fps i can care less, however i do agree that Amd is easily behind.

gamerk316 · Apr 17, 2014

If they were banking on Mantle then they wouldn't be supporting DX12 and rather would be pushing Mantle forward. Since they are supporting DX12 it is obvious they don't expect Mantle to take over.

To be fair, if they dropped DX support, the choice would be to support EITHER NVIDIA or AMD. And since NVIDIA has a much higher share, which one do you think devs would choose? [Nevermind how the full DX feature set will be open to use on Tablets/Phones, another incentive to stick with DX]. AMD HAS to support DX.

Reynod · Apr 17, 2014

Swede69 :

Troll ... wander off or I'll send you on a two week holiday. There is an Intel sticky you can roll around in joy ... just don't post here unless you have something to contribute other than poo.

Begone.

de5_Roy · Apr 17, 2014

amd uses gpu compute to decode jpeg images
http://semiaccurate.com/2014/04/17/amd-makes-jpegs-suck-less/
accompanied by Two big, almost blank promo slides showing speed improvements. at first i thought that the images didn't load properly and refreshed the page a couple of times. :s

steamroller-b based "berlin" opteron now has a model number - x2100
http://vr-zone.com/articles/amd-demos-hsa-opteron-x-series-processor/76169.html
jaguar based opterons have x1150 and x2150 for the cpu version and the apu version respectively.
strangely, amd refers "opteron x2100 series" to jaguar based apus only and assinged a lower model number to berlin. may be they'll update later.

Global Foundries and Samsung sync up 14nm processes
http://semiaccurate.com/2014/04/17/global-foundries-samsung-sync-14nm-processes/
i had absolutely no idea common platform alliance was dead. wth.

8350rocks · Apr 17, 2014

While I am pleased to see 14nm coming along, I am honestly a bit thrown off by the fact that no one is really doing SHP nodes anymore. It is almost as if no one is even bothering to try anymore. That saddens me...the death of SOI at 28nm seems to be the death knell for all the SHP nodes that would have come after it...I guess IBM will be the last one using SOI on STMicro tech for some time...at least until we can further discern AMD's future plans...

Cazalan · Apr 17, 2014

8350rocks :

It's unfortunate but sometimes there are advantages to having less options. Instead of splitting time between designing cell libraries for both SOI and bulk they can produce more refined bulk libraries. Tool sets should get cheaper or better.

You could imagine if we were still fighting HD-DVD vs Blu-ray. Players would still be much more expensive.

Cazalan · Apr 17, 2014

de5_Roy :

Well if GF is buying IBM fabs that would effectively take IBM out of the foundry business. I don't know why he would say it's dead though. It just shrunk to fewer players.

It benefits AMD as well if they ever have a need to increase volumes.

designasaurus · Apr 17, 2014

Cazalan :

I don't know about other browsers, but in Opera if you just disable Javascript in the page preferences, the full articles become available and you don't get asked to subscribe anymore. Kind of makes me doubt seekingalpha's tech savvy...

juanrga · Apr 17, 2014

I recall someone in this thread predicted more than one year ago that SOI was a dead way and that AMD migration to bulk was the correct decision. Time is giving him the reason.

Future AMD CPUs will use bulk and then finfets (on bulk).

jimmysmitty · Apr 17, 2014

jdwii :

And that is possible. Of course it all depends on how AMD wants to compete. Honestly I don't see them doing that mainly due to their current position.

As well it will still be much more expensive than a standard CPU. That's a bread and butter market much like server CPUs.

palladin9479 :

Intel has a larger software dev team than even Microsoft and has been making drivers longer than NVidia or AMD. I don't think their graphics drivers are the best, not by far, but to assume they wouldn't put resources in to improve them for something that is considered as big as Skylake will be to the,? That is a pretty low blow.

I don't think they will drop the ball. Their drivers have gotten quite a bit better even in the past few years. Intel needs to be able to compete on that level with AMD in the APU market in order to keep market share.

juanrga :

My point is that I don't see AMD continuing Mantle if DX12 does what Mantle does. It would be pointless and a waste of resources since most devs will use what will allow them to hit the largest market. Mantle is only supported on AMD GPUs and specifically only GCN so that limits devs to 20-30% depending on the adoption of GCN GPUs. It doesn't even include consoles which limits it again as PS4 uses their own API yet the XB1 currently used DX11.x and will be updated to DX12 giving DX an advantage.

If AMD wants to keep trying to support it, that is fine but it could be a pointless endeavor as I can tell you devs like the big companies will push the one API that will give them more coverage.

As for ARM, I wont get into that. ARM is fine for what it is but if AMD thinks ARM will take over? That's their own thing. I would rather see a real comparison between a 16 core ARM server against a 16 core x86 server in Hyper-V situations where there are a bunch of virtual servers running multiple applications and backups and being accessed by multiple clients. Considering how my 4 core ARM based S4s CPU runs when keeping multiple apps open, I don't think it will fare as well.

Of course we will have to wait and see as this is how it always goes. One company thinks this is the best way to go and eventually only one is right. Intel with Netburst and high clocks, AMD with SOI and now with ARM for servers.

I still think NVidia is fine for now. I never said they wont have trouble just that they will survive as long as Intel and AMD don't find a way to shove a top end GPU with dedicated VRAM into a single CPU.

palladin9479 · Apr 17, 2014

Intel has a larger software dev team than even Microsoft and has been making drivers longer than NVidia or AMD. I don't think their graphics drivers are the best, not by far, but to assume they wouldn't put resources in to improve them for something that is considered as big as Skylake will be to the,? That is a pretty low blow.

I don't think they will drop the ball. Their drivers have gotten quite a bit better even in the past few years. Intel needs to be able to compete on that level with AMD in the APU market in order to keep market share.

They could have a dev team 1000x bigger MS and they still wouldn't have caught up. Adding more people can not make up for documented experience and trade secrets.

This isn't "drivers", this is "graphics drivers" specifically the kind that accelerate rasterization and GP compute. I say this because Intels' drivers are years behind ATI (AMD) and maybe as much as a decade behind nVidia's. I say ATI because AMD didn't dismantle the ATI graphics development team when they bought the company, they kept it intact and let it continue to work the same as it did before. All that accumulated experience and trade secrets was maintained. And that's before going into performance profiles which, again, require experience and time to get right. Time is the central requirement and while Intel is catching up, they aren't there yet which is why I say two to three years from now.

juanrga · Apr 18, 2014

jimmysmitty :

The link that I gave you explain that DX12 doesn't make everything that MANTLE does. It explain that MANTLE will be faster than DX12. The link also says that MANTLE will work for linux (DX12 will probably be Windows 9 exclusive) and that porting from MANTLE to DX12 is trivial (which I already expected due to the strong similarities of both).

There is another reason why abandoning MANTLE is not an good option. As I showed before Microsoft, didn't plan a new DX for PC. Microsoft was forced to develop DX12 in response to MANTLE. If AMD abandon MANTLE, Microsoft would relax again, and AMD want continuous evolution of software to match the new GPU architectures. In fact, the link that I gave you contain the next quote:

Mantle may also be quicker to take advantage of new hardware, since AMD will be able to update the API independently without waiting on Microsoft to release a new version of Direct3D.

I want competition and innovation.

jimmysmitty :

I gave you a link where AMD server division head makes explicit that their last 16 core x86 server (Warsaw) is released only for legacy customers who will be slow on migrating to the new ARM servers.

AMD already shown an ARM server chip that is much faster and efficient than the x86 chip that replaces. The ARM chip was about 30% faster (IPC) in standard server benchmarks. The chip used standard Cortex cores. Custom cores will be much faster.

Several builders are preparing their own ARM servers. AMD is far from being alone. Dell is working in an ARM supercomputer prototype and I already gave a talk (given at SC13), where authors (from a well-known supercomputer center) predicted that ARM will take over x86 on supercomputers.

Nvidia is already in trouble. Next year Intel will be releasing a 'CPU' with much more performance than top-end Nvidia GPUs,* esteemed efficiency of 14--16GFLOPS/watt (the best Nvidia card today tops at ~6GFLOPS/watt) and 8--16 GB of stacked RAM with 500GB/s bandwidth (the best Nvidia HPC card, the K40 tops at 288GB/s).

If Nvidia had ready a high-performance Denver + CUDA SoC with high-bandwidth stacked RAM... but they don't. By this reason they joined to IBM. I believe many of new x86 supercomputers will use Intel Phi and that we will see Nvidia cards in PowerPC supercomputers.

* You mentioned the 780Ti. Ok. The 780Ti offers 210 GFLOPS (DP). Intel will offer 3000 GFLOPS (DP).

blackkstar · Apr 18, 2014

I don't see DX12 beating on Mantle either.

Game developers, for the most part, want out of reliance on the Windows ecosystem. staying with DirectX isn't going to free them from that. Metro scared a lot of game developers, because Microsoft showed that they have no problem stabbing them in the back to chase a new market.

DX also suffers from the problems of OpenGL, where a group of people who don't know as much as Nvidia and AMD about GPUs are defining the software which controls the GPU.

Also, I can guarantee you that DX12 won't be showing up on Windows 7. Microsoft will use it as a tool to push Windows 8 sales and Windows 9 sales. Why do you think they added the start button back on? It's not because you wanted it, it's to try and make upgrading to Windows 8 more tolerable for those who want to use DX12. It is very similar to what happened with Vista.

If you had a hyperthreaded CPU, XP scheduler was not smart enough to run two threads on different physical CPUs. If you had two threads using 100% of a core, XP scheduler would put one on core 0 and one on core 1, meaning two threads on on physical core and on the same physical core but the logical part.

Microsoft has no problems with screwing people over to get them to upgrade.

Which brings me to my point, and it's that AMD GCN share is going to be a lot better than Windows 8 gaming PC share or Windows 9 share (if MS releases DX12 with Windows 9). It doesn't matter if Nvidia has 99% of the market. If Windows 9 is 1% of the market, that would give Nvidia .99% market share for Dx12 graphics.

I understand you Nvidia guys are going through the whole "sour grapes" and buyer's remorse thing over this. I would be really mad too if I had to sit around and watch a company do little things like HBAO and shader anti-aliasing while the smaller company that gets lambasted as "the crappy driver alternative" actually created their own API and it's gaining traction. It's funny to me, because Nvidia runs around paying people to add PhysX support or to optimize for their hardware and AMD is talking about how game developers are approaching them for an alternative.

gamerk316 · Apr 18, 2014

blackkstar :

That's not how an API works. The API simply defines the input and output structures for each function call. Its up to the hardware makers to implement the API; hence why NVIDIA and AMD has significantly different HW designs (though they are growing closer in recent years). They execute the same code differently, but get the same results in the end.

Point being, NVIDIA and AMD can implement the API in HW however they want. All the API does is give the specifications for what needs to be accomplished.

Also, I can guarantee you that DX12 won't be showing up on Windows 7. Microsoft will use it as a tool to push Windows 8 sales and Windows 9 sales. Why do you think they added the start button back on? It's not because you wanted it, it's to try and make upgrading to Windows 8 more tolerable for those who want to use DX12. It is very similar to what happened with Vista.

Even DX11.2 didn't show up on Vista/7, so yeah, its probably a Win8/9 feature. That's typical though.

If you had a hyperthreaded CPU, XP scheduler was not smart enough to run two threads on different physical CPUs. If you had two threads using 100% of a core, XP scheduler would put one on core 0 and one on core 1, meaning two threads on on physical core and on the same physical core but the logical part.

To be fair, no one really saw Intel HTT comming. XP itself handles multiple physical cores fine [due to being NT based], it was just brain dead with how to handle logical cores. AMD ran into the same exact problem with CMT a decade later.

I understand you Nvidia guys are going through the whole "sour grapes" and buyer's remorse thing over this. I would be really mad too if I had to sit around and watch a company do little things like HBAO and shader anti-aliasing while the smaller company that gets lambasted as "the crappy driver alternative" actually created their own API and it's gaining traction. It's funny to me, because Nvidia runs around paying people to add PhysX support or to optimize for their hardware and AMD is talking about how game developers are approaching them for an alternative.

Different case. There wasn't (and still isn't) a viable GPU accelerated physics API alternative to PhysX, which was VERY advanced at the time (Bullet has caught up in some areas in recent years though). The issue with Mantel is the potential for a major API war, which is BAD for all parties. I remember the days when Glide got all the graphical features, DX got the highest resolution, and OpenGL was somewhere in the middle, and the feature set you got varied depending on your cards manufacturer. I do NOT want to return to those days.

gamerk316 · Apr 18, 2014

http://techreport.com/news/26338/amd-posts-another-loss-but-beats-wall-street-forecast

To be fair, AMD would have made a profit if not for GloFo payments. AMD is still loosing money on its CPU's though.

juanrga · Apr 18, 2014

gamerk316 :

But AMD had problems only for windows based PCs. Microsoft released a pair of FX fixes for AMD hardware but didn't work. They supposedly improved the Windows 8 scheduler, but again isn't working.

At the other hand, people running linux has not problems with the CMT architecture of AMD CPUs/APUs. You didn't need to download and install any CMT fix to get maximum performance from a FX-8350.

Not a surprise here: Microsoft is about stagnant development. This is the same reason why AMD has just demo HSA server (new Berlin APU) using a linux system instead Windows Server.

jdwii · Apr 18, 2014

^ Yeah they're about not updating every 6 months to a new system that would cause IT's multiple problems

gamerk316 · Apr 18, 2014

juanrga :

Because due to CMT, there's a design tradeoff no matter what:

If you use both cores of a Module, you can Turbo Boost more effectively, but you also incur a 20% performance hit on the second core.

So are you better with the higher clocks? Or avoiding the 20% penalty?

The correct answer is: It depends. And because it depends on specific application workloads, there isn't much the scheduler can do about it.

Also, don't forget CMT had its issues at launch in Linux too; the only difference is Linux updates its kernel more regularly then Windows does.

As far as the Linux Scheduler goes (I'll limit this discussion to CFS on a single-user system for simplicity):
http://www.linuxjournal.com/magazine/completely-fair-scheduler

CFS basically attempts to ensure every task gets executed for the same amount of time. Great for latency, since it should be more or less constant. For execution time for singular processes though, CFS is inferior to the windows scheduler (or even the O(1) scheduler used in the 2.6 kernel), since you would have high priority tasks (say, a game) constantly booted in favor of lower priority tasks.

Compare that to the Windows scheduler. The Windows scheduler is effectively priority based (just with a LOT of factors being used to determine priority dynamically), which ensures the highest priority tasks (say, any foreground app) run for longer periods of times then lower priority tasks (say, your AV program running in the background). However, when a task DOES get replaced by the scheduler, it is indeterminate when it actually gets to run again, which can cause occasional performance hiccups.

For a single, long duration, high priority task, Windows has the better scheduler. CFS is superior when you want multiple tasks to execute at the same time, however. They each have their own pros and cons. CFS would tend to do better on multi-core chips though, since you can basically guarantee execution time of up to eight tasks at once without needing to worry about tasks jumping cores (unlike on Windows), so BD/PD would tend to do better on CFS as a result (for that environment anyway).

Understand: Software is 10% code, 90% design tradeoffs. Windows is optimized for doing one heavy workload at a time. Linux is optimized for doing many things at once.

gamerk316 · Apr 18, 2014

As an aside, I just had a minor brainstorm:

All schedulers at their root operate on the same principle: No matter how high priority a task, at some point, the lower priority processes need a chance to run. This is based on the age old problem of having multiple tasks running on a single physical CPU.

You may have noticed, for the most part, we no longer have just single physical CPU's.

Which begs the question: Does preemption in all cases make sense anymore? Wouldn't it make sense, for multi-core systems, to allow the programmer to specify threads that can NEVER be preempted unless manually specified, for the purposes of maximizing performance?

Take your prodotypical game that runs mainly on two threads, each at ~80% on a given CPU. Under the current schedulers, at some point, the two main threads will be preempted by some other threads, costing you performance (latency or FPS). But if you have a quad core CPU and no other high priority/long duration tasks running, does it make sense to preempt these two threads? Even on a tri-core system, this would be a tough sell. Heck, on a dual core system, would it be worth it to keep one of the two threads constantly running?

Now, I am NOT talking about going back to cooperative threading (dear god no), but simply adding a variable on thread creation to never allow preemption of that thread. This would be a request which may or may not be honored by the OS depending on system config (how many cores, how many non-premptable threads are running, etc).

Huh...might be a good exercise to try out...

*goes off to create a new Linux scheduler*

-Fran- · Apr 18, 2014

gamerk316 :

You will always have more threads than physical processors, so you will still need schedulers that have queuing rules for threads.

But you do have a point about how old the current paradigm is (at least the NT kernel). On the other hand, the Linux Kernel has always been multi-CPU friendly (to call it something). Even more Solaris.

And isn't "thread priority" what you're describing?

Cheers!

AMD CPU speculation... and expert conjecture

Splendid

Splendid

Distinguished

Honorable

Honorable

Splendid

Glorious

Administrator

Splendid

Distinguished

Distinguished

Distinguished

Honorable

Distinguished

Champion

Splendid

Distinguished

Honorable

Glorious

Glorious

Distinguished

Splendid

Glorious

Glorious

Glorious

Share this page