AMD CPU speculation... and expert conjecture

gamerk316 · Apr 1, 2014

colinp :

Not really. As DX is just the API layer that interacts with the graphic drivers, NVIDIA, AMD, and Qualcomm would have been involved in its development. As a result, AMD would have had full access to the API and the low level design details during its development. All AMD would have to do is simply re-implement a subset of the API and release a driver, which they likely already had as part of the DX12 developmental process. Heck, would have been trivial to do really.

Lets look at the timeline here:

~4 Years ago: MSFT begins working on DX12
~3-4 Years ago: MSFT defines HW for XB1, mandates Full DX12 support.
~3 Years ago: First details on AMD GCN based GPUs released to public

Look at that timeline for a second. Lets assume MSFT mandated future DX12 support for the XB1 GPU. And look at the time frame where GCN was likely under development. If you take the ~4 year dev time for DX12 as true, then you also have to accept GCN started very shortly afterward. Coincidence? [I for one would be VERY interested to see if GCN supports the fixed-function pipelines full DX12 support needs; I bet it does.]

~1 year ago: AMD declares there will be no DX12

Its worth noting, MSFT was non-committal, and NVIDIA was strangely silent. MSFT's stance makes sense; they don't want to tick off their partner by saying "you lie", and NVIDIA (like AMD) would have been under NDA until news of DX12 dropped. So no one could really stand up and correct AMD.

6 months ago: MANTLE

AMD releases Mantle, and the internet goes insane

2 weeks ago: DX12

Internet claims MSFT "stole" Mantle. Despite MSFT and NVIDIA claiming DX12 had been in works for "over three years".

Timeline wise, its QUITE possible AMD not only borrowed parts of Mantle (reimplementing a subset of the DX12 API with minor changes would be simple, given the AMD driver team would almost certainly have a working DX12 driver working internally), but built the GCN arch around the proposed DX12 spec (kinda like they did with the 4000 series being built around the proposed DX10 spec, including the proposed Tesselation engine that never made it in until DX11). Which had the (totally intended) side benefit of guaranteeing them the XB1 GPU,

So yes, timeline wise, it works. Whether its true or not is just substructure, but the timeline works.

colinp · Apr 1, 2014

gamerk316 :

Not really. As DX is just the API layer that interacts with the graphic drivers, NVIDIA, AMD, and Qualcomm would have been involved in its development. As a result, AMD would have had full access to the API and the low level design details during its development. All AMD would have to do is simply re-implement a subset of the API and release a driver, which they likely already had as part of the DX12 developmental process. Heck, would have been trivial to do really.

Lets look at the timeline here:

~4 Years ago: MSFT begins working on DX12
~3-4 Years ago: MSFT defines HW for XB1, mandates Full DX12 support.
~3 Years ago: First details on AMD GCN based GPUs released to public

Look at that timeline for a second. Lets assume MSFT mandated future DX12 support for the XB1 GPU. And look at the time frame where GCN was likely under development. If you take the ~4 year dev time for DX12 as true, then you also have to accept GCN started very shortly afterward. Coincidence? [I for one would be VERY interested to see if GCN supports the fixed-function pipelines full DX12 support needs; I bet it does.]

~1 year ago: AMD declares there will be no DX12

Its worth noting, MSFT was non-committal, and NVIDIA was strangely silent. MSFT's stance makes sense; they don't want to tick off their partner by saying "you lie", and NVIDIA (like AMD) would have been under NDA until news of DX12 dropped. So no one could really stand up and correct AMD.

6 months ago: MANTLE

AMD releases Mantle, and the internet goes insane

2 weeks ago: DX12

Internet claims MSFT "stole" Mantle. Despite MSFT and NVIDIA claiming DX12 had been in works for "over three years".

Timeline wise, its QUITE possible AMD not only borrowed parts of Mantle (reimplementing a subset of the DX12 API with minor changes would be simple, given the AMD driver team would almost certainly have a working DX12 driver working internally), but built the GCN arch around the proposed DX12 spec (kinda like they did with the 4000 series being built around the proposed DX10 spec, including the proposed Tesselation engine that never made it in until DX11). Which had the (totally intended) side benefit of guaranteeing them the XB1 GPU,

So yes, timeline wise, it works. Whether its true or not is just substructure, but the timeline works.

That's a lovely story.

de5_Roy · Apr 1, 2014

@gamerk: what you said seems plausible and the timeline kinda sorta fits. what doesn't fit is the boldness from amd. amd have been playing it safe since 2011. what would corner them so much to pull off a stunt like this now? imo everything amd tried to do with mantle hype got mostly undone by the mining craze. amd did make money, but it was for short term unlike what mantle-hype would have gained them. and the way developers and amd are handling mantle's software support is simply subpar and negligent.

Cazalan · Apr 1, 2014

Well you keep ignoring that there is a NVLink 1.0 and NVLink 2.0. So sure, the NVLink available in 2018+ will be much faster than PCIe from 2010. !SHOCKING REVELATION! :sarcastic:

But the first product Pascal coming out in late 2016 will be NVLink 1.0 which is a mere 20GT/s. So yes, two and a half years from now, NVLink 1.0 will be 25% faster than PCIe 4.0.

If you're willing to wait half a decade then NVlink 2.0 will be 215% faster. Kudos! And I'm sure the whole industry will just sit still and let that happen for NVidia. 😉

Oh now you finally use the word "future", yet PCIe 4.0 will be out well before Pascal. If you think 25% faster is much much faster then fine.

I don't have a 2018 crystal ball. Roadmaps change and slip constantly, just like NVidias recent flip flop in both of their product lines.

I'm still waiting for the source that IBM bought NVLink. It's a consortium. They openly collaborated on it.

When technologies are purchased you see headlines like :
"Intel acquires HPC interconnect assets from Cray for $140 million"
"Intel buys QLogic's InfiniBand assets for $125 million"
"AMD licenses Cyclos resonant clock mesh"

Of course IBM knows of PCIe and infiniband. IBMs primary CPU interface CAPI (Coherence Attach Processor Interface) uses PCIe 3.0. There are 12 IBM products listed on Mellanox website which makes for lots of Infiniband products. And they just announced their 100Gb/s product.

Anyway you do realize how bad propritary interfaces are for the industry right? It would be 10x better FOR US CONSUMERS if they just collaborated on PCIe 5.0 or 6.0. Mezzanine connectors have been around since the birth of PCI (yes the one before PCIe). The reason they aren't used as often is because they're more expensive and they're easier to damage than the typical PCI slot.

jimmysmitty · Apr 1, 2014

anxiousinfusion :

Not surprised since even Intel has had yield issues before and will. Shrinking the process tech is no walk in the park, especially when we are nanometering closer to the point where the materials we use will no longer work properly.

de5_Roy :

ASICs are already out for Litecoin. A friend has a few that are running 360KH/s and use way less power than his R9 290s do. As well those same ones will be updated to push out 16MH/s, qhich is about 16x faster than a R9 290X can even push out, and soon a 100MH/s one will hit.

Problem is that there are other crypto-coins out there and if one catches fire and the price jumps up like Litecoin did and as well works well with GPUs, we will be in the same boat again.

On the bright side, the Asus R9 290X DCII is priced pretty decently compared to the rest.

juanrga :

Not sure if I believe it. It may be for a server based version but I doubt it will be a desktop based version. DDR4 boards are going to be pretty expensive to start and that would kill the price/performance/platform angle AMD has been doing, especially if the CPU is not that powerful.

As well, the speed will be near pointless for us. I don't even know if games will benefit They should as that speed should push out 51.2GB/s, much like Intels current quad channel DDR3 does.

de5_Roy :

I can see AMD doing that. Especially since their GPUs perform better when there is less CPU overhead and their CPUs tend to bottleneck their GPUs before Intel does. This is a free performance boost for their platform ahead of NVidia. It is a way to sell their GPUs.

What will be sad though is that once DX12 is out, the advantage will be gone as it will work for every GPU not just theirs.

palladin9479 · Apr 1, 2014

ASICs are already out for Litecoin. A friend has a few that are running 360KH/s and use way less power than his R9 290s do. As well those same ones will be updated to push out 16MH/s, qhich is about 16x faster than a R9 290X can even push out, and soon a 100MH/s one will hit.

I don't think they'll be that common nor cheap. Making a chip that can do the computation isn't hard, it's keeping it fed with a high speed memory bus that forces the costs up. The LTC algorithm was design's specifically to require large amounts of memory bandwidth as a way to prevent cheap custom chips from doing the computation entirely inside local registers.

Cazalan · Apr 1, 2014

Are they actually ASICs or pre-programmed FPGAs? How can these mining companies turn out silicon faster than AMD? 😉

palladin9479 · Apr 1, 2014

Are they actually ASICs or pre-programmed FPGAs? How can these mining companies turn out silicon faster than AMD?

They aren't. AMD is making a general purpose multi-core scalar processor that can combine units to function as a vector processor. Their GPU's must to a wide variety of functions. Coin minding on the other hand is usually just doing a single computation a bazillion times on different sets of data. It's trivial to program a chip to do a single function without any special control logic, and that is what the minners have been doing. BTC is just SHA-256 hash's done over and over again, it's weakness is that all the data required for the hash can fit in side the CPU registers and so you can program a chip to just continuously do those hash's while feeding different data to the registers. LTC on the other hand use's a scrypt based algorithm that requires large amounts of memory bandwidth.

Now you can build a custom chip and give it a huge memory pool of fast expensive memory, which is exactly what the current batch of scrypt ASIC's are doing. It's gambling on the price of LTC skyrocketing in a year so that you can make profit since the machine's aren't cheap to build.

de5_Roy · Apr 2, 2014

http://www.tomshardware.com/news/nvidia-maxwell-20nm-delay,26419.html
20nm maxwell possibly delayed till next year. i wonder if apple fab-blocked amd too, since amd also depends on tsmc for gpus.

http://www.techpowerup.com/199452/amd-amends-wafer-supply-agreement-with-globalfoundries.html

Under this amendment AMD expects to pay GLOBALFOUNDRIES approximately $1.2 billion in 2014. These purchases contemplate AMD's current PC market expectations and the manufacturing of certain Graphics Processor Units (GPUs) and semi-custom game console products at GLOBALFOUNDRIES in 2014. The 2014 amendment does not impact AMD's 2014 financial goals including gross margin.

gpus too? 😱 if glofo can give amd access to it's 20nm node before tsmc for gpus (despite glofo's well-known track record of failures)... 😗

AMD’s Jaguar Microarchitecture
http://www.realworldtech.com/jaguar/
edit: if pd uses jaguar's branch predictor (among other tweaks)... doesn't that mean pd is a standalone cpu, unlike the minor upgrade (over bd) it seemed previously?

juanrga · Apr 2, 2014

gamerk316 :

Yes, let us see the timeline.

- Developers meet Microsoft and did solicit advances in DX. Microsoft denies.

- Microsoft plans to abandon traditional PC game market and moves towards tablets with Windows 8 without start menu and other nuisances.

- Microsoft also plans to abandon traditional console gaming and released Xbox1 which is presented more like a aio home gadget than true game console.

- Some game developers announce Windows gaming is dead and migrate to Linux.

- Other game developers contact AMD to develop MANTLE with requirements rejected by Microsoft.

- AMD announce Microsoft has no plans for DX12.

- Microsoft replies but doesn't announce DX12.

- Xbox1 appears and is destroyed both in marketing and sales by PS4.

- Xbox1 uses DX11.x and lacks some low-level optimizations available on PS4.

- AMD presents MANTLE, developed in collaboration with game developers.

- All mayor players adopt MANTLE. Last one was Crytek.

- Microsoft and Nvidia get nervous.

- Microsoft contact AMD and license MANTLE.

- Microsoft presents DX12.

- DX12 is introduced in Xbox1. Microsoft expect 20% improvement in the console performance, which will close the gap with PS4 and will help with sales.

- Nvidia claims that DX12 was developed during three or four years and that everyone was informed about it.

- Game developers asked about DX12 admit that they didn't know the existence of DX12 even at MANTLE launch!

- Techreport confirms that Microsoft interest in the DX12 is "recent" (post-MANTLE).

- SA confirms that DX12 is basically MANTLE.

- MANTLE is available today. DX12 will be released latter because is adapted/copied from MANTLE.

- Same people who hated MANTLE now praise DX12 by doing exactly the same than MANTLE. E.g. low-level access was evil for MANTLE but is a fantastic feature for DX12.

- Conspiratorial theories about AMD stooling DX12 from Microsoft are started by the same AMD/MANTLE haters.

juanrga · Apr 2, 2014

Cazalan :

Well you keep ignoring that there is a NVLink 1.0 and NVLink 2.0. So sure, the NVLink available in 2018+ will be much faster than PCIe from 2010. !SHOCKING REVELATION! :sarcastic:

But the first product Pascal coming out in late 2016 will be NVLink 1.0 which is a mere 20GT/s. So yes, two and a half years from now, NVLink 1.0 will be 25% faster than PCIe 4.0.

If you're willing to wait half a decade then NVlink 2.0 will be 215% faster. Kudos! And I'm sure the whole industry will just sit still and let that happen for NVidia. 😉

Oh now you finally use the word "future", yet PCIe 4.0 will be out well before Pascal. If you think 25% faster is much much faster then fine.

I don't have a 2018 crystal ball. Roadmaps change and slip constantly, just like NVidias recent flip flop in both of their product lines.

I'm still waiting for the source that IBM bought NVLink. It's a consortium. They openly collaborated on it.

When technologies are purchased you see headlines like :
"Intel acquires HPC interconnect assets from Cray for $140 million"
"Intel buys QLogic's InfiniBand assets for $125 million"
"AMD licenses Cyclos resonant clock mesh"

Of course IBM knows of PCIe and infiniband. IBMs primary CPU interface CAPI (Coherence Attach Processor Interface) uses PCIe 3.0. There are 12 IBM products listed on Mellanox website which makes for lots of Infiniband products. And they just announced their 100Gb/s product.

Anyway you do realize how bad propritary interfaces are for the industry right? It would be 10x better FOR US CONSUMERS if they just collaborated on PCIe 5.0 or 6.0. Mezzanine connectors have been around since the birth of PCI (yes the one before PCIe). The reason they aren't used as often is because they're more expensive and they're easier to damage than the typical PCI slot.

Pascal will have 2.5x more bandwith that PCie4 slots in your future mobo. The key however, is that NVLINK can be scaled beyond PCIe limits as explained before.

It was already given a link explaining who designed NVLINK. Hint: "NV".

As explained before NVLINK has been developed to break PCIe limits. Thus no hypothetical PCIe 5.0 or 6.0 can provide same bandwith and features.

NVLINK will bring competition to the HPC market. That is good.

gamerk316 · Apr 2, 2014

de5_Roy :

To be fair, no one really expected the mining craze; I can't blame AMD for that. As for dev software support, as I said from the beginning: Dev's aren't going to bend over backward to support TWO different graphical renders in their games, hence why you see Mantle getting tacked on after the fact, and the gains are less then what many here were expecting (though well in line with my predictions).

But from EA's perspective, I think they're quite happy with the couple million they pocketed from AMD to support Mantle.

gamerk316 · Apr 2, 2014

Your continued bias is amusing Juan.

sarinaide · Apr 2, 2014

It is safe to say that Excavator will only be around by 2015, with some luck Q1 unless AMD is waiting on GF/TSMC to reach yields for 22nm or less, if Carrizo is the line up for 2014 it appears to be another intervening step between evolutions ie: Richland so maybe there is hope that we will get a die shrink.

If I was to be happy with an Excavator APU I would be looking at 15% performance gains at lower clocks to Kaveri say 10-15% faster x86 performance at 3.3ghz and a tPD around 90w, GPU gains from 20-40% mostly based on better power management, maybe a turbo. But would need a IMC improvement around 10-20%. Possible

gamerk316 · Apr 2, 2014

sarinaide :

Honestly, I don't see GPU's getting 40% gains anymore simply due to die shrinks. They are going to run into the same issues CPU's are running headlong into. We are fast approaching the limit of what die shrinks can attain, so until we start to re-think CPU/GPU design and communication, we're going to hit our maximum performance in a few years.

CPU side? 10% IPC gains at the same clock, based on nothing but past results. GPU depends on a few factors, but since I don't see the memory bottleneck going away anytime soon, I figure that limits AMD to 20% gains at most.

juanrga · Apr 2, 2014

sarinaide :

Excavator maximum TDP will be 65W (according to roadmaps). Integer performance is not still known, but floating point performance would be about 60--80% better than Steamroller at same clocks.

juanrga · Apr 2, 2014

gamerk316 :

Your predictions agree very well with reality when we ignore the huge discrepancy between both, otherwise they look as your "nobody will use mantel (sic)" prediction.

esrever · Apr 2, 2014

gamerk316 :

Its also amusing you are one to cry bias.

gamerk316 · Apr 2, 2014

esrever :

I call it as I see it. And like it or not, I'm one of the more consistent people on Toms when it comes to predictions on hardware performance.

Which is sad, given my background is software, not hardware.

gamerk316 · Apr 2, 2014

In other news: Mantle BF4 Frame Pacing, yay!

http://www.pcper.com/reviews/Graphics-Cards/Frame-Rating-Battlefield-4-Mantle-CrossFire-Early-Performance-FCAT/High-End-C

What is especially worth noting is the consistency of frame times for Mantle, but is doing so at the cost of absolute FPS (observed and real). The same behavior occurs at 1440, even if the two results are now almost identical due to other bottlenecks.

Now for AMD:

Things flip here: Mantle CFX becomes the fastest config, but CFX DX11 becomes the SLOWEST, even slower then a single card running DX11. Things like better at 1440 as the CFX DX11 comes in second, but is closer to the single card solutions then CFX mantle. Likely the same issue as at 1080. Something is clearly mucked with the driver here.

Even more interesting are the AMD vs Intel result sets:

Intel is faster with DX, AMD is faster with Mantle. Which should NOT be the case; both CPU's should benefit to some degree.

I really don't know what to make of these results; they don't make sense on multiple fronts. But combine that with AMD's very poor DX11 results in Thief, we now have TWO games that use Mantle where it looks like AMD's D3D driver is performing worse then it should.

And before someone says "there's the anti-AMD bias again", answer me why AMD's DX11 performance SUCKS in Thief, or why BF4 DX11 CFX results look so poor.

Cazalan · Apr 2, 2014

Why is there a limit to PCIe if they wish to continue it? At the phyical level (I/O) they're exactly the same, thus why NVLink can plug right into PCIe slots.

PCIe 4.0 is 16GT/s
PCIe 5.0 would be 32GT/s
PCIe 6.0 would be 64GT/s

Which means NVLink 2.0 (50GT/s) sits between PCIe 5 and 6.

You say it's bringing competition but it's really just more exclusivity like PhysX. I doubt Intel or AMD will be making NVLink ports in any processors. And I doubt anyone besides NVidia will be making NVLink GPUs.

jdwii · Apr 2, 2014

sarinaide :

Yeah no
Edit
Yeah i don't expect any improvements on the CPU side except maybe what juan is saying but of course i don't agree with his figure unless we use synthetic benchmarks. Gaming might actually see a boost since i'm pretty sure that's why AMD CPU's suck so much compared to intel in gaming in performance per clock or core to core. The shared FPU will always cripple their design that's why jaguar comes to close to the PD design in IPC. As for the GPU i have no idea i don't think they talked about it but i don't expect much except maybe a quad channel option for the ram but even that seems unlikely to me

blackkstar · Apr 2, 2014

Falling for the Nvidia "lets compare our future products to the competition's current ones and then imply that the competition will never come out with anything else" marketing tactics.

What year is this? Is it still 2014? It feels like I am in groundhog day movie, except every time the thing I'm stuck in changes. It goes from Tegra, Tegra 2, Tegra 3, Tegra 4, Tegra K1, Maxwell, NVLink, etc. The ride ever ends!

"ITS OVER MANTEL IS FINISHED DX15 ON VOLTA WILL DESTROY THE APU IN XBONE AMD IS DEAD!!!!"

I'm just so thankful Nvidia is saving us from proprietary "Mantel" technology by bringing us NVLink! Nvidia knows how to save computing from the evil clutches of proprietary AMD. And it's even better that Microsoft is helping to save us from vendor lock in from "Mantel" by ensuring that everyone stays in Windows forever. I might have to learn something new besides Visual Studio and DX, and since I can't learn that (it's over my head!) no one else will be able to learn it either.

I just pray for the day where we stop getting more CPU cores and Microsoft and Nvidia save us all from proprietary hardware and software! It will be glorious! Praise Microsoft and Nvidia! DEATH TO MANTEL!

sarinaide · Apr 3, 2014

On the GPU side I mentioned 20-40% which is more or less what AMD achieved from Richland to Kavari, maybe if its DDR4 bandwidth it is possible to see higher with some bandwidth opening up.

X86 side kavari at lower clocks was either better or more or less the same as Richland clocked higher so 10% doesn't seem like that much of an ask.

Gaming wise maybe 10%-15% combined performance over Kavari, depending on the game that may be lots or a little.

gamerk316 · Apr 3, 2014

Cazalan :

Why is there a limit to PCIe if they wish to continue it? At the phyical level (I/O) they're exactly the same, thus why NVLink can plug right into PCIe slots.

PCIe 4.0 is 16GT/s
PCIe 5.0 would be 32GT/s
PCIe 6.0 would be 64GT/s

Which means NVLink 2.0 (50GT/s) sits between PCIe 5 and 6.

You say it's bringing competition but it's really just more exclusivity like PhysX. I doubt Intel or AMD will be making NVLink ports in any processors. And I doubt anyone besides NVidia will be making NVLink GPUs.

People, single cards don't even max out PCI-E 1.1 x16 yet; why are we talking about PCI-E bandwidth like its a massive limiting factor here?

The real savings to NVLINK are power related, not performance related.

AMD CPU speculation... and expert conjecture

Glorious

Honorable

Splendid

Distinguished

Champion

Splendid

Distinguished

Splendid

Splendid

Distinguished

Distinguished

Glorious

Glorious

Splendid

Glorious

Distinguished

Distinguished

Splendid

Glorious

Glorious

Distinguished

Splendid

Honorable

Splendid

Glorious

Share this page