AMD Demos The World's First 7nm GPU, Radeon Instinct Vega

Paul Alcorn · Jun 6, 2018

AMD demoed the world's first 7nm GPU at Computex 2018 and said the new process is coming to gaming GPUs--eventually.

AMD Demos The World's First 7nm GPU, Radeon Instinct Vega : Read more

oneblackened · Jun 6, 2018

This is a big deal. AMD has beaten both Nvidia and Intel to the punch on process node shrinks.

bit_user · Jun 6, 2018

oneblackened :

Moot point, given that it hasn't got tensor cores. Okay, I'm assuming it doesn't have tensor cores, but I'm pretty sure she'd have mentioned them if it did. So, even with the 35% performance boost, it still can't touch the training performance or efficiency of Nvidia's V100.

BTW, the new instructions will help for inference, but not training.

Gary_133 · Jun 6, 2018

AMD skipping 12nm and going straight to 7nm? That brings a tear to the eye in a very good way!!

Nothing but praise the way Lisa Su is running the company placing technical improvements just as high as profits!

InvalidError · Jun 6, 2018

bit_user :

A tensor core is only a fancy name for matrix multiply-add. AMD could probably tweak the shader architecture to achieve comparable performance without dedicating a large chunk of die area to fixed-function math, albeit at the expense of power efficiency when running tensor-intensive workloads.

boju · Jun 6, 2018

AMD have sailed their Chips over the narrow sea. The big war is coming who will prevail. Will it be the Advanced Mother of Dragons, the Nviannister's or the Intelwalker's. Winter is coming The night is dark and full of terrors, old man, but the fire burns them all away." "Look to your sins Lord Renly, the night is dark and full of terrors.

bit_user · Jun 7, 2018

InvalidError :

Vega already had packed fp16 math, and (as I implied) I've already seen enough of the LLVM patches for the new instructions to know that they won't significantly change its fp16 throughput.

So, the only way it gets more than the stated 35% performance boost @ training is by some fixed-function hardware that wasn't mentioned - a pretty big deal to gloss over, but it's possible they're keeping that bit under wraps. Otherwise, the V100 will still be over 3x as fast.

As for inference, their new 8-bit instructions net them a mere 67 TOPS, compared with V100's 110 TFLOPS. I doubt its efficiency improved enough to sustain 67 TOPS at a mere 150 W, which is nominally what they'd have to achieve to reach parity with V100's efficiency. Plus, lots of fixed function hardware is coming to market that targets inference (or already in use, such as Google's TPUv2).

Interestingly, the new chip has packed 4-bit arithmetic, which we'll probably be hearing about. However, that's so coarse that you probably need to compensate for the quantization noise by adding significantly more nodes in the layers using it.

Co BIY · Jun 7, 2018

"The new process also affords a 2x reduction in power efficiency"

I think that should be "2x increase in power efficiency"

Paul Alcorn · Jun 7, 2018

Good eye, fixed!

Solandri · Jun 7, 2018

oneblackened :

I thought AMD sold off all their fabs? So wouldn't the fact that they're first to 7nm just mean they were willing to pay more to Global Foundries or TSMC or whoever so they could be first in line?

bigpinkdragon286 · Jun 7, 2018

bit_user :

Right, so basically VEGA has no need for the special purpose cores as each of the standard cores already supports mixed precision. A benefit of the more general purpose cores AMD designed; less need for fixed function hardware.

samlebon23 · Jun 7, 2018

boju :

Hodor, Hodor, Hodor...

bit_user · Jun 8, 2018

bigpinkdragon286 :

No... I wouldn't say there's no need.

Nvidia's P100 was first with the type of packed fp16 math that Vega added*. That netted it about 19 TFLOPS. Then came the V100 and showed what a benefit can be derived from a special-purpose engine, delivering 110 TFLOPS from its Tensor cores. That's completely separate from the 27.6 TFLOPS that V100 delivers on the same packed-float instructions that the P100 supported.

Again, (extrapolating from what they've said) Vega's approach is only good for about ~67 fp16 TFLOPS, in the 7 nm chip. That's not exactly matching Nvidia, much less leap-frogging it. Maybe the purchase price on the new chip will be lower by enough to offset the difference in operating efficiency, but it's a little hard for me to see how this will gain AMD very much traction in the deep learning market.

Where we can agree as that the approach taken by Pascal and Vega has substantial potential benefits to gamers and certain other applications, whereas the tensor cores don't help with much beyond their target application of machine learning.

* In fact, Intel's Broadwell-era HD Graphics were actually first with this capability, but they lack enough shaders & memory bandwidth to be very interesting for the sort of training workloads that require fp arithmetic.

halbe · Jun 8, 2018

At what point will quantum tunneling start to become a problem?

mortius123455 · Jun 8, 2018

@HALBE depends on the material and the gate oxide layer. On 7nm this is a problem, 5 NM it is a hard problem. Intel will use a 3d gate and sink solution, AMD i do not know. 3-1 i have no clue i think 2 will be the final stop for shink fore silicone.

bigpinkdragon286 · Jun 8, 2018

halbe :

Quantum tunneling is already a problem but your question is also a bit broad, as this depends on materials and temperature. Current transistors require internal barriers of about a nanometer to keep current leakage from quantum tunneling at reasonable levels, although on the research side of things, a team at Lawrence Berkeley National Laboratory has successfully built a functional 1 nanometer long transistor gate.

akamateau · Jun 21, 2018

When is Tom's going to write about AMD and Hygon's Dhyana? You know the x86 Zen based server cpu that is soon to be launched as the product of the AMD joint venture with THATIC?

https://www.cnbeta.com/articles/tech/736677.htm

http://www.zjzgh.net/news/2018/06/21/92322324.html

https://prohardver.hu/hir/hygon_dhyana_rendszerchip_x86_amd64_kina.html

https://article.pchome.net/content-2068879.html

Paul Alcorn · Jun 21, 2018

akamateau :

We have, at the bottom of this article.
https://www.tomshardware.com/news/intel-ceo-amd-server-market,37273.html

InvalidError · Jun 21, 2018

akamateau :

THG rarely covers AMD and Intel SoC products for the domestic and worldwide markets since most THG readers have no interest in soldered-on-motherboard CPUs aside from x86 laptops.

Dhyana is an SoC server product exclusively for the Chinese market, which means most THG readers won't be able to get one even if they wanted to.

AMD Demos The World's First 7nm GPU, Radeon Instinct Vega

Editor in Chief (Interim)

Distinguished

Titan

Prominent

Titan

Titan

Titan

Splendid

Editor in Chief (Interim)

Glorious

Splendid

Reputable

Titan

Prominent

Splendid

Reputable

Editor in Chief (Interim)

Titan

Share this page