Welcome to the Volta & Turing MegaThread! I know, it is strange to mash two different architectures in one thread, but with how similar the Volta and Turing GPUs are, plus the fact that Turing doesn’t succeed Volta; it’s much easier to discuss both architectures if they are integrated into one thread.
VOLTA:
The Volta architecture was designed around the need for a cheaper and simpler way to power AI/Deep Learning compute tasks. Traditional deep learning algorithms require vast amounts of server arrays that cost hundreds of thousands of dollars and also tens of thousands of dollars to power each month.
Running these resource hungry AI algorithms on regular CPUs and GPUs isn’t the most effective way of processing AI, since normal processing units are designed to process different kinds of tasks and not just one specific workload. This is where Tensor cores come into play, these cores are designed specifically for deep learning and machine learning compute tasks. The result is a core that runs FAR more efficiently and performs better than traditional CPUs (in AI computing), now a server with several Volta GPUs can process just as much data and information as a whole line of servers reaching from wall to wall in a large warehouse.
However, Volta didn’t exactly, takeoff, as many expected, there are only two variants of the Volta core, one is a Quadro variant and one is in a prosumer variant, the Titan V.
Titan V:
Price: (Around) $3000
CUDA Cores: 5120
Streaming PUs: 80
Texture Units: 320
Base Clock: 1200 MHz
Boost Clock: 1455 MHz
Memory Clock & Bandwidth: 850 MHz & 1.7 Gbps
Memory Bus: 3072-bit
VRAM: 12.3GB HBM2
TURING:
While Volta may have been focused almost exclusively on machine learning, Nvidia’s latest architecture, Turing, is focused back on the pure graphical performance of a GPU, this time in the form of Real Time Ray Tracing.
Turing is a special architecture, compared to all (known) previous Nvidia architectures, Turing has been in the making for over 10 years (yes that means Nvidia has been developing Turing since the days Fermi). You might be asking, is Ray Tracing that important? Nvidia seems to think so.
Without going too deep, Ray Tracing is incredibly intensive to run on normal GPUs since it simulates actual light rays. It can take a day, days, or even a WEEK to generate ONE ray traced image. This is why the movie industry has been the only place to take advantage of ray tracing, since they can afford to wait that long for a single ray traced image.
Turing on the other hand can generate Ray Traced images in REAL TIME. Through Nvidia’s new RT core and optimizations for Ray Tracing, real time ray tracing can now be done.
This is why we are now seeing Nvidia pushing ray tracing into video games.
Of course ray tracing isn’t the only thing Turing and the RTX gaming cards are good at, these cards also have a big bump in CUDA cores compared to their Pascal predecessors and include an unknown amount of Tensor cores (most likely for Nvidia’s new-deep-learning-anti-aliasing tech).
So for the first time ever in a GPU, we have three different cores designed for three separate functions, but all somehow work towards one goal. Pretty interesting stuff.
For now Nvidia has launched three RTX gaming cards, the RTX 2080 Ti, RTX 2080 and RTX 2070, and 3 more Turing cards in the Quadro family for the enterprise space.
Note: Turing GPUs support GPU Boost 4.0 in spec sheet. Something Nvidia hasn’t shown light on (yet).
RTX 2080 Ti:
Price: Around $1150
CUDA cores: 4352
Base Clock: 1545 MHz
Boost Clock: 1545 MHz (1635 MHz OC on FE)
VRAM: 11GB GDDR6
Memory Speed: 14Gbps
Memory Bus: 352-bit
Memory Bandwidth: 616GB/s
TDP: 250-260W
SPCs: Dual 8 pin connectors.
RTX 2080:
Price: Around $750
CUDA cores: 2944
Base Clock: 1515 MHz
Boost Clock: 1710 MHz (1800 MHz OC on FE)
VRAM: 8GB GDDR6
Memory Speed: 14Gbps
Memory Bus: 256-bit
Memory Bandwidth: 448GB/s
TDP: 215-225W
SPCs: 8 pin + 6 pin connectors (FE only, AIBs can use different configurations)
RTX 2070:
Price: Around $550
CUDA cores: 2304
Base Clock: 1410 MHz
Boost Clock: 1620 MHz (1710 MHz OC on FE)
VRAM: 8GB GDDR6
Memory Speed: 14Gbps
Memory Bus: 256-bit
Memory Bandwidth: 448GB/s
TDP: 175-185W
SPCs: 8 pin connector (FE only, AIBs can use different configurations)