Question How can newer GPUs be faster with fewer cores?

Jan 5, 2020
Hello, this might not be a question about an issue specifically but a question I wanted to ask you guys that know about this matter more than I do.

I'm asking here because googling the same question does not give me an answer.

I'm not that into tech, but I'm getting there. As a novice PC Gamer I've been doing a very basic research on GPUs and what makes them faster and more efficient. As I got to understand the importance of Shading Units, Memory BUS and Bandwith, clock speed, turbo speeds, floating point operation performance and etc, I started looking back at the older GPUs, and the most noticeable thing is how newer GPUs are able to do so much more with so much less. Let's compare for example, the original GeForce Titan, based on the Kepler architecture at 28nm and the GTX 970 based on Maxwell and 28nm as well.

The Titan was a High-End GPU that blew anything out of the water at the time it came out, but then one year later it was easily made dumb by the 970 which was made on the same lithography however with a different architecture, but what's more impressive is that the 970 sports only a fraction of the shading units, 1664 compared to 2688 the Titan has, not only that, but the memory bandwidth on the 970 is a little slower.

How come the TITAN becomes so inefficient with those +1000 cores compared to the GTX 970? Even with a architecture change, how can the 970 be faster with a thousand cores less? That is 1.8 TFLOPS of GPU of extra GPU power compared to the 970. It still boggles me how it works. Is there an IPC type of thing similar to what we have with CPUs?

Thanks in advance. Have a good one.