This struck me as a rather silly argument to make, in the context of gaming:
It’s also worth considering this on a purely theoretical basis. If you have eight cores, you could reduce your processing time to 12.5% if everything shared perfectly, saving you 87.5% compared to one core. But if you add another eight cores, that only takes you down to 6.25%, only saving you a tiny amount. In fact, the biggest saving comes from the first few cores you add, because there will always be work for them to do.
It's technically correct, but nobody is going to consider gaming with a single core. So, using that as the baseline is ridiculous. Secondly, it's not like this is some render which could either take 10 minutes or 5 minutes, and you just have to decide whether it's worth the $ to save that extra 5 minutes. What we're talking about is up to 2x the throughput. So, if a game is CPU-bottlenecked, then doubling the core count could mean up to 2x framerate improvement.
That said, he's right that the benefit of adding cores decreases as a function of the number of cores, but more by virtue of the fact that scaling is always sub-linear (assuming well-written software). To his credit, he acknowledges this in his brick-layer analogy.
BTW, the success of the i7-6700K, on Project Cars, suggests their load-balancing isn't great. Skylake cores simply aren't that much faster than Broadwell, per clock.
Remember those huge pauses that plagued the i3? They’re ironed out when The Witcher 3 has four threads to work with.
Those actually suggest lock contention or races involving lock-free data structures. Either way, I'd chalk it up to deficiencies in the software's design. That might also go some ways towards explaining why enabling HT caused the average framerate to increase quite so much.