tim.hotze :
The issue isn't cores per die - its how the die interfaces with the rest of the system. The AM4 socket only supports dual-channel memory, and you'd physically need to change the pin layout of the socket to support more (which AMD isn't likely to do since it'd break compatibility).
Threadripper is more than just extra cores - its also a different socket (TR4) with more pins and support for quad-channel memory. Search online for dual- vs. quad-channel benchmarks on TR4 - the extra channels make a significant difference in speed.
With just support for 2 channels of RAM, I don't think you can send data to/from memory quickly enough to keep 16 cores busy most of the time, so you'd likely see little/no performance improvement over say, 12 cores at the same clock speed. Its not to say AMD wouldn't sell a 16-core product, just that it'd have little performance advantage over the 12 core, even for highly threaded work.
One thing this article leaves out is, the high end, 16 core chips are rumored to have 32 MB of L3 cache. This will go a long way in addressing memory bandwidth problems.
A lesser known issue that the first two Ryzen releases are plagued with is the underwhelming front end is insufficient to keep the cores fully fed, so they end up underutilized quite regularly. Ryzen cores have more execution resources than the Skylake architecture, but unfortunately due to being starved by their front ends, kick out less IPC. If AMD fixes their front end, this should be enough to catch up most of the IPC that seems to be lacking.
If the engineers balance the chip right, and that's a big if, the dual-channel memory support shouldn't be all that big of an issue, just as it isn't the issue now. Can you even say how much bandwidth you need per core! Probably not, especially as this changes with each software suite you run on a CPU. Not all well threaded software is well written to keep the important bits in the relevant caches to avoid costly system calls, and not all poorly threaded software will end up needing to go outside the bounds of the CPU.
There is still benefit from the extra threads, having such wide execution resources goes a long way in letting the system run with much lower clocks, while still running massively inefficient software loads such as we're seeing with the current bloat that is Windows 10, or any number of user systems that have been filled with junk-ware.
People forget, are ignorant, or maybe they just prefer the shock and awe of hyperbolic statements, but just because a CPU is rated at 125 watts or higher, doesn't mean it's running at that wattage all of the time.