"AMD made mistakes with Bulldozer, but the biggest one was actually that they expected software to utilize more cores a decade before it really happened (so more like 2020 is when cores will be properly utilized)."
Even then, though, it still only really happens in ideal core-saturating cases such as an h.265 encode, and very rarely in the real world---you would think chip engineers and executives could have figured that out if we laypeople all could.
"Intel's P4 and Netburst was a BAD architecture because of critical flaws with it's queue and branch prediction, at least that's what I remember. Implementing HT actually helped a lot there because it alleviated their backed up queue problems. Now HT is worse when cores are properly loaded...and I think they have much shorter queues."
Well, that and its pipeline depth which they *thought* was going to allow them to hit high Ghz before they hit that brick wall with the furnace known as Prescott. In simplest terms, they were chasing clock speeds instead of IPC, which is kind of the trouble AMD got in with BD/PD---higher physical core count at the expense of single-thread performance, and when they pushed the clock speed to offset Intel's IPC/single thread performance advantage, we got the atrocity that is my 9590, a 220W monster that got beat in most things by my 80 or so watt 3770K.
Edit: That should read my "old 3770K" as I now have 4790K in my Intel boxes.