Actually it's not a bug. As most would tell you here, especially imgod2u who vocally stated it many times, it's the x86 architecture which simply isn't much tailored for parallelism. Therefore in the P4's case where it already has low IPC, a good portion goes unused because there are many cases like branch mispredictions, latency, data dependencies which cause pipeline bubbles, (which probably are even more significant if the schedulers are weak). This is where the K7 shines, it has 9 units, 50% more than P4, so there is a bit more parallelism, hence why the P4 desperatly needs HT's help. The K7 most likely can use 5 of its units on average, that is about 80% of the P4's. If Intel states on average they are 35% at capacity, you know that the P4 has so much hindered potential. Of course, if you've thought about it as I did, K7+HT=Deadly weapon. I would not doubt it and am sure that if AMD gets their success in the server market, yanks some cash for R&D, they'd be doing the right thing investing into their own multithreading support, as it would probably yeild higher average performance boosts than the P6 core.
It's not easy explaining it to someone indeed, and you could use that explanation you said. I would say more like, it makes sure the processor is more efficient when operating.
--
I guess I just see the world from a fisheye. -Eden