Celeron got a bad rap because it was, indeed, very often very crippled at factory settings. Those cripplings were:
- much less L2 cache
- toned down FSB.
Historically, the Celeron appeared as a L2 cache-less Pentium 2/300 (at the time, L2 cache was held outside the core), there was no Celeron for Pentium/586 and /MMX variants.
The Celeron 300A was a boon due to its VERY easy overclocking: a chip tailored for 500 MHz, with a multiplier set at 4.5, integrated L2 cache which, while 4 times smaller than its P2 counterpart, was much faster, made a 450 MHz perform slightly better in some cases than a much more expensive P2/450. But it was an exception (the 333A had a much lower overclock success due to it being much more borderline).
Later Celys (P3 cores with half of L2 disabled) were usually stuck with a 66 MHz FSB on cores that didn't allow an easy 50% overclock, and even worse, at a time where the FSB was switching to 133 MHz (the laptop versions were decent performers due to FSB at 100 MHz) - their crap reputation started then, moreover AMD CPUs were cheaper and more efficient at the time (the Duron for example could have its multiplier unlocked and FSB raised, and K7 cores didn't mind low L2 caches as much as they minded low latency RAM - feeding a high FSB and fast RAM to a Duron would reach performances very close to the same clocked Athlon).
When Intel switched to P4, the Celys left were designed to upgrade older Socket 370 mobos - they were nothing more than relabelled P3 with somewhat smaller L2 cache, but the new production process, correct L2 cache and 100 MHz FSB made them better overclockers, and the generally better performing P3 architecture made them much better than much pricier, higher clocked P4s (a 1.4 GHz Cely would trounce a 1.8 GHz P4).
For the P4 (which was a crapola architecture all the way), the Celys were even worse performers: the only things going the P4 way were its big L2 cache and memory throughput, and those being crippled on the P4 Celys, those chips were even worse than VIA's (they performed no better and sucked up more juice). Overclocking them usually just... failed, the low L2 cache failing to keep the deep pipeline fed, extra MHz were usually like giving marmalade to pigs.