[citation][nom]warmon6[/nom]Although the better question is, is the FPU's in each core strong or are the weak? That'll make more of a difference.[/citation]
This is something I was considering as well. As impressive as "100-core" sounds to a lot of people, I know that if they're cramming that many "cores" onto a 40 nm chip, this means that the number of transistors per core has to suffer. Ignoring cache size (which also impacts performance) this means that each core would be comparably complex to, at best, a .35 µm (350 nm) CPU, or potentially even worse. For reference, .35 µm in x86 was what gave us the late-cycle Pentium MMX and Pentium Pro, as well as early Pentium IIs.
This very, VERY heavily suggests that each core will, in fact, not have a vector unit; the first x86s to have proper, un-gimped 4x32 FP support were the 180nm Coppermine Pentium IIIs, fully four times as complex.
This would put the CPU at a disadvantage; that'd make its theoretical peak FP performance equivalent to merely that of a 25-core CPU that had 4x32 vector support. I'm also wagering that the per-clock performance there will be worse, as the CPU has a good chance of sporting less cache per core. Lastly, I will question the competitiveness due to economies of scale: while I'm not saying the CPU is doomed, I highly doubt it'll reach the volumes that we see with Opteron and Xeon processors, so prices will likely be rather high. In the end, the performance-per-dollar may very well be worse.
The Memcached benchmark does strike me as suspicious: they made no mention of the comparative clocks, but did, with the author's indication of a "400w server" suggest that the Tilera's TDP is, in fact, higher. If, say, a 300w CPU manages only a 67% increase over a 130w Xeon... (that's the highest Nehalem/Sandy Bridge Xeons go, AND the article referred to the Xeon compared as "low-power" chip, suggesting an even lower TDP) You've got yourself a disaster for a CPU. You've got, at best, a 72.4% performance-per-watt level.