[citation][nom]g00ey[/nom]I'm not impressed at all. Doesn't GPU's perform over 2 Teraflops by now? Also the "ASCII Red" from 1997 was NOT the first system capable of over 1 TFLOP. The first known system was the Cray T3E that cam by the end of 1995 and it was advertized to deliver "over" 1.6 TFLOPS.[/citation]
Actually, as I mentioned, the highest-performing single-GPU card only does about 675 GigaFLOPS. Keep in mind that the numbers are different depending on the level of precision: supercomputers are measured using DOUBLE-precision floating-point, aka 64-bit FP. This level of precision is what's needed for scientific and engineering tasks. Meanwhile, standard 3D rendering, gaming, and media tasks are fine using 32-bit single-precision FP. Hence, a lot of consumer-targetted equipment is measured using single-precision; the teraflop figures from AMD (as well as the entirely made-up teraflop figures for the consoles) are referring to single-precision power.
Depending on the architecture, single-precision FP units can be used to produce double-precision results, the double-precision levels will be as much as half the single-precision, (some types of units, such as x86 FPUs that use AVX, namely Sandy Bridge and Bulldozer, as well as the PowerXCell 8i) to quarter (Radeon 6000-series GPUs) a fifth (Radeon 5000-series GPUs, newer nVidia GPUs) to as low as a tenth. (older nVidia GPUs, the PS2's Cell)
And no, ASCI Red was the first computer to actually pass 1 teraFLOP. Just because Cray advertised the performance doesn't mean everyone was getting it; supercomputer performance on the TOP 500 list isn't done off of theoretical peak numbers, but actual, real-world benchmark results. This allows for measurements of just how PRACTICAL those math units are, and what they ACTUALLY can achieve. (as a note, currently Intel CPUs tend to get a slightly closer to their theoretical numbers than AMD CPUs do, and GPGPUs don't get anywhere near close) The first Cray T3E that passed 1 TFLOP wasn't built until 1998.
[citation][nom]g00ey[/nom]Well then they have managed to bring a better CPU to the market which is what R&D is all about after all. But, alas, it's still a mere 8 cores (with an enhanced variant of hyperthreading/SMT) no matter how you paint it, and not the 16 as they are falsely advertising.[/citation]
It's hardly so clear-cut. Yes, it does blur the line on what level of parallelism is being done here, but they're very much cores given that they have complete hardware capability to run two threads per module, and not merely "virtualize" two threads akin to Hyperthreading. Each module has TWO sets of L1 data cache, and is capable of running two floating-point threads in hardware. (with the FPU being the contentious point here) The only exception, technically, is AVX, but Sandy Bridge can't truly support a full thread of AVX in a single core either.
And of course, keep in mind that the concept of a "core" isn't quite fully defined either; it just kind of emerged as an alternative to "extra CPU" around 2004 when the Pentium D and Athlon64 X2 came out... And there was bickering there, too. Yet no one deemed that a chip failed to be "dual-core," if it merged or removed redundant parts, such as the memory interface or I/O, or shared a pool of cache. Looking at that philosophy, Bulldozer's modules are the next logical step in the evolution of sets of 2 cores.