SSE was roughly equivalent to 3dNow! in performance.
3D Now! is not equivalent to SSE. It is much more closer to MMX than it is to SSE.
Sorry, but you're wrong.
MMX is a SIMD instruction set for processing vector
integer code, 3dNow! instead was aimed at
FP vector code.
SSE was Intel's response to 3dNow!, back then AMD wasn't really able to push its technology to become an industry standard, like instead it happened recently with AMD64.
Back in those days, i wrote a technical article (for an Italian tech website) on a comparison of 3dNow!, SSE and Altivec (the streaming SIMD set of PowerPC), which was later referenced by Jon - Hannibal - Stokes of Ars Technica.
3DNow!(, Enhanced 3DNow!, 3DNow! Professional, SSE, SSE2, SSE3 and SSSE3) are extensions to the MMX. Its purpose, like the purpose of the MMX, is to improve the performance of 3D games and multimedia. Becouse of the weak FPU of K6, 3DNow! was coverup in the competition with P2. It was expanded latter with Enhanced 3D Now! and 3D Now! Professional on the K6-III & K7, but never did its goal becouse it was software unsupported. The SSE came as a response from Intel to the 3D Now! It was much more advanced and faster. 3D Now! is not even close to equivalent of SSE.
3D Now! provides
21 vector instructions(integer & FP), that operate on
64-bit registers,
divided into two 32-bit single-precision FP words, supporting
only the round-to-nearest rounding mode.
K6 implement 8 64bit 3D Now! registers, mapped onto the FP registers just like MMX registers. Aliasing the 3D Now! registers onto the floating-point stack enables to write x86 programs containing both integer, MMX and SIMD FP instructions with no performance penalty(100-150 cycles on P-MMX) for switching between the integer MMX and the floating-point 3D Now! units.
SSE provides
8 cache control instructions and
70 vector instructions (integer & FP), that operate on
128-bit registers,
divided into four 32-bit single precision FP words, supporting
all 4 rounding modes by the IEEE stanard.
P3 (Katmai) impelment 8 128bit SSE registers, mapped onto the FP registers just like MMX registers.
The eight 64bit MMX registers are aliased on top of the eight FPU registers, enabling SIMD integer oprations in parallel with SSE(no penalty for switching).
So, 3D Now! and SSE are not equal, nither their instructions are compatible.