JAYDEEJOHN
Champion
piesquared :
Phenom good for? It looks like it might depend on which compiler is used. If it's Intel's compiler, performance may be crippled on AMD processors, yet again.
http://aceshardware.freeforums.org/cpuid-family-bits-added-because-of-flaw-in-intel-compiler-t428.html
I just tried Intel C++ compiler version 10.1 with option /QxO as you suggested. It generates the following versions of code for common mathematical functions: SSE2, SSE3, SSE4.1 and non-Intel SSE2. It doesn't work on any CPU prior to SSE2. This is the only compiler option that makes it run reasonably on an AMD, but why are there two different SSE2 versions, one for Intel and one for AMD? When I hack the CPU-dispatcher and makes it believe that it is an Intel, it runs 50 - 100 % faster. This means that the Intel-SSE2 version is faster than the AMD-SSE2 version when running on an AMD processor!
There are also options that work on any processor. For example /QaxB. This options runs non-vectorized SSE2 code on Intel processors and old 8087 code on AMD processors. I measured this to be 5-10 times slower than the /QxO option on an AMD Opteron.
http://aceshardware.freeforums.org/cpuid-family-bits-added-because-of-flaw-in-intel-compiler-t428.html