gamerk316 :
Looking for compiler benchies; very few exist, but here's one from 2009:
http://keyj.emphy.de/compiler-benchmark/
ICC crushing it.
Only other one I could find that is meaningful:
http://www.g-truc.net/post-0372.html
GCC/LLVM/MSVC AMD v Intel. Breakdown is what you'd expect, or:
C2Q < PII X6 < Core i5
GCC < LLVM < MSVC
EDIT:
Give up; can't find anything run with an AMD CPU. Best one I could find, again for Intel:
http://willus.com/ccomp_benchmark2.shtml?p18+s14
Note the trend:
ICC 11 x64 > GCC 4.6.3 > ICC 11 x32 > MSVC 2010 x64 > MSVC 2010 x32 > GCC 4.6.3 x32.
EDIT2:
Found one. From 2006, but still:
http://www.principledtechnologies.com/Intel/CompComp.pdf
ICC 9.1 > GCC 4.1.1 for AMD.
http://blog.pgaddict.com/posts/postgresql-performance-with-gcc-clang-and-icc
Did you miss this one? GCC 4.9.1 faster than ICC. Older versions of GCC were basically lacking a lot of features that the proprietary compilers were missing. That has been fixed in newer versions. The problem with old GCC is that it was simply not optimizing code the same way the new ones were. That's (mostly) fixed now and GCC has auto-vectorization and goodies like that now.
Meanwhile, some applications do run better on ICC:
http://stackoverflow.com/questions/1733627/anyone-here-has-benchmarked-intel-c-compiler-and-gcc
So it's a mixed bag, it's not so clear cut. But regardless of what's better, it still raises the fact that some are better at some tasks on different CPUs.
AMD basically abandoned Open64 (it hasn't been updated in almost 2 years) and they just contribute to GCC now. All of this is new since 2013.
Compilers do far more than just decide which instructions get executed. For GCC and command line switches I have the ability to tell the compiler to do things like optimize for specific cache sizes, etc. And as Juan's link showed, compiler tuning can decide how much of the theoretical maximum of IPC you can use by reducing cache misses and stuff. Lots of stuff here.
https://gcc.gnu.org/onlinedocs/gcc/Option-Summary.html
GCC itself has seen improvements of 15% or more simply by going from version 4.1 to 4.9. There's far more than just choosing which instructions get ran by the compiler. ICC, when it had a massive lead over the competition in nearly everything, and the places where it still does, doesn't get those gains just from running different instructions. IT does so by creating code which will work the best with the architecture they have.
Your benchmarks are out dated. GCC has changed a lot:
https://www.p8952.info/ruby/2014/12/12/benchmarking-ruby-with-gcc-and-clang.html
The difference between GCC 4.4 and 4.9 is around 70%! That's massive! Like I said, GCC has come an extremely long way since the older versions and the older versions were definitely missing optimizations features.
LTO alone is a big change that earlier GCC versions completely lacked while other compilers had it.
Are you saying AMD is going to give up in the dGPU market completely or just for HPC? I can see them dropping it for HPC but for HEDT and semi-custom I'm not sure it makes that much sense for a while still. When we hit 7nm an x86 CPU is going to be so small that it's barely going to contribute to the overall TDP and die area anyways.
I can see dGPU going away when x86 core is small enough that it barely takes up die space. But that's not going to be for a while, Even Zen core on 14nm is rumored 95w TDP maximum with ~5 to ~8 mm^2 per core (not including caches and stuff). Carrizo core is tiny part of the die, but look:
Even with the CPU core being so small, there's a ton of other stuff taking up space on Carrizo. That's a ~245mm^2 die. Less than half is GPU, the rest is CPU and associated logic.