I don't know when *it* actually happened, but AMD/ATI has been steadily improving OpenGL performance over the last 3-4 years, and Trinity seems to bring everything together.
I don't know why or how :lol: but it's a combination of chip logic and arch, drivers and the API itself. Implementing OpenGL in hardware and software by its (open) source nature right now has advantages over 'DirectX' in compute, but you can be sure that MS is fighting back with refinements to
DirectCompute11.
If you check out the OpenGL scores for the Trinity APU in Cinebench, you start to see significant evidence of this. The
Turks cores in combination with the Piledriver cores just work well. This is a big deal. An integrated SIMD engine on the APU is blowing past multi-core CPUs with significant discreet graphics.
If you re-check the Toms OpenCL/GL article (where they interviewed folks from Corel, Adobe, etc), one of those guys used calls to those Turks APU cores via OpenGL. I'm not smart enough to define the intricacies of coding OpenCL/GL in this fashion, but AMD Trinity APUs via hardware seem to work well at it - and they are only half-way through their unified memory *HSA* ultimate hardware design.
Check out Sony Vegas with the FirePro V4900. I don't know what part OpenCL/GL plays in the accelerated encoding, video FX, transitions, compositing, pan/crop, and track motion, but the impact is significant over and above the pretty dang good multi-thread performance of the
Bulldozer CPU cores.
😀 I forgot to mention that the FirePro V4900 has the same 800MHz
Turks core clocks and 128-bit interface as the
Turks SIMD engine cores on the Trinity APU. The difference is 480<-->384
Turks shaders, or using desktop discreet as an example the
Turks HD6670<--->HD6570.
As far as BD to PD improvements, it's typical of AMD in a die-shrink (see: original Phenom 45nm). From original stepping, they make a 15-20% leap in performance and efficiency when they refine the arch. The changes on Piledriver are almost too numerous to list: logic, split bi-directional power between the APU x86/GPU 'SIMD' cores, the UNB ('unbridge'), IOMMU v2, smoothing PLL *jitter* etc.
And that kinda ties back in to the thread topic
😱 when AMD makes a big leap in efficiency, some folks can use it (and some LN2) to clock 7.3GHz ...
edit: I messed up your quotes and just slashed 'em out - LOL - hope this makes sense