cdrkf :
Yeah it's interesting. What I have noticed though is that whilst DX12 is proving to be somewhat lacking, Vulkan looks to be *much* stronger. Playing the new DOOM on my now ancient R9 280 yields a 20+ fps frame rate boost at 1080p on high settings. It's glorious as game is now running consistently well over my screen refresh rate, whereas I was having to lower settings under the older render path to keep things smooth.
I guess the caveat to that is that DOOM is written by ID, who are some of the most talented engine developers in the industry. Still on the flip side it looks like Vulkan is likely to gain a strong footing in the mobile games market where developers are typically working with weaker hardware and as a result need to be cleverer about how they use resources.
when it comes to this low level business OpenGL and Vulkan definitely can do it better than DirectX can be. extensions. IdTech mentioned because Vulkan did support vendor extensions (just like OGL) they able to optimize Doom depending on the hardware. this is how low level should be done. tweak the software based on architecture specific and strength. Direct X does not have this. but the funny thing is i see some people suggest Khronos group to ditch extensions for Vulkan because it is one of the reason why developer avoiding OpenGL in the past. without extensions we probably will see Vulkan having the same problem as DX12 right now.
vulkan on mobile is interesting topic. i think majority of game developer that develop mobile games will stick with OpenGL ES 2.0 for a very long time. while vulkan can uplift the performance the support probably only come to most recent hardware only not on very old hardware. and developer on mobile have this tendency that they want their game to reach wide audience even people with very dated hardware so they develop their game with that in mind and not pushing the limit. just look at openGL ES 3.0. android add support for OGL ES 3.0 since 2013 but how many games actually use them?
cdrkf :
Final thought, I don't think AMD's 'power efficiency' issue is really related to them 'sticking' to GCN. There are a number of examples of GCN cards that are actually as efficient / more efficient than nVIdias in each generation- the design isn't as bad as people think. The issue is more that nVidia got better performance than I think AMD was expecting which as forced them to push the clocks up way higher than the design is intended- which puts it firmly into the 'exponential increase in power vs performance increase' territory making the cards rather inefficient. From what I've seen, GCN parts clocked in the 800 mhz range offer good performance whilst sipping power. Push them to 1ghz and they start to get hungry, push them higher- well you get the 390X monster.
that's why they need to make big changes to them and not just minor changes. they are more power efficient at lower clock but they can't retain that power efficiency when they want faster performance. for the power RX480 use around 1.3ghz nvidia can offer you GTX1070 base performance. try lowering nvidia clock and undervolt them. we might get similar result.
cdrkf :
When you look at what nVidia did to get their massive perf / w gain- essentially they leveraged their work on mobile graphics socs. They implemented tile based rendering and a few other techniques, whilst also stripping out the compute capability of their cards which coupled together helped them bring power down significantly. I think if anything nVidia responded much better to the delay in silicone node transition- when they realized they were stuck with 28nm they had to make some drastic changes to effectively gain a new process node type jump without it- resulting in the ludicrously efficient 900 series. Definitely a clever move on their part- I think in this situation AMD just didn't adapt quickly enough (effecitly shoehorning Fiji onto 28nm when it was always supposed to be a 20nm part).
their venture on mobile market really force them to make more power efficient design despite their venture in that segment end up being a failure. nvidia learnt their lesson back in 40nm. because of that they always expect what TSMC do might not always giving them the result that they want.