I don't think so... i mean those are not extraordinary results, but representative in a conservative POV for the *REAL* HSA software. I happen to have stumble against a lot of patents, and the JIT compiler (Just-in-time) and the "runtime" they are building are pretty advanced... proactive loop unroll, branch elimination, extensive prefetch, HTM (hardware transactional memory), and with that(HTM), even some forms of spMT (speculative multi-threading), and this in either Latency processors (CPU - integer) or Parallel Processors (APD- accelerated processing devices -> the heterogeneous parts).
And their GPGPUs will be "processors" for all concern, they will have a MMU(memory management unit) with TLB, extensive context save states with preemption, exception handling, centralized advanced interrupt control and even adaptive caches like in CPUs. Yes AMD is building GPUs exactly as if they were advanced CPUs, meaning those GCN CU(compute units) will be prepared to fit "inside" a BD *module* without much trouble, the next FlexFPU can very well be a GCN CU of sorts... all served by stacked (3D, 2.5D) High Bandwidth Memory, with advanced interconnects.
4 modules 4 CUs, running at 4Ghz, 4x the speed of GPU card(typical 1ghz), could be equivalent or more to 16 CUs of actual GPUs, that is 16x64 =1024sp, above HD7770 is perfectly at hand
That is why i think this might not be too off base
http://diybbs.zol.com.cn/11/11_106489.html
Good thread
http://www.xtremesystems.org/forums/showthread.php?282723-AMD-quot-Steamroller-Excavator-quot-info-speculations-and-experience/page7
Don't know... it could be a 1th April joke... but who ever invented it, has good knowledge of the matters, all that JIT oriented software, fits well in a "dataflow" paradigma... the GPU will be equal to CPUs, that is, the FlexFPU can be of GCN CU kind... and spMT( speculative multithreading) based on HTM ( thread level data speculation) are on AMD patents...
If all well build, 500% for HSA software ( it must be build to take full advantage of all that) might even be low for "average"... but it must be HSA software...
EDIT:
This is already being offered by OpenCL.
No OpenCL is NOT HSA... OpenCL is a kronos standard for compute jobs. HSA can do C/C++, fortran, C#, Java , Python, Ada , Ruby, etc, and OpenCL. All HSA software could run in non-HSA platforms, only needs the proper runtime( which is software), but full speed only on HSA hardware( minimal hardware requirements). Likewise, OpenCL based HSA software can run in all platforms for which there is a runtime,
it will not run without the runtime, similarly, all OpenCL that is starting to show now, *that is non-HSA*, will run on HSA platforms, only it will not show the full potentialities... not even close.