It’s noticeably more efficient to be done on EPIC than x86, based on x86’s register limits as apposed to EPIC’s.
Number of registers has nothing to do with EPIC concept. E.g. SPARC has as many registers as Itanium, it even has register windows and still is OOO machine. BTW, it does not work well for SPARC either...
It is true that 8 GPR for IA32 is somewhat limiting the performance (but not that much). Anyway, this problem is solved by AMD64. There is little you can complain about in AMD64 ISA performance wise.
Here i will side a bit with the pro-EPIC party

Well yes the extemely limited number of registers of x86 has nothing to do with OOO architectures in general.
However, the idea behind EPIC is the following.
OOO is cool and powerful, but it has a cost, in terms of HW, an important cost.
So EPIC says, let's do without OOO and use that silicon to improve the execution resouces instead, and leave the scheduling to the compiler.
For example, instead of using 88 registers only for dynamic scheduling, i could expose them to programmer, which then could use them also as storage space, say, for local variables for or to speed up procedure calls.
Sames goes for the deep buffers, schedulers, and extra control logic which becames much more complicated; that silicon could be used for creating extra ALUs, which can give a higher peak performance.
It's a bit like CISC -> RISC: with CISC you had complex instructions which could even make you a coffee, then RISC broke that down in sequences of simpler instructions, simplifying HW design and the control logic of the CPU; this in turn meant that hand writing assembler code became more difficult, and compilers had to become more sophysticated.
But the HW overall became faster and RISC was a big success.
However, EPIC has yet to prove to be the next big thing compared to OOO RISC.
Let's take the predication example: with EPIC you use redundant execution resources to process both ends of a branch, then kill the wrong one; however, it could be more efficient to use those resources to process a second thread, like SMT and hyperthreading.
And there are case where compiler analysis cannot find out dependencies, which arise only at run time.
Let's take the following example:
array[n] = array[j] * a
array[k] = array[j] + c
Now, depending on the values at
run time of
n, j, k, we can have several possible dependencies which can turn into hazards:
n = j : real (or flow) dependency (read after write)
k = j : blocking (or anti) dependency (write after read)
k = n : write (or output) dependency (write after write)
Since the values of k, i, j are not known at compile time, a compiler cannot do much to optimize that code, while HW OOO can dynamically schedule it at run time.
IMO, what made OOO so popular, is that different CPUs in the same family can share the same compile code and all of them execute it as fast as possible given their own architecture.
Without OOO, AMD probably would have had no chance against Intel, because there was no way that they could convince the developers to really optimize the code for their CPUs, so the only way was for the CPU to optimize the code itself.