Sorry amigo..... the best you'll get is a phone with enough horsepower on a good enough network to provide you a proper remote connection to a PC. Or if MS ever releases their vapourware surface phones.
What you're asking, unfortunately, is just not within the realm of possibility at present - almost all mobile devices uses ARM processor arch, while almost every proper game uses x86. This means the fundamentals of the processors are completely different, and so they don't cross over very nicely. Aka, at all. It's like asking an English text to speech engine to read you a chinese website.
Dolphin, as an example, actually runs fairly poorly, all things considered. It has been coded from the ground up to emulate a specific console's operating system, processor, and other key hardware - to run that translation. However, this is quite CPU intensive. Hence one reason that the Dolphin for Android emu states it runs like trash and almost certainly will for quite some time.
Running full x86 emulation is basically beyond the tech we have right now. It's just too challenging to convert the instruction calls (x86 has a huge number of instructions, as does ARM) for all situations required for something like running a game - and to convert them at a speed that it goes unnoticed, even moreso.
Technical Explanation of why this is an awkward question, if you're interested
An instruction is the single most basic thing a processor can do. The fundamentals everything else builds on, if you would.
However, take as an example some really basic MIPS (for MIPS processors by Imagine Technologies) instructions:
addi $1, $0, 5 # add 5 to 0 and place in register 1
add $2, $1, $1 # add R1 to R1 and place in 2, res: 10
beq $2, $1, 0xFF0ABC # If R1 == R2, jump to instruction at 0xFF0ABC
addi $2, $1, 5 # If else, add 5 to $1 and put in $2
Anyhow, that's for MIPS. Keep in mind, each of these commands translates to a binary code of 32x, so like, 00110 001 000 100 might be the first one (if I could bother counting how many digits I needed. Just assume it's 32x), with the first section being something called the "op code". This tells your processor what instruction it's even doing.
Intel (x86) may repeat functionality, hell, the system may even use the same names in its programming, but here's the problem, say intel has this:
addi $1, $0, 5 # add 5 to 0 and place in register 1
The actual binary might have a different opcode, as this isn't something standard (outside of the instruction set).
10001 001 000 110
Now you have to identify and translate all that into something matching your MIPs instruction, and you have to do it accurately. There's other hiccups and tricks in machine code that don't even make it this simple, including things like undocumented instructions. However, if you take nothing else from the example: This is one instruction. x86 has several hundred, if not thousands of instructions, some for functionality that can't be replicated single-step on a different platform. The platform hosting ALSO has large quantities of instructions. These all have to be translated for emulation to work, and.... that's hard.