So, to make our point, we have to perform a magic trick. All magic tricks have three acts. The first part is called "The Pledge." That's where we do something ordinary: talk about CPU architecture. Any editorial team can do that. The second act is called "The Turn." We take our ordinary article and make it do something extraordinary. This is where we get into the details of chip fabrication and the history of mobile GPUs, something only a few editorial teams can do.
...
Let’s talk about raw performance before we discuss power consumption. There is no question whether Intel has the best resources to achieve the fastest processors. ARM and Qualcomm are going to face the same growing pains that the x86 world has already struggled through.
In the next processor generation, Qualcomm is transitioning from its partially out-of-order Scorpion architecture to Krait, a full out-of-order design. Krait should more effectively facilitate peak CPU utilization, maximizing efficiency.
At the same time, Qualcomm is now navigating uncharted territory, where its engineers have less expertise. ARM already has some experience with its Cortex-A9, which is out-of-order-capable. But even with the upcoming Cortex-A15, the company will be relying on dedicated reservation stations (the instruction queue) for each of the execution units. While Intel and AMD used dedicated reservation stations in the past, both now employ unified reservation stations to improve performance and utilization. Unlike ARM, Qualcomm is attempting to jump directly to a unified reservation station design. The original Pentium Pro used a unified reservation station, so it’s not inconceivable to think that a company could pull this off successfully.
The Atom architecture doesn’t incorporate any of Intel’s advanced technology. It’s a single-core, in-order design that is more reminiscent of the Pentium CPU than anything modern. But here’s the thing: it’s already faster than the ARM-based competition. As performance demands start to increase, Intel has access to decades of expertise to drop into Atom. We’ve heard that Atom would go to an out-of-order core within five years of its launch, landing it in the 2013 range. So, ignoring power consumption, there is little doubt that Intel can put out faster processor designs.
...
Intel, on the other hand, has always done pretty well with the performance of its platforms (just look at its current Sandy Bridge-E architecture). Again, the challenge for Intel is power consumption, rather than performance.
The reason ARM is dominant on the mobile side is that, to date, Intel has been unable to demonstrate a power-efficient MSoC. In this world, trumpeting impressive performance-per-watt numbers isn’t enough. You actually have to be able to show off a full day’s worth of talk time and impressive standby numbers in order to be functional. With Medfield, Intel demonstrates that its team has the technical know-how to produce an MSoC within striking distance of ARM. As Intel put it, Medfield buys the company a seat at the table.
What happens in the next three years, though? Given that Medfield is competitive with currently-shipping ARM MSoCs, we have to look to the next generation. On the previous page, we suggested that ARM and Qualcomm face at least as significant of a challenge scaling performance up as Intel faces in scaling power down.
We can be perhaps most objective in looking at manufacturing technology. Intel has the best chip fabs in the industry, which allowed it to out-compete AMD during the K6 and K7 era, and maintain its position when AMD introduced the successful K8-era processors. Medfield is currently based on a 32 nm node and is already competitive with ARM-based solutions. Intel’s next move is to make a jump to 22 nm on a 3D FinFET design, representing two steps forward in process technology. Intel has never failed to execute with a fabrication process, and it will already have plenty of experience from its Ivy Bridge-based processors. If the company stays on track, it’s about 18 months ahead of the competition in manufacturing. As soon as the competition starts shipping 28 nm, Intel will follow with 22 nm, and it will be even longer before competing fabs can implement FinFET. This gives Intel another 20-30% improvement in power consumption over its current technology, while basically doubling density.