All uArch's lose efficiency as they scale up, it's the natural consequence of adding more transistors that do important control work. I mentioned it long long ago that the vast majority of transistors in a modern x86 CPU aren't for doing math but for controlling the flow of data in order to feed the high powered engines.
Let me state this for everyone to know, binary math is not hard nor complex or power intensive. Binary math is so fast and easy to do that it scales linearly with clock speed. The problem with complex powerful process designs isn't about the speed of the math but getting enough information to actually do the math in the first place. Engineers have invented all sorts of cool and interesting methods to speed the control of the math up, everything from extremely-fast local storage of data (L1 cache), layered secondary and tertiary storage (L2/L3 cache) to building an electronic crystal ball to have what you need present before you need it (branch prediction and prefetch). The improvements from these tricks are so significant that the engineers now spend more time / energy creating and applying these tricks then actually building a faster math machine.
AMD and Intel both do binary math the same way, it's so simple to do that there is no difference in how the math is executed, and yet Intel commands a strong lead in raw efficiency. This isn't because they sacrifice kittens to some dark satanic digital god, it's because their engineers have found better tricks and have build a better electronic crystal ball then AMD's engineers. Massive vector processors (GPUs) operate under the same principles, nearly all the advances are now in how to feed more data efficiency to the processors rather then making the processors faster. We are now devoting large swaths of silicone to the control and input / output functions over adding more execution functions.
So yes ARM, like PPC, SPARC, MIPS, and x86 will experience the same scaling issue whereby they make the math units faster but need to then add exponentially more control logic to keep them active.