I see modern RISC processors needing just as much magic HW and SW to get them to "play nice too". At least for any that can compete at the workstation level. They require some fairly complex/expensive I/O systems and database/grid compute software to get that performance.
IBM processors are usually packaged in giant MCMs with 100s of MB of cache/eDRAM. The MCM itself needs 800 Watts just to run it.
This is an old pic but similar is done with the latest Power7.
http://en.wikipedia.org/wiki/File😛ower5.jpg
SPARC modules required similarly large and expensive amounts of cache to be useful. This is partly why they pretty much died off until Oracle breathed some life back into them. The price/performance just wasn't worth it.
As long as price/performance keeps getting better the underlying ISA is just 1 aspect of 1 piece of the puzzle.
http://www.theregister.co.uk/2012/03/08/supercomputing_vs_home_usage/page2.html
*Cough*
Considering I have several SUN stacks sitting about 50 feet from my desk I might have just a little bit of hands on experience with this.
To someone who doesn't know what their looking at you'd think the interconnects between a SPARC, PPC and x86 look they same, they aren't, not by a long shot.
SPARC itself is modular, you can pull a SPARC CPU, memory or peripheral device out of a system and it won't crash. That circuitry you see is what allows that to happen. SPARC's now ship looking exactly like Xeon's, a chip either in a flat socket package or part of a CPU card assembly (for hot swap usage). Depending on the backplane you use you can keep getting bigger and bigger, I've worked with 64-way systems before.
I'm really shocked that so few people actually know what happened to SUN. Their hardware was never a problem, it wasn't expensive nor clunky. You could buy an Enterprise 25K if you wanted, that's an entire rack of equipment for a mainframe. But that era's most common product was actually the SunFire V240, a 2U dual socket server designed for low to medium processing. After that was the V490's which are 5U with two dual socket CPU cards. The 880's and 890's were rare and expensive, you only bought them if you had a specific purpose for them. They each were just a bigger version of the 480/490, four CPU cards instead of two and twelve FC HDD bays instead of two. Lots more I/O capability due to increased PCI bus's.
Which comes to one of the starkest differences, x86 systems tend to only have one PCI bus. SPARC's tend to have three to four separate PCI bus's. That is what that extra I/O circuitry is for, it's to handle all the additional components that you will put in there. I have a box sitting outside in one of my racks' that has four dual-port 4Gbps FC cards in it. Two form a 4 double loop to local FC disk array and the other two form a double loop to the enterprise SAN. That's four 4Gbps pipes that get muxed via mpxio on both the back end and the front end for 16Gbps in all directions. That kind of I/O isn't possible unless each set of card's gets it's own bus. And this is an older V490 system. Newer T2/T3/T4 box's can do 16GFC and dual 10GE.
Oracle bought SUN because SUN has a history of bad management decisions. Solaris is difficult and archaic to work with, it's unforgiving and makes only the slightest attempt at a GUI. Oracle changed out the management and infused cash to get the T3's finished and out the door while furthering production of the T4's. And while I think Oracle is shady when it comes to customer service (you must pay for security updates) their the king of Databasing for a reason. Oracle bought SUN because they knew they could make a profit with the company and further integrate SPARC + Solaris + Oracle Database + Oracle Middleware (used to be BEAWLS).
Anyhow this was all about the x86 ISA being old and not scalable. It was designed in the 70's and has stayed the same to maintain backwards compatibility. Only extensions have been added in the form of a coprocessor (SIMD / FPU) with the 64-bit iteration being just an extension of the registers to 64-bits with everything else kept the same. To get more performance out of the metal the CPU engineer's are having to design complex decoders and translators to act as middlemen between x86 binary and the internal RISC language of the CPUs. CPU's would be faster / cheaper / smaller if they didn't need to do that.