This may help a bit;
http://techreport.com/reviews/2006q1/aiw-x1900/index.x?pg=1
The 'pipelines' are so called because most people count it based on everything including texture units (TMUs), but even nVidia has moved away from that method. And while the X1800 had 16 shader units and 16 pipelines, now they have uncuppled the equation (48shaders 16 'pipes') so that they can do more complex operations that may require multiple passes and save on transistor space.
Think about it like this.
If you have two design 48 full pipes and even 48 ROPs compared to the 48/16/16 design of the X1900 and the 24/24/16 design of the GF7900. Now consider that a full48 would likely require 2+X as much silicon/transistor, and therefore be much more expensive, and likely also have a lower yield based on simply percentages of failure (more transistors usually equal more failures/duds). So the X1900 looks to try to get some advanatage of the Full48 in a small package.
A simple scene just colour+z then the 48/48/48 machine outputs 48 pixels per clock cycle, the GF7 processes 24 pixels but can only output 16 to the screen, the X1800/1900 both also output 16 pixels.
Now a theroetical scene requires 48 operations however requiring 3 passes done progressively, this results in 16 final pixel output, both the X1900 and FULL48 perform at the same speed. The GF7800/7900 take 2x clock cycles or perhaps even loops depending on complexity) to perform the operation (staggered), and the X1800 takes 3x clock cycles to perform the operation.
Now nothing's ever that 'perfect for balancing, but it does show you how the newer design favours the complex design.
The problem is balancing what is most important. nV was favouring the textyre side of the equation having more TMUs in their 'unbalanced' designs (starting with the GF6600), ATi's unbalance design favours pixel shaders. Both have the same number of ROPs so therefore at the very simplest they can only output the same number. Now the future design like the R600 will involve a more unified design where the Pixel and vertex # is changeable depending on requirements/load. Ufortunately the ROP # remains 16 (or so rumour has it) and the TMU number is also rumoured to be fixed at 16, this means that in order to outperform clock for clock the R600 needs more complex scenes than the X1900 and 1800 before it, and if it's a texture heavy situation it would even favour the older GF7800/7900 design (and likely G80) because they are able to do more texture ops per clock (likely the R600 will be faster than the GF7800/7900 so overall may still favour the R600, but it wouldn't be as 'efficient' let's say).