Yes it has to lose fillrate with more then 1 texture layer because it takes more cycles to add more texture layers, it only has 1 TMU per pipe, thats why for 2 texture layers it loses half its fillrate, the other cards there don't lose half there fillrate when doing 2 textures because they have 2 TMU's per pipe, but anymore then 2 textures and all those cards fillrate will drop in half and the memory badnwidth will more then drop in half because it needs 2 passes with more then 2 textures which wastes allot of mem bandwidth. The Kyro doing 8 texture layers will need 8 cycles so its fillrate will drop 8 times to 44mpixels/s (352mpixels/s taken from that Anand single texturing test divided by 8), but remember this is after the 8 texture are taken care of so to think of the performance you'd get you have to think of it as if its a 44mpixel/s card playing a single texture game (and also it doesn't render hidden pixel too so 44mpixels/s is more then it looks), but also you have to remember that it won't need to pass the textures to the frambuffer 8 times and it won't have to reload or retransform the polys 8 times, it'll only have to do all that once even with 8 texture layers, so its bandwidth won't drop anywhere near as much as the fillrate, now with say an MX your looking at 88mpixels/s theoretically because it has 2 TMU's per pipe (its fillrate drops 4 times from the single texture score of 142mpixel/s in that Anand test you posted), but in reality since it has to make 4 passes to the framebuffer for each pixel and has to reupload the polys 4 times its already limited bandwidth will die completely, its bandwidth will drop faster then its fillrate, its bandwidth would already have droped 8 times because of 8 times more textures, now take 4 passes for every pixel and 4 times the poly bandwidth needed and its bandwidth would be completely gone, in that single texturing test the MX only gets about 40% of its theoretical peak fillrate, now thats single texturing with only 1 texture layer 1 pass and no poly reloading so imagine how much of its fillrate it'll actually be able to use after all those extra passes and after having to reload the polys 4 times, I'd be supprised if it gets 10% of its fillrate in these circumstances (when I say fillrate I mean the 88mpixel/s fillrate it would be able to reach with the normal 1 pass in single texturing devided by 8 times the amount of textures not its peak theoretical fillrate), meaning it'll be able to use about 11mpixels/s of its fillrate, yes this is just a very very rough estimate, there's no way I can sit down and work out exactly how efficient its bandwidth will be, it may be able to use 22mpixel/s fillrate, maybe 33mpixels/s but I would say that being optimistic considering the terrible mem bandwidth problems it has even in single texturing with only 1 pass and with all those extra passes its bandwidth will be far far less efficient then it is in a single or dual texturing tests (40% efficiency in the Anand test). One of the reasons why the Kyro II is so dominant in the Serious Sam performance tests in that Anand review is because Serious Sam uses 4 texture layers (it only uses 1 and 2 texture layers in the fillrate tests though), now that theoretically should take the Kyro II's fillrate down to 88mpixels/s in Serious Sam (352 devided by 4 texture layers), but look at those fps results, the Kyro II gets 61.5fps at 1024x768x32 while the MX is getting 31fps, now add 4 more texture layers to that (yes I know this is simplistic), and you'd get something like this: Kyro II 31fps, MX 16fps, GTS 24fps, Ultra 29fps, so the Kyro II won't be super fast in a 8 texture layer game but it should be playable, while with the MX its unplayable, again this is all estimates using the benchmarks we have now but at the moment thats the best I can do, phew thank good thats over😱), it was tough writing that, too much thinking, I need a drink😱), I hope you can understand what I was trying to say.
And of course all of this is fotgetting that if you make multiple passes to the framebuffer your going to get horrible colour innacuracies, so forget 16bit colour on the GTS with games that use lots of texture layers, while on the Kyro II you could actually use 16bit and it'll be just as nice looking as it is in a single texture games (it looks almost as nice as 32bit colour), and I'm pretty sure that even 32bit colour will deteriate if loads of passes are needed, like 4 passes if the GTS was playing a 8 texture layered game.
BTW incase you didn't know the reason the Kyro II can do 8 texture layers in 1 pass it because it keeps the tile its working on inside the chip until its finished adding everything to the tile, so after its added all 8 textures only then does it send the tile to ram once so appart from having 8 times the amount of textures the memory bandwidth it uneffected with the Kyro II.