First of all i like this idea of evaluating performance for particular games.
I think most everyone is aware that often times drivers/hardware are optimized to take advantage of benchmarks. Although I wish they would just do level run throughs at high/med/low res/settings and do away with the "playable" thing. The other thing is the reader is forced to assume there is no company bias. i.e. on one card runs through the level looking at the ground as much as possible and on the other card stares directly at explosions/effects throughout the level. Now if they want to get extensive, testing two cards, running through each level 3-4 times and providing min- max -avg for each run through for each card, then and overall for each level for each card. Then an overall for the entire game. Of course by unbiased testers. I think that would paint a pretty accurate picture. For those of you defending Synthetics they can be easily manipulated, the equivilent to me getting a copy of an upcoming exam before hand and scoring well on a test. My results on said test does not reflect my real world knowledge of the topic. Had the exam changed prior to my taking it, a much more realistic protrayal of my knowledge would have been recorded.
I mean its up to you, if card A scored better on a benchmark synthetic test than card B and card B performed better in rigorous real world application than card A, I guess it is up to you what you would like better, good on paper or good in application.
I would prefer repeated real world testing (i.e. running the same level multiple times on each card and recording the values for each run)
I mean come on, how can you argue against real world testing? if I play a game on a card then ran said games benchmark at LFPS 30 HFPS 80 and AVG 50, then when I actually play the game I get LFPS 15 HFPS 50 and AVG 30, thats a little misleading no? Especially if a card that benchmarked lower actually does better real world.
Not even the benchmarks are 100 percent accurate every time, run a bench mark 3 times in a row and tell me if you get the same score each time?
Barring somebody purposely influencing the the real world test (i.e. exploiting high frame situations looking at the ground/sky etc. and doing it for an extended period of time) I think you are going to get pretty Accurate) information.
Also, in my opinion why isn't this scientific? Say you are on a pool table and hit a cue ball into the 8 ball 3 times, each time from the same distance, force etc. and record the direction, speed, and distance in which the 8 ball stops after being hit. The, you use a computer physics simulation to do the same thing. How is the aforementioned not scientific? its the same situation as testing real world game performance to a synthetic benchmark. Of course the real world application wont be perfect every time, but the REAL WORLD ISNT PERFECT and what works in a simulation may prove not as good in actual application. The Key is repetition, and use averages. If the reviewers played through a level 10 times, and handed you an average low average high and general average over all ten instances, would you still not trust it over the synthetic?!
Sorry for the long rant, I just cannot understand why anyone would prefer a synthetic benchmark to rigorous real world testing.
Again, I am not a big fan of the way they do it, but I would prefer real world benchmarks to synthetics.