The premise of the testing methodology was good but the execution was flawed. You began the test by using the first example that was NOT a true test and you admitted it. By using the multi-monitor test first you gave the test participants a much higher likely hood of detecting which system used which card. The testing to be accurate should have done the single monitor first to begin opinions on closer to a level playing field.
Remember that the participants who tested Eyefinity vs. Surround were different than the ones that tested single 30" vs. single 30".