You can certainly track down each generation's marketing material. They generally compare them to the previous generation, but the numbers are usually pretty bogus as they focus on what makes it look best.
Pretty rare for someone to take the time to put all the generations together and run the same tests. That is why I recommend passmark. You'll get a single core and multicore score for each CPU. You can easily graph that and look up what their test consists of.
As for the more complex benchmarks you are after. You are probably going to have to track down individual reviews and compare them. Zip and some sort of render are usually included in standard test suites, but they may vary from generation to generation.