My counter arguments to this are:
However, I will say that the data set would be better if they added a frame time graph.
- This example has too small of a sample size to be useful. I'm nit picking here sure, but if the upwards spike was intermittent, then it doesn't matter over the long run.
- Consider this, the average benchmark tends to be 60 seconds. If the performance average is 100 FPS, that's a sample size of 6000 frames. Even if we had a case where one second was 200 FPS, the overall FPS would only increase by 1.666...
- Unless there's a blip of looking at an empty skybox, most games won't exhibit a behavior of suddenly shooting up in FPS. Also I can't imagine a scenario where one CPU would suddenly have a blip and another wouldn't.
- Practically all benchmarks report an average, which is the number most people will use because it's right there. If you have a problem with that, then go tell benchmark developers to stop doing this.
- Further peaking into this rabbit hole, it appears some benchmark utilities are not using arithmetic means to report an "average", if this reddit post is correct: View: https://www.reddit.com/r/hardware/comments/kyw1oe/comment/gjk28jc/?utm_source=share&utm_medium=web2x&context=3
Textures don't reside in CPU cache. Also calibrating to some arbitrary FPS and seeing the quality settings you can get is not really a useful metric when benchmarking the processor. The goal is to see how much performance you can get out of the processor period, not a combination of performance and image quality.
As an example, if I'm getting 100 FPS, I've identified it's my CPU limiting performance, and I want to know which CPU gets me say 240 FPS on a game (because I happen to own a 240 Hz monitor), if everything is "calibrated" to 144, then how do I know which CPU to get?
They're using a geometric mean for the specific purpose of lessening the effect of those outliers. From https://sciencing.com/differences-arithmetic-geometric-mean-6009565.html:
Previous post already too long , I can't quote everything. so I just put my point one by one here.
The only reason I put few suggestion on how to benchmark properly just b/c there are no so large L3 cache before, but since we have 3800 3D here, we DO need change the way we bench it to reflect CPU's real performance instead of just L3 cache performance.
1. For the average FPS or 99th FPS.
for example, in Tom's test here,
since 5800 3D have large L3 cache, it's peak FPS is much higher than 99th FPS. that's b/c before L3 cache full, FPS gain alot, but when L3 reloading and miss hit the cache line, the FPS also lower alot. that's why 5800 3D have low 99th FPS and higher average FPS. but from player's opnion, 99th FPS is real one and peak FPS too high not a good thing cause it's not stable and fluctuate too much expecially for FPS shooter gamer, they hate jitters.
2. For game graphic setting and calibration to 144Hz, my explaination is here:
The gaming texture loading path is Disk texture File -> cpu -> ram -> cpu -> pci-e to GPU ram -> CU's unified cache ( L1 ).
Before we have amd 5800 3D, all cpu 's bench is testing how fast the cpu can process the rendering request from DX api. so unlimited FPS make sense and it's a fair game.
But since 5800 3D have extra L3 cache, things changed alot, unlimited FPS is not only testing how fast CPU can process rendering request from API, 5800 also cache texture from the loading path mentioned above , so if still unlimited FPS, then it will need test all game graph settings instead just testing one like Toms's did here, which won't show when in extra texture model, ( extreme graph level), how CPU really performance. And gamer still have no idea which CPU is best for their gaming experience with their setttings. Like here, even after I read Toms's bench, I still don't know if I play 4K with extreme @ 144 Hz 32" monitor , which CPU I should pick. The only reason is the differrent graphic setting 's FPS is not linear, if L3 cache size overflow, 5800 3D's FPS will slower than 12900k, but if texture is small and all L3 cache hit, 5800 3D FPS gain alot.
3. How to synthesis the multi game result.
If Tom's test already arithmetic mean, that's great, but ppl buy new GPU for new game. all tested game as I know , it's multi year old one and NONE of them from top 10 steam. like they don't include eldern ring , no new world etc. Old game and new game really performance totally difference in favor of CPUs, My story is when I pick my CPU for new world which is based on Tom's testing , which is amd 5600x, but after I have a intel 11700K, I found 11700k did 30% high FPS vs amd 5600x.
So please update Tom's benchmark games, these games tested really too old to have players nowadays.
Another thoughts about the multi game result, even it's already arithmetic mean, could it still add weights on games? Like there are 1M gamer play eldern ring , but only 1K gamer play watch dogs, give these two game same weights when do mean FPS is so misleading gamers.!