army_ant7 :
I guess that's nice to know, that Intel processors are configured nicely, though I have heard before how they use RAM more efficiently compared to AMD when ran at the same speed. Does L3 cache have to be full-speed to be optimal? I'm guessing it's a situation based thing, but does it have to "keep up" with the other parts of the CPU (processor)? I've heard most from you how Bulldozer (and I guess Piledriver as well) don't have them running at full speed. I'm wondering if making it go faster than the core clockrate (lets say on BD and PD) would have any advantages, or if even having it substantially lower would be enough to not be a bottleneck to the CPU's potential performance anymore...
That's the method of how to measure frame latency which sounds nice, but I was wondering how regular benchmarking programs do it (maybe like Fraps), though I probably shouldn't be asking here since maybe only the actual programmers know, unless they've told anyone else. Hehe...
How much of an impact that L3 cache has on performance is definitely situation-dependent. For example, the FX-4300 and the A10-5800K both have nearly identical performance in many workloads, but in some workloads, the FX-4300's L3 cache (slow as it may be) can give it a substantial advantage like around 30%. The latency, bandwidth, and capacity of the cache are all factors that can be favored by different programs and to make it even more complicated, how much of an impact the cache has also depends on the same factors of the relevant L1 and L2 caches and the memory. For example, the L3 cache's capacity can be more important for capacity-bound programs when there isn't much L2 cache, but if L2 cache is sufficient for capacity-bound applications, the L3's capacity might not be important.
The caches frequency is just one factor in its bandwidth and latency. Increasing the frequency increases bandwidth while decreasing latency, kinda similar to how going from DDR3-1600 9-9-9-24 to DDR3-2133 9-9-9-24 increases bandwidth while decreasing latency because timings are measured in clock cycles, not in time.
Having the L3 cache run at the CPU frequency should help quite a bit compared to running it far lower than the CPU frequency. More bandwidth and lower latency never hurts AFAIK and in many things, such as gaming, it really helps. Running the L3 cache above the frequency of the CPU cores would probably still help performance, but I think that running the cache at a frequency that is equal to or a multiple of the CPU frequency is probably ideal. Going over the CPU frequency by say double would probably help performance because it'd still be increasing bandwidth while decreasing latency, although a 6+GHz cache would probably require some high voltage and if so, it'd probably consume a lot of power while generating a lot of heat.
About having to keep up with the rest of the CPU, well, yes, the cache hierarchy should. If the caches of a CPU fail to keep up well enough, they can hurt performance and as we can see by raising the L3 caches frequency of AMD CPUs, performance most certainly is hurt by AMD's incredibly slow caches.
The original Phenoms were the first CPUs from AMD that had L3 cache IIRC. AMD has not made any improvement in L3 cache frequency since those times and it seems like their L3 cache latency has not improved as measured in real-time latency either, at least not since Phenom II, hence my distaste with AMD's current L3 cache situation. Did they really think that keeping cache at fairly similar performance to what it was like five years ago, albeit with higher capacity now, is a good idea? We can prove that it hurts performance significantly, so I'd say that it's not a good idea and like I've said before, IDK why AMD keeps using such slow cache. What some might find funny is how if Bulldozer had Intel's L3 cache's performance, Bulldozer might have
far higher performance, both multi-threaded and per core.
I think that regular benchmarking programs just count the FPS in each second or at a still fairly large sub-second interval, but I haven't really looked into the specifics. I know that they don't measure the average frame latency and that that's why you can't see stutter by looking at FPS numbers. If you want to read more about it, you could go back to the article that I linked (I'll link the page talking about their measurement method here):
http://techreport.com/review/23662/amd-a10-5800k-and-a8-5600k-trinity-apus-reviewed/4
Also, here's a link in that page that goes further in-depth about their methodology:
http://techreport.com/review/21516/inside-the-second-a-new-look-at-game-benchmarking