xtreme5 :
the cache doesn't really matter in dual core series which have 2mb or 4mb cache but these cache are in good amount for those cpu's..
Cache is just like A RAM but cache are indeed extremely fast memory than ram. The manufacturer uses lower cache in the development of newer cpu's if they uses higher cache then it will be just a waste an example of core 2 quad q9650 it has 12mb of cache however i3 2100 uses only 3mb of cache but 100mhz more clock speed, now if we compare those two cpu's the q9650 is beaten by i3 2100 in some benchs, even when there is a huge difference 12mb beats by 3mb in some way.
www.anandtech.com/bench/Product/289?vs=49
on the other hand q9650 has 4 cores where as i3 uses 2 cores
While I don't disagree that obnoxiously large cache sizes does increase latency, there's two problems with your comparison of the core 2 quad q9650 and the i3 2100:
1. You're assuming clock speed = performance. All clock speed means is it executes that many cycles per second. It does not in ANY way indicate how many instructions it performs in that time, or, for that matter, how much work one instruction actually does (though instructions-per-second does tend to be a better metric for modern CPUs than pure clock rate). If CPU A performs at 5 times the clock speed of CPU B, but CPU B can perform all instructions in one clock cycle, but CPU A takes 6 clock cycles at minimum, B has 20% higher performance than A, despite cycling 5 times "slower."
2. You're linking the clock speed to the cache access speed. Cache is memory. Registers are also memory, but they are SIGNIFICANTLY faster than cache. In fact it is entirely possible to keep an entire program in registers, and execute some VERY complex code without once pushing data into or accessing data from memory. This is most easily done with assembly language, where you can directly command the system to more or less access things in a very specific pattern. That being said, there is still SOME cache accessing, but not in the way you would think. The best you can get is to restrict it to instruction-level accessing, if done right, most CPUs will know how to pre-load all that into the instruction cache, which tends to be L1 and L2 cache (the fastest cache levels). I promise you, not ONE company is going to measure the metrics of their CPU's core clock rate (e.g. clock rate) based on higher-level data-cache latency. Not only is it too specific of a problem (since it's not exactly always necessary, and it's not really related to the base-level speed of the CPU) but it also tends to make them LOOK bad, because it's going to be slower. Cache latency and cache architecture are fundamentally separate from CPU clock rate, and are not to be confused.
That all being said, a bigger cache can still quite often be a much better thing. You might wonder why that is, and the short answer would be simply because RAM access is almost ALWAYS going to be CONSIDERABLY slower.
You might be wondering, "well, how much slower?" The answer is, as one might expect, it varies. However, on average, modern RAM takes about 3 times longer than the SLOWEST cache level. This cache level is, on many architectures, about 9 times slower than the FASTEST cache level, making memory access a whopping 27 times slower than the fastest cache access.
I don't know about you, but I'd definitely be on the boat for even half the speed of the highest level cache if it meant I was getting a 30% speed boost in memory access time, and preventing cache misses in well-designed software. I specify well-designed because it is fairly easy to build software to pretty much miss cache completely on EVERY instruction, and have to constantly be going out to main memory.
To put this in perspective, on an average CPU could perform a square-root operation in approximately 27 cycles. This operation is NOTORIOUSLY slow, as a standard instruction, in the programming community. However, how long does a cache miss take? Somewhere in the ballpark of 200-500 cycles. I could perform almost 30 square roots in the time it takes for you to simply ACCESS something. However, if that thing was already cached, depending on how deep in the caching layers it was, you could have it as soon as one square-root operation.
At the end of the day, smaller cache size does not mean faster run speed. It simply means faster cache access speed. If I had a cache with only enough space for a single "word" of data (4 bytes, or 32 bits, by x86 standards, at least) it would likely be comparable in speed to a register, which is the fastest form of "memory" that basically passes data around within the CPU's actual processing architecture, such as the ALU or the "arithmetic logic unit." However, it would be virtually pointless to use because the point of cache is not to make it as fast as humanly possible, because what would be even faster would be to simply not use memory at all, but that's simply not a reasonable assumption from a programming standpoint, as it would require us to limit our programs to using ONLY as much memory as could be contained in registers (which is in the order of less than one half of one kilobyte on the most elaborate of CPUs). So why do we use cache? Again, it's WAY faster than main memory. So, more or less, larger cache = faster run speed (usually).
However, there is actually somewhat of a cap to this speed boost. That cap happens to be the clock rate. Basically, what it comes down to is the fact that if you're running down a cache-line (cache is organized into "lines" of memory, by the way) as you're running your program, the CPU will generally notice this, and will automatically go out and grab the next line FOR you. This means that the cache will be ready to go or at the very least already on the way there when your CPU reaches the end of the line. So to that effect, there is a point where having MORE cache wouldn't really do you much good anyway, because it would take more time to grab it than it would to run through it, or significantly more time to run through it because it's simply so long.
Intel (if it's not terribly hard to notice) has come into a sort of niche of cache size for the last few generations. While the caches were actually getting LARGER about four or five generations of CPU ago, they are now stuck at roughly 4MB. And that's not the only thing - they also have it split into more levels than before. This way, missing in one level of cache more linearly increases time waiting, and we don't get so many leaps in performance loss.
So while yes it is somewhat true that a smaller cache is faster than a bigger one, again, it's not about how fast the cache is. It's about how well it performs with regards to not MISSING cache lines and needing to walk AAALLL the way over to main memory and get the data back to the cache.