Six-Core Analysis: Intel Core i7-980X Scaling
Intel's Turbo Boost technology provides a mechanism for improving system performance most significantly in lightly-threaded apps, even at peak loads. But what is the feature's impact on a Gulftown-based Core i7-980X processor with different core counts?
After having looked at AMD’s six-core Phenom II X6 across all of its possible core configurations, it’s time to do the same with Intel’s Core i7-980X flagship. How do performance, power consumption, and power efficiency change when fewer cores are utilized? Is the 32 nm Core i7 top model best with all six cores, or does some combination of fewer core deliver the optimal experience?
The testing we did on AMD's Thuban-based six-core chip revealed two important things.
First, we had to realize that many applications still don't benefit from multiple cores. Users would realize much more performance if software did a better job of supporting the available hardware. We find it frustrating to see AMD and Intel deliver extremely powerful CPUs only to have their potential remain underutilized, especially in mainstream applications.
Our second finding was about efficiency. The Phenom II X6 shows the best performance per watt with all six cores active, as performance gains are more substantial than the additional power required to operate the higher core count.
Is this also the case on Intel’s flagship? Does using all six cores provide the best power efficiency? Will power consumption in idle decrease if we switch off individual cores? Let’s find out.
Both processor flagships from AMD and Intel are equipped with performance-enhancing features that allow the CPUs to increase clock speeds when two requirements are met. First, CPU load has to go through the roof, and secondly, there has to be sufficient thermal headroom for increasing the clock rate. The features, however, are implemented differently.
AMD’s Turbo CORE function only knows one acceleration mode, while Intel implements two (at least on this particular model; other CPUs are more dynamic). The first mode applies when all cores are accelerated (a 133 MHz boost). The second kicks in when only one or two cores are active and can benefit from additional clock speed (up to 266 MHz). The Turbo Boost implementation is more aggressive on Intel’s 32 nm processors, whether in dual- or multi-core models, but it's still notable on the 45 nm Core i5 and i7 processors. Note that Turbo Boost accelerates cores by increasing the CPU’s multiplier within a set range, but the feature can't always take advantage of maximum acceleration if the processor is already operating close to its thermal/power limits.
AMD’s Turbo CORE basically works like a reversed implementation of Cool'n'Quiet, AMD's power saving feature for CPUs. To make a long story short, Turbo CORE exploits thermal headroom if there's sufficient workload demand, and it does it for exactly three cores (unless you alter that through AMD's OverDrive software. In theory, this speaks to Intel’s configuration, since there should be higher clock speeds available if few cores are required. Only limited acceleration is available if all cores are involved, since there would be little headroom left. AMD’s feature, on the other hand, also kicks in when needed, but probably reaches thermal limits quicker because all cores are involved at all times. However, this is just a theory, and we need to put it to the test and directly compare Turbo Boost against Turbo CORE in a different article.
Regarding our test platform, we found that you can't just pick any socket LGA 1366 motherboard and expect to reduce the number of active cores. Fortunately, we found a feature for switching off individual cores on Gigabyte’s EX58-UD4P with the F12 BIOS version. Although this might not be a really important BIOS switch for most users, it's worth exploring, since power consumption does decrease if you switch off Intel cores. This wasn’t the case on our Phenom II X6 1090T test system. Here, idle power remained constant whether one or six cores were used.
3ds Max scales extremely well with each additional core.
Similar conclusions arise with 7-Zip, although you’re not getting a lot more performance beyond four or five cores.
Cinebench in its multi-threaded run shows that real life performance doesn't scale linearly with every processing core added. There is a bit of overhead incurred with each addition. Again, we’re seeing best results from the six-core configuration.
Adobe’s Acrobat 9 always takes at least a few seconds to generate a PDF document from a complex Word or PowerPoint file. Our benchmark uses a 115-page presentation, but the time savings on multiple cores versus a single core are embarassing for Adobe. It should be possible to parallelize this type of workload to a much greater extent. As things stand, all you really need is a fast, dual-core CPU.
Photoshop CS4 is a perfect example of how applications can take maximum advantage of modern multi-core processors.
WinRAR is thread-optimized and benefits from each CPU core enabled during testing, but benchmark variance is about as large as the performance gains witnessed once you exceed three cores.
WinZip needs a serious update. Variance is high and performance only scales if you boost clock speeds. What a disappointment for such a popular tool.
Idle power actually decreases if you decide to switch off CPU cores, but the difference isn’t large, thanks to optimizations in Intel's architecture that shut down unused execution resources when they're not in use anyway. We measured savings of 2% between six cores and just one.
Peak power scales very linearly. If you were to switch off five out of the six cores, our Core i7-980X machine would require only 122W at peak load rather than 223W. This would be a single-core 32 nm processor with 12MB L3 cache and a 3.2 GHz clock speed running at only 54% of the peak power of six cores. Keep in mind that performance drops much more, though. This is just a hypothetical example.
The total runtime tells us how long the systems took to complete our full efficiency workload, including most of the applications mentioned previously. Clearly, the difference between one, two, and four cores is significant, while adding cores five and six don't yield the same performance jumps. This is very applicable to real life and typical applications.
The average power consumption during this efficiency run increases with every core we switch on, but look at how the bars create a curve that flattens out. Average power may increase, but so does performance, and probably to a larger extent.
These results are similar to what we found on our AMD system. The total power required to complete our efficiency workload is lowest with a maximum number of cores.
These results clearly show how performance increases significantly with additional cores while power consumption increases more moderately. If we relate performance to power used, we get confirmation that 5- and 6-core operation is most efficient.
Both AMD's Phenom II X6 and Intel’s Core i7-980X prove that a larger CPU core count is by far the most reasonable technique for improving overall performance. This largely depends on software support for threading, as applications need to be able to take advantage of more than one or two processor cores. However, with this support in place, we can see from our results that you’ll not only be getting much faster performance, but also highly improved efficiency (performance per watt).
Since idle power between six, four, or only two active cores doesn't vary much, we can only recommend leaving all cores switched on all the time (as we suspected at the start of this piece). There are other, much more effective ways to reduce power consumption than disabling cores. Likewise, our results show that it makes sense to pick the highest possible core count within a processor generation when you’re looking for maximum performance in threaded environments.
Unfortunately, only professional applications are truly thread-optimized across the board. Lots of popular software, even from large software houses like Adobe, might not always be good at utilizing multi-core resources. Thus, clock speeds remain important, even though they make limited sense from an efficiency standpoint. We’ll soon be looking at the two six-core processors again to compare their Turbo features at stock speeds and at overclocked speeds, since it seems that these dynamic mechanisms are the best way to combine the best of both worlds: high clock speeds and a large core count.