The RAR and 7z archive formats make use of a large size 'dictionary': a small store of patterns that, when used on a solid archive, can help achieve very high compression ratios. If this dictionary can be made to fit in fast memory (ie. cache), then comparing its patterns to currently compressed data can yield tremendous speed improvements:
- the dictionary doesn't have to be called from RAM on every new data page, which frees memory bandwidth
- when the dictionary is half the size of cache, then uncompressed data can fit in cache too, thus actual compression doesn't need 'paging' from memory.
As an example, the PKZIP algorithm (used in .zip files) has a fixed dictionary size of 64 kb; .zip can't handle solid file compression either (the same algorithm can be found in gzip, but when used with the tar archiver, can in essence achieve solid file archiving, which can yield non negligible compression improvements).
In 7-zip, when creating the archive, try setting up the dictionary at a size lower than half the biggest consolidated cache the least gifted CPU has, and compare again: performance will in fact be rather close. However, if you go over the cache's size, performance plummets.
About AVG appreciating core counts better than CPU speed: this could be explained by how I/O intensive a virus scan is; and since Vista sucks at I/O, what's left to compare are how many file handles can be opened and used simultaneously. A test that could be done:
- Install AVG on Vista, XP and Linux
- Run a scan on the same file set (be mindful though that the Linux file set should be put on an ext3 filesystem, NTFS access still being rather CPU intensive on Linux)
- see if there are differences.