File compression = faster performance?

jaculum

Distinguished
Jan 12, 2006
31
0
18,530
I've always been told that enabling compression on your hard drive would reduce its performance but would better utilize its space. However, I've also heard that in newer computers one of the biggest bottlenecks is that hard drives have fallen far behind in the category of speed when compared to the relative increases seen in other devices such as processors and RAM.

So if a hard drive is slow enough to make other devices wait for information on the drive, doesn't it make sense to have the drive transmit compressed information to reduce the amount it has to send, utilizing processor cycles that would have otherwise gone to waste?

I'm sure that the compression would have to be relatively small in order to avoid the migration of the bottleneck, but does this make sense or is there something I'm missing?

If this is right though, would enabling compression for the entirety of an NTFS drive be a viable solution to speeding up startup time, program load times, etc. or is this an example of compression that is too process-hungry? I'm not sure of NTFS compression is on-the-fly or if the files have to be completely uncompressed before use.
 
Compressed files in not fast due to the fact that the computer will decompress it in order to access the files and that extra task of decompressing files takes time as well. I've never compressed any of my files specially music and videos for it will lower its quality. So for me I don't think compressing files for extra storage is very practical, I just buy another hd or switch to a high capacity hd if I ever run out of storage. But then again that is from what I know before and I don't about with computers now, if it's faster with compressed files.
 
Compression will always slow down your computer because it induces a certain latency in accessing files, because the processor always have to decompress before being able to do anything with the data.
The best thing you can do to optimize performance is to get some automatic defragmentation program such as Diskeeper, which will substantially increase it.
 
Performance decreases due to having to compress/decompress on the fly.

See here for details. Major points:

• You can work with NTFS-compressed files without decompressing them, because they are decompressed and recompressed without user intervention.
• You may notice a decrease in performance when you work with NTFS-compressed files. When you open a compressed file, Windows automatically decompresses it for you, and when you close the file, Windows compresses it again. This process may decrease your computer performance.
• NTFS-compressed files and folders only remain compressed while they are stored on an NTFS Volume.
 
The biggest delay in reading from the hard disk is typically getting to the start of the file. Waiting for the correct part of the disk to spin to the read head and getting the read head to the proper track takes the most time. Once you've found the start the file you are on the home stretch and the transfer isn't all that slow since usually the rest of the file is organised to be close by on the disk to prevent even longer access times. Therefore, compressing the drive wouldn't help since it's not the reading of data off the disk thats slow, it's the getting to the data thats slow.
 
As everyone has said above, generally it takes longer to decompress than to read a file. This is not always the case.

Stakker and Doublespace advertised increased performance back in about 1994 for systems with enough processor power because disk speeds were so low. I think this was for external storage, it is easy to imagine a 486 decompressing files faster than a floppy could read. Likewise, compressing data onto a slow external device might increase read performance. Looking at a CD-R, a 24x drive has a read speed of 3.6MB/sec, so your processor would have to be able to decompress 2:1 compressed data at greater then 50% of that speed for it to be faster.

If you had very large, very compressable files, say 10:1, and low latency to and from the processor, you might be able to decompress faster than you can read raw data.
 
Actually guys, there's not that big of a performance hit in compressing an entire volume. It's barely noticable. Compression isn't used anymore because it really isn't needed- we've got 500GB drives! Back in the mid-90's it was pretty popular though.

The major downside is this: higher potential for disaster. If something goes wrong with the compressed volume you could potentially lose a lot (or all) of your data.

-mpjesse
 
I see no reason for file compression.
Most large files (like mp3, jpg, mpeg, etc) are already compressed so you will not see any improvement. There is also a big risk of losing everything if there is a problem, because compressed drives arn't easily viewed by other PCs (it's happend to me before)
 
If you are using NTFS and want to use comrpession may I suggest:

- 2 KB clusters for the volume / partition (not to be confused with stripe sizes for RAID)
- Compressing all *.log, *.txt, *.doc, etc files first.

With enough processing power, and (slow enough) hard disk drives it is actually quite common to see an increase in read performance.

However, you'll notice a decrease in write performance if you flag the volume to compress everything, and everything new added to it.

NTFS Compression can also increase fragmentation significantly, but this is not a problem for most 'typical' computers under 'typical' work loads.

If you just compress all the files, read performance may rise, and then compress any additional files added to the volume at night the impact on write performance will not be as significant. (Only slight increase in fragmentation build up).

You can use the Windows XP 'COMPRESS /?' command to do the above trick, while not packing the entire volume. I'd recommend weekly defragmentation runs on the drive though.

DriveSpace / DoubleSpace (both MS-DOS) / SuperStor-DS (PC DOS), and Stacker (DOS 3rd party, OS/2, later versions of PC DOS) did some interesting things, and if the host volume had issues, or the hidden, system files on the host volume got corrupted, played with, or deleted, they might not have been able to mount to the compressed volume / drive.

MS-DOS 6.2 fixed most the issues in Double-Space (over MS-DOS 6.0)
MS-DOS 6.21 lacked disk compression. (legal battle with Stacker)
MS-DOS 6.22 included Drive-Space, which fixed other issues, including their legal battle with Stacker.

The main issues where if the data being written to disk was modified by another app (usually TSR), before being written to disk, the data would become corrupted.

From Windows 95 (FAT16 only, using DriveSpace still), and onwards Disk Compression has been safe.

NTFS Volume Compression is a very different system to all the above though, and actually quite safe to use by comparison. The downside is it lacks a SmartPack technique, so just use 2 KB clusters to make up for it. (Anything under 2 KB clusters in NTFS, unless your disk is very small, is totally idiotic as the MFT bloats out heaps.... even Microsoft recommend against it.... and this caused many issues in Windows NT, until Service Pack 3 or 4 if I recall correctly).
 
The biggest delay in reading from the hard disk is typically getting to the start of the file. Waiting for the correct part of the disk to spin to the read head and getting the read head to the proper track takes the most time. Once you've found the start the file you are on the home stretch and the transfer isn't all that slow since usually the rest of the file is organised to be close by on the disk to prevent even longer access times. Therefore, compressing the drive wouldn't help since it's not the reading of data off the disk thats slow, it's the getting to the data thats slow.

The hard drive is still insanely slow compared to memory. The only thing slower is a CD or DVD.
 
Yes, I know. What I was saying though is that it's the mechanical aspect of the hard drive that makes you wait so long. If you have to move the read head you can typically expect to wait an average of 10 ms before you can start reading data of the disk. It's that mechanical aspect that really slows down HDs. Even moving to the track beside the one you are currently reading from takes about 1ms.

Edit: took out some unnecessary words
 
The thing is today that most files have some form of lossy/non-lossy compression added to them already. Examples:

Pictures: Rarely found as .bmp, though there are a mix of lossy (.gif, .jpg) and non-lossy (.png) formats.

Music: Almost always compressed (except for .wavs). most forms of .wma, and .mp3, .mp4, .ogg are lossy.

Video: Fullframe Uncompressed video is MASSIVE and only used in professional applications during editing etc before final compression. MPEG-1, -2. and -4 are all examples of lossy compression, along with 99% of the other codecs.

Even games come with mose of their files packed these days.


As such, you wont see *near* the 2:1 compression you used to see (or at least be predicted) with MS-DOS and doublespace etc, and for the overall 10% or so space saving the processor overhead isnt worth it imho.
 
:arrow: NTFS won't compress files that don't benefit from it.

:arrow: Also doubt it will try to compress files that fit within 1 cluster (usually 4 KB), as there is no gain unless it shrinks to fit within MFT entry, Highly unlikely.

:cry: Flagging full volume compression (on write) can increase fragmentation heaps. Although can compress each (not already compressed, now including 'modified files' which are now not compressed again) file out of core business hours, and only perform non-compressed writes during core business hours.

8) If you have 4 CPU cores, and 1 or 2 of them aredoing nothing though..... and ample memory for performing write-back caching & write-combining (of compressed data) can help.

:idea: Most businesses deal with files that compress 2.5:1, even today. (What is more likely ?: 10,000 JPEGs of non-work related pr0n, resulting in immediate dismissal and no compression gain, or 10,000 files a mix of documents, spreadsheets, databases, etc...).

:wink: Reads from compressed files will be faster. Even when overheads are factored in. As it takes far less CPU time and memory to decompress files. Once the HDD seek is complete (assuming 1 contiguous fragment) only 1/2 the data needs to be read, the rest of the operations occur in CPU/RAM. Over 10,000 complete file reads / accesses this can start to add up.

:twisted: Microsoft (DoubleSpace) didn't try to steal technology from Stacker (DOS + OS/2 btw - OS/2 was big at the time), claiming they invented it for nothing. They knew exactly where the IT market was going, and who wsa going to be driving it (Business), and what files most businesses where going to be using. (The MS Office series is still pretty popular today, they knew the files would compress well and still haven't implemented even LZH compression into MSO 2K3/XP)

8) Being able to flag log files that grow to 100 MB+, but compress at better than 8:1 ratios is very useful... and typically improves write performance on those files. (CPUs have scaled far better than HDDs have performance wise since the days of MS-DOS 6.2x, even with a 486SX-33 you would typically notice a boost in speed - Just back then you really did notice any performance improvements 😛).