Sata hard drive life expectancy

douge

Distinguished
Nov 23, 2010
8
0
18,510
Hello,
I operate a radio station with several automation systems on 24/7. I run SATA hard drives and have been running them for about 4-5 years. How long can I expect them to last?
 
Solution
I don't see how putting drives on a rotation schedule will protect the motherboard. You need a case with good ventilation, preferably a case that has removable dust screens and two or more case fans (hopefully one blows across the drive cage). If the case doesn't have a dust screens, you may want to peek inside the case every month (more or less depending on how dusty an environment it's in). Use an air can to get rid of any excess dust as it helps trap in heat. Removable dust screens are nice - sort of like a lint filter on a dryer.

Also, like other suggested, RAID, other than raid 0, can be of big help. Using RAID 1 or RAID 5, the system can still run with a single drive failure. RAID 6 can be hard to come by, but two drives can...
Hard drives are mechanical and thus will eventually fail. Drive manufacturers used to list MTBF, but they don't do that as often now. I've had drives arrive DOA, some die after a day, and some that have lasted 10 years. There is just no way to tell how long a drive will live. My oldest drives have been western digital.

The secret to helping a drive last longer is to keep it cool. In the end, only good backups will keep you running. If you're running RAID (not raid 0), then a hot spare can help.
 
It sounds to me like a 24/7 system requires high uptime. What are the consequences if a disk fails on you?

Rather than worrying about the life expectancy of the drives, you may be better off considering RAID 1 so that a drive failure doesn't cause any downtime.
 

douge

Distinguished
Nov 23, 2010
8
0
18,510

Consequences would be obviously downtime meaning off the air and no revenue coming, RAID 1 might be a great idea if the company would go for it, I'll present it and see what happens, but overall I thinks with all the responses, keep them error free which highly impossible and keeping them cool so they don't overheat.
The discussion came up after another radio station put themselves on a rotation basis where they replaced hard drives every other year to protect the mother boards, but if the unit doesn't get hot the mother boards shouldn't be a main concern should they??
 
I don't see how putting drives on a rotation schedule will protect the motherboard. You need a case with good ventilation, preferably a case that has removable dust screens and two or more case fans (hopefully one blows across the drive cage). If the case doesn't have a dust screens, you may want to peek inside the case every month (more or less depending on how dusty an environment it's in). Use an air can to get rid of any excess dust as it helps trap in heat. Removable dust screens are nice - sort of like a lint filter on a dryer.

Also, like other suggested, RAID, other than raid 0, can be of big help. Using RAID 1 or RAID 5, the system can still run with a single drive failure. RAID 6 can be hard to come by, but two drives can fail and the system will still run. Just remember, RAID only protects you from a failed drive, not accidental deletion, viruses, or malware, so you'll still need a good backup policy in place.
 
Solution
Google's study on hard drive failure rates is pretty much the definitive work as far as actual reliability studies go. Their findings are that failure rates for the 1st two years of drive life run at about 2% for the 1st year of life (i.e., 2% of drives 1 year old failed) and then jumps to around 6-8% for drives 2-5 years old. See: http://static.googleusercontent.com/external_content/untrusted_dlcp/labs.google.com/en//papers/disk_failures.pdf

But the reality is that even new drives can and do fail, so a "drive rotation" isn't really protecting you. RAID is the standard industry solution for business-critical systems. RAID-1 mirrors your data on two drives and almost completely eliminates drive failure as a cause for a system outage - the system will continue to run as long as two drives don't fail on the same day (or however long it takes you to replace a failed drive). If you use a third drive as a hot spare then even the failed drive replacement is automatic.

Even if you go the "deluxe" route and buy a hardware RAID controller, the added hardware cost is just a few hundred bucks for both the controller and one or two extra drives. The bigger cost is probably the labour - the time required to learn, install and test the controller so that you thoroughly understand it. For example, when your system is in production and you experience a drive failure, you need to know how to tell which drive is the one that needs to be replaced.

You need to weigh those costs against the lost revenue that will happen if your system goes down for, say, a day while someone tries to get hold of the support person and then he tries to get a replacement drive, get the OS installed on it, and recover all the data from backups.