Question What measures are in place to protect hard drives from Silent Data Corruption (SDC)?

Feb 12, 2022
13
1
25
I've read some articles about Silent Data Corruption across SATA channels.
Apparently there are some mechanisms in SATA HDDs to prevent (or at least reduce) the rate of SDC.
What is this mechanism, and can I rely on it?
Do I need to worry about SDC in a 100TB storage array for instance?

My intention is to build such an array and write to it only once, although I expect circumstances will arise where I'll need to do additional writing (so to be safe, I'd like to plan for 500TB of writing and then read from then on).

If SDC could be an issue for me, how would you suggest I overcome it?
 
Feb 12, 2022
13
1
25
It looks like there are specially-configured hard drives with 520-byte sectors (512 bytes of data, 8 bytes checksum) which can test/correct for SDC, but these are quite expensive.

In terms of whether SDC over SATA is something I need to worry about in a 100TB storage array, here is a table showing the frequency of such errors:

sas-vs-sata_6019ca1b2d5fc.jpeg


Across the SATA channel, the typical error rate is 10E17, or 1 error per 11PB (max) of information transferred.
At a consistent transfer rate of 0.5GB/s, it's possible to transfer 15.7PB of data per year which would result in at least 1.4 errors.
Given that I aim to write only about 500TB of new data over the lifetime of the drives (4.5% the error rate), it doesn't seem worthwhile protecting against it.
Especially since ZFS can't protect against SDC when initially writing to the drives (see here).
Therefore it's far more likely that an SDC would occur during reading of files, which isn't a major issue since the data will not be affected.

Does this seem reasonable?

I'd like to be clear that my analysis ONLY factors for SDC rates over SATA channel.
It does not account for for single-bit errors in RAM, hard error rates of the drives, potential RAID rebuilds etc.


Therefore - based on my research over the last few days - the need for a solution like ZFS is imperative when working with a 100TB array and data integrity is important to you.
It's also clear to me that SATA drives are not reliable enough above 10TB, never mind 100TB. SAS drives are necessary instead, but this is outside the scope of this thread.

Thanks for pointing me to ZFS @fzabkar .
Also @Ralston18 yep I'm considering backup/contingency plans.
This is early days for me. First DIY file-server build so all this is new to me.
But I'm learning!
 
D

Deleted member 14196

Guest
The only safety is in keeping and verifying that you have good back ups. And you want multiple back ups across different types of media and at least one copy in the cloud

You can’t rely on Any mechanism built into any drive