[SOLVED] How does ZFS restore a corrupt file/sector?

Status
Not open for further replies.
Feb 12, 2022
13
1
25
I have no hands-on experience with ZFS or RAID in general.
I'm starting a project (a DIY file-server) and wanted to weigh up whether to use ZFS for it.

How exactly is a corrupt file restored/fixed/recovered on ZFS's RAID-Z1.

E.g.
Imagine I have 4 drives (10TB each) and one of the drives experiences a read error making the data on one of its sectors corrupt.

Does ZFS need to undergo a complete rebuild of the array?
Or does it have a mechanism to recover that single sector's worth of data?

I'd like to know for two reasons:
  1. The length of time I can expect ZFS to recover from the single faulty sector
  2. The amount of reading/writing that will occur as a result
Help greatly appreciated!
 
Solution
RAIDZ1 (if only 4 drives total to work with) is similar to RAID5, in that the data and parity are spread amongst the four drives, where even the complete failure of a drive would not result in the loss of any data, so, any data stored is still retrievable. Completely rebuilding /resilvering a drive should only be an issue when/if a drive is replaced. (Single drive redundancy when using large drives is not all that recommended these days due to possible failure of an additional drive during a potential hypothetical lengthy rebuild being catastrophic..)

The principle of ZFS correcting a potential minor data errors (drive 'forgets'/'drops' a bit , aka, 'bit-rot') in the background has been proven and tested by Wendell at Level1's YT...
For a first timer I would advise to stay away from ZFS until you have acquired a significant amount of knowledge. It's not for beginners. However, to answer the initial questions:

  1. Single read errors are (if possible) corrected on the fly. You'll never see them.
  2. As above.
 
RAIDZ1 (if only 4 drives total to work with) is similar to RAID5, in that the data and parity are spread amongst the four drives, where even the complete failure of a drive would not result in the loss of any data, so, any data stored is still retrievable. Completely rebuilding /resilvering a drive should only be an issue when/if a drive is replaced. (Single drive redundancy when using large drives is not all that recommended these days due to possible failure of an additional drive during a potential hypothetical lengthy rebuild being catastrophic..)

The principle of ZFS correcting a potential minor data errors (drive 'forgets'/'drops' a bit , aka, 'bit-rot') in the background has been proven and tested by Wendell at Level1's YT channel...

View: https://www.youtube.com/watch?v=l55GfAwa8RI
 
Solution
This is where I was supposed to claim that btrfs setup to just span multiple volumes might be a good idea because I remember to have read somewhere in the past that metadata on btrfs may be duplicated in each storage device (hdd/partition).

The stupid thing however - Now I cannot find any source on this to back this claim up. Therefore I may remember it incorrectly or I'm just incapable to locate it on the web.

Anyway - if such setup of btrfs volume are possible, then it only means that a loss of one hdd will reduce the time to restore any missing files because only the files on the defective hdd is needed to copy back.
 
Feb 12, 2022
13
1
25
Ah, thanks for replies!

So...
  1. ZFS will correct automatically in the event "bit rot."
  2. RAID-Z1 is a bad idea for large drives due to potential for a 2nd drive to fail during resilvering
  3. I should stay away from ZFS as a beginner... but I should also forget about RAID :LOL:

So a few questions...
  1. Given that I'm a beginner (with a technical but non-networking background) and I'm keen on building my own file-server, how should I best approach this task? I've no problem experimenting/learning on cheap drives initially (say 5x250GB). This project is intended to be a learning experience but also has a concrete use case at the end. The final project was going to have 8x 14TB SAS drives (4 of which I've already bought).
  2. Given that single-drive redundancy is bad for large drives, what level of redundancy is recommended? The most robust solution seems to be 1-to-1 backups that are loosely connected (i.e. physically separate), but I'll be spending a fortune if I need to get 16x 14TB drives.
 
I should stay away from ZFS as a beginner... but I should also forget about RAID :LOL:
"You're a beginner, so stay away from these things until you get more experience" Classic gate keeping right there 😉

Given that I'm a beginner (with a technical but non-networking background) and I'm keen on building my own file-server, how should I best approach this task? I've no problem experimenting/learning on cheap drives initially (say 5x250GB). This project is intended to be a learning experience but also has a concrete use case at the end. The final project was going to have 8x 14TB SAS drives (4 of which I've already bought).
What I'd suggest doing:
  • Figure out what features you need.
  • Shop around for what OSes designed for file servers people recommend, figure out which fulfills those needs, and if you can't find something that does everything you want, take whatever you think you can live with
  • Test drive the OSes. If you don't have the computer yet, you can always go the VM route and set up small virtual disks to get a feel for how to work things out
  • Make notes on the tasks you perform if they seem too complicated to commit to memory
Given that single-drive redundancy is bad for large drives, what level of redundancy is recommended? The most robust solution seems to be 1-to-1 backups that are loosely connected (i.e. physically separate), but I'll be spending a fortune if I need to get 16x 14TB drives.
Mirroring. This article gives a few points as to why it's better:

Too many words, mister sysadmin. What’s all this boil down to?
  • don’t be greedy. 50% storage efficiency is plenty.
  • for a given number of disks, a pool of mirrors will significantly outperform a RAIDZ stripe.
  • a degraded pool of mirrors will severely outperform a degraded RAIDZ stripe.
  • a degraded pool of mirrors will rebuild tremendously faster than a degraded RAIDZ stripe.
  • a pool of mirrors is easier to manage, maintain, live with, and upgrade than a RAIDZ stripe.
  • BACK. UP. YOUR POOL. REGULARLY. TAKE THIS SERIOUSLY.
TL;DR to the TL;DR – unless you are really freaking sure you know what you’re doing… use mirrors. (And if you are really, really sure what you’re doing, you’ll probably change your mind after a few years and wish you’d done it this way to begin with.)

EDIT: Another article also seems to agree mirroring is the way to go, though they suggest 3-way mirroring (although they use a smaller number of drives, so pick and choose what you want)
 
Last edited:
  • Like
Reactions: shanecallanan
Status
Not open for further replies.