Question Ubuntu 22.04 NAS: ZFS or MD RAID with dm-integrity?

anvoice

Honorable
Jan 12, 2018
147
11
10,615
I'm repurposing a PC as a NAS.

Specs:
Gigabyte Aorus x570 Master motherboard
Ryzen 2700X CPU
32GB 3200MHz non-ECC RAM
Seasonic Prime 750W PSU
2 x OEM Samsung 512GB Gen 3 NVME SSDs
6 x Seagate Exos X14 12TB HDDs

The GPU will be removed after I've configured everything.

I'm running Ubuntu 22.04 server on the NVMEs (set up in RAID1 via the motherboard: I don't know if it's necessary, but I did it anyway). This is a learning project, so I opted to go the Ubuntu route vs something like TrueNAS. This is meant to eventually become my main NAS, at which point I'll likely add ECC RAM to the system (unbuffered, as buffered/registered ECC RAM is apparently incompatible with x570 motherboards).

Aside from being a learning platform, the NAS will primarily serve for data storage (backups of different devices, photos and video, etc.). I also place a lot of value in data integrity at the expense of some performance and cost. I'll definitely be using encryption and as many data protection features as I can reasonably enable.

I know that many people prefer ZFS to MD RAID for similar scenarios, and that ZFS offers many built-in data corruption protections. From what I've read, the most compelling use case for MD RAID, aside from niche scenarios, appears to be its slightly better performance (with similar resources). I'm still undecided, especially since a lot of the articles/posts I've read comparing ZFS to MD RAID are now pretty old. Normally I'd be leaning towards ZFS, but some more recent posts mention that dm-integrity offers some of the same benefits of ZFS, albeit the MD RAID/dm-integrity combo is less intuitive to set up and use.

Question: Can MD RAID/dm-integrity play a similar role as ZFS, in terms of avoiding data corruption? I'm willing to put in some time to learn how the first combination works, but I need to strike some balance between performance/safety/convenience since I don't have unlimited time.
 
Learn ZFS, it's one of if not the greatest storage systems ever made. I first used it back on Solaris 10/06 and it radically changed how I approach disk storage. While both can be used, please understand that ZFS acts more like a storage processor (SP) which is common when doing SAN's. After creating your root pool (rpool), you can then add other disks and start treating storage as putty instead of having to do silly stuff like partitions. A more comparable product would be zfs vs brtfs, and you can start a religious war between those two.
 
Last edited:
Learn ZFS, it's one of if not the greatest storage systems ever made. I first used it back on Solaris 10/06 and it radically changed how I approach disk storage. While both can be used, please understand that ZFS acts more like a storage processor (SP) which is common when doing SAN's. After creating your root pool (rpool), you can then add other disks and start treating storage as putty instead of having to do silly stuff like partitions. A more comparable product would be zfs vs brtfs, and you can start a religious war between those two.
Thanks for the input. I am indeed leaning towards ZFS, although was hoping someone might weigh in with at least a few strengths of the other solution, assuming there are any.
 
Thanks for the input. I am indeed leaning towards ZFS, although was hoping someone might weigh in with at least a few strengths of the other solution, assuming there are any.

They are two completely different things, so comparing them is very hard. Mdraid is just software RAID, it takes multiple disk partitions (though most use the whole disk) and makes them appear as a single physical disk with striping / parity. You then partition / format it into volumes and viola. It's acts like a RAID card. ZFS on the other is an object based storage system, there are no slices and instead it does parity on blocks then stores that parity information as part of the file system. As an object based file system it can do stuff like Copy On Write, deduplication, compression, and there are no partitions or volumes to manage, instead it treats each file system as a container. Very different approaches. Mdraid would be more "direct" to someone who is used to working with servers but not SANs.

Here is an example of something I've actually done on a T5220 (this was a long time ago). That system comes with 8 300GB SAS disks. You take the first two and make them a mirrored root pool (rpool), which you the install the Global Zone (Solaris OS) onto. This will have /, /usr, /var, /lib and so forth. The remaining six disks would then be turned into zonepool with either raidz1 (1 parity block) or raidz2 (2 parity blocks). At Z1 we have 1.5TB of local storage, at z2 it's 1.2TB of storage. We then use ZFS to create a zone1 filesystem inside zonepool and assign it a mountpoint of /zones/zone1, then zone2, zone3 and so forth.

There are no size limits, nor partitions, instead think of each file system as bucket. If we wanted to cap the size of any ZFS, we should set it's quote attribute.

zfs set quota=32GB zonepool/zone1
zfs set mountpoint=/zones/zone1 zonepool/zone1

I could then use zoneadm to create individual zones and assign them to use a zfs file system as it's storage. Would have a dozen zones running on that single server, each with it's own persistence, some have Oracle RDMBS installed, others BEA Weblogic or an Apache reverse proxy, Open LDAP or other such software.

If I wanted to do a backup I could use zfs send to stream the filesystem to a serial file and pipe it through compression. Or better yet create a snapshot, which is a point in time bookmark of the file system, then send that bookmark to a serial file somewhere else. If I wanted to blow the serial file out on a different system I would do the reverse and stream it to the zfs receive command. Or you could even mount the snapshot (if it still exists) as a different folder. When you are done, use zfs to destroy the snapshot and all the changes since are then written back to the filesystem.

Sorry for the long post, but ZFS really is a different way to approach managing storage. On a desktop system you probably won't really notice any difference, other then never having to worry about fixed partitions / volumes. For enterprise stuff or just shared storage it's a game changer.
 
  • Like
Reactions: anvoice
Sorry for the long post, but ZFS really is a different way to approach managing storage. On a desktop system you probably won't really notice any difference, other then never having to worry about fixed partitions / volumes. For enterprise stuff or just shared storage it's a game changer.
Nope, completely appreciated. Thank you for the thoughtful post!