the safest file storage setup (using zfs)

I recently set up a file storage with the purpose of being super safe.
Here is what one needs to do and why:

1. First one needs a machine which supports ECC RAM.
Why? It is not so much the correction of a faulty bit which matters, but to avoid any faulty bit to be undetected, as a ECC supporting machine (RAM, CPU and Motherboard needs to support it) halts on RAM errors.
To illustrate why this is important imagine the second and third bits in your RAM are faulty, meaning they stick to the state ‘1’ permanently. Every time a file is read to ram and written to disk (when editing or copying) the file will be corrupted – the second and third bit will be changed to ‘1’ no matter what it really is (simplified example). When using ZFS this will be even more disastrous: The checksum for each file will be calculated wrong, and in the effort to repair the wrongly presumed file with the parity copy it will be written corrupted. All files would be lost! See another explanation in the FreeNAS Forum.

As a side note: An external storage or NAS, like a synology, etc, will not be sufficient. First only a few, “pro” graded systems have dcc ram. But when your computer does have normal RAM which is faulty, it will corrupt the file and copy it to the NAS’s ECC RAM, which will write the corrupted file to disk, as it can not detect the corruption. So, when editing or downloading a file better do it on the storage machine directly and not with a non-ECC machine.

2. Use ZFS.
Why? ZFS maintains data integrity and avoids silent failures. ZFS is not only self-healing, and self-checking, it does not have a write-whole as a hardware raid, for example. In any other RAID system a file would be corrupted when the machine fails while the file is written (as files are overwritten in most filesystems). ZFS copies-on-write, which means first a copy is written, then the to-be-overwritten file will be deleted. It has many other cool features too, but the avoidance of silent data corruption is most important to me. Read here about ZFS.

3. Set up ZFS mirrors instead of a ZFS RAID-Z1
Why? You can better upgrade mirrors and there is more data redundancy. If you have four disks, making a RAID-Z1 would give you 3 disks for data and one for parity – so if one disk fails your data is recoverable, if two fail it is lost. Using them as two mirrors you only have two disks for data, but two for protection, meaning your data will be lost only if three disks fail. But the main reason for using mirrors is upgrading: Let’s assume you have four 3 TB drives and want to upgrade them to 4 TB drives. Having a RAID-Z1 you need to replace all disks before you get more storage capacity (as the storage is limited to the size of the smallest disk), while with a two mirror setup you only need to replace two disks for an increase. IN ZFS terminology: A vdev is only a big as it’s smallest device. A four disks RAID-Z1 is one vdev, while each mirror is a vdev.
This article surely is more convincing.

4. Create a spool for each mirror!
Why? If I could convince you to use mirrors don’t do the mistake to add them together as one virtual device (called zpool in ZFS). If you do so the mirrors will be striped, meaning that all data is spread other the mirrors. If, instead, you have a zpool for each mirror, you have a safer setup:
Let’s again assume one has four drives. If they are added into one zpool, if one drive fails nothing is lost. If two drives fail it depends on which drives fail. If it is one of each mirror everything is recoverable, but is the failing drives both belong to one mirror all data in the whole zpool is lost! Remember, the data on the still fine mirror consists only of data blocks, which where spread over all drives. So, in order to have a complete file, all blocks on all mirrors are needed.
If one has two zpools instead, only the data on the failing mirror would be lost, the over zpool would be not affected.

However, having multiple zpools means that you have multiple virtual drives which have their storage limit. So you might need to shift files around, when the space on one of them runs out. But I pay this price for the gained safety.

5. Use ‘Open ZFS on OS X (O3X)‘ for ZFS on a Mac.
Why? It currently supports zpool 5000 and ZFS 5. zpool 5000 is version 28 plus support for feature flags, but does not support the closed-source Oracle Solaris ZFS pool versions 29 and up.

There are only two alternatives:
MacZFS currently only supports zpool version 8 and zfs version 2.
I assume that the it might be superseded by O3X.
The other alternative, ZEVO does not support OS X 10.9 (Mavericks) yet. It is developed by a company which charges for a pro version, and even for the free version one needs to register.

6. Use these commandos for setup:

sudo zpool create -f -o ashift=12 \
-O compression=lz4 \
-O casesensitivity=insensitive \
-O atime=off \
-O normalization=formD \
-poolname- mirror /dev/disk /dev/disk

The main options are:
– compression=lz4, which not only saves space, but is faster as well. Loading a file from even an SSD is slow, decompressing it the CPU faster. So, the reduced file size helps loading it faster, while the time needed for decompression is still smaller, resulting in overall lesser time used. Follow this link for experimental results.
– atime=off switches of the access time file attribute. Otherwise every time a file is read the access time would be set to the current date, issuing an unnecessary write (wearing down the hard drive and endangering the file).
– ashift=12 adapts the block size to suit modern hard drives (Advanced Format Disks). Read on for a better explanation.

7. Have an offsite disaster recovery backup.
Well, if something drastically happens to this super safe storage machine, and even the disks are not recoverable, one better has an offsite backup for disaster recovery. I decided to not have all files, but only the important, non-retrievable files backed up like this.
For data thief safety have extra disk(s) where you back up your important data:
Either synchronise it using rsync, or ifs attach+resilver+detach, and bring it yourself to a save location.
Or use a cloud storage service – like dropbox or flicker (offers 1 TB for photos!). For the latter I still have to find out if the data is save from silent data corruption … do they use ECC RAM and ZFS as well? Or a better alternative?

If you read the whole post (wow!) here is a bonus for you: The best ZFS manual and a CheatSheet I found!

2 thoughts on “the safest file storage setup (using zfs)

  1. I enjoy hat you guys are up too. This type of clever work
    and reporting! Keep up the fantastic works guys I’ve incorporated you
    guys to blogroll.

Leave a Reply

Your email address will not be published. Required fields are marked *