You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
Chris Mahoney 355afe7d92 added link 2 years ago added link 2 years ago
finalstatus.png submission time 2 years ago submission time 2 years ago


Christopher Mahoney OS Final Project

Adding 8 new 4 TB drives to prometheus and setting up 4 RAIDZ1s in ZFS.

Breifly what heck is ZFS and what am I doing

My project isn't really about "what is zfs" it's more documenting what it takes to preforming upgrades on a system with ZFS. ZFS or the Zettabyte File System is a combination between a FS and a volume manager (RAID manager). ZFS is also expandable and has a ton of nice features. The theory behind it I believe is very complicated but it is easy to use.

I have't written 10 pages of a report but what I have done is many hours of system administration and learning for ZFS.

Starting out

The begin the project I need to figure out exactly what I'm working with.

Using lsblk and zpool status I took status on prometheus to check the drives. What I found made me very nervous. There are eight 4 terabyte drives in the zpool with zero data redunacy. As this relates to RAID I believe it's similar to a RAID 0 where data is spread across the drives for very fast reads and writes. But like RAID 0 if we lose a single drive the system would likely be unrecoverable.

Next using zpool list I check the storage utilization:

zdata    29T  16.8T  12.2T         -     8%    57%  1.00x  ONLINE  -

So around about half utilization.

Finally I need to check if any snapshots are being saved to be used as backups. It would be possible to use these snapshots to rebuild the state of the drives on a new zpool. zpool get listsnapshots

zdata  listsnapshots  off        default

And... there are no backups being saved, bummer. This makes me even more nervous because a mistake on my end could kill the machine without drive failures. So the first thing I'm going to do is take a snapshot of the machine by running sudo zfs snapshot -r zdata@before in case I mess up the data on this file system.

Now I ran into a ton of issues adding the new drives. It's mostly just boring server hardware managment stuff, I (with help of other COSInaughts) just added a PCIE to SAS card and then adapted them to the SATA backplane. They do not make these SAS to SATA cables easy for people to use, they are so stiff. Adding the new cards also changed the network interface labels so I had to redo the networking using netplan.

New pool

In I provided a list of possibilities of zfs configurations. The choice came down to 2 things

  1. Decent read performance
  2. Better than 50% storage efficiency

I felt that adding new drives just wasn't as satisfying if it turned out there was no increase in usable storage. We're also planning on using this server for storing more backups and that would imply it is going to get a bit more use.

With this in mind I went with zdata2 4 x 4 raidz1s (RAID-5 equivalents) without having to store the data on a backup device.

To create this storage layout firstly I created a the new pool named zoodle with the command.

zpool add zoodle raidz1 new0 new1 new2 new3 raidz1 new4 new5 new6 new7

zfs set compression=lz4 zoodle

I also added lz4 compression because I've read that iobound processes can actually benifit bandwidth streaming compressed data to the cpu and decompressing live. Essentially there is more latency with higher bandwidth at the cost of some CPU which we have to space on this machine.

Next I began the week long process of copying the data from zdata to zoodle using:

zfs send -R zdata@final | pv | zfs recv -f zoodle

Now I ran into an issue at the existence of LXC containers. ZFS & LXC together create subfilesystems per LXC container so I had to move these across devices. This wasn't possible with the apt version of LXC currently installed in 2018 so I had to update to the latest version following stack overflow guide. Then I simply moved the LXC containers from the old storage pool to the new one.

This actually took me the logest to figure out, now that I know it was just out of date I'm not sure why it took me so many hours...


lxc move $1 $1t -s=zoodle
lxc move $1t $1

Now that all of the data has been transferred from the old pool to the new pool it was time to delete the old one.

zfs unmount zdata
// reboot
zpool destory -f zdata

Finally we can add the old drives to the new pool! :D

sudo zpool add zoodle raidz1 old0 old1 old2 old3 raidz1 old4 old5 old6 old7

The final result of zfs status: finalstatus.png


An amazing feature of ZFS is incremental snapshots. Basically full system snapshots can be created just by recording differences of files. Sort of like git. I thought it would be a good idea to set us up with automatted snapshots moving forward, incase someone makes a critical error we can restore the system to a previous state.

I didn't feel like writting cron jobs myself so I found a widely suggested shell script to wrap around ZFS snapshot creation:

tar -xzf 1.2.4.tar.gz
cd zfs-auto-snapshot-upstream-1.2.4
make install

This will automatically add new cron jobs in /etc/cron.* and if something bad happens we can just run

zfs restore zoodle@snapshot


Working with ZFS was pretty easy all-in-all. There were a few hiccups but in general it was pretty well documented.

Side Projects

There are 3 things I would like to mention that are OS related but not enough to be their own projects.

I helped Sarah get setup a machine for her project. This was pretty quick it was just addig a new drive to a existing computer.

I built the machine Saied & Yan used for their project. I think they have split up and are no longer teammates. To create the machine it involved taking a server out of the COSI server room and adding 6 hard drives to it. This took about an hour and a half all together. Mostly screwing things together and installing the a new OS.

Finally, In a different server I replaced it's 2 small 300 gig hard drives with 2x 1 TB SSDs. This was a bit of a nightmare. I had made a backup about 2 weeks prior but I made a mistake which resulted in the loss of all data since then. Lucky it wasn't the end of the world because people's scripts where version controlled on GitHub. All that was last was my time while trying to resolve the issue, some CPU hours running calculations, and my ego for making such a simple mistake.


Turns out when I installed the new network card I broke the LXC containers because the bridge no longer exists. Luckly nobody as asked me to fix it so I can put that off for a while. I'm nervous messing with networking right now because if I break something I will be stuck out of the system.