Reconnecting your LXD installation to the ZFS storage pool

You are using LXD and you are creating many containers. Those containers are stored in a dedicated ZFS pool, and LXD is managing this ZFS pool exclusively. But disaster strucks, and LXD loses its database and forgets about your containers. Your data is there in the ZFS pool, but LXD has forgotten them because its configuration (database) has been lost.

In this post we see how to recover our containers when the LXD database is, for some reason, gone.

This post expands on the LXD disaster recovery documentation.

How to lose your LXD configuration database

How could you have lost your LXD database?

You have a working installation of LXD and you have uninstalled LXD by accident. Normally, there should be some copy of the database lying around which could make the recovery much easier. In my case, I have been running an instance of LXD from the edge channel (snap package) and after some time, LXD would get stuck and not work. LXD would not start and the lxc commands would get stuck without giving any output. Therefore, I switched to the stable channel (default) and the configuration database was gone. lxc list would work, but show an empty list.

Prerequisites

In this post we cover the case where your storage pool is intact but LXD has forgotten all about your containers, your profiles, your network interfaces, and, of course, your storage pool.

You should get the appropriate output with zfs list. Like this.

$ sudo zfs list
NAME                         USED  AVAIL  REFER  MOUNTPOINT
 lxd                        78,4G   206G    24K  /var/snap/lxd/common/lxd/storage-pools/lxd
 lxd/containers             73,1G   206G    24K  /var/snap/lxd/common/lxd/storage-pools/lxd/containers
 lxd/containers/mycontainer  486M   206G   816M  /var/snap/lxd/common/lxd/storage-pools/lxd/containers/mycontainer
...

But the lxc commands return empty.

$ lxc storage list
 +------+-------------+--------+--------+---------+
 | NAME | DESCRIPTION | DRIVER | SOURCE | USED BY |
 +------+-------------+--------+--------+---------+
 +------+-------------+--------+--------+---------+
$ lxc profile list
 +-----------+---------+
 |   NAME    | USED BY |
 +-----------+---------+
 +-----------+---------+

What happened?

First, LXD lost the connection to the storage pool. There is no information as to where is the ZFS pool. We need to give that information to LXD.

Second, while LXD lost all configuration, each container has a backup of its own configuration in a file backup.yaml, stored in the storage pool. Therefore, you can sudo lxd import (Note: it is lxd import, not lxc import) to add back each container. If a custom profile, or network interface is missing, you will get an appropriate message to act on it.

How do we recover?

First, we make a list of the container names. It is quite possible you can get the list from /var/snap/lxd/common/lxd/storage-pools/lxd/containers/.

$ ls /var/snap/lxd/common/lxd/storage-pools/lxd/containers/
mycontainer
...

Second, mount each container. We run zfs mount and specify the ZFS part only. The mount point is somehow known already to ZFS.

$ sudo zfs mount lxd/containers/mycontainer
$ zfs mount
lxd/containers/mycontainer   /var/snap/lxd/common/lxd/storage-pools/lxd/containers/mycontainer

Finally, run lxd import to import the container. You may get an error; see the troubleshooting section below, and then try to import again.

$ sudo lxd import mycontainer

By doing so, we can now start the container.

$ lxc start mycontainer

Troubleshooting

Error: Create container: Requested profile ‘gui’ doesn’t exist

You get this error if the profile with name gui does not exist. Create the profile and run lxd import again.

Error: Create container: Invalid devices: Not an IP address: localhost

This relates to a change in LXD (appearing in LXD 3.13) and proxy devices. See more at this post.

Error: Storage volume for container “mycontainer” already exists in the database. Set “force” to overwrite

There is already a container in LXD with the same name. Most likely you got this if you already imported the container. Because if not, you need to figure out which one to keep.

Permanent link to this article: https://blog.simos.info/reconnecting-your-lxd-installation-to-the-zfs-storage-pool/

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: