Reconnecting your LXD installation to the ZFS storage pool

You are using LXD and you are creating many containers. Those containers are stored in a dedicated ZFS pool, and LXD is managing this ZFS pool exclusively. But disaster strucks, and LXD loses its database and forgets about your containers. Your data is there in the ZFS pool, but LXD has forgotten them because its configuration (database) has been lost.

In this post we see how to recover our containers when the LXD database is, for some reason, gone.

This post expands on the LXD disaster recovery documentation.

How to lose your LXD configuration database

How could you have lost your LXD database?

You have a working installation of LXD and you have uninstalled LXD by accident. Normally, there should be some copy of the database lying around which could make the recovery much easier. In my case, I have been running an instance of LXD from the edge channel (snap package) and after some time, LXD would get stuck and not work. LXD would not start and the lxc commands would get stuck without giving any output. Therefore, I switched to the stable channel (default) and the configuration database was gone. lxc list would work, but show an empty list.

Prerequisites

In this post we cover the case where your storage pool is intact but LXD has forgotten all about your containers, your profiles, your network interfaces, and, of course, your storage pool.

You should get the appropriate output with zfs list. Like this.

$ sudo zfs list
NAME                         USED  AVAIL  REFER  MOUNTPOINT
 lxd                        78,4G   206G    24K  /var/snap/lxd/common/lxd/storage-pools/lxd
 lxd/containers             73,1G   206G    24K  /var/snap/lxd/common/lxd/storage-pools/lxd/containers
 lxd/containers/mycontainer  486M   206G   816M  /var/snap/lxd/common/lxd/storage-pools/lxd/containers/mycontainer
...

But the lxc commands return empty.

$ lxc storage list
 +------+-------------+--------+--------+---------+
 | NAME | DESCRIPTION | DRIVER | SOURCE | USED BY |
 +------+-------------+--------+--------+---------+
 +------+-------------+--------+--------+---------+
$ lxc profile list
 +-----------+---------+
 |   NAME    | USED BY |
 +-----------+---------+
 +-----------+---------+

What happened?

First, LXD lost the connection to the storage pool. There is no information as to where is the ZFS pool. We need to give that information to LXD.

Second, while LXD lost all configuration, each container has a backup of its own configuration in a file backup.yaml, stored in the storage pool. Therefore, you can sudo lxd import (Note: it is lxd import, not lxc import) to add back each container. If a custom profile, or network interface is missing, you will get an appropriate message to act on it.

How do we recover?

First, we make a list of the container names. It is quite possible you can get the list from /var/snap/lxd/common/lxd/storage-pools/lxd/containers/.

$ ls /var/snap/lxd/common/lxd/storage-pools/lxd/containers/
mycontainer
...

Second, mount each container. We run zfs mount and specify the ZFS part only. The mount point is somehow known already to ZFS.

$ sudo zfs mount lxd/containers/mycontainer
$ zfs mount
lxd/containers/mycontainer   /var/snap/lxd/common/lxd/storage-pools/lxd/containers/mycontainer

Finally, run lxd import to import the container. You may get an error; see the troubleshooting section below, and then try to import again.

$ sudo lxd import mycontainer

By doing so, we can now start the container.

$ lxc start mycontainer

Troubleshooting

Error: Create container: Requested profile ‘gui’ doesn’t exist

You get this error if the profile with name gui does not exist. Create the profile and run lxd import again.

Error: Create container: Invalid devices: Not an IP address: localhost

This relates to a change in LXD (appearing in LXD 3.13) and proxy devices. See more at this post.

Error: Storage volume for container “mycontainer” already exists in the database. Set “force” to overwrite

There is already a container in LXD with the same name. Most likely you got this if you already imported the container. Because if not, you need to figure out which one to keep.

Permanent link to this article: https://blog.simos.info/reconnecting-your-lxd-installation-to-the-zfs-storage-pool/

7 comments

Skip to comment form

    • Stefan on August 1, 2019 at 14:36
    • Reply

    I hope I will never have to use this, but I’m really glad you took the time to write this article. Thanks!

    1. Thank you!

    • Roman on August 15, 2019 at 08:53
    • Reply

    Great post, it helped me to solve the “Not an IP address: localhost” issue on one of my older containers which was stopped for a while. For some reason, the good old ‘lxc config edit’ workaround didn’t work, as the changed device configuration seemed to be ignored when trying to save the changed conf. I had to manually mount the container’s storage pool, edit the backup.yaml and import the container using –force.
    Thanks for the useful info.

    1. In LXD 3.16 there has been a change that did not allow you to edit the configuration if it was not valid in the first place. This strict change has been relaxed and it should work now.
      It is good that you got a workaround using this post.

    • 512YiB on March 13, 2021 at 10:42
    • Reply

    Hello, Simos.
    I have cloned whole rpool with zfs send ... | ssh root@secondserver zfs recv ... from one server to another.
    Everything works fine, except LXD. systemctl status showed that snap/LXD cant’s start. I tried to resolve this problem, but had not succeed. So I’ve reinstalled LXD. Now LXD works, but it’s empty.
    So I tried your method, but got:

    lxd import mycontainer
    Error: The instance's directory "/var/snap/lxd/common/lxd/storage-pools/lxd/containers/mycontainer" appears to be empty. Please ensure that the instance's storage volume is mounted

    but it’s not empty:

    ls -A /var/snap/lxd/common/lxd/storage-pools/lxd/containers/mycontainer/
    backup.yaml metadata.yaml metadata.yaml~ rootfs templates

    Is there a solution?

    1. I haven’t tried this and I would need to replicate the whole setup to be able to help you.
      I am not able to do this at the moment.

      If the original server is still working, then you can enable one of the two LXD servers to be accessible remotely. Then, from the other LXD server you

      1. add the first as a remote (lxc remote add),
      2. and use lxc copy (or lxc move) to transfer the container.
    • ad on October 3, 2021 at 15:10
    • Reply

    HI, really thanks for your work. I got that “no pools available” when inputting “zpool list”,do you how to deal it? Thanks!

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.