You have just run
lxc launch ubuntu:18.04 mycontainer and a new container is being created. The command returns very quickly (around 1-2s) and the container image starts running. The container image may take a few more seconds to complete, so that the
init performs all the required tasks.
The question is, how do you know programmatically when a container’s
init has really finished and the startup has been completed?
We will answer first this question, why do we need to know when a running container’s startup has really been completed? We need to know when we write automation scripts. Some commands of the automation script will fail to work if the container has not completed fully the start up. For example, the
ubuntu:18.04 container images create a non-root account (username:
ubuntu). This account is created near the end of the startup process, therefore if we try to execute commands relating to
ubuntu on a container that has not completed the startup, those commands will fail.
Towards a solution
The proper way to solve this issue is to use a feature of the
init subsystem of the container image that can tell us when it has completed the startup.
In the case of the Ubuntu 16.04 and newer container images, they use
And there is functionality to report back if a system has completed the startup, by running
systemctl is-system-running. When a container is starting up, the state is
initializing. As soon as it completed the startup, the state switches into
$ systemctl is-system-running running
The first issue is that
systemd should have a feature to wait for us instead of we, having to check in a loop (polling) when the state changes. It actually does, and has been added to
systemd on August 2019 as systemctl: add support for –wait to is-system-running #9796. Translating this into Ubuntu versions, it means that in Ubuntu 19.04 or newer, we can use
--wait as in
systemctl is-system-running --wait. Very easy.
The second issue is with Ubuntu versions prior to Ubuntu 19.04, where we need to perform polling. Polling has some complications. Some
systemd targets will fail if they are set to run but are not able to complete in a container. Therefore, the end state per systemd will be
degraded instead of
running. Therefore, when polling, we need to check for either of these two states.
The third issue is that as soon as LXD launches a container, it takes a little bit for systemd to start up and be able to respond to requests for its state. You get the error
Failed to connect to bus: No such file or directory if you ask too fast.
The fourth issue is that in newer versions systemd that has the –wait parameter, the command will fail with
if we run it too soon. Which means that a simple
Failed to connect to bus: No such file or directory
is not sufficient. We need a bit of polling here until systemd is ready to report the state.
systemctl is-system-running --wait
The fifth issue is that both the following commands return the same error code 1.
systemctl is-system-running when it gives the error
Failed to connect to bus: No such file or directory. And
systemctl is-system-running when the result is
. That means that we need to be careful when we consume the error message through the return value, because the return value is not unique for the error message.
Here is the sequence of states for systemd when it starts in an Ubuntu LXD container. In parenthesis is the number of multiple times I got this message on my test system, until systemd completed the startup (reaching state: degraded).
Failed to connect to bus: No such file or directory (185 times) initializing (189 times) starting (168 times) degraded
Now we are ready to put all these together and have a solution for Ubuntu 19.04 (or newer), and a solution for Ubuntu 16.04/18.04.
Solution for Ubuntu 19.04 or newer
The following example script installs a snap package as soon as the container has fully started. The first
lxc exec waits until after the
systemctl is-system-running command does not return an error. The second
lxc exec command ways until the container has finished the startup.
$ cat myscript-1904newer.sh lxc stop mycontainer lxc delete mycontainer lxc launch ubuntu:19.04 mycontainer lxc exec mycontainer -- bash -c 'while $(systemctl is-system-running &>/dev/null); (($?==1)); do :; done' lxc exec mycontainer -- systemctl is-system-running --wait lxc exec mycontainer -- sudo snap install hello
Note: This script checks the return value of
systemctl is-system-running. When
systemd is not available yet, the return value is 1. When the command returns
degraded, the return value is also 1. Which means, bummer! We can make use of the
--wait parameter but we cannot get a proper foolproof solution without having to resort to some polling of ours. However, in the case of Ubuntu 19.04 or newer, the startup tends to take more time because
snapd has to start as well. Therefore, it is unlikely to hit the case that systemd has completed immediately and reports
degraded (return value 1).
Solution for Ubuntu 16.04 and Ubuntu 18.04 (but also Ubuntu 19.04 and newer)
Use the following example script. You can run it repeatedly in order to verify that it works well. Has been tested with Ubuntu 16.04, Ubuntu 18.04 and Ubuntu 19.04.
$ cat myscript-1804older.sh lxc stop mycontainer lxc delete mycontainer lxc launch ubuntu:18.04 mycontainer lxc exec mycontainer -- bash -c 'while [ "$(systemctl is-system-running 2>/dev/null)" != "running" ] && [ "$(systemctl is-system-running 2>/dev/null)" != "degraded" ]; do :; done' lxc exec mycontainer -- sudo snap install hello
As an overall solution, I suggest to use the last script that does polling. It works on Ubuntu 16.04, Ubuntu 18.04 and Ubuntu 19.04. Here is the line again that waits if the container
mycontainer has not completed the startup yet.
lxc exec mycontainer -- bash -c 'while [ "$(systemctl is-system-running 2>/dev/null)" != "running" ] && [ "$(systemctl is-system-running 2>/dev/null)" != "degraded" ]; do :; done'