Sysdig (.org) is an open-source container troubleshooting tool and it works by capturing system calls and events directly from the Linux kernel.
When you install Sysdig, it adds a new kernel module that it uses to collect all those system calls and events. That is, compared to other tools like strace, lsof and htop, it gets the data directly from the kernel and not from /proc. In terms of functionality, it is a single tool that can do what strace + tcpdump + htop + iftop + lsof + wireshark do together.
An added benefit of Sysdig is that it understands Linux Containers (since 2015). Therefore, it is quite useful when we want to figure out what is going on in our LXD containers.
Once we get used to Sysdig, we can venture to the companion tool called Falco, a tool for container security. Both are GPL v2 licensed, though you need to sign a CLA in order to contribute to the projects (hosted on Github).
We are installing Sysdig on the host (where LXD is running) and not in any of our containers. In this way, we have full visibility of the whole system.
You can get the installation instructions at https://www.sysdig.org/install/ which essentially amount to a single curl | sh command:
curl -s https://s3.amazonaws.com/download.draios.com/stable/install-sysdig | sudo bash
Ubuntu and Debian already have a version of Sysdig, however it is a bit older than the one you get from the command above. Currently, the version in the
universe repository is 0.8, while from the command above you get 0.19.
If we run
sysdig without any filters, it will show us all system messages and events. It’s a never-ending waterfall, and you need to Ctrl+C to stop it.
We can run instead the Curses version called
You can go through the examples at https://github.com/draios/sysdig/wiki/Sysdig-Examples In this post, we look into the section that relates to containers.
View the list of containers running on the machine and their resource usage
sudo csysdig -vcontainers
There are six LXD containers, though we do not see their names. The LXD container names are shown under a column further to the right, therefore we would need to use the arrow keys to move to the right. Let’s select the container we are interested in, and press Enter.
We selected the container
guiapps and here are the processes inside this container.
View the list of processes with container context
sudo csysdig -pc
This command shows all the container processes together. That is, they have the container context.
View the CPU usage of the processes running inside the guiapps container
sudo sysdig -pc -c topprocs_cpu container.name=guiapps
Here we switch from
sysdig. An issue is that these two tools do not have the same parameters.
We have a container called guiapps and we asked
sysdig to show the CPU usage of the processes, sorted. The container is idle, therefore all are 0%.
View the network bandwidth usage of the processes running inside the guiapps container
sudo sysdig -pc -c topprocs_net container.name=guiapps
Here it shows the current network traffic inside the container, sorted by traffic. If there is no traffic, the list is empty. Therefore, it is just good to give you an indication of what is happening.
View the top files in terms of I/O bytes inside the guiapps container
sudo sysdig -pc -c topfiles_bytes container.name=guiapps
View the top network connections inside the guiapps container
sudo sysdig -pc -c topconns container.name=guiapps
The output is similar to
tcpdump, showing the IP addresses of source and destination.
Show all the interactive commands executed inside the guiapps container
sudo sysdig -pc -c spy_users container.name=guiapps
The output looks like this,
29756 17:10:57 root@guiapps) groups 29756 17:10:57 root@guiapps) /bin/sh /usr/bin/lesspipe 29756 17:10:57 root@guiapps) basename /usr/bin/lesspipe 29756 17:10:57 root@guiapps) dirname /usr/bin/lesspipe 29756 17:10:57 root@guiapps) dircolors -b 29756 17:11:07 root@guiapps) ls --color=auto 29756 17:11:24 root@guiapps) ping 220.127.116.11 29756 17:11:38 root@guiapps) ifconfig
The commands in italics are the commands that were recorded when running lxc exec. The rest are the commands I typed in the container (ls, ping 18.104.22.168, ifconfig and finally exit which does not get shown). Commands that come from the shell (like pwd, exit) are not visible since they do not
execv some command.
We have already installed Sysdig using the
curl | sh method that added their repository. Therefore, to install Falco, we just need to
sudo apt-get install falco
Falco needs its own kernel module,
$ sudo dkms status falco, 0.8.1, 4.10.0-38-generic, x86_64: installed sysdig, 0.19.1, 4.10.0-38-generic, x86_64: installed
Upon installation, it adds some default rules in /etc/falco. These rules are about application behaviour that Falco will be inspecting and reporting to us. Therefore, Falco is ready to go. If we need something specific, we would need to add our rules in /etc/falco/falco_rules.local.yaml
$ sudo falco container.name = guiapps Tue Nov 7 17:41:41 2017: Falco initialized with configuration file /etc/falco/falco.yaml Tue Nov 7 17:41:41 2017: Parsed rules from file /etc/falco/falco_rules.yaml Tue Nov 7 17:41:41 2017: Parsed rules from file /etc/falco/falco_rules.local.yaml 17:41:52.933145895: Notice Unexpected setuid call by non-sudo, non-root program (user=nobody parent=<NA> command=lxd forkexec guiapps /var/lib/lxd/containers /var/log/lxd/guiapps/lxc.conf -- env USER=root HOME=/root TERM=xterm-256color PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/snap/bin LANG=C.UTF-8 -- cmd sudo --user ubuntu --login uid=root) 17:41:52.938956110: Notice A shell was spawned in a container with an attached terminal (user=user guiapps (id=guiapps) shell=bash parent=sudo cmdline=bash terminal=34842) 17:41:58.583366422: Notice A shell was spawned in a container with an attached terminal (user=root guiapps (id=guiapps) shell=bash parent=su cmdline=bash terminal=34842)
We specified that we want to focus only on the container with the name
The lines in italics are the startup lines. The two lines on 17:41:52 are the result of the
lxc exec guiapps -- sudo --user ubuntu --login. The next line is the result of
Both Sysdig (troubleshooting) and Falco (monitoring) are useful tools that are aware of containers. Their default use is quite handy to troubleshoot and monitor containers. These tools have more much features and the ability to add scripts to them (called chisels) to do more even more advanced stuff.
For more resources, check their respective home page.