Update #1: I posted at the Scaleway Linux kernel discussion thread to add support for the Ubuntu Linux kernel and Add new bootscript with stock Ubuntu Linux kernel #349.
Scaleway has been offering ARM (armv7) cloud servers (baremetal) since 2015 and now they have ARM64 (armv8, from Cavium) cloud servers (through KVM, not baremetal).
But can you run LXD on them? Let’s see.
Launching a new server
We go through the management panel and select to create a new server. At the moment, only the Paris datacenter has availability of ARM64 servers and we select ARM64-2GB.
They use Cavium ThunderX hardware, and those boards have up to 48 cores. You can allocate either 2, 4, or 8 cores, for 2GB, 4GB, and 8GB RAM respectively. KVM is the virtualization platform.
There is an option of either Ubuntu 16.04 or Debian Jessie. We try Ubuntu.
It takes under a minute to provision and boot the server.
Connecting to the server
It runs Linux 4.9.23. Also, the disk is vda, specifically, /dev/vda. That is, there is no partitioning and the filesystem takes over the whole device.
Here is /proc/cpuinfo and uname -a. They are the two cores (from 48) as provided by KVM. The BogoMIPS are really Bogo on these platforms, so do not take them at face value.
Currently, Scaleway does not have their own mirror of the distribution packages but use ports.ubuntu.com. It’s 16ms away (ping time).
Depending on where you are, the ping times for google.com and www.google.com tend to be different. google.com redirects to www.google.com, so it somewhat makes sense that google.com reacts faster. At other locations (different country), could be the other way round.
This is /var/log/auth.log, and already there are some hackers trying to brute-force SSH. They have been trying with username ubnt. Note to self: do not use ubnt as the username for the non-root account.
The default configuration for the SSH server on Scaleway is to allow password authentication. You need to change this at /etc/ssh/sshd_config to look like
# Change to no to disable tunnelled clear text passwords PasswordAuthentication no
Originally, it was commented out, and had a default yes.
Finally, run
sudo systemctl reload sshd
This will not break your existing SSH session (even restart will not break your existing SSH session, how cool is that?). Now, you can create your non-root account. To get that user to sudo as root, you need to usermod -a -G sudo myusername.
There is a recovery console, accessible through the Web management screen. For this to work, it says that you first need to You must first login and set a password via SSH to use this serial console. In reality, the root account already has a password that has been set, and this password is stored in /root/.pw. It is not known how good this password is, therefore, when you boot a cloud server on Scaleway,
- Disable PasswordAuthentication for SSH as shown above and reload the sshd configuration. You are supposed to have already added your SSH public key in the Scaleway Web management screen BEFORE starting the cloud server.
- Change the root password so that it is not the one found at /root/.pw. Store somewhere safe that password, because it is needed if you want to connect through the recovery console
- Create a non-root user that can sudo and can do PubkeyAuthentication, preferably with username other than this ubnt.
Setting up ZFS support
The Ubuntu Linux kernels at Scaleway do not have ZFS support and you need to compile as a kernel module according to the instructions at https://github.com/scaleway/kernel-tools.
Actually, those instructions are apparently now obsolete with newer versions of the Linux kernel and you need to compile both spl and zfs manually, and install.
Naturally, when you compile spl and zfs, you can create .deb packages that can be installed in a nice and clean way. However, spl and zfs will originally create .rpm packages and then call alien to convert them to .deb packages. Then, we hit on some alien bug (no pun intended) which gives the error: zfs-0.6.5.9-1.aarch64.rpm is for architecture aarch64 ; the package cannot be built on this system which is weird since we are only working on aarch64.
The running Linux kernel on Scaleway for these ARM64 SoC has the following important files, http://mirror.scaleway.com/kernel/aarch64/4.9.23-std-1/
Therefore, run as root the following:
# Determine versions arch="$(uname -m)" release="$(uname -r)" upstream="${release%%-*}" local="${release#*-}" # Get kernel sources mkdir -p /usr/src wget -O "/usr/src/linux-${upstream}.tar.xz" "https://cdn.kernel.org/pub/linux/kernel/v4.x/linux-${upstream}.tar.xz" tar xf "/usr/src/linux-${upstream}.tar.xz" -C /usr/src/ ln -fns "/usr/src/linux-${upstream}" /usr/src/linux ln -fns "/usr/src/linux-${upstream}" "/lib/modules/${release}/build" # Get the kernel's .config and Module.symvers files wget -O "/usr/src/linux/.config" "http://mirror.scaleway.com/kernel/${arch}/${release}/config" wget -O /usr/src/linux/Module.symvers "http://mirror.scaleway.com/kernel/${arch}/${release}/Module.symvers" # Set the LOCALVERSION to the locally running local version (or edit the file manually) printf 'CONFIG_LOCALVERSION="%s"\n' "${local:+-$local}" >> /usr/src/linux/.config # Let's get ready to compile. The following are essential for the kernel module compilation. apt install -y build-essential apt install -y libssl-dev make -C /usr/src/linux prepare modules_prepare # Now, let's grab the latest spl and zfs (see http://zfsonlinux.org/). cd /usr/src/ wget https://github.com/zfsonlinux/zfs/releases/download/zfs-0.6.5.9/spl-0.6.5.9.tar.gz wget https://github.com/zfsonlinux/zfs/releases/download/zfs-0.6.5.9/zfs-0.6.5.9.tar.gz # Install some dev packages that are needed for spl and zfs, apt install -y uuid-dev apt install -y dh-autoreconf
# Let's do spl first tar xvfa spl-0.6.5.9.tar.gz cd spl-0.6.5.9/ ./autogen.sh ./configure # Takes about 2 minutes make # Takes about 1:10 minutes make install cd .. # Let's do zfs next cd zfs-0.6.5.9/ tar xvfa zfs-0.6.5.9.tar.gz ./autogen.sh ./configure # Takes about 6:10 minutes make # Takes about 13:20 minutes make install # Let's get ZFS loaded depmod -a ldconfig modprobe zfs zfs list zpool list
And that’s it! The last two commands will show that there are no datasets or pools available (yet), meaning that it all works.
Setting up LXD
We are going to use a file (with ZFS) as the storage file. Let’s check what space we have left for this (from the 50GB disk),
root@scw-ubuntu-arm64:~# df -h / Filesystem Size Used Avail Use% Mounted on /dev/vda 46G 2.0G 42G 5% /
Initially, it was only 800MB used, now it is 2GB used. Let’s allocate 30GB for LXD.
LXD is not already installed on the Scaleway image (other VPS providers have alread LXD installed). Therefore,
apt install lxd
Then, we can run lxd init. There is a weird situation when you run lxd init. It takes quite some time for this command to show the first questions (choose storage backend, etc). In fact, it takes 1:42 minutes before you are prompted for the first question. When you subsequently run lxd init, you get at once the first question. There is quite some work that lxd init does for the first time, and I did not look into what it is.
root@scw-ubuntu-arm64:~# lxd init Name of the storage backend to use (dir or zfs) [default=zfs]: Create a new ZFS pool (yes/no) [default=yes]? Name of the new ZFS pool [default=lxd]: Would you like to use an existing block device (yes/no) [default=no]? Size in GB of the new loop device (1GB minimum) [default=15]: 30 Would you like LXD to be available over the network (yes/no) [default=no]? Do you want to configure the LXD bridge (yes/no) [default=yes]? Warning: Stopping lxd.service, but it can still be activated by: lxd.socket LXD has been successfully configured. root@scw-ubuntu-arm64:~#
Now, let’s run lxc list. This will create first the client certificate. There is quite a bit of cryptography going on, and it takes a lot of time.
ubuntu@scw-ubuntu-arm64:~$ time lxc list Generating a client certificate. This may take a minute... If this is your first time using LXD, you should also run: sudo lxd init To start your first container, try: lxc launch ubuntu:16.04 +------+-------+------+------+------+-----------+ | NAME | STATE | IPV4 | IPV6 | TYPE | SNAPSHOTS | +------+-------+------+------+------+-----------+ real 5m25.717s user 5m25.460s sys 0m0.372s ubuntu@scw-ubuntu-arm64:~$
It is weird and warrants closer examination. In any case,
ubuntu@scw-ubuntu-arm64:~$ cat /proc/sys/kernel/random/entropy_avail 2446 ubuntu@scw-ubuntu-arm64:~$
Creating containers
Let’s create a container. We are going to do each step at a time, in order to measure the time it takes to complete.
ubuntu@scw-ubuntu-arm64:~$ time lxc image copy ubuntu:x local: Image copied successfully! real 1m5.151s user 0m1.244s sys 0m0.200s ubuntu@scw-ubuntu-arm64:~$
Out of the 65 seconds, 25 seconds was the time to download the image and the rest (40 seconds) was for initialization before the prompt was returned.
Let’s see how long it takes to launch a container.
ubuntu@scw-ubuntu-arm64:~$ time lxc launch ubuntu:x c1 Creating c1 Starting c1 error: Error calling 'lxd forkstart c1 /var/lib/lxd/containers /var/log/lxd/c1/lxc.conf': err='exit status 1' lxc 20170428125239.730 ERROR lxc_apparmor - lsm/apparmor.c:apparmor_process_label_set:220 - If you really want to start this container, set lxc 20170428125239.730 ERROR lxc_apparmor - lsm/apparmor.c:apparmor_process_label_set:221 - lxc.aa_allow_incomplete = 1 lxc 20170428125239.730 ERROR lxc_apparmor - lsm/apparmor.c:apparmor_process_label_set:222 - in your container configuration file lxc 20170428125239.730 ERROR lxc_sync - sync.c:__sync_wait:57 - An error occurred in another process (expected sequence number 5) lxc 20170428125239.730 ERROR lxc_start - start.c:__lxc_start:1346 - Failed to spawn container "c1". lxc 20170428125240.408 ERROR lxc_conf - conf.c:run_buffer:405 - Script exited with status 1. lxc 20170428125240.408 ERROR lxc_start - start.c:lxc_fini:546 - Failed to run lxc.hook.post-stop for container "c1". Try `lxc info --show-log local:c1` for more info real 0m21.347s user 0m0.040s sys 0m0.048s ubuntu@scw-ubuntu-arm64:~$
What this means, is that somehow the Scaleway Linux kernel does not have all the AppArmor (“aa”) features that LXD requires. And if we want to continue, we must configure that we are OK with this situation.
What features are missing?
ubuntu@scw-ubuntu-arm64:~$ lxc info --show-log local:c1 Name: c1 Remote: unix:/var/lib/lxd/unix.socket Architecture: aarch64 Created: 2017/04/28 12:52 UTC Status: Stopped Type: persistent Profiles: default Log: lxc 20170428125239.730 WARN lxc_apparmor - lsm/apparmor.c:apparmor_process_label_set:218 - Incomplete AppArmor support in your kernel lxc 20170428125239.730 ERROR lxc_apparmor - lsm/apparmor.c:apparmor_process_label_set:220 - If you really want to start this container, set lxc 20170428125239.730 ERROR lxc_apparmor - lsm/apparmor.c:apparmor_process_label_set:221 - lxc.aa_allow_incomplete = 1 lxc 20170428125239.730 ERROR lxc_apparmor - lsm/apparmor.c:apparmor_process_label_set:222 - in your container configuration file lxc 20170428125239.730 ERROR lxc_sync - sync.c:__sync_wait:57 - An error occurred in another process (expected sequence number 5) lxc 20170428125239.730 ERROR lxc_start - start.c:__lxc_start:1346 - Failed to spawn container "c1". lxc 20170428125240.408 ERROR lxc_conf - conf.c:run_buffer:405 - Script exited with status 1. lxc 20170428125240.408 ERROR lxc_start - start.c:lxc_fini:546 - Failed to run lxc.hook.post-stop for container "c1". lxc 20170428125240.409 WARN lxc_commands - commands.c:lxc_cmd_rsp_recv:172 - Command get_cgroup failed to receive response: Connection reset by peer. lxc 20170428125240.409 WARN lxc_commands - commands.c:lxc_cmd_rsp_recv:172 - Command get_cgroup failed to receive response: Connection reset by peer. ubuntu@scw-ubuntu-arm64:~$
Two hints here, some issue with process_label_set, and get_cgroup.
Let’s allow for now, and start the container,
ubuntu@scw-ubuntu-arm64:~$ lxc config set c1 raw.lxc 'lxc.aa_allow_incomplete=1' ubuntu@scw-ubuntu-arm64:~$ time lxc start c1 real 0m0.577s user 0m0.016s sys 0m0.012s ubuntu@scw-ubuntu-arm64:~$ lxc list +------+---------+------+------+------------+-----------+ | NAME | STATE | IPV4 | IPV6 | TYPE | SNAPSHOTS | +------+---------+------+------+------------+-----------+ | c1 | RUNNING | | | PERSISTENT | 0 | +------+---------+------+------+------------+-----------+ ubuntu@scw-ubuntu-arm64:~$ lxc list +------+---------+-----------------------+------+------------+-----------+ | NAME | STATE | IPV4 | IPV6 | TYPE | SNAPSHOTS | +------+---------+-----------------------+------+------------+-----------+ | c1 | RUNNING | 10.237.125.217 (eth0) | | PERSISTENT | 0 | +------+---------+-----------------------+------+------------+-----------+ ubuntu@scw-ubuntu-arm64:~$
Let’s run nginx in the container.
ubuntu@scw-ubuntu-arm64:~$ lxc exec c1 -- sudo --login --user ubuntu To run a command as administrator (user "root"), use "sudo <command>". See "man sudo_root" for details. ubuntu@c1:~$ sudo apt update Hit:1 http://ports.ubuntu.com/ubuntu-ports xenial InRelease ... 37 packages can be upgraded. Run 'apt list --upgradable' to see them. ubuntu@c1:~$ sudo apt install nginx ... ubuntu@c1:~$ exit ubuntu@scw-ubuntu-arm64:~$ curl http://10.237.125.217/ <!DOCTYPE html> <html> <head> <title>Welcome to nginx!</title> <style> ...
ubuntu@scw-ubuntu-arm64:~$
That’s it! We are running LXD on Scaleway and their new ARM64 servers. The issues should be fixed in order to have a nicer user experience.
11 comments
Skip to comment form
Do they support FUSE ?
Simos
Great work documenting this !
Have you filed any bugs w Scaleway to get some of the issues fixed ?
Brian
Author
@Ahmad:
According to http://mirror.scaleway.com/kernel/aarch64/4.9.23-std-1/config-4.9.23-std-1
the FUSE filesystem is compiled in the kernel and available to use.
@bmullan:
I asked at the LinuxContainers mailing list (and discussion forum) for a repository that has a suitable Linux kernel.
We can propose to them (I’ll do so and report here).
I do not know if Canonical will try to approach them as well.
Author
@bmullan: I reported about the Linux kernel issue at https://community.online.net/t/official-linux-kernel-new-modules-optimizations-hacks/226/75
Hello, great work. Consider that this address http://mirror.scaleway.com/kernel is empty now though
Author
The mirror.scaleway.com web server does not allow indexing. The ARM64 files can be found at http://mirror.scaleway.com/kernel/aarch64/4.9.23-std-1/
The URL has the form of http://mirror.scaleway.com/kernel/${arch}/${release}/
Installing ZFS on the Start range from Scaleway seems to be much easier as they use a conventional kernel, so there’s no need to build the modules…
Author
Are you referring to the ARM64 servers? There is difficulty with the ARM64 servers and I suppose there has not been a change about that recently.
For the x86 servers, it is much easier,
https://blog.simos.info/how-to-install-lxd-containers-on-ubuntu-on-scaleway/
https://blog.simos.info/how-to-run-the-stock-ubuntu-linux-kernel-on-scaleway-using-kexec-and-server-tags/
Hi Simos, I’m talking about the new “Start” range of KVM based machines on Scaleway…
https://www.scaleway.com/virtual-cloud-servers/#anchor_starter
These machines don’t have bootscript options, you just use the classic LXD and ZFS install options for Debian which is much easier than needing to manually edit things as was the case before. The Start range has only been around since May this year…
https://blog.online.net/2018/05/03/introducing-scaleway-nextgen-nvme-cloud-servers-with-hot-snapshots/
Author
Hi Bruce,
I just had a look at the new “Start” range. Scaleway has eventually started to use the stock Ubuntu Linux kernel, therefore, everything should work fine when you want to use LXD (AppArmor kernel support), even with ZFS or btrfs. Previously, Scaleway had their own Linux kernels which were shared among Linux distributions.
This warrants an updated post on running LXD on Scaleway servers!
Great… yeah, I was pleasantly surprised when I saw that their latest incarnation was using the stock kernel… I suspect they had a bunch of issues with their boot scripts or people just kept asking them for stock…