Connecting to systemd-nspawn SSH containers in Ansible

August 3, 2018

I’ve recently been working on using Ansible to deploy some test services, one of which is an open source IAM server called Gluu. Gluu is unique in that it runs in a systemd-nspawn container. Management and installation of Gluu requires dropping into the container namespace using /sbin/gluu-serverd-3.1.3 login.

While this is all well and good for manual configuration, it makes it a bit tricky to deploy using automation. There’s no real official support in Ansible for systemd containers, although there was some discussion on this pull request. Additionally, there is a third party connection driver that uses the machinectl command for systemd managed virtual machines and containers, but I haven’t tested it out.

At any rate, I wanted to find a way to do this using a vanilla Ansible installation for this specific use case.

Figuring out what `login` does

The first step along this journey involves figuring out what, exactly, the /sbin/gluu-serverd-3.1.3 login does. Taking a look at the script is fairly straightforward:

# Output omitted
case "$1" in
# Output omitted
        login)
            PID=$(machinectl status gluu_server_$GLUU_VERSION 2>/dev/null | grep Leader | awk -F ' ' '{ print $2 };' )
            if [[ ${PID} =~ ^-?[0-9]+$ ]] ; then
                ssh  -o IdentityFile=/etc/gluu/keys/gluu-console -o Port=60022 -o LogLevel=QUIET \
                -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null \
                -o PubkeyAuthentication=yes root@localhost
            else
                echo "Gluu server is not started."
            fi
            ;;
# Output omitted

We can see that login simply executes SSH against a specific port on localhost. This indicates that Gluu is running an SSH daemon inside the container namespace. Sure enough, taking a look at the listening ports tells us exactly what is going on:

[root@gluuHost ~]# ss -tlp | grep 60022
LISTEN     0      128        *:60022                    *:*                     users:(("sshd",pid=1390,fd=3))
LISTEN     0      128       :::60022                   :::*                     users:(("sshd",pid=1390,fd=4))

It’s interesting to note that Gluu isn’t just listening on localhost. It’s listening on all ports. Gluu is, in my opinion, unnecessarily exposing SSH publicly. I can’t really imagine a situation where you would want to SSH directly into that container, as SSH isn’t a core service provided by Gluu.

The documentation is also opaque about this container behavior: It doesn’t mention anywhere in the setup guide that Gluu starts listening on SSH, or even that Gluu is running in a container (the docs erroneously say that it’s chrooted). At any rate, knowing that Gluu is listening on SSH provides a starting point for managing it via Ansible.

Options for Ansible management

If we take a look in the /opt/gluu-server-3.1.3 directory, we can see the container’s filesystem. Since we’re logging in as root, we can place a public key into /opt/gluu-server-3.1.3/root/.ssh/authorized_keys and then log in with whatever key we want.

The first approach that we could take toward managing Gluu would be to simply add it as a host in Ansible and manage it like any other Ansible host. Within a Gluu playbook, we can execute tasks on the host (such as installing the Gluu package) and then use delegate_to to execute tasks inside the container namespace:

Our inventory file would look something like this:

gluuContainer ansible_host=<FQDN or IP of the Gluu host> ansible_port=60022

And then a sample task might look something like this:

- name: Check to see if install has already been run
  stat:
    path: /install/community-edition-setup/setup.log
  register: gluuInstallLog
  delegate_to: gluuContainer

This isn’t really an ideal approach. It would be better to firewall off that SSH port entirely, since there isn’t really a reason for Gluu to be listening on public interfaces. That leads to the next approach: treating the container host as a jump host, and then “jumping” into the container. This is the approach that I opted to take. Let’s take a look at the implementation.

The inventory file

First, we need to create a host entry for the Gluu container and set the parameters for using a jump host:

[gluuContainer]
gluuContainer ansible_host=127.0.0.1 ansible_port=60022 ansible_ssh_transfer_method=scp ansible_ssh_common_args='-o ProxyJump=root@gluuHost'

Let’s break this down:

ansible_host defines the host that Ansible will connect to after it’s connected to the jump host. We set this to localhost because Ansible will first SSH to the Gluu host and then jump into the container on localhost.
ansible_port defines the port that Ansible will connect to. Once Ansible has established the first SSH connection to the gluuHost, it will ssh to 127.0.0.1:60022, which is the Gluu container.
ansible_ssh_transfer_method=scp tells Ansible to use SCP to transfer files into the Gluu container. This is useful because Ansible will default to using SFTP which, at the time of writing, wasn’t supported in the Gluu container.
ansible_ssh_common_args allows us to pass raw arguments into the SSH connection.
- -o ProxyJump=root@gluuHost tells SSH to first connect to gluuHost, and then jump into the host defined by ansible_host

Of course, we also need an inventory entry for the Gluu host, but that’s fairly simple:

[gluuHost]
10.0.1.105

The playbook

We can now write one unified playbook for Gluu that includes both the host and the container. First, we can define tasks on the host:

- name: Start Gluu
  command: /sbin/gluu-serverd-3.1.3 start

- name: Wait for Gluu container to launch
  wait_for:
    port: 60022

- name: Make sure Ansible can log into the Gluu chroot via locahost
  shell: cat /root/.ssh/authorized_keys >> /opt/gluu-server-3.1.3/root/.ssh/authorized_keys

At the host level, the tasks above will:

Ensure that the Gluu container is started
Wait for the Gluu container to launch by polling port 60022.
Copy the same keys that grant Ansible login on the host into the container’s filesystem
- /root/.ssh/authorized_keys is the authorized key file on the Gluu host
- /opt/gluu-server-3.1.3/root/.ssh/authorized_keys is the authorized key file within the Gluu container’s namespace (it mount’s /opt/gluu-server-3.1.3 when the container starts)

Since this lab server is only managed by Ansible, copying the host’s authorized_keys file works just fine. Other use cases might prefer to transfer the public keys from somewhere else, but the concept is the same.

Now that Ansible can log into the container, we can define some tasks that should take place within it:

- name: Check to see if install has already been run
  stat:
    path: /install/community-edition-setup/setup.log
  register: gluuInstallLog
  delegate_to: gluuContainer

- name: Set installed flag
  set_fact:
    gluuInstalled: "{{ gluuInstallLog.stat.exists }}"

- name: Send Gluu setup properties
  copy:
    src: setup.properties
    dest: /install/community-edition-setup/setup.properties
  delegate_to: gluuContainer
  when: gluuInstalled == False

The exact tasks run above aren’t important. The key is the use of delegate_to, which tells Ansible to run these tasks on the gluuContainer host. Ansible will look for gluuContainer in the inventory file that we defined previously and see that it needs to use a jump host. Ansible will then establish an SSH session to the Gluu host and “jump” into the Gluu container to execute these tasks.

Wrapping up

It’s worth pointing out that the method above will only work for systemd-nspawn containers (or really any container) that are running SSH. In general, it’s a best practice to avoid running SSH in a container unless the purpose of the container is to provide SSH functionality (which is usually unlikly). A better approach to managing a systemd container would be to use machinectl. Hopefully Ansible will support such interaction in the future, but the method described above should work for any container that exposes SSH in some fashion, such as the Gluu server that I was configuring.