Carvel (formerly k14s) combined with Ansible to orchestrate a git-ops workflow has after a lot of testing become my preferred k8s deployment technique. This post will cover how to use the k14 Ansible role as part of a configuration management and control process. I’ll detail the process to configure and customize applications as well as how I create a role for a new application. This will be used as the basis for future posts that will focus on deploying specific applications to work together and will refer back to this post for details on how my Carvel work flow works.

Work flow

The work flow is designed to generate configurations to be checked into revision control. This makes reviewing changes easy as both the input and resulting output can be seen in a diff. Additionally everything needed to deploy or update an application is in the repository. Each commit is a fully deployable configuration enabling a rollback by reverting a bad change in git and reapplying.

To automate this I’m using an Ansible role ansible_k14. This role is designed to be imported by application specific roles to provide a consistent way of generating deployment artifacts in a DRY fashion. Ansible_k14 currently does the following:

  • Retrieve the templates for a service via
    • Helm chart in a git repo
    • Helm chart downloaded from a chart server
    • Docker-compose files from a helm repository
  • Generate kubernetes objects from the templates
  • Generate any additional supporting objects from the application role
  • Apply overlay modifications to resources if they are defined
  • Resolve image references to their digest form (immutable)
  • Setup files in the application folder

The process allows for multiple sites with site specific configuration as well as application specific overlays if customization is needed on a per-site basis.

Directory Structure

The final directory structure for a site and application follow the following scheme:

sites/
└── site1
    ├── application
    │   ├── deploy.sh
    │   ├── diff.sh
    │   ├── manifest
    │   │   ├── ConfigMap.yaml
    │   │   ├── daemonset.yaml
    │   │   ├── deployment.yaml
    │   │   ├── namespace.yaml
    │   │   ├── rbac.yaml
    │   │   └── service-accounts.yaml
    │   ├── secrets
    │   │   └── secrets.yaml
    │   └── overlays
    │       └── manifest.yaml
    └── site.yaml

The role generates:

  • application folder
  • deploy and diff scripts
  • manifest folder populated with K8s objects
  • secrets folder with a secrets file (if used by the application role)

The site.yaml file is used for site specific settings and is not generated by the role. The ansible_k14 role supports detection of sops allowing the settings to be encrypted.

The overlays folder and contents are not generated and are used to apply site specific customizations to an application if they exist. In most instances this shouldn’t be necessary, but is included as a convenience if it’s needed in some edge case.

Software requirements

All of the deployment objects are fully defined in an applications folder for a given site. This means you can deploy them with kubectl directly if you need it. However, generating the templates requires supporting software and the deployment scripts are setup to support deploying via kapp. The full list of currently required software is:

  • ansible - required to run the roles providing the work flow automation
  • kapp - to run the deploy scripts
  • ytt - for template processing
  • kbld - for image dereferencing during template processing
  • kompose - to generate templates from docker-compose files
  • helm3 - to generate templates from helm charts
  • git - needs no explanation

On Ubuntu you can install Carvel, Kompose, Helm, and Ansible to meet these requirements with:

# Carvel (run as root)
wget -O- https://k14s.io/install.sh | bash

# Kompose for updating config
curl -L https://github.com/kubernetes/kompose/releases/download/v1.21.0/kompose-linux-amd64 -o kompose
chmod +x kompose
sudo mv ./kompose /usr/local/bin/kompose

# Helm3 for updating config
sudo snap install helm --classic

# Ansible for updating config
sudo apt install software-properties-common
sudo apt-add-repository --yes --update ppa:ansible/ansible
sudo apt install ansible

Several alternate install methods are available for other platforms or if you want to avoid curl | bash installs. See the links for specific applications for details on install options.

Putting it to use

Clone the example repository on this post and we’ll get some hands on examples.

git clone https://github.com/chris-sanders/ansible_k14_deployment.git
cd ansible_k14_deployment/
git submodule update --init

At the top level of the repository we have a few files that configure ansible_k14 and control which applications and sites we want to process. This is being done via an Ansible inventory file and playbook. Let’s look at the inventory file clusters.yaml

Inventory file

all:
    hosts:
        site1:
            root_folder: ./sites/site1 
        site2:
            root_folder: ./sites/site2 
    vars:
        ansible_connection: local
        ansible_python_interpreter: ""
        site_file: site.yaml

Even if you aren’t familiar with Ansible and inventory files this is a fairly short and straight forward file. Ansible is configured to run locally, because we’re generating all the files here in this directory no need to run on remote machines.

Ansible Hosts are used to define the sites we want, here you can see that I have defined two sites (site1 and site2) and I’ve defined where I want their root folders to be. I’m keeping all of the sites in a subfolder ‘sites’. This lets you name or organize your sites however you prefer.

The vars section applies to all hosts. The Ansible parameters are setting up the local conncetion, and the site_file tells ansible_k14 what filename to use for the site specific configuration. I’m using site.yaml for all sites, you could define that on each site if you wanted different file names for some reason.

Playbook

In addition to the inventory file a playbook is used to specify which roles and sites we want to process for a given run. The example folder has the playbooks all.yaml and dev.yaml. Let’s look at the all.yaml playbook.

- hosts: site1
  gather_facts: no
  tasks:
    - name: Include site1 roles
      include_role:
          name: "{{ item }}"
          public: no
      loop:
          - metallb
          - traefik
          - ceph-csi-rbd
          - acme-dns
          - cert-manager
          - bitwarden
- hosts: site2
  gather_facts: no
  tasks:
    - name: Include site2 roles
      include_role:
          name: "{{ item }}"
          public: no
      loop:
          - metallb
          - traefik
          #- ceph-csi-rbd
          - acme-dns
          - cert-manager
          - bitwarden

If you aren’t familiar with Ansible playbooks this file is defining two plays, one targeting the host site1 and the other targets the host site2. These names are the hosts that I defined in the inventory above. For each host, a single task is run which runs a loop including a list of Ansible roles.

All of the roles are included in the roles folder, including the ansible_k14 role. Details on how the roles are written will be covered later. Including a role will process the configuration for that application and generate the output in the application folder for the site it was included on.

Running the play

Let’s run a playbook to generate an application on one site, if you look at the file dev.yaml it’s a playbook that just runs the metallb role on site2. To run the playbook call ansible-playbook specifying the inventory file and the playbook to run.

ansible-playbook -i clusters.yaml dev.yaml

If you have the dependencies installed Ansible will run through the ansible_k14 process regenerate the manifest files. The role will delete the current manifest folder and completely regenerate it’s contents. If you don’t see any changes in git, that just means nothing changed in the input and hence the output files are exactly the same nothing to review. At the time of this writing a git diff after this run produces the following diff.

diff --git a/sites/site2/metallb/manifest/daemonset.yaml b/sites/site2/metallb/manifest/daemonset.yaml
index 5d24219..dd2ffc0 100644
--- a/sites/site2/metallb/manifest/daemonset.yaml
+++ b/sites/site2/metallb/manifest/daemonset.yaml
@@ -14,7 +14,7 @@ metadata:
     app.kubernetes.io/instance: metallb
     app.kubernetes.io/managed-by: Helm
     app.kubernetes.io/name: metallb
-    helm.sh/chart: metallb-0.1.22
+    helm.sh/chart: metallb-0.1.23
   name: metallb-speaker
 spec:
   selector:
@@ -29,7 +29,7 @@ spec:
         app.kubernetes.io/instance: metallb
         app.kubernetes.io/managed-by: Helm
         app.kubernetes.io/name: metallb
-        helm.sh/chart: metallb-0.1.22
+        helm.sh/chart: metallb-0.1.23
     spec:
       containers:
       - args:

This demonstrates one of the benefits of checking everything into git. I can see that none of the input configuration changed, however re-processing the upstream helm Chart shows a minor version change in the Chart revision. That change doesn’t affect my configuration as none of the actual Kubernetes resources changed. Roles are currently pulling the latest upstream on every run, your results at a later time may show more significant changes.

Site config

All of the roles include default values, you can include whatever baseline configuration you want in the roles to represent the most common configuration across your sites. Where you need changes between sites values can be configured for each site in the site file. The site files in each of the example sites show configuration examples for the currently deployed roles.

Edit the site file for site2 located at ./sites/site2/site.yaml and update the metallb address. This value tells Metallb what IP range it can use for ExternalIP addresses. Changing this value to 10.0.10.50-10.0.10.55 and re-running the dev.yaml play as before will re-generate the configuration including this new range.

diff --git a/sites/site2/metallb/manifest/ConfigMap.yaml b/sites/site2/metallb/manifest/ConfigMap.yaml
index d0a9b06..8d7a527 100644
--- a/sites/site2/metallb/manifest/ConfigMap.yaml
+++ b/sites/site2/metallb/manifest/ConfigMap.yaml
@@ -11,4 +11,4 @@ data:
     - name: default
       protocol: layer2
       addresses:
-      - 10.0.9.1-10.0.9.19
+      - 10.0.10.50-10.0.10.55
diff --git a/sites/site2/site.yaml b/sites/site2/site.yaml
index 5e7366c..f9c3b67 100644
--- a/sites/site2/site.yaml
+++ b/sites/site2/site.yaml
@@ -2,7 +2,7 @@ kapp:
     namespace: kapp
 metallb:
     addresses:
-    - 10.0.9.1-10.0.9.19
+    - 10.0.10.50-10.0.10.55

After the change the git diff now shows that the site configuration was changed and it shows that a ConfigMap for metallb had a value changed as well. Again by revision controlling the full configuration git is able to show us the configuration change and the actual results of the change that will be applied to the cluster.

Deploying updates

With the configuration change to metallb described above, kapp can be used to diff the configuration on-disk with the one in the cluster. This is the first operation that will actually talk to the cluster. Generating and reviewing changes have thus far been completely done offline.

From the folder ./sites/site2/metallb running the diff.sh script will produces the following diff.

$ ./diff.sh
Target cluster 'https://192.168.1.108:16443' (nodes: x1c)

@@ update configmap/metallb-config (v1) namespace: metallb @@
  ...
  6,  6         addresses:
  7     -       - 10.0.9.1-10.0.9.19
      7 +       - 10.0.10.50-10.0.10.55
  8,  8   kind: ConfigMap
  9,  9   metadata:

Changes

Namespace  Name            Kind       Conds.  Age  Op      Op st.  Wait to    Rs  Ri
metallb    metallb-config  ConfigMap  -       4h   update  -       reconcile  ok  -

Op:      0 create, 0 delete, 1 update, 0 noop
Wait to: 1 reconcile, 0 delete, 0 noop

You can see the full command line in the script. By default it prints a diff of changes as well as displaying a table of which objects are going to change. In this case the ConfigMap in the metallb namespace is going to be updated. Kapp will track and notice changes, additions, and even removal of resources. Similarly, the script deploy.sh will deploy the changes with kapp, only altering the items that changed.

Kapp provides a few benefits over standard kubectl for deployment. One of the most useful is tracking what was deployed with each deployment. This means kapp will remove resources that were removed where kubectl would simply add/update resources. Another nice benefit of kapp is that it is resource aware and applies resources in order to try and avoid failures. For example, creating the namespace before other resources which would fail with kubectl. If necessary, kapp can be customized for specific deployment ordering.

Encrypting configuration

The roles have built in support to autodetect if the site_file is using sops encryption. Sops provides several encryption options, and the configuration in site1 demonstrates using PGP to encrypt files locally. Here is an excerpt of the site1 site.yaml file:

kapp:
    namespace: ENC[AES256_GCM,data:7Nz1Sw==,iv:qMIoQpMPKPkyZSh2gkkULlLJLNrWOAdr4m+uOXFteh4=,tag:k0GPNj5sU5pnZLgD1A4Bvg==,type:str]
metallb:
    addresses:
    - ENC[AES256_GCM,data:BHLA+TrW3vwLgG+3gCirSuYf,iv:w4gkAG/3TqMfPrK5SW6QqmSzlXM5vydE2K8rpKg08lU=,tag:8IbOtRUWTZBlgDYygbkJww==,type:str]
    secretkey: ENC[AES256_GCM,data:bjjqG6FZY7a+9wLPYXRzhLvxo4IOQij3VsS1fvE88LLrb3c/TbQ9vO+Uyc7mvkgC47WFA9BO9P9JeXFXAZgTGrxbPCE51zK7BjbjduWfEVm6KBt5L7Le5G10dWp4tl5TErK+6CMgP7tI5H1tXZCxr6Uh0fY=,iv:GcilrbW/ImQIDPxB7jrgZEidz5puY2vqETH1Buyk1rA=,tag:7DyE2lHYq0p/T0uhdMNyrg==,type:str]
sops:
    kms: []
    gcp_kms: []
    azure_kv: []
    lastmodified: '2020-09-08T21:27:51Z'
    mac: ENC[AES256_GCM,data:LQeSU2pGh8JSgxNL4LqfbEqLO4mAdkB5Q7EQgWK0Fu8Aki0DUWrd8CorzRlwAj3Zux6rbMe/uCOUEA/evYWkkYf9LjG7po1fivLAab8LxrdEnvSB5HO+HBT7CAJHg/2AXN9TqBRSOpHRaTUJdo02eXzH1WcT2meuAaZj1SNEOls=,iv:AQ+4y+WYTFZfYZ3Enk+vel5JaevThBnuLxzJKM6ihqk=,tag:hcGOsQzEvmhIalsU0i5fGQ==,type:str]
    pgp:
    -   created_at: '2020-09-08T21:27:09Z'
        enc: |
            -----BEGIN PGP MESSAGE-----

            hQIMA8+hman75/nUAQ//dmVZtmz6pAK1bglOkaa06BYQAJq1B+RvPn3k8AInv1jw
            LLZOJkRGBY41+C4HHuStQt/TtZVwRJIOTNA9AwpLUMV7c60AO6K+aMueUrITegvy
            idHhVFu1nMVIELnYBLLx7fuxlbJqRjjbc4WtU7XZd4cCS8B7ZE1LzoB3fPdK+TEn
            3TxsWIHr/GJ8319pri5Ql/G7Bc/+n6F+8HT0jc5JKg9M2IglSZ4M9mMV+hj1GYpR
            8jWlM4AIeiWi1nvBCkgISeun0A/OWYg5FZayufMnBdqoYBu//jtgUSamlp26WV8t
            yC3iugTT9uOOyW4TdgdHPjXb+UT9Zu4Ly7IgjXh/ymQ+JJwKnKBcyh74jsVs6oTR
            sRiWkhAPRWFNvAbE+6PSZbIbuQcideNLc1OKLU20FLcBvleCg+l4LuK4C/BtegyQ
            1PauOY/nzjqPfvA1oYFuzfVTptibhwnR2eVFRUmMsOVeTPeiuoath61XLLnOP3GN
            LhnylDw8YWAwus+HqLR9cViVsz/2/uxGLjKXF+HIUlWBgtbhPaq6II7us0CheuhM
            KubhHyJ/VeOKkll0Dzb3YERX2tk4XbjNzGRVaXQetg0lEZR/ezW/yklm97RUUgG/
            f64dW2JwNltNPmi1VpN/eY5UIroJyfz0fiGruP3qm2JjO4tr0qAs9grF9fIFI6fS
            XgE1aNuqeaPCc46/G54sbDI/TlUhcvnlQ5qU6I93W8ZW47CyYO2wb3TY0xvRbt+m
            cRax0LFGI6Yf4Ya8fGtvsk1L58jBWFYfHjQdFX+Tij8m8SWGPaZmMHDGKn8yevM=
            =V09O
            -----END PGP MESSAGE-----
        fp: 211FA1AE58F975F266A3E42E3CF239734CDBFD58
    unencrypted_suffix: _unencrypted
    version: 3.5.0

Sops is yaml aware, and as can be seen here leaves the key values as-is but encrypts the values of the keys. Which keys are encrypted is configurable, but encrypting all values in the site_file is the best (and default) choice. All of the information that is needed for sops to decrypt the file is included in the key sops that is added by sops when it encrypts a file. This means you can decrypt the configuration on any machine that has the appropriate keys. Multiple keys can be used making it very straight forward to add keys for CI systems or other users that need access to decrypt the secrets.

Sops configuration can be seen in the .sops.yaml in the root of the repository, for full configuration options see the upstream sops documentation. If using sops for the site.yaml be sure you open/edit the file with sops not with a standard text editor. Sops checks a hash of the file and if it has changed outside of sops it will not process it.

The ansible_k14 role will decrypt the site file and use sops to encrypt resources in the secrets folder if it was used on the site file. The deploy and diff scripts will include decrypting the site file as part of the script if sops was found in the site file. To verify this is working you can check for a call to sops in the *.sh files and visually inspect the secrets.yaml file in the secrets folder to see that the values are encrypted.

Updating images

Many Helm Charts reference images via tags which do not specifically identify the image. Updates to the image using the same tag are not noticed by Kubernetes. The new image will be updated based on the imagePullPolicy which is generally IfNotPresent. The result is that a pod’s image will be pulled when it is created on a node that doesn’t already have the image. Updates under the same tag name will not trigger an update to the pod.

The ansible_k14 role addresses this by using kbld. I show an example of this in the previous post. Each time a role is run the tag is resolved from upstream and the digest which is specific to the exact image is used. Even if nothing else changes in the configuration, a new image available for the tag will update the image value of the resource. Applying the resource will cause a redeployment of the pod with the new images. Regularly processing roles with a CI/CD pipeline can therefore be used to guarantee update intervals on your pods.

Wrapping up

This post provides the core concepts to understand how Ansible roles that I’ve written based on the common ansible_k14 role can be configured and generated. I intend to make a later post reviewing how to write the roles for the applications but if you are familiar with Ansible you’ll probably be able to get started without a walk through.

The intention of this post was to show how these roles work so I can use these in examples for how to configure these applications for homelab/edge use. Future posts using these roles will assume you already know, or can refer back to, these instructions and will just provide information about the specific applications being deployed.