Bootstrapping Argo CD

This article follows on from an introductory article that discussed the chicken or the egg paradox when it comes to bootstrapping GitOps agents into Kubernetes clusters. The article discusses how this relates to one of the popular GitOps solutions commonly used to automate application deployments to Kubernetes, Argo CD. Argo CD is one of the prominent GitOps solutions that pioneers this evolving approach to application delivery, and is part of a family of tools that co-exist under the Argo umbrella. Collectively, the Argo toolset is a recently graduated project of the Cloud Native Computing Foundation (CNCF).

Installing Argo CD Link to heading

When it comes to bootstrapping Argo CD into a cluster to act as a GitOps agent, there are detailed instructions for installation according to the preferred setup (i.e. multi-tenant, high availability and so on). Kubernetes configuration files (including its custom resource definitions) are maintained at Argo CD’s GitHub repo, and these can be applied directly with kubectl, or through a Kustomization definition. There’s a Helm chart, too, for anyone who prefers to use the chart packaging metaphor for applications. Using one of these techniques gets Argo CD running in a target cluster.

So, we know how to install Argo CD, but, what’s less clear is how or if Argo CD can manage itself according to GitOps principles. The project’s documentation is a little opaque in this regard. But, if you look hard enough, you’ll find a reference to managing Argo CD with Argo CD. It suggests using a Kustomization to define how Argo CD is configured to run in a Kubernetes cluster, with the config stored in a Git repo, which is monitored by the installed instance of Argo CD. There’s even a live online example of Argo CD managing itself, alongside a bunch of other applications.

A deployed agent, with its configuration held in versioned storage, configured to fetch and reconcile that same configuration, fulfils the GitOps principles. What’s missing from this tantalising glimpse of self-management, however, is how this works in practice. Let’s see if we can unpick this.

Applications in Argo CD Link to heading

GitOps agents work with instances of their own custom resources, that enable them to manage applications according to the GitOps principles. For Argo CD, the main custom resource it extends the Kubernetes API with, is the ‘Application’. In defining an Application object, and having it applied to the cluster, we provide Argo CD with the information it needs to manage that application. Some of the information defined in the object is mandatory, and some is optional. For example, the source location of the application’s remote Git or Helm repository that contains its configuration, and the coordinates of the target cluster where the application is to be deployed, are mandatory. Here’s an example Application definition that could be used by Argo CD for managing itself;

apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: argocd
  namespace: argocd
spec:
  destination:
    namespace: argocd
    server: https://kubernetes.default.svc
  project: default
  source:
    path: argocd-bootstrap/argocd
    repoURL: https://github.com/nbrownuk/gitops-bootstrap.git
    targetRevision: HEAD
  syncPolicy:
    automated:
      allowEmpty: true
      prune: true
      selfHeal: true

This object definition would need to be applied to a cluster running Argo CD, which would then be enacted upon by Argo CD’s application controller, resulting in periodic fetches of the configuration from the source repo. Any changes made to Argo CD’s configuration in the repo (for example, a change in app version), would then be applied to the cluster, which would result in a rolling update of Argo CD. Hence, Argo CD is managing itself.

But, let’s stop to think for a moment; isn’t the definition of the Application part of the overall configuration? We’ve manually applied the Application object definition to the cluster, but it’s not stored in our ‘source of truth’ in the Git repo. If we had to re-create the cluster, would we remember the Git reference to use for the targetRevision; was it the HEAD, a branch, a specific commit? What was the sync policy for imperative changes made inside the cluster? This is one of the typical scenarios that inspired the GitOps philosophy - to take away the ambiguity of configuration, by using declarative configuration residing in versioned storage. This includes the configuration that drives the actions of the GitOps agent itself.

App of Apps Pattern Link to heading

Using a pattern called ‘app of apps’, Argo CD enables Argo CD to manage itself declaratively. The pattern is more general purpose in utility, in that a single ‘root’ Application definition serves as a pointer to a Git directory path containing one or more further Application definitions. Applying the ‘root’ definition results in the creation of all the subordinate Application objects in the cluster, which are then acted upon by Argo CD’s application controller. One of the subordinate Application objects might be one that defines Argo CD itself.

.
├── managedapps
│  ├── argocd
│  │  ├── kustomization.yaml
│  │  └── namespace.yaml
│  └── podinfo
│     └── kustomization.yaml
├── rootapp
│  ├── kustomization.yaml
│  └── rootapp.yaml
└── subapps
   ├── argocd.yaml
   ├── kustomization.yaml
   └── podinfo.yaml

Here, the ‘rootapp’ Application definition (contained in rootapp.yaml) references the path, ‘subapps’, and each of the Application definitions at this location (argocd.yaml, podinfo.yaml) reference the corresponding sub-directories under ‘managedapps’.

If there are just a few applications, this pattern works well. But, if there are a multitude of applications, then maintenance of the plethora of Application definitions can become burdensome. As an enhancement to this pattern, the Argo CD project introduced the ApplicationSet custom resource and controller. An ApplicationSet object defines one or more ‘generators’, which generate key/value pairs called parameters. The definition will also contain a ‘template’, which the ApplicationSet controller will render with the corresponding parameters. In this way, a single ApplicationSet object can automatically spawn numerous Application resources, saving on the administrative overhead of maintaining lots of Application definitions by hand.

An ApplicationSet definition can work equally as well as the app of apps pattern for managing Argo CD itself. Maybe you’d want to lean towards the ApplicationSet solution, as it’s seen as an evolution of the app of apps pattern.

Argo CD Autopilot Link to heading

One of the problems with setting up the ‘app of apps’ or ApplicationSet configuration for Argo CD self-management, is that there are a number of manual steps involved:

Establish Git repo with app configuration
Install Argo CD
Create secret for trusted access to Git repo
Apply ‘root’ Application or ApplicationSet object definition

None of these steps are particularly onerous, but where there are manual steps there is always the opportunity to introduce error and ambiguity. The more automated everything is, the better chance of a successful outcome. This is where Argo CD Autopilot adds some value.

Argo CD Autopilot is a command line tool for bootstrapping Argo CD and managed applications into Kubernetes clusters. It originated from engineers at Codefresh as an ancillary project to Argo CD. And, it seeks to take away some of the complexity of bootstrapping an Argo CD GitOps implementation, as well as to provide a sane repo structure for the configuration of the managed applications. It still uses ApplicationSet and Application objects under the covers, but hides the complexity behind its CLI. The price you pay for this simplified user experience, is an opinionated approach to the repo structure, and the way you might approach your deployment workflows.

In terms of bootstrapping Argo CD itself, however, it’s as simple as issuing a command:

$ argocd-autopilot repo bootstrap \
    --repo https://github.com/nbrownuk/argocd-bootstrap \
    --git-token "$(< ./github-token)"

The net result is the creation of a Git repo (hence, the need to supply a personal access token with ‘repo’ privileges) containing a prescribed directory structure, with all of the necessary configuration elements to enable self-management of Argo CD. It also results in the deployment of Argo CD to the cluster addressed by your current context. From here you can use the CLI to create projects and applications for management by Argo CD, with the necessary configuration automatically created in the Git repo.

.
├── apps
│  ├── podinfo
│  │  ├── base
│  │  │  └── kustomization.yaml
│  │  └── overlays
│  │     └── prod
│  │        ├── config.json
│  │        └── kustomization.yaml
│  └── README.md
├── bootstrap
│  ├── argo-cd
│  │  └── kustomization.yaml
│  ├── argo-cd.yaml
│  ├── cluster-resources
│  │  ├── in-cluster
│  │  │  ├── argocd-ns.yaml
│  │  │  └── README.md
│  │  └── in-cluster.json
│  ├── cluster-resources.yaml
│  └── root.yaml
└── projects
   ├── prod.yaml
   └── README.md

The directory structure above is an example of a configured Git repo when using Argo CD Autopilot.

It’s early days for the Argo CD Autopilot project, with some features still missing. For example, its CLI doesn’t yet support the use of Helm charts as the embodiment of application configuration, although it’s possible to work around this using the limited support provided by Kustomize. Argo CD Autopilot isn’t necessarily for everybody, especially given its opinionated approach. But, if you’re looking for a convenient method for bootstrapping Argo CD, it does the job perfectly. It even allows you to recover from the loss of a cluster using the CLI, by bootstrapping with reference to the original Git repo as the source of truth.