2026-05-30 14:10:01 +02:00
|
|
|
# ADR-003 — Toolchain decisions
|
|
|
|
|
|
2026-06-10 14:50:03 +02:00
|
|
|
## Status
|
|
|
|
|
|
|
|
|
|
Accepted (2026-05-30)
|
|
|
|
|
|
|
|
|
|
## Context
|
|
|
|
|
|
|
|
|
|
boma needs a defined, reproducible toolchain for running and testing its Ansible
|
|
|
|
|
monorepo: an execution engine, a Python environment, secrets handling, a testing
|
|
|
|
|
framework, linting, CI/CD, developer-ergonomics conventions, and a collections/roles
|
|
|
|
|
policy. This ADR records the choice made for each, together with the alternatives
|
|
|
|
|
weighed and why they were not adopted.
|
|
|
|
|
|
|
|
|
|
## Decision
|
|
|
|
|
|
|
|
|
|
### Execution engine
|
2026-05-30 14:10:01 +02:00
|
|
|
|
|
|
|
|
**Choice**: `ansible-core` (pip-installed, pinned version) + explicit `requirements.yml`
|
|
|
|
|
|
|
|
|
|
**Not chosen**: `ansible` full package (bundles ~85 collections at a frozen version)
|
|
|
|
|
|
|
|
|
|
**Rationale**: Explicit collection pinning allows independent upgrades, smaller installs,
|
|
|
|
|
and fully reproducible environments. The full package trades these away for convenience
|
|
|
|
|
that isn't needed in a maintained monorepo.
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
2026-06-10 14:50:03 +02:00
|
|
|
### Python environment
|
2026-05-30 14:10:01 +02:00
|
|
|
|
|
|
|
|
**Choice**: `python3-venv` (system Python on Debian 13) + pinned `requirements.txt`
|
|
|
|
|
|
|
|
|
|
**Not chosen**: `pyenv` (solves multi-version problems on developer laptops, not needed
|
|
|
|
|
on a dedicated Debian control node with a controlled Python version)
|
|
|
|
|
|
|
|
|
|
**Rationale**: The control node runs one Python version. A plain venv is sufficient,
|
|
|
|
|
reproducible, and has no extra dependencies.
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
2026-06-10 14:50:03 +02:00
|
|
|
### Secrets
|
2026-05-30 14:10:01 +02:00
|
|
|
|
|
|
|
|
**Choice**: Ansible Vault (file-based, built-in)
|
|
|
|
|
|
|
|
|
|
**Not chosen**:
|
|
|
|
|
- SOPS + age: better git-diff ergonomics, but adds external tooling and key management
|
|
|
|
|
- HashiCorp Vault: powerful, but significant operational overhead for this scale
|
|
|
|
|
|
|
|
|
|
**Rationale**: Vault is built-in, requires no extra services, and works well at this
|
2026-05-30 18:16:35 +02:00
|
|
|
scale. Whole-file encryption makes diffs unreadable regardless of layout, so rather
|
|
|
|
|
than flattening we organise secrets for human lookup and clean extraction: a nested
|
|
|
|
|
`vault.<service>.<key>` map inside each `vault.yml`, scoped to actual secrets (see
|
|
|
|
|
CLAUDE.md → Secrets).
|
2026-05-30 14:10:01 +02:00
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
2026-06-10 14:50:03 +02:00
|
|
|
### Testing
|
2026-05-30 14:10:01 +02:00
|
|
|
|
|
|
|
|
**Choice**: Molecule with Docker driver (`molecule-plugins[docker]`)
|
|
|
|
|
|
|
|
|
|
**Not chosen**:
|
|
|
|
|
- Molecule + Podman: rootless is appealing, but Docker is simpler on a Debian control node
|
|
|
|
|
- Molecule + Vagrant: full VMs are slower and require a hypervisor on the control node
|
|
|
|
|
- No testing: unacceptable for a shared, maintained project
|
|
|
|
|
|
|
|
|
|
**Test image**: a self-built, project-owned Debian 13 image with systemd support
|
|
|
|
|
(`.docker/molecule-debian13/`), hosted in the Forgejo registry. ADR-008 is canonical
|
|
|
|
|
for the image and the rationale for not using an external image such as
|
|
|
|
|
`geerlingguy/docker-debian13-ansible`.
|
|
|
|
|
|
|
|
|
|
**Verifier**: Built-in Ansible verifier. Testinfra added later if deeper assertions
|
|
|
|
|
are needed.
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
2026-06-10 14:50:03 +02:00
|
|
|
### Linting
|
2026-05-30 14:10:01 +02:00
|
|
|
|
|
|
|
|
**Choice**: `ansible-lint` + `yamllint` + `pre-commit`
|
|
|
|
|
|
|
|
|
|
- `yamllint`: catches formatting issues before Ansible sees the file
|
|
|
|
|
- `ansible-lint`: enforces correctness and idiomatic style
|
|
|
|
|
- `pre-commit`: runs both locally on every commit, preventing CI failures
|
|
|
|
|
|
|
|
|
|
Config files: `.ansible-lint`, `.yamllint` in repo root.
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
2026-06-10 14:50:03 +02:00
|
|
|
### CI/CD
|
2026-05-30 14:10:01 +02:00
|
|
|
|
2026-05-30 18:16:35 +02:00
|
|
|
**Choice**: Forgejo Actions (self-hosted at forgejo.nyumbani.baobab.band) + `act_runner`
|
2026-05-30 14:10:01 +02:00
|
|
|
|
|
|
|
|
**Not chosen**: GitHub Actions (external), Jenkins (heavy)
|
|
|
|
|
|
2026-05-30 19:32:37 +02:00
|
|
|
**Pipeline** (trunk-based — no pull requests; see CLAUDE.md git conventions):
|
|
|
|
|
1. Push to `main` → lint + Molecule tests
|
|
|
|
|
2. On green → deploy to staging
|
|
|
|
|
3. [manual promote gate] → deploy to production
|
2026-05-30 14:10:01 +02:00
|
|
|
|
2026-06-05 19:28:07 +02:00
|
|
|
`act_runner` runs as a Docker container on `ubongo` (the control node — ADR-015), or on
|
|
|
|
|
a dedicated runner VM later if CI load warrants a separate host.
|
2026-05-30 14:10:01 +02:00
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
2026-06-10 14:50:03 +02:00
|
|
|
### Developer ergonomics
|
2026-05-30 14:10:01 +02:00
|
|
|
|
|
|
|
|
**Choice**: `Makefile` as the single interface for all operations
|
|
|
|
|
|
|
|
|
|
**Rationale**: All `ansible-playbook`, `molecule`, and `ansible-lint` invocations go
|
|
|
|
|
through Make targets. This means:
|
|
|
|
|
- Claude Code always calls `make <target>` — never constructs raw commands
|
|
|
|
|
- Collaborators don't need to know the underlying flags
|
|
|
|
|
- CI uses the same targets as local development (no drift)
|
|
|
|
|
|
|
|
|
|
**direnv**: Not used — the control node is a dedicated host, not a shared workstation.
|
|
|
|
|
The venv is activated in the user's shell profile.
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
2026-06-10 14:50:03 +02:00
|
|
|
### Collections and roles policy
|
2026-05-30 14:10:01 +02:00
|
|
|
|
|
|
|
|
**No Galaxy roles.** All roles are written and maintained locally in `roles/`.
|
|
|
|
|
Galaxy roles introduce external state, versioning surprises, and implicit
|
|
|
|
|
conventions that conflict with this repo's style.
|
|
|
|
|
|
|
|
|
|
**Collections on demand.** A collection is added to `requirements.yml` only when
|
|
|
|
|
a task in a committed role actively uses a module from it. Pre-emptive inclusions
|
|
|
|
|
are removed. Each entry in `requirements.yml` must justify its presence.
|
|
|
|
|
|
|
|
|
|
**Starting collection set** (rationale for each):
|
|
|
|
|
|
|
|
|
|
| Collection | Kept / dropped | Reason |
|
|
|
|
|
|----------------|----------------|--------------------------------------------------------------|
|
|
|
|
|
| `ansible.posix`| Kept | Ansible-team maintained; fills real `ansible.builtin` gaps (`authorized_key`, `sysctl`, `acl`) |
|
|
|
|
|
| `community.docker` | Dropped | ADR-004 uses `ansible.builtin.command` + `docker compose` — no Docker API modules needed |
|
|
|
|
|
| `community.proxmox`| Dropped | Proxmox configuration is out of scope (ADR-001) |
|
|
|
|
|
| `community.crypto` | Deferred | Add when a role needs cert automation; use `openssl` CLI until then |
|
|
|
|
|
| `community.general`| Deferred | 1,500+ modules; add only the specific sub-module needed, with a comment |
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
## What was explicitly ruled out
|
|
|
|
|
|
|
|
|
|
| Tool | Reason not adopted |
|
|
|
|
|
|------------------|-------------------------------------------------------------|
|
|
|
|
|
| AWX / AAP | Significant operational overhead, not needed at this scale |
|
|
|
|
|
| Semaphore | Revisit if non-SSH operators need to trigger runs |
|
|
|
|
|
| ansible-runner | Only needed when AWX/Semaphore orchestrates runs |
|
|
|
|
|
| ansible-builder | Only needed when packaging Execution Environments for AWX |
|
|
|
|
|
| Kubernetes/Swarm | Out of scope — Docker Compose is the right complexity level |
|
|
|
|
|
| NixOS targets | Poor Ansible fit; all hosts standardised on Debian 13 |
|
|
|
|
|
|
2026-05-30 19:10:58 +02:00
|
|
|
Terraform is **adopted** for VM provisioning only (no DNS) — see `docs/decisions/006-terraform.md`.
|
2026-06-10 14:50:03 +02:00
|
|
|
|
|
|
|
|
## Consequences
|
|
|
|
|
|
|
|
|
|
Drawn from the rationale and trade-offs this ADR already states:
|
|
|
|
|
|
|
|
|
|
- Pinning `ansible-core` + an explicit `requirements.yml` and a plain pinned venv keeps
|
|
|
|
|
the control-node environment small and fully reproducible, at the cost of maintaining
|
|
|
|
|
the pins (per Execution engine / Python environment).
|
|
|
|
|
- Ansible Vault's whole-file encryption makes diffs unreadable regardless of layout, so
|
|
|
|
|
secrets are organised for human lookup (`vault.<service>.<key>`) rather than diff
|
|
|
|
|
ergonomics — the trade accepted against SOPS/age (per Secrets).
|
|
|
|
|
- The `Makefile` is the single interface: Claude Code and CI invoke the same targets, so
|
|
|
|
|
local and CI behaviour can't drift and collaborators need not know raw flags (per
|
|
|
|
|
Developer ergonomics).
|
|
|
|
|
- Collections are added only on demand, so `requirements.yml` stays minimal; this defers
|
|
|
|
|
`community.crypto` (use `openssl` CLI until a role needs certs) and `community.general`
|
|
|
|
|
(add only the specific sub-module needed) until a real need appears (per Collections
|
|
|
|
|
and roles policy).
|
|
|
|
|
- The heavier orchestration tools were declined for this scale, each with a named
|
|
|
|
|
revisit trigger — e.g. Semaphore if non-SSH operators must trigger runs, AWX-adjacent
|
|
|
|
|
tooling only if AWX/AAP is ever adopted (per "What was explicitly ruled out").
|