sjat/boma

sjat 175777e36a docs: reconcile 2026-06-14 review findings (O1-O7,O18,O22)

- STATUS: docker_host is built+applied, not scaffold-only (O1)
- ADR-004: backup points to ADR-022, not "out of scope"; service-role file
  table gains ACCESS.md + BACKUP.md rows (O2, O5)
- Finish Traefik->Caddy: ADR-008/011/017/019, CAPABILITIES, TODO (O3); scope
  ADR-024's custom-image/NetBird claims to the deferred DNS-01/M4b paths (O22)
- ADR-016/017/018 now lead with ## Status per ADR-023 (O4)
- ADR-002: caveat `PLAYBOOK=upgrade` as planned/unbuilt (O6)
- CAPABILITIES: carve out ubongo's dev_env from the nvim/tmux exclusion (O7)
- ADR-007: one authoritative boma.baobab.band -> boma.wingu.me transition note (O18)
- new-host Part E: note ubongo is managed as sjat, ansible-user bootstrap pending (O15)

O9 (hosts.yml header) left open: the file is generator-owned (hook-protected);
fixing it needs a tf_to_inventory.py change or a tf-inventory run, not a hand-edit.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

2026-06-14 19:06:33 +02:00

6.3 KiB

Raw Blame History

ADR-004 — Docker and Compose service model

Status

Accepted (2026-05-30)

Context

All services run as Docker containers managed via Docker Compose. This document defines how services are structured, deployed, and maintained.

Core principles

No hand-edited files on hosts: all Compose files are rendered by Ansible from Jinja2 templates. If a file exists on a host, it was put there by Ansible.
Compose per service: each service (or tightly coupled service group) gets its own Compose file and directory under a standard path.
Variables drive differences: the same template renders differently per host via group_vars and host_vars. No host-specific templates.

Directory layout on hosts

/opt/services/
├── servicename/
│   ├── docker-compose.yml    # rendered by Ansible, never edited manually
│   ├── .env                  # rendered by Ansible from vault variables
│   └── data/                 # persistent volumes (bind mounts)
│       └── ...

All services live under /opt/services/. The path is defined in group_vars/all/vars.yml as services__base_dir.

Service-role standard

Every service has its own self-contained role — one service, one role. Shared roles serving multiple services are no longer used (see "Why not a shared engine" below). Each service role contains a standard set of files:

File	Purpose
`tasks/main.yml`	The standard deploy mechanics (below)
`templates/docker-compose.yml.j2`	The Compose definition
`templates/env.j2`	`.env` rendered from vault variables
`defaults/main.yml`	Tuneables, `rolename__` namespace
`README.md`	Purpose, variables, usage (role convention)
`SECURITY.md`	Per-service security record — see ADR-002 and `docs/security/service-security-template.md`
`VERIFY.md`	Per-service UI acceptance spec — see ADR-008 Level 4 / ADR-017 and `docs/testing/service-verify-template.md`
`ACCESS.md`	Per-service operational-access record — see ADR-021 and `docs/access/service-access-template.md`
`BACKUP.md`	Per-service backup record — see ADR-022 and `docs/backup/service-backup-template.md` (a stateless service declares `backup__state: false` with a reason)
`meta/main.yml`, `molecule/default/`	Metadata + Debian 13 test scenario

Standard deploy mechanics

Every service role's tasks/main.yml follows the same sequence, so all roles are uniform and predictable:

Create /opt/services/<service>/ directory
Render docker-compose.yml from templates/docker-compose.yml.j2
Render .env from templates/env.j2 (secrets from vault variables)
Run docker compose up -d --remove-orphans via ansible.builtin.command
Optionally run docker compose pull before up (controlled by a variable)

Why not a shared engine

A shared compose_service engine role — service roles delegating the mechanics to one place — is intentionally not built. Duplicating the ~5 standard tasks per role is accepted in favour of legible, self-contained roles a reader can understand without indirection, and AI authorship makes the duplication cheap to generate uniformly from this standard.

Revisit trigger: extract a shared engine role if maintaining the duplicated mechanics across service roles becomes painful — a pattern change that means editing many roles, or drift between them that this standard alone isn't preventing.

Docker daemon configuration

Managed by the docker_host role. Key settings:

"log-driver": "json-file" with size limits (prevents disk exhaustion)
"iptables": false — firewall managed entirely by nftables (see ADR-002)
TCP socket disabled — Unix socket only (/var/run/docker.sock)
User namespace remapping: evaluated per use case, not enabled by default

Networking

Each service Compose file defines its own named network(s)
Services that need to communicate are placed on a shared named network defined in a dedicated docker-compose.networks.yml (if cross-service networking is needed on a host)
External port publishing is explicit and matches nftables rules

Image management

Image pinning follows the tiered model in ADR-011: stateful services pin tag@digest (readable tag + integrity digest); stateless services use rolling tags (latest/stable), refreshed deliberately and watched by DIUN
Bare latest is therefore acceptable only on the stateless tier; the stateful tier is always pinned
Image updates are a deliberate operation: update the tag/digest variable, run deploy

Persistent data

Bind mounts preferred over named volumes for data that must be backed up
All bind mount paths are under /opt/services/<name>/data/
Backup strategy is defined in ADR-022 — the bind mounts under /opt/services/<name>/data/ are exactly the unit ADR-022's per-service backup__* contract (and BACKUP.md) captures

Decision

Docker Compose was chosen over Kubernetes/Swarm because:

Appropriate complexity level for 2–5 hosts with independent service sets
Compose files are human-readable and easily auditable
No distributed state to manage
Straightforward to back up and restore

Consequences

Drawn from the trade-offs and deferred items this ADR already states:

A shared compose_service engine role is intentionally not built: the ~5 standard tasks are duplicated per role in favour of legible, self-contained roles, with a stated revisit trigger — extract a shared engine if maintaining the duplicated mechanics becomes painful (a pattern change touching many roles, or drift this standard alone isn't preventing) (per "Why not a shared engine").
Forgoing Kubernetes/Swarm is the deliberate cost of matching complexity to a 2–5 host fleet with no distributed state to manage (per Decision).
User-namespace remapping is not enabled by default — evaluated per use case (per Docker daemon configuration).
Bare latest is acceptable only on the stateless tier; the stateful tier is always pinned tag@digest, and image updates are a deliberate operation (per Image management; ADR-011).
Backup strategy is defined in ADR-022 (not in this ADR); the persistent bind mounts under /opt/services/<name>/data/ are the unit ADR-022's per-service backup__* contract captures (per Persistent data).

6.3 KiB Raw Blame History Unescape Escape