Dogfood of the new /kaizen command. 11 consumed, 1 kept open.
- SYSTEMATIZE → docs/testing/gotchas.md (apply:{tags} propagation, Molecule
tag-isolation testing, API/templating render-only gap); CLAUDE.md
(item['key'] loop convention, TF module required_providers); public_dns
README (Gandi null-MX workaround).
- CHANGE → extend the Stop hook to also guard the brainstorming spec-review gate
(verified: blocks the gate, passes meta-discussion).
- SYSTEMATIZE → make new-role scaffolds the access__/backup__ noqa reminder;
ADR-004 documents the cross-role-naming convention.
- ALREADY-BUILT/ACCEPTED → exec-menu guard verified firing; ADR-023; ADR-024;
subagent-faithfulness now embodied in the two-stage subagent review.
- KEEP-OPEN → a repo-scan.py check for ADRs that over-claim reconciliation.
Nudge: OVERDUE (13 signals) → ok (1). make lint + 16 friction-scan tests green.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
6.8 KiB
ADR-004 — Docker and Compose service model
Status
Accepted (2026-05-30)
Context
All services run as Docker containers managed via Docker Compose. This document defines how services are structured, deployed, and maintained.
Core principles
- No hand-edited files on hosts: all Compose files are rendered by Ansible from Jinja2 templates. If a file exists on a host, it was put there by Ansible.
- Compose per service: each service (or tightly coupled service group) gets its own Compose file and directory under a standard path.
- Variables drive differences: the same template renders differently per host
via
group_varsandhost_vars. No host-specific templates.
Directory layout on hosts
/opt/services/
├── servicename/
│ ├── docker-compose.yml # rendered by Ansible, never edited manually
│ ├── .env # rendered by Ansible from vault variables
│ └── data/ # persistent volumes (bind mounts)
│ └── ...
All services live under /opt/services/. The path is defined in
group_vars/all/vars.yml as services__base_dir.
Service-role standard
Every service has its own self-contained role — one service, one role. Shared roles serving multiple services are no longer used (see "Why not a shared engine" below). Each service role contains a standard set of files:
| File | Purpose |
|---|---|
tasks/main.yml |
The standard deploy mechanics (below) |
templates/docker-compose.yml.j2 |
The Compose definition |
templates/env.j2 |
.env rendered from vault variables |
defaults/main.yml |
Tuneables, rolename__ namespace |
README.md |
Purpose, variables, usage (role convention) |
SECURITY.md |
Per-service security record — see ADR-002 and docs/security/service-security-template.md |
VERIFY.md |
Per-service UI acceptance spec — see ADR-008 Level 4 / ADR-017 and docs/testing/service-verify-template.md |
ACCESS.md |
Per-service operational-access record — see ADR-021 and docs/access/service-access-template.md |
BACKUP.md |
Per-service backup record — see ADR-022 and docs/backup/service-backup-template.md (a stateless service declares backup__state: false with a reason) |
meta/main.yml, molecule/default/ |
Metadata + Debian 13 test scenario |
The access__* (ADR-021) and backup__* (ADR-022) data in defaults/main.yml are
cross-role conventions — shared field names that deliberately do not carry the
<rolename>__ prefix. ansible-lint's var-naming[no-role-prefix] has no per-prefix
allowlist, so each such line carries a trailing # noqa: var-naming[no-role-prefix] (the
rule stays enforced for genuinely role-scoped vars). make new-role scaffolds a reminder;
roles/reverse_proxy/defaults/main.yml is the reference.
Standard deploy mechanics
Every service role's tasks/main.yml follows the same sequence, so all roles are
uniform and predictable:
- Create
/opt/services/<service>/directory - Render
docker-compose.ymlfromtemplates/docker-compose.yml.j2 - Render
.envfromtemplates/env.j2(secrets from vault variables) - Run
docker compose up -d --remove-orphansviaansible.builtin.command - Optionally run
docker compose pullbefore up (controlled by a variable)
Why not a shared engine
A shared compose_service engine role — service roles delegating the mechanics to
one place — is intentionally not built. Duplicating the ~5 standard tasks per
role is accepted in favour of legible, self-contained roles a reader can understand
without indirection, and AI authorship makes the duplication cheap to generate
uniformly from this standard.
Revisit trigger: extract a shared engine role if maintaining the duplicated mechanics across service roles becomes painful — a pattern change that means editing many roles, or drift between them that this standard alone isn't preventing.
Docker daemon configuration
Managed by the docker_host role. Key settings:
"log-driver": "json-file"with size limits (prevents disk exhaustion)"iptables": false— firewall managed entirely by nftables (see ADR-002)- TCP socket disabled — Unix socket only (
/var/run/docker.sock) - User namespace remapping: evaluated per use case, not enabled by default
Networking
- Each service Compose file defines its own named network(s)
- Services that need to communicate are placed on a shared named network
defined in a dedicated
docker-compose.networks.yml(if cross-service networking is needed on a host) - External port publishing is explicit and matches nftables rules
Image management
- Image pinning follows the tiered model in ADR-011: stateful services pin
tag@digest(readable tag + integrity digest); stateless services use rolling tags (latest/stable), refreshed deliberately and watched by DIUN - Bare
latestis therefore acceptable only on the stateless tier; the stateful tier is always pinned - Image updates are a deliberate operation: update the tag/digest variable, run deploy
Persistent data
- Bind mounts preferred over named volumes for data that must be backed up
- All bind mount paths are under
/opt/services/<name>/data/ - Backup strategy is defined in ADR-022 — the bind mounts under
/opt/services/<name>/data/are exactly the unit ADR-022's per-servicebackup__*contract (andBACKUP.md) captures
Decision
Docker Compose was chosen over Kubernetes/Swarm because:
- Appropriate complexity level for 2–5 hosts with independent service sets
- Compose files are human-readable and easily auditable
- No distributed state to manage
- Straightforward to back up and restore
Consequences
Drawn from the trade-offs and deferred items this ADR already states:
- A shared
compose_serviceengine role is intentionally not built: the ~5 standard tasks are duplicated per role in favour of legible, self-contained roles, with a stated revisit trigger — extract a shared engine if maintaining the duplicated mechanics becomes painful (a pattern change touching many roles, or drift this standard alone isn't preventing) (per "Why not a shared engine"). - Forgoing Kubernetes/Swarm is the deliberate cost of matching complexity to a 2–5 host fleet with no distributed state to manage (per Decision).
- User-namespace remapping is not enabled by default — evaluated per use case (per Docker daemon configuration).
- Bare
latestis acceptable only on the stateless tier; the stateful tier is always pinnedtag@digest, and image updates are a deliberate operation (per Image management; ADR-011). - Backup strategy is defined in ADR-022 (not in this ADR); the persistent bind mounts
under
/opt/services/<name>/data/are the unit ADR-022's per-servicebackup__*contract captures (per Persistent data).