From d10f6de84b6c9dfc90bc479ba81044898688e1fb Mon Sep 17 00:00:00 2001 From: sjat Date: Sun, 14 Jun 2026 17:28:42 +0200 Subject: [PATCH] =?UTF-8?q?docs(adr):=20ADR-024=20=E2=80=94=20Caddy=20is?= =?UTF-8?q?=20boma's=20reverse=20proxy?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Adds ADR-024 pinning Caddy (xcaddy + caddy-dns/gandi) as boma's reverse proxy, superseding the soft Traefik assumption in the roadmap and ADR-017 prose. Updates CLAUDE.md Further reading table and ROADMAP.md Phase-2 step 5. Co-Authored-By: Claude Opus 4.8 (1M context) --- CLAUDE.md | 1 + docs/ROADMAP.md | 4 +- docs/decisions/024-reverse-proxy.md | 107 ++++++++++++++++++++++++++++ 3 files changed, 110 insertions(+), 2 deletions(-) create mode 100644 docs/decisions/024-reverse-proxy.md diff --git a/CLAUDE.md b/CLAUDE.md index b8e457f..5616225 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -249,6 +249,7 @@ Single-contributor, trunk-based (no merge requests / approval gates): | Operational access | `docs/decisions/021-operational-access.md` | | Backup & disaster recovery | `docs/decisions/022-backup.md` | | ADR structure & lifecycle | `docs/decisions/023-adr-structure.md` | +| Reverse proxy (Caddy) | `docs/decisions/024-reverse-proxy.md` | | Adding a new role | `docs/runbooks/new-role.md` | | Adding a new host | `docs/runbooks/new-host.md` | | Rotating vault secrets | `docs/runbooks/rotate-secrets.md` | diff --git a/docs/ROADMAP.md b/docs/ROADMAP.md index d7cb860..e2afb0b 100644 --- a/docs/ROADMAP.md +++ b/docs/ROADMAP.md @@ -156,8 +156,8 @@ Canonical dependency order: 3. **`docker_host`** — real Docker engine + Compose, daemon hardening, `nftables.d` container rules (currently a scaffold; ADR-004, ADR-020). 4. **`dns` role** — render the internal zone from inventory (ADR-007). -5. **Auth + reverse proxy** — Authentik + Traefik: the foundation every service sits - behind with authentication (ADR-002). +5. **Auth + reverse proxy** — Authentik + **Caddy** (ADR-024): the foundation every + service sits behind with authentication (ADR-002). 6. **Monitoring** — Loki + Grafana Alloy (logging, ADR-018) + Prometheus/exporters + Uptime Kuma; decide which alerts live where (TODO 3.6). 7. **Service roles** — PhotoPrism, email, indexers, … (`docs/CAPABILITIES.md`); each diff --git a/docs/decisions/024-reverse-proxy.md b/docs/decisions/024-reverse-proxy.md new file mode 100644 index 0000000..28379d2 --- /dev/null +++ b/docs/decisions/024-reverse-proxy.md @@ -0,0 +1,107 @@ +# ADR-024 — Reverse proxy: Caddy with ACME DNS-01 (Gandi) + +## Status + +Accepted (2026-06-14). Amends the soft Traefik assumption carried by the roadmap +(Phase-2 step 5) and ADR-017 prose; those are updated to read "Caddy (ADR-024)". + +## Context + +boma needs a reverse proxy to front its services with TLS. ADR-002 requires every +service to sit behind a proxy with authentication before it is reachable; ADR-007/M1 +delivers a `*.boma.` wildcard cert via ACME DNS-01 against Gandi — the only +viable cert path for mesh/LAN-only services that cannot satisfy HTTP-01 (no public +A-record to point at). + +The roadmap (Phase-2, step 5) and ADR-017 prose assumed **Traefik + Authentik** as the +auth-and-proxy pair without an ADR ever pinning Traefik. On closer inspection: + +- Traefik's headline feature is **dynamic Docker-label discovery** — it discovers and + routes services automatically from container labels without any static config. +- boma already renders *all* config from Ansible templates and the `group_vars` catalog + (ADR-004). That makes dynamic label discovery a disadvantage: a service that is not in + the catalog does not exist (CLAUDE.md), so any route that Traefik auto-discovers + outside the catalog would be unaudited. +- The first reverse-proxy instance is needed on `askari` for M4 (NetBird), a host where + `docker_hosts` patterns are being established under off-site/VPS constraints, not a + full Proxmox cluster with many services. + +No production investment in Traefik config has been made; the decision can be made +cleanly here. + +## Decision + +boma's reverse proxy is **Caddy**. + +### 1. Rationale for Caddy over Traefik + +1. Traefik's dynamic label discovery is wasted — boma renders config from the catalog; + Caddy's static Caddyfile maps naturally to "render from templates" (ADR-004). +2. Caddy's Caddyfile is simple to template with `ansible.builtin.template`; one file, + one `ansible_managed` header, no side-channel label state. +3. **Automatic HTTPS** via ACME DNS-01: the `caddy-dns/gandi` plugin satisfies the + Gandi DNS-01 challenge, which is the only cert path for services with no public + A-record (ADR-007/M1 wildcard strategy). +4. Far simpler for a solo operator: no dashboard-as-a-service, no routing-rule DSL, + no dynamic config files to reconcile. +5. `forward_auth` to Authentik is a first-class Caddy directive — the planned + Authentik auth story (ADR-002) is preserved without Traefik as the middleman. + +### 2. Custom image + +Caddy's official Docker image does not include third-party DNS plugins. The `caddy-dns/gandi` +plugin must be compiled in via `xcaddy`. boma builds a custom image: + +``` +FROM caddy:builder AS builder +RUN xcaddy build --with github.com/caddy-dns/gandi + +FROM caddy:latest +COPY --from=builder /usr/bin/caddy /usr/bin/caddy +``` + +This image is maintained as a boma artifact (Forgejo registry, pinned digest in the +Compose template). It is the cost of the Gandi DNS-01 path — unavoidable regardless of +proxy choice. + +### 3. Deployment scope + +The first Caddy instance fronts the NetBird stack on `askari` (M4). The pattern +generalises to the Proxmox cluster in Phase 2 when services multiply. + +### 4. Authentik integration (deferred) + +`forward_auth` to Authentik is deferred to Phase 2 (when Authentik is deployed on the +cluster). The Caddyfile template will carry a placeholder comment. No Traefik-Authentik +middleware migration is required. + +## Consequences + +- **Roadmap Phase-2 step 5** is updated from "Authentik + Traefik" to "Authentik + + Caddy (ADR-024)". +- **ADR-017 prose** that mentioned Traefik is updated to read "Caddy (ADR-024)". +- A custom Caddy image (`xcaddy` + `caddy-dns/gandi`) must be built, pushed to the + Forgejo registry, and kept current (plugin + base image updates). +- Caddyfile config is rendered by Ansible from `group_vars` — consistent with ADR-004 + and easier to review than distributed container labels. +- `forward_auth` to Authentik is available when Authentik is deployed; no extra + middleware layer required. +- The `proxy` concern tag (already in `tests/tags.yml`) covers Caddy config tasks. + +## What was ruled out + +- **Traefik** — dynamic label discovery is a mismatch for boma's catalog-rendered + config model (ADR-004); more complex for a solo operator; no prior investment to + protect. +- **nginx / HAProxy** — no built-in ACME; require a separate ACME client (certbot, + acme.sh) adding operational surface; Caddy's integrated ACME is simpler. +- **NetBird's bundled TLS** — NetBird's management UI can serve its own TLS, but that + doesn't generalise; a real proxy separates concerns and applies to every service. + +## Related + +- ADR-002 — services behind a proxy with authentication (the requirement this satisfies). +- ADR-004 — Docker & Compose model (template-rendered config, catalog-driven). +- ADR-007 / M1 — Gandi DNS-01 ACME path (the TLS strategy Caddy implements). +- ADR-016 — NetBird (M4 is the first deployment of this proxy). +- ADR-017 — service-UI verification; forward_auth to Authentik is the future auth story.