feat(reverse_proxy): optional ACME DNS-01 via Gandi (wildcard / LAN-only)

Adds a per-instance DNS-01 mode to the Caddy role for mesh/LAN-only hosts that
cannot satisfy HTTP-01. Default behaviour (vanilla caddy:2 + HTTP-01, what askari
runs) is unchanged.

  - reverse_proxy__acme_dns_provider: "" (HTTP-01) | "gandi" (DNS-01)
  - reverse_proxy__image: override to the custom caddy-gandi image for DNS-01
  - Caddyfile gains a global `acme_dns gandi {env.GANDI_BEARER_TOKEN}` block
  - the PAT (vault.gandi.pat) renders into a host-only 0600 env file (no_log),
    loaded by compose only when DNS-01 is enabled

Verified: the custom image issues a real wildcard cert (*.dns01test.wingu.me)
end-to-end against LE staging via Gandi DNS-01; `caddy validate` accepts
`acme_dns gandi` on the custom image and rejects it on vanilla caddy:2. Molecule
(HTTP-01 default path) green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
sjat 2026-06-15 06:57:47 +02:00
parent d407aeabb2
commit 6e38693499
6 changed files with 67 additions and 12 deletions

View file

@ -1,18 +1,34 @@
# reverse_proxy # reverse_proxy
Boma's standard Caddy reverse proxy (ADR-024). Runs on `askari` (the off-site Boma's standard Caddy reverse proxy (ADR-024). Runs on `askari` (the off-site
Hetzner host) and terminates TLS for all public-facing services via ACME HTTP-01. Hetzner host) and terminates TLS for services. It supports **two ACME challenge
Uses the official `caddy:2` image — no custom build, no DNS plugin, no token required. types**, chosen per proxy instance by exposure:
- **HTTP-01 (default)** — public hosts with an A-record (askari). Official `caddy:2`
image; no plugin, no token.
- **DNS-01 via Gandi** — mesh/LAN-only hosts with **no** public A-record (the cluster),
where HTTP-01 is impossible. Needs the custom `caddy-gandi` image and the Gandi PAT.
## How TLS works ## How TLS works
Caddy obtains per-hostname certificates using the ACME HTTP-01 challenge. Port 80 **HTTP-01 (default).** Caddy obtains per-hostname certificates using the ACME HTTP-01
must be reachable from the internet for the challenge to succeed. Each `host` in challenge. Port 80 must be reachable from the internet. Each `host` in
`reverse_proxy__routes` gets its own certificate automatically. `reverse_proxy__routes` gets its own certificate automatically.
> **DNS-01 (for mesh/LAN-only cluster services) is deferred to Phase 2.** The **DNS-01 (Gandi).** Set `reverse_proxy__acme_dns_provider: gandi` and point
> `caddy-dns/gandi` plugin failed to issue certificates during M4a and needs `reverse_proxy__image` at the custom Caddy image (`make caddy-image`, built on ubongo
> investigation before it can be used. and pushed to the Forgejo registry — see `.docker/caddy-gandi/`). Caddy then proves
domain control by writing ACME TXT records through the Gandi LiveDNS API, so it can
issue certs — **including wildcards** — for hosts that are never publicly reachable.
The token (`vault.gandi.pat`) is injected as `GANDI_BEARER_TOKEN` via a host-only
`env` file (mode 0600) and sent as a **Bearer PAT** (the legacy Apikey scheme is gone).
> **Verified (2026-06-15):** the custom image issues a real wildcard cert
> (`*.dns01test.wingu.me`) end-to-end against Let's Encrypt staging via Gandi DNS-01;
> `caddy validate` accepts the `acme_dns gandi` directive on the custom image and
> rejects it on vanilla `caddy:2` (`module not registered: dns.providers.gandi`). The
> original M4a failure was version skew (a pre-Bearer `libdns/gandi` that sent the
> deprecated Apikey header) plus building the image on a Hetzner IP (Go proxy 403).
## Route catalog — `reverse_proxy__routes` ## Route catalog — `reverse_proxy__routes`
@ -46,6 +62,8 @@ Use `upstream` to proxy to a Docker service, or `respond` to return a static str
| `reverse_proxy__base_dir` | `/opt/services/reverse_proxy` | Working directory for Compose project | | `reverse_proxy__base_dir` | `/opt/services/reverse_proxy` | Working directory for Compose project |
| `reverse_proxy__acme_email` | `admin@example.test` | ACME registration email | | `reverse_proxy__acme_email` | `admin@example.test` | ACME registration email |
| `reverse_proxy__routes` | `[]` | List of `{host, upstream}` or `{host, respond}` entries | | `reverse_proxy__routes` | `[]` | List of `{host, upstream}` or `{host, respond}` entries |
| `reverse_proxy__image` | `caddy:2` | Container image. DNS-01 hosts override to the custom `caddy-gandi` image |
| `reverse_proxy__acme_dns_provider` | `""` | `""` = HTTP-01; `"gandi"` = ACME DNS-01 via the Gandi PAT |
| `reverse_proxy__manage` | `true` | Set `false` in Molecule to skip Docker tasks | | `reverse_proxy__manage` | `true` | Set `false` in Molecule to skip Docker tasks |
Production overrides live in Production overrides live in
@ -59,4 +77,7 @@ creation) without a Docker daemon.
## Secrets ## Secrets
None. HTTP-01 requires no credentials. - **HTTP-01 (default):** none — the challenge requires no credentials.
- **DNS-01 (`reverse_proxy__acme_dns_provider: gandi`):** the Gandi PAT
(`vault.gandi.pat`, the same token `public_dns` uses). Rendered host-side into
`{{ reverse_proxy__base_dir }}/env` (mode 0600, `no_log`); never committed.

View file

@ -1,10 +1,19 @@
--- ---
# Caddy reverse proxy (ADR-024). Vanilla Caddy; TLS via ACME HTTP-01 (public hosts). # Caddy reverse proxy (ADR-024).
reverse_proxy__base_dir: /opt/services/reverse_proxy reverse_proxy__base_dir: /opt/services/reverse_proxy
reverse_proxy__acme_email: admin@example.test reverse_proxy__acme_email: admin@example.test
reverse_proxy__routes: [] # each: {host: x, upstream: "svc:port"} OR {host: x, respond: "text"} reverse_proxy__routes: [] # each: {host: x, upstream: "svc:port"} OR {host: x, respond: "text"}
reverse_proxy__manage: true # set false in Molecule to render without Docker reverse_proxy__manage: true # set false in Molecule to render without Docker
# ACME challenge type (ADR-024). Default is HTTP-01 with the vanilla upstream image —
# correct for PUBLIC hosts with an A-record (askari). For mesh/LAN-only hosts with NO
# public A-record (the cluster), HTTP-01 is impossible: set reverse_proxy__acme_dns_provider
# to "gandi" AND point reverse_proxy__image at the custom Caddy+Gandi image to issue certs
# (incl. wildcards) via Gandi DNS-01. The token is vault.gandi.pat (sent as a Bearer PAT;
# the legacy Apikey scheme is gone). Build the image with `make caddy-image` (on ubongo).
reverse_proxy__image: "caddy:2" # DNS-01 hosts override -> the caddy-gandi registry image
reverse_proxy__acme_dns_provider: "" # "" = HTTP-01; "gandi" = ACME DNS-01 via Gandi PAT
# access__*/backup__* are the ADR-021/022 CROSS-ROLE conventions — shared field names that # access__*/backup__* are the ADR-021/022 CROSS-ROLE conventions — shared field names that
# render ACCESS.md/BACKUP.md and drive /check-access · /check-backup. They intentionally do # render ACCESS.md/BACKUP.md and drive /check-access · /check-backup. They intentionally do
# NOT carry the reverse_proxy__ prefix, so each is marked `# noqa: var-naming[no-role-prefix]` # NOT carry the reverse_proxy__ prefix, so each is marked `# noqa: var-naming[no-role-prefix]`
@ -22,7 +31,7 @@ access__api: # noqa: var-naming[no-role-prefix]
reason: "Caddy admin API bound to container localhost :2019; never exposed (ADR-020 catalog owns ports)" reason: "Caddy admin API bound to container localhost :2019; never exposed (ADR-020 catalog owns ports)"
# Backup contract (ADR-022). Stateless: Caddy's /data holds only ACME account keys + # Backup contract (ADR-022). Stateless: Caddy's /data holds only ACME account keys +
# issued certs, which are re-requested automatically on restart via HTTP-01 (no manual # issued certs, which are re-requested automatically on restart via ACME (HTTP-01 or
# steps). Residual risk: Let's Encrypt rate limits on rapid repeated re-issuance. # DNS-01; no manual steps). Residual risk: Let's Encrypt rate limits on rapid re-issuance.
backup__service: reverse_proxy # noqa: var-naming[no-role-prefix] backup__service: reverse_proxy # noqa: var-naming[no-role-prefix]
backup__state: false # noqa: var-naming[no-role-prefix] backup__state: false # noqa: var-naming[no-role-prefix]

View file

@ -14,6 +14,16 @@
notify: reload caddy notify: reload caddy
tags: [config] tags: [config]
- name: Render the Gandi DNS-01 token env file
ansible.builtin.template:
src: env.j2
dest: "{{ reverse_proxy__base_dir }}/env"
mode: "0600"
no_log: true # contains the Gandi PAT
when: reverse_proxy__acme_dns_provider == 'gandi'
notify: reload caddy
tags: [config]
- name: Render the compose file - name: Render the compose file
ansible.builtin.template: ansible.builtin.template:
src: docker-compose.yml.j2 src: docker-compose.yml.j2

View file

@ -1,6 +1,12 @@
# {{ ansible_managed }} # {{ ansible_managed }}
{ {
email {{ reverse_proxy__acme_email }} email {{ reverse_proxy__acme_email }}
{% if reverse_proxy__acme_dns_provider == 'gandi' %}
# ACME DNS-01 via Gandi (mesh/LAN-only hosts, incl. wildcard certs). Token is the
# Gandi PAT, injected from the env file as a Bearer token (ADR-024). Needs the custom
# caddy-gandi image — the upstream caddy:2 has no DNS provider modules.
acme_dns gandi {env.GANDI_BEARER_TOKEN}
{% endif %}
} }
{% for r in reverse_proxy__routes %} {% for r in reverse_proxy__routes %}
{{ r.host }} { {{ r.host }} {

View file

@ -1,12 +1,16 @@
# {{ ansible_managed }} # {{ ansible_managed }}
services: services:
caddy: caddy:
image: caddy:2 image: {{ reverse_proxy__image }}
container_name: caddy container_name: caddy
restart: unless-stopped restart: unless-stopped
ports: ports:
- "80:80" - "80:80"
- "443:443" - "443:443"
{% if reverse_proxy__acme_dns_provider == 'gandi' %}
env_file:
- ./env
{% endif %}
volumes: volumes:
- ./Caddyfile:/etc/caddy/Caddyfile:ro - ./Caddyfile:/etc/caddy/Caddyfile:ro
- caddy_data:/data - caddy_data:/data

View file

@ -0,0 +1,5 @@
# {{ ansible_managed }}
# Gandi Personal Access Token for ACME DNS-01 (rendered only when
# reverse_proxy__acme_dns_provider == 'gandi'). Sent by caddy-dns/gandi as a Bearer
# token to https://api.gandi.net/v5/livedns. Host-only, mode 0600 (ADR-024).
GANDI_BEARER_TOKEN={{ vault.gandi.pat }}