boma/docs/superpowers/specs/2026-06-11-public-dns-gandi-migration-design.md
sjat f7fac5f5e3 docs(spec): M1 — finalize for wingu.me (greenfield), record Gandi-defaults purge
boma's domain is wingu.me (registered at Gandi; 'wingu' = Swahili for cloud).
Replace the parametric <boma-domain> placeholder with wingu.me throughout. The
zone was NOT empty — Gandi auto-seeded 13 default records (parking A, www redirect,
a full Gandi mailbox set), so M1 includes a one-time purge to a clean baseline plus
an anti-spoof null-mail set (null MX, SPF -all, DMARC reject) since wingu.me sends
no mail. Domain-pick open item closed.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-14 09:14:10 +02:00

10 KiB

Design — boma's DNS home: a new domain at Gandi (DNS-as-code)

  • Date: 2026-06-11 · Revised: 2026-06-12 (Option B — boma gets its own new domain; supersedes this spec's original "migrate baobab.band off Cloudflare" framing)
  • Status: Draft for review — design settled in brainstorming; pending user review, then implementation plan
  • Roadmap milestone: M1 (docs/ROADMAP.md)
  • Resolves: TODO 4 (split-horizon FQDN — with/without nyumbani); review finding O12
  • Amends: ADR-007 — boma's public zone is a new domain at Gandi LiveDNS, managed as code; the three-tier naming scheme; nyumbani removed; mesh/LAN-only default
  • Becomes: an ADR-007 amendment (no new ADR unless public_dns grows its own concerns)

Problem

boma needs a DNS home. Investigating the obvious candidates ruled them out as boma's home:

  • baobab.band is the live legacy homelab (on Cloudflare): vaultwarden, nextcloud, matrix/element, collabora, ntfy, radio, … in daily use, much of it riding *.baobab.band / *.nyumbani.baobab.band wildcards. Moving its authoritative DNS risks breaking production.
  • ziethen.dk is the family's primary email (Fastmail). Moving a live email domain's DNS is the highest-stakes DNS operation there is — worse, not better.

Decision: register a NEW Swahili-themed domain at Gandi for boma. Greenfield, zero-risk, born at Gandi — so it satisfies the DNS-as-code + sovereignty goal natively with no migration at all. The existing domains are decoupled: baobab.band's Cloudflare exit / V4 decommission is a separate, later track (handled when boma replaces what it hosts), and ziethen.dk is untouched.

boma's domain is wingu.me (registered at Gandi 2026-06-14; wingu = Swahili for cloud). The public_dns role keeps it as a variable (public_dns__domain) so it stays swappable.

Starting state (verified 2026-06-14): Gandi auto-seeded the zone with 13 default records — apex parking A, www web-redirect, and a full Gandi mailbox set (MX, SPF, three *._domainkey DKIM CNAMEs, webmail, IMAP/POP/submission SRV). None are boma's; wingu.me sends no mail (email stays at ziethen.dk). See the setup sequence for the one-time purge + anti-spoof baseline.

Decisions (as settled)

  1. New domain, registered at Gandi. No transfer, no migration, no Cloudflare/Fastmail entanglement. (Human registers + pays — see division of labour.)
  2. Three-tier naming scheme (re-homed to wingu.me) — see table. nyumbani dropped.
  3. Mesh/LAN-only by default. Home/cluster services have no public record; reached over LAN or the NetBird mesh. Public Gandi records only for deliberate exceptions.
  4. DNS-as-code via a control-node public_dns role driven by record data in group_vars (same pattern as the firewall catalog). Name is provider-agnostic.
  5. Tooling: community.general.gandi_livedns with personal_access_token (PAT). Re-adds community.general to requirements.yml (collections-on-demand; a committed role uses gandi_livedns), pinned >=9.0.0, with the naming comment.
  6. Cert scope: DNS + PAT only. M1 ends at the zone + PAT in vault, which enables ACME DNS-01 later. No cert issuance in M1 (reverse proxy → askari M4 / home Phase 2).
  7. Human/agent division of labour (see table) — register + pay + PAT are human; all record/IaC work is the agent's, from ubongo.
  8. Explicitly out of scope: baobab.band (and its Cloudflare exit / V4 decommission) and ziethen.dk — separate later tracks.

Verified facts (ADR-014)

verified: community.general.gandi_livedns requires personal_access_token (PAT); api_key is deprecated and rejected by Gandi (Bearer auth replaced Apikey) · WebFetch docs.ansible.com + WebSearch (Gandi PAT announcement 2023-09; community.general issue #7926) · PAT param added in community.general 9.0.0, 13.0.1 current · 2026-06-11

  • Module params: domain, record, type, values (list), ttl, state (present/absent). Supports check mode + diff.
  • Auth is per-task: personal_access_token: "{{ vault.gandi.pat }}".

Naming scheme (the convention)

Tier Pattern Authoritative source Public?
Infrastructure / hosts <host>.boma.wingu.me internal zone (dns1/dns2, Phase 2) never
Home / cluster services <service>.wingu.me internal zone (split-horizon) only deliberate exceptions
Off-site / VPS services <service>.askari.wingu.me Gandi LiveDNS yes (askari has a stable public IP)
  • nyumbani removed — home is the default; only the exception (askari) needs naming.
  • The mesh carries "internal" to road-warriors. NetBird pushes dns1/dns2 (over wt0) as resolver for the wingu.me match-domain → on-LAN-or-on-mesh resolves internal; truly public resolves at Gandi (ties M1 ↔ ADR-016 / M5).
  • Wildcard TLS later. *.wingu.me ACME DNS-01 (Gandi PAT) gives even unexposed services real TLS without a public A record. Enabled by M1, issued in M4/Phase 2.

Architecture — two deliverables

(A) One-time setup — a short runbook (docs/runbooks/)

Greenfield, so this is small and low-risk (contrast the abandoned migration framing): register the domain, create the LiveDNS zone, issue the PAT. No transfer, no live-zone cutover.

(B) public_dns — the reusable IaC role

  • Runs from the control node (delegate_to: localhost, or a dns.yml play targeting control) against the Gandi LiveDNS API — no managed host, only API calls.
  • Reconciles records from group_vars data via community.general.gandi_livedns, PAT from vault.gandi.pat. Check-mode/diff first, always.

Data model (sketch)

# inventories/production/group_vars/all/public_dns.yml
public_dns__domain: "wingu.me"
public_dns__records:
  # Anti-spoof baseline for a no-mail domain (replaces Gandi's seeded mail set):
  - { record: "@",     type: MX,  values: ["0 ."],               ttl: 3600 }
  - { record: "@",     type: TXT, values: ['"v=spf1 -all"'],     ttl: 3600 }
  - { record: _dmarc,  type: TXT, values: ['"v=DMARC1; p=reject;"'], ttl: 3600 }
  # Service records appear as public-tier needs arise; near-empty at M1.
  # askari / NetBird records land in M4, e.g.:
  # - { record: askari, type: A, values: ["<hetzner-ip>"], ttl: 1800 }
  # mesh/LAN-only services are intentionally ABSENT — internal zone only.
# PAT referenced as {{ vault.gandi.pat }} (nested vault.<service>.<key>, CLAUDE.md).

Open design nuance — additive vs authoritative

gandi_livedns is per-record (present/absent), not whole-zone sync. Gandi seeded wingu.me with 13 default records (above), so M1 needs a one-time purge of those to a clean baseline (declare them state: absent, or a one-shot scripted delete), then manage additively. Full-zone authoritative sync (GET existing → remove undeclared — the proper end-state, and TODO 8.3's prune question) is flagged as a later enhancement.

Setup sequence (the runbook)

Legend: [H] human · [A] agent (from ubongo, committed code + check-mode).

  1. [H] Register wingu.me at Gandi; pay. [H] Issue a LiveDNS-scoped PAT for it; store in vault (vault.gandi.pat) via rbw.
  2. [A] Author the public_dns role + public_dns__records data (incl. the anti-spoof baseline); add community.general to requirements.yml (≥9.0.0, with comment); commit.
  3. [A] One-time: purge Gandi's 13 seeded defaults (parking A, www redirect, Gandi mail MX/SPF/DKIM/webmail/SRV) down to the boma baseline.
  4. [A] make check (diff vs live Gandi) → make deploy to load records → dig verify. Re-run make deploy to confirm idempotence.
  5. Thereafter the zone is reconciled as code; M4 adds the askari/NetBird records.

No registrar transfer, no nameserver flip of a live zone, no service-preservation, no Forgejo rename — all of that belonged to the abandoned baobab.band framing.

Division of labour & access (security posture)

Task Who How
Register domain + pay Human Identity/billing/ToS — not automatable.
Issue + store the PAT Human LiveDNS-scoped, single-domain; into vault via rbw.
public_dns role + record data Agent Committed IaC; make check diff.
Create zone + load records + reconcile Agent public_dns on ubongo, PAT from vault, check-mode first.
  • Minimal token scope. Gandi PAT: LiveDNS-only, restricted to wingu.me.
  • Token in vault (vault.gandi.pat) via rbw — never pasted in chat.
  • Execution on ubongo, committed role + make checkmake deploy. No agent sandbox holds production credentials.

Testing & verification

External-API reconciliation does not fit container Molecule cleanly (a nuance against ADR-008). Instead: make check (check-mode + diff), idempotence (second deploy = no changes), dig assertions post-load, and optionally a small pytest over the public_dns__records data shape (mirrors test_firewall_rules.py).

Scope boundaries — what M1 is NOT

  • Not a migration of baobab.band or ziethen.dk — and not the Cloudflare exit / V4 decommission. Those are separate, later tracks.
  • Not the internal split-horizon dns role (renders <service>.wingu.me privately) — that needs the dns role + actual home services → Phase 2.
  • Not certificate issuance or the reverse proxy — M4 (askari) / Phase 2 (home).
  • Not authoritative whole-zone pruning — additive for now.

ADR work

Amend ADR-007: boma's public zone is wingu.me at Gandi LiveDNS, managed as code (replaces "Cloudflare or equivalent"); record the three-tier naming scheme; remove the nyumbani example; state the mesh/LAN-only default; note public_dns as the control-node role rendering the public zone (sibling to the internal dns role). Note that baobab.band (legacy, Cloudflare) is not boma's zone and is out of ADR-007's scope going forward.

Open items (resolve during the plan / implementation)

  • Pick the domain DONE: wingu.me registered at Gandi; LiveDNS PAT verified (2026-06-14) and stored in vault as vault.gandi.pat.
  • Pin the community.general version in requirements.yml (≥9.0.0).
  • Play wiring: a dedicated dns.yml play (control-targeted) vs folding into an existing play — decide in the plan.