boma/docs/superpowers/plans/2026-06-14-public-dns-m1.md
sjat b131ee317e docs(plan): M1 — public_dns implementation plan
Bite-sized TDD plan: add community.general; scaffold public_dns; wingu.me record
data + pytest; role tasks (gandi_livedns present/absent loops, apply toggle);
Molecule (apply=false, no live API); dns.yml play; gated live run on ubongo
(purge Gandi defaults + anti-spoof baseline + dig verify); ADR-007 amendment +
TODO 4 resolution + STATUS/CAPABILITIES.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-14 10:23:26 +02:00

19 KiB
Raw Blame History

Public DNS (M1) Implementation Plan

For agentic workers: REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (- [ ]) syntax for tracking.

Goal: Build the public_dns role that manages wingu.me's records at Gandi LiveDNS as code, purging Gandi's seeded defaults and applying boma's anti-spoof baseline.

Architecture: A control-node role drives community.general.gandi_livedns over declarative record lists in group_vars/all/public_dns.yml (mirroring the firewall-catalog pattern). Records to keep are state: present; Gandi's auto-seeded defaults are state: absent. A public_dns__apply toggle lets Molecule converge without calling the API; a pytest validates the data shape; the live run happens via make check/deploy PLAYBOOK=dns on ubongo.

Tech Stack: Ansible (community.general.gandi_livedns, PAT auth), pytest, Gandi LiveDNS API. Secrets from vault.gandi.pat.

Spec: docs/superpowers/specs/2026-06-11-public-dns-gandi-migration-design.md

Execution context: Tasks 16 + 8 are authoring (any machine with the venv). Task 7 runs on ubongo (has the vault + Gandi egress) and is the only one that touches live Gandi.


File Structure

  • requirements.yml (modify) — add community.general (≥9.0.0) for gandi_livedns.
  • roles/public_dns/ (create) — defaults/main.yml, tasks/main.yml, meta/main.yml, README.md, molecule/default/.
  • inventories/production/group_vars/all/public_dns.yml (create) — public_dns__domain + public_dns__records (present) + public_dns__absent (Gandi defaults).
  • playbooks/dns.yml (create) — control-node play running the role.
  • tests/test_public_dns.py (create) — pytest over the record data.
  • docs/decisions/007-network.md, STATUS.md, docs/TODO.md, docs/CAPABILITIES.md (modify) — doc reconciliation.

Task 1: Add the community.general collection

Files:

  • Modify: requirements.yml

  • Step 1: Add the collection with the on-demand comment

In requirements.yml, under collections:, append:

  # community.general — gandi_livedns (public_dns role manages wingu.me at Gandi
  # LiveDNS). PAT auth requires >= 9.0.0.
  - name: community.general
    version: ">=9.0.0"
  • Step 2: Install it

Run: make collections Expected: installs community.general (≥9.0.0) with no errors.

  • Step 3: Verify the module is available

Run: .venv/bin/ansible-doc community.general.gandi_livedns | head -5 Expected: prints the module doc header (confirms the module resolves), mentioning personal_access_token.

  • Step 4: Commit
git add requirements.yml
git commit -m "deps: add community.general for gandi_livedns (public_dns)"

Task 2: Scaffold the role

Files:

  • Create: roles/public_dns/ (via the scaffolder)

  • Step 1: Scaffold

Run: make new-role NAME=public_dns Expected: Role public_dns scaffolded at roles/public_dns/ (creates tasks/, handlers/, defaults/, meta/, templates/, files/, molecule/default/, README.md).

  • Step 2: Commit the scaffold
git add roles/public_dns
git commit -m "scaffold(public_dns): empty role structure"

Task 3: Record data + validation test (TDD)

Files:

  • Test: tests/test_public_dns.py

  • Create: inventories/production/group_vars/all/public_dns.yml

  • Step 1: Write the failing test

Create tests/test_public_dns.py:

import pathlib

import yaml

_DATA = (
    pathlib.Path(__file__).resolve().parent.parent
    / "inventories" / "production" / "group_vars" / "all" / "public_dns.yml"
)

# Gandi auto-seeds these on a fresh .me zone; boma purges them (verified 2026-06-14).
GANDI_DEFAULTS_ABSENT = {
    ("@", "A"), ("www", "CNAME"), ("webmail", "CNAME"),
    ("gm1._domainkey", "CNAME"), ("gm2._domainkey", "CNAME"), ("gm3._domainkey", "CNAME"),
    ("_imap._tcp", "SRV"), ("_imaps._tcp", "SRV"), ("_pop3._tcp", "SRV"),
    ("_pop3s._tcp", "SRV"), ("_submission._tcp", "SRV"),
}


def _load():
    return yaml.safe_load(_DATA.read_text())


def test_domain_is_wingu():
    assert _load()["public_dns__domain"] == "wingu.me"


def test_present_records_well_formed():
    for r in _load()["public_dns__records"]:
        assert r["record"] and r["type"]
        assert isinstance(r["values"], list) and r["values"]


def test_anti_spoof_baseline_present():
    recs = {(r["record"], r["type"]): r["values"] for r in _load()["public_dns__records"]}
    assert recs[("@", "MX")] == ["0 ."]                       # null MX
    assert recs[("@", "TXT")] == ['"v=spf1 -all"']            # SPF deny-all
    assert recs[("_dmarc", "TXT")] == ['"v=DMARC1; p=reject;"']


def test_gandi_defaults_marked_absent():
    absent = {(r["record"], r["type"]) for r in _load()["public_dns__absent"]}
    assert GANDI_DEFAULTS_ABSENT <= absent


def test_no_record_both_present_and_absent():
    present = {(r["record"], r["type"]) for r in _load()["public_dns__records"]}
    absent = {(r["record"], r["type"]) for r in _load()["public_dns__absent"]}
    assert present.isdisjoint(absent)


def test_no_duplicate_present_records():
    keys = [(r["record"], r["type"]) for r in _load()["public_dns__records"]]
    assert len(keys) == len(set(keys))
  • Step 2: Run it to verify it fails

Run: .venv/bin/python -m pytest tests/test_public_dns.py -v Expected: FAIL (the data file does not exist yet — FileNotFoundError).

  • Step 3: Create the record data

Create inventories/production/group_vars/all/public_dns.yml:

---
# Public DNS — wingu.me at Gandi LiveDNS, managed by the public_dns role (M1).
# Mesh/LAN-only by default: only deliberate public records live here. PAT in
# vault.gandi.pat. See docs/decisions/007-network.md and the M1 spec.
public_dns__domain: wingu.me

# Present — anti-spoof baseline for a no-mail domain (overwrites Gandi's seeded mail set).
public_dns__records:
  - { record: "@",     type: MX,  values: ["0 ."],                  ttl: 3600 }
  - { record: "@",     type: TXT, values: ['"v=spf1 -all"'],        ttl: 3600 }
  - { record: _dmarc,  type: TXT, values: ['"v=DMARC1; p=reject;"'], ttl: 3600 }
  # Service records appear as public-tier needs arise (askari A in M4).
  # Mesh/LAN-only services never appear here.

# Absent — Gandi's auto-seeded defaults we don't want (purged once, idempotent thereafter).
public_dns__absent:
  - { record: "@",              type: A }      # Gandi parking IP
  - { record: www,             type: CNAME }   # Gandi web-redirect
  - { record: webmail,         type: CNAME }   # Gandi webmail
  - { record: gm1._domainkey,  type: CNAME }   # Gandi DKIM
  - { record: gm2._domainkey,  type: CNAME }
  - { record: gm3._domainkey,  type: CNAME }
  - { record: _imap._tcp,      type: SRV }     # Gandi mail autodiscovery
  - { record: _imaps._tcp,     type: SRV }
  - { record: _pop3._tcp,      type: SRV }
  - { record: _pop3s._tcp,     type: SRV }
  - { record: _submission._tcp, type: SRV }
  • Step 4: Run the test to verify it passes

Run: .venv/bin/python -m pytest tests/test_public_dns.py -v Expected: PASS (6 passed).

  • Step 5: Commit
git add tests/test_public_dns.py inventories/production/group_vars/all/public_dns.yml
git commit -m "feat(public_dns): wingu.me record data + validation test"

Task 4: Role implementation (defaults, tasks, meta, README)

Files:

  • Modify: roles/public_dns/defaults/main.yml

  • Modify: roles/public_dns/tasks/main.yml

  • Modify: roles/public_dns/meta/main.yml

  • Modify: roles/public_dns/README.md

  • Step 1: Write defaults/main.yml

---
# public_dns — manage the public zone at Gandi LiveDNS as code (M1).
# Record data (public_dns__domain / __records / __absent) lives in group_vars/all.
# See docs/decisions/007-network.md.
public_dns__apply: true        # set false to validate without calling the Gandi API (Molecule)
public_dns__default_ttl: 1800  # TTL when a record omits one
public_dns__domain: ""         # overridden in group_vars/all
public_dns__records: []        # present records
public_dns__absent: []         # records to remove
  • Step 2: Write tasks/main.yml
---
- name: Assert public DNS data is sane
  ansible.builtin.assert:
    that:
      - public_dns__domain | length > 0
      - public_dns__records | selectattr('type', 'equalto', 'MX') | list | length > 0
    fail_msg: >-
      public_dns__domain must be set and a null-MX anti-spoof record declared in
      public_dns__records (group_vars/all/public_dns.yml).
  run_once: true

- name: Ensure desired records are present (Gandi LiveDNS)
  community.general.gandi_livedns:
    domain: "{{ public_dns__domain }}"
    record: "{{ item.record }}"
    type: "{{ item.type }}"
    values: "{{ item.values }}"
    ttl: "{{ item.ttl | default(public_dns__default_ttl) }}"
    state: present
    personal_access_token: "{{ vault.gandi.pat }}"
  loop: "{{ public_dns__records }}"
  loop_control:
    label: "{{ item.record }} {{ item.type }}"
  run_once: true
  when: public_dns__apply | bool

- name: Ensure unwanted records are absent (Gandi LiveDNS)
  community.general.gandi_livedns:
    domain: "{{ public_dns__domain }}"
    record: "{{ item.record }}"
    type: "{{ item.type }}"
    state: absent
    personal_access_token: "{{ vault.gandi.pat }}"
  loop: "{{ public_dns__absent }}"
  loop_control:
    label: "{{ item.record }} {{ item.type }}"
  run_once: true
  when: public_dns__apply | bool
  • Step 3: Write meta/main.yml
---
galaxy_info:
  author: sjat
  description: Manage boma's public DNS zone (wingu.me) at Gandi LiveDNS as code.
  license: MIT
  min_ansible_version: "2.17"
  platforms:
    - name: Debian
      versions:
        - trixie
dependencies: []
  • Step 4: Write README.md
# public_dns

Manages boma's public DNS zone (**wingu.me**) at **Gandi LiveDNS** as code, via
`community.general.gandi_livedns` (PAT auth from `vault.gandi.pat`). Provider-agnostic
name on purpose. Run from the control node: `make check/deploy PLAYBOOK=dns`.

Mesh/LAN-only by default — only deliberate public records live in the zone (the
anti-spoof baseline now; `askari` in M4). Everything else is reached over LAN/mesh and
never appears here.

## Data (in `group_vars/all/public_dns.yml`)

| Var | Meaning |
|---|---|
| `public_dns__domain` | the zone (`wingu.me`) |
| `public_dns__records` | records to ensure **present** (`record`, `type`, `values`, optional `ttl`) |
| `public_dns__absent`  | records to ensure **absent** (Gandi's auto-seeded defaults) |

## Behaviour knobs (`defaults/main.yml`)

| Var | Default | Meaning |
|---|---|---|
| `public_dns__apply` | `true` | set `false` to validate without calling the Gandi API (Molecule) |
| `public_dns__default_ttl` | `1800` | TTL when a record omits one |

## Notes

The zone is reconciled **additively** plus an explicit `absent` list (Gandi seeds 13
default records on a new `.me`; we purge the unwanted 11 and overwrite MX/SPF with the
anti-spoof baseline). Full-zone authoritative pruning is a future enhancement (TODO 8.3).
  • Step 5: Lint

Run: make lint Expected: Passed: 0 failure(s) and check-tags: OK.

  • Step 6: Commit
git add roles/public_dns
git commit -m "feat(public_dns): role tasks, defaults, meta, README"

Task 5: Molecule scenario (no live API)

Files:

  • Modify: roles/public_dns/molecule/default/converge.yml

  • Modify: roles/public_dns/molecule/default/verify.yml

  • Step 1: Write converge.yml (apply disabled, sample data)

---
- name: Converge
  hosts: all
  gather_facts: true
  vars:
    public_dns__apply: false          # never call the Gandi API from a container
    public_dns__domain: example.test
    public_dns__records:
      - { record: "@",    type: MX,  values: ["0 ."],           ttl: 3600 }
      - { record: "@",    type: TXT, values: ['"v=spf1 -all"'], ttl: 3600 }
    public_dns__absent:
      - { record: www, type: CNAME }
  roles:
    - role: public_dns
  • Step 2: Write verify.yml
---
- name: Verify
  hosts: all
  gather_facts: false
  tasks:
    - name: Role variables resolved
      ansible.builtin.assert:
        that:
          - public_dns__domain == "example.test"
          - public_dns__apply | bool == false
        msg: "public_dns defaults/vars did not resolve as expected"
      tags: [verify]
  • Step 3: Run Molecule

Run: make test ROLE=public_dns Expected: PASS — converge applies the role (the assert passes; the gandi_livedns tasks are skipped because public_dns__apply: false), verify passes, idempotence clean.

  • Step 4: Commit
git add roles/public_dns/molecule
git commit -m "test(public_dns): Molecule scenario (apply disabled, no live API)"

Task 6: The dns.yml playbook

Files:

  • Create: playbooks/dns.yml

  • Step 1: Write the play

---
# dns.yml — manage the public DNS zone (wingu.me) at Gandi LiveDNS as code.
# Runs on the control node (ubongo) against the Gandi API — no host config.
# Run: make check PLAYBOOK=dns  then  make deploy PLAYBOOK=dns
- name: Manage public DNS (Gandi LiveDNS)
  hosts: control
  connection: local
  gather_facts: false
  become: false
  roles:
    - role: public_dns
      tags: [public_dns]
  • Step 2: Lint (verifies the role-name tag on the import)

Run: make lint Expected: Passed: 0 failure(s) and check-tags: OK (... role imports verified).

  • Step 3: Commit
git add playbooks/dns.yml
git commit -m "feat(public_dns): dns.yml play (control-node, Gandi LiveDNS)"

Task 7: Live run on ubongo (purge + baseline) — gated

Runs on ubongo only (vault + Gandi egress). rbw unlock first. This is the one task that mutates live Gandi; review the check-mode diff before deploying.

  • Step 1: Dry-run (check mode + diff)

Run: make check PLAYBOOK=dns Expected: the diff shows the 3 present records being set (null MX, SPF -all, DMARC reject) and the 11 Gandi defaults being removed. Review it.

  • Step 2: Apply

Run: make deploy PLAYBOOK=dns Expected: changed for the present + absent records; no errors.

  • Step 3: Verify idempotence

Run: make deploy PLAYBOOK=dns Expected: ok=... changed=0 — a second run makes no changes.

  • Step 4: Verify with dig
dig +short MX wingu.me            # expect: 0 .
dig +short TXT wingu.me           # expect: "v=spf1 -all"
dig +short TXT _dmarc.wingu.me    # expect: "v=DMARC1; p=reject;"
dig +short www.wingu.me           # expect: empty (CNAME removed)

Expected: as annotated (allow for TTL/propagation).

  • Step 5: No commit — this task changes live Gandi, not the repo.

Task 8: Documentation reconciliation

Files:

  • Modify: docs/decisions/007-network.md

  • Modify: STATUS.md

  • Modify: docs/TODO.md

  • Modify: docs/CAPABILITIES.md

  • Step 1: Amend ADR-007 — naming scheme row

Replace the Public service FQDN row of the naming-scheme table:

| Public service FQDN | `<service>.baobab.band` | `forgejo.nyumbani.baobab.band` |

with:

| Public service FQDN | `<service>.wingu.me` | `vaultwarden.wingu.me` |
| Off-site (VPS) FQDN | `<service>.askari.wingu.me` | `netbird.askari.wingu.me` |
  • Step 2: Amend ADR-007 — public zone + scheme

Replace the Public zone paragraph:

**Public zone**: `baobab.band` — served by external DNS (Cloudflare or equivalent).
Public-facing services resolve to the public IP or Cloudflare proxy.

with:

**Public zone**: `wingu.me` — Gandi LiveDNS, **managed as code** by the `public_dns`
role (`vault.gandi.pat`). Three-tier naming: infra `<host>.boma.wingu.me` (internal),
services `<service>.wingu.me` (split-horizon), off-site `<service>.askari.wingu.me`.
`nyumbani` is retired. **Mesh/LAN-only by default**: home services have no public record
(reached over LAN or the NetBird mesh); only deliberate exceptions are published.
The project is `boma`; the domain is `wingu.me` (see the M1 spec). The legacy
`baobab.band` zone (Cloudflare) is out of scope here.
  • Step 3: Update the split-horizon example

In the Split-horizon paragraph, replace the example forgejo.nyumbani.baobab.band with vaultwarden.wingu.me (internal → private proxy IP; public → only if a deliberate exception). Leave the internal-zone (boma.baobab.band → to become boma.wingu.me when the dns role lands in Phase 2) wording; add a parenthetical: (internal zone is renamed to boma.wingu.me when the dns role is built — Phase 2).

  • Step 4: Mark STATUS — public_dns built

In STATUS.md, under "Real and working today", add a row:

| `roles/public_dns/` + `playbooks/dns.yml` | **Built + applied.** Manages wingu.me at Gandi LiveDNS as code (`community.general.gandi_livedns`, PAT from `vault.gandi.pat`); purged Gandi's seeded defaults, applied the anti-spoof baseline (null MX, SPF `-all`, DMARC reject). Mesh/LAN-only default. M1 of the roadmap. |
  • Step 5: Resolve TODO 4

In docs/TODO.md, change item 4 to struck-through/decided:

4. ~~**Split-horizon FQDN** — adopt split-horizon FQDN with or without nyumbani?~~
   DECIDED (M1): three-tier scheme on `wingu.me`; `nyumbani` dropped; mesh/LAN-only
   default. See `docs/decisions/007-network.md` + the M1 spec.
  • Step 6: Add a CAPABILITIES row

In docs/CAPABILITIES.md, near the Internal DNS row, add:

| Public DNS | `public_dns` role → Gandi LiveDNS | P | core | wingu.me zone as code (ADR-007) | anti-spoof baseline; mesh/LAN-only |

(Match the surrounding table's column shape; adjust the status letter to the table's convention.)

  • Step 7: Lint + commit

Run: make lint Expected: clean.

git add docs/decisions/007-network.md STATUS.md docs/TODO.md docs/CAPABILITIES.md
git commit -m "docs(public_dns): amend ADR-007 to wingu.me/Gandi; resolve TODO 4; STATUS + CAPABILITIES"

Self-Review (completed)

  • Spec coverage: role + group_vars data (Decisions 4,5) → Tasks 3,4; gandi_livedns + PAT (Decision 5, Verified facts) → Task 4; collections-on-demand (Decision 5) → Task 1; anti-spoof baseline + Gandi-defaults purge (Problem, Data model) → Tasks 3,7; cert scope (Decision 6) → out of scope (no cert tasks, correct); testing (check-mode/idempotence/dig + pytest) → Tasks 5,7,3; ADR-007 amendment + TODO 4/O12 → Task 8. All covered.
  • Placeholder scan: none — every code/content step is concrete.
  • Type/name consistency: public_dns__domain/__records/__absent/__apply/__default_ttl and vault.gandi.pat used identically across data, role, play, and tests. gandi_livedns params match the verified module signature.
  • Note for the implementer: Task 7 assumes ubongo. If the gandi_livedns absent call needs values for some record types, add them from public_dns__absent (verify against the pinned community.general version per ADR-014).