Compare commits

...

3 commits

Author SHA1 Message Date
993d7885e4 docs: mark M1 applied (STATUS); log item.values + Gandi null-MX gotchas
M1 public_dns applied to wingu.me (purge + SPF/DMARC, idempotent). Friction:
item.values dict-method collision, Gandi null-MX rejection, and the apply=false-
Molecule/data-only-pytest gap that let both bugs reach a live apply.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-14 10:58:03 +02:00
76bd1d63fc fix(public_dns): index loop keys with item['key'] not item.key
item.values resolved to the dict's built-in .values() METHOD, not the 'values'
key, so gandi_livedns received '<built-in method values of dict object at 0x..>'
as the TXT value — garbage AND non-idempotent (the address changes each run).
Bracket-index all loop fields. Caught only by the live apply (apply=false Molecule
+ data-only pytest both missed it).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-14 10:57:23 +02:00
078d1ad9d9 fix(public_dns): drop null-MX (Gandi rejects '0 .'); remove MX instead
Gandi LiveDNS rejects the RFC-7505 null-MX value '0 .' ('invalid format for MX
record'), which failed the live apply. No MX + no apex A = no mail delivery, and
SPF -all + DMARC reject still prevent spoofing — so remove Gandi's seeded MX (add
@/MX to absent) rather than declare a null-MX present. Assert now requires an SPF
@/TXT record; tests + Molecule sample updated.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-14 10:53:54 +02:00
6 changed files with 43 additions and 16 deletions

View file

@ -28,7 +28,7 @@ _Last reviewed: 2026-06-11._
| Tag standard + enforcement (ADR-019) | Works — `tests/tags.yml` (closed vocabulary) + `scripts/check-tags.py` (run by `make lint`, unit-tested): enforces the tag vocabulary and that each role import in a play's `roles:` block carries its role-name tag. Governs mostly-unbuilt roles, but the linter is live now. Proxmox VM tag convention (`<env>`, group, `managed-by=terraform`) is in the Terraform HCL but unprovisioned. |
| `roles/dev_env/` — interactive developer environment | **Built + applied.** zsh + oh-my-zsh + oh-my-posh, tmux + TPM plugins, neovim; dotfiles deployed via GNU stow (re-derived from V4/fisi per ADR-013). Node.js from a pinned upstream tarball (not Debian's npm). Lint + Molecule (idempotent) green. **Applied to `ubongo`** for users `sjat` + `claude` (verified: zsh login shells, stow-symlinked `.zshrc`/`.tmux.conf` + nvim config, oh-my-zsh, tmux plugins; nvim v0.12.2, oh-my-posh 29.0.1). Run via `playbooks/workstation.yml` against the `control` group (no dedicated `workstations` group yet). |
| `make check` / `make deploy PLAYBOOK=<name>` | **Works.** First end-to-end run (applying `dev_env`) surfaced + fixed latent bugs: Makefile `PLAYBOOK` var collision (binary path vs playbook-name arg) meant the targets never ran; `ansible.cfg` referenced uninstalled community.general callbacks (now built-in `default` + `ansible.posix.profile_tasks`); `acl` package added so Ansible can `become_user` an unprivileged user. The make targets now function — though `site`/`base`/`docker_host` content is still incomplete (see below). |
| `roles/public_dns/` + `playbooks/dns.yml` | **Built — not yet applied.** Manages wingu.me at Gandi LiveDNS as code (`community.general.gandi_livedns`, PAT from `vault.gandi.pat`); record data, anti-spoof baseline (null MX, SPF `-all`, DMARC reject), and the Gandi-defaults purge list are defined + unit-tested (`tests/test_public_dns.py`). The live `make deploy PLAYBOOK=dns` (purge + baseline) is **pending — run on ubongo**. M1 of the roadmap. |
| `roles/public_dns/` + `playbooks/dns.yml` | **Built + applied.** Manages wingu.me at Gandi LiveDNS as code (`community.general.gandi_livedns`, PAT from `vault.gandi.pat`); record data, anti-spoof baseline (SPF `-all` + DMARC reject), and the Gandi-defaults purge are defined + unit-tested (`tests/test_public_dns.py`). **Applied to wingu.me (2026-06-14):** purged Gandi's 13 seeded defaults; zone now holds only the SPF + DMARC TXT records; idempotent re-run clean. No null-MX (Gandi rejects `0 .`) — the MX is removed, so no MX + no apex A = no mail. M1 of the roadmap. |
| `ubongo` — physical control / AI-worker host (ADR-015) | **Built (partial).** Debian 13.5 on a Lenovo M70q (i3-10100T, 16 GB, 256 GB SSD; no disk encryption — accepted risk). Full toolchain installed + pinned to `fisi` (Docker 29.5.3, rbw 1.15.0, Claude Code 2.1.173, ansible-core 2.17.14 + molecule via `make setup`/`make collections`). Repo cloned under a dedicated `claude` user (docker group, no sudo). Vault works via rbw (offline-cache decryption verified). SSH key-only (password + root login disabled). In the production inventory `control` group at 10.20.10.151. **`dev_env` now applied here** (zsh/tmux/nvim for `sjat` + `claude`, via `playbooks/workstation.yml`). Managed as the operator account `sjat` (`group_vars/control` sets `ansible_user: sjat`), not the `ansible` service user `group_vars/all` assumes — ubongo has no bootstrapped `ansible` user. **Pending:** NetBird mesh enrollment (so SSH is LAN-only); full `base` hardening (only the `firewall` concern exists, and it is NOT applied here — applying default-deny with no mesh would lock out inbound SSH on the physical NIC); proper `ansible`-user bootstrap (currently managed as `sjat`); OPNsense DHCP reservation for 10.20.10.151 (MAC `88:a4:c2:e0:ee:da`); Terraform state backup (no TF state yet). |
## Scaffolded but empty — NOT implemented

View file

@ -21,6 +21,28 @@ earning its keep.
_(append new raw signals here; the next kaizen review consumes them)_
- `[gotcha]` **`item.values` in a loop sends the dict's `.values()` METHOD, not the
key** (2026-06-14): the `public_dns` role looped over records that have a `values:`
key and used `{{ item.values }}` in the `gandi_livedns` task. Jinja attribute access
resolved `item.values` to the built-in dict method, so Gandi received
`"<built-in method values of dict object at 0x...>"` as the live TXT value — corrupt
**and** non-idempotent (the address changes each run → always "changed"). The fix is
bracket-indexing: `item['values']` (same risk for any key named `keys`/`items`/`get`/
`update`/...). → convention: in loops, index loop-var keys with `item['key']`, never
`item.key`; consider an ansible-lint guard.
- `[gotcha]` **Gandi LiveDNS rejects RFC-7505 null-MX `0 .`** (2026-06-14): "invalid
format for MX record." Used "no MX + no apex A" + SPF `-all` + DMARC reject instead.
Minor, but worth a note for any future no-mail domain on Gandi.
- `[recurring]` **apply=false Molecule + data-only pytest leave a real gap for
API/templating roles** (2026-06-14): both the null-MX and the `item.values` bugs sailed
through the spec, BOTH review subagents, the pytest (validates the data file, not the
rendered template), and the Molecule scenario (`apply=false`, so the API tasks never
run) — only the **live `make check`/`deploy`** against the real Gandi API surfaced them.
For roles whose payload is "render data → external API call", the rendered template is
the thing that breaks, and nothing short of a real (or check-mode) API call exercises it.
→ for such roles, treat a check-mode run against the real API as a required gate, not an
optional final step; or build a render-only assertion that materializes the module args.
- `[recurring]` **Execution-mode menu asked AGAIN despite the 2026-06-10 "mechanical
fix"** (2026-06-14): at the M1 (`public_dns`) plan handoff I presented the "1.
Subagent-Driven / 2. Inline Execution — which approach?" menu and asked the user to

View file

@ -4,9 +4,10 @@
# vault.gandi.pat. See docs/decisions/007-network.md and the M1 spec.
public_dns__domain: wingu.me
# Present — anti-spoof baseline for a no-mail domain (overwrites Gandi's seeded mail set).
# Present — anti-spoof baseline for a no-mail domain. No null-MX: Gandi LiveDNS rejects
# the RFC-7505 "0 ." form, so the MX is simply REMOVED (below) — no MX + no apex A means
# no mail delivery, and SPF -all + DMARC reject prevent spoofing.
public_dns__records:
- {record: "@", type: MX, values: ["0 ."], ttl: 3600}
- {record: "@", type: TXT, values: ['"v=spf1 -all"'], ttl: 3600}
- {record: _dmarc, type: TXT, values: ['"v=DMARC1; p=reject;"'], ttl: 3600}
# Service records appear as public-tier needs arise (askari A in M4).
@ -15,6 +16,7 @@ public_dns__records:
# Absent — Gandi's auto-seeded defaults we don't want (purged once, idempotent thereafter).
public_dns__absent:
- {record: "@", type: A} # Gandi parking IP
- {record: "@", type: MX} # Gandi mail MX (no mail on wingu.me; null-MX unsupported)
- {record: www, type: CNAME} # Gandi web-redirect
- {record: webmail, type: CNAME} # Gandi webmail
- {record: gm1._domainkey, type: CNAME} # Gandi DKIM

View file

@ -6,9 +6,9 @@
public_dns__apply: false # never call the Gandi API from a container
public_dns__domain: example.test
public_dns__records:
- {record: "@", type: MX, values: ["0 ."], ttl: 3600}
- {record: "@", type: TXT, values: ['"v=spf1 -all"'], ttl: 3600}
public_dns__absent:
- {record: www, type: CNAME}
- {record: "@", type: MX}
roles:
- role: public_dns

View file

@ -3,36 +3,37 @@
ansible.builtin.assert:
that:
- public_dns__domain | length > 0
- public_dns__records | selectattr('type', 'equalto', 'MX') | list | length > 0
- public_dns__records | selectattr('record', 'equalto', '@')
| selectattr('type', 'equalto', 'TXT') | list | length > 0
fail_msg: >-
public_dns__domain must be set and a null-MX anti-spoof record declared in
public_dns__domain must be set and an SPF record (@/TXT) declared in
public_dns__records (group_vars/all/public_dns.yml).
run_once: true
- name: Ensure desired records are present (Gandi LiveDNS)
community.general.gandi_livedns:
domain: "{{ public_dns__domain }}"
record: "{{ item.record }}"
type: "{{ item.type }}"
values: "{{ item.values }}"
ttl: "{{ item.ttl | default(public_dns__default_ttl) }}"
record: "{{ item['record'] }}"
type: "{{ item['type'] }}"
values: "{{ item['values'] }}"
ttl: "{{ item['ttl'] | default(public_dns__default_ttl) }}"
state: present
personal_access_token: "{{ vault.gandi.pat }}"
loop: "{{ public_dns__records }}"
loop_control:
label: "{{ item.record }} {{ item.type }}"
label: "{{ item['record'] }} {{ item['type'] }}"
run_once: true
when: public_dns__apply | bool
- name: Ensure unwanted records are absent (Gandi LiveDNS)
community.general.gandi_livedns:
domain: "{{ public_dns__domain }}"
record: "{{ item.record }}"
type: "{{ item.type }}"
record: "{{ item['record'] }}"
type: "{{ item['type'] }}"
state: absent
personal_access_token: "{{ vault.gandi.pat }}"
loop: "{{ public_dns__absent }}"
loop_control:
label: "{{ item.record }} {{ item.type }}"
label: "{{ item['record'] }} {{ item['type'] }}"
run_once: true
when: public_dns__apply | bool

View file

@ -8,8 +8,10 @@ _DATA = (
)
# Gandi auto-seeds these on a fresh .me zone; boma purges them (verified 2026-06-14).
# Includes the @ MX: Gandi rejects the RFC-7505 null-MX "0 .", so we remove the MX
# entirely (no MX + no apex A = no mail) rather than declare a null-MX present.
GANDI_DEFAULTS_ABSENT = {
("@", "A"), ("www", "CNAME"), ("webmail", "CNAME"),
("@", "A"), ("@", "MX"), ("www", "CNAME"), ("webmail", "CNAME"),
("gm1._domainkey", "CNAME"), ("gm2._domainkey", "CNAME"), ("gm3._domainkey", "CNAME"),
("_imap._tcp", "SRV"), ("_imaps._tcp", "SRV"), ("_pop3._tcp", "SRV"),
("_pop3s._tcp", "SRV"), ("_submission._tcp", "SRV"),
@ -32,9 +34,9 @@ def test_present_records_well_formed():
def test_anti_spoof_baseline_present():
recs = {(r["record"], r["type"]): r["values"] for r in _load()["public_dns__records"]}
assert recs[("@", "MX")] == ["0 ."] # null MX
assert recs[("@", "TXT")] == ['"v=spf1 -all"'] # SPF deny-all
assert recs[("_dmarc", "TXT")] == ['"v=DMARC1; p=reject;"']
assert ("@", "MX") not in recs # no MX (Gandi rejects null-MX; removed instead)
def test_gandi_defaults_marked_absent():