chore(kaizen): first /kaizen run — curate 12 friction signals

Dogfood of the new /kaizen command. 11 consumed, 1 kept open.
- SYSTEMATIZE → docs/testing/gotchas.md (apply:{tags} propagation, Molecule
  tag-isolation testing, API/templating render-only gap); CLAUDE.md
  (item['key'] loop convention, TF module required_providers); public_dns
  README (Gandi null-MX workaround).
- CHANGE → extend the Stop hook to also guard the brainstorming spec-review gate
  (verified: blocks the gate, passes meta-discussion).
- SYSTEMATIZE → make new-role scaffolds the access__/backup__ noqa reminder;
  ADR-004 documents the cross-role-naming convention.
- ALREADY-BUILT/ACCEPTED → exec-menu guard verified firing; ADR-023; ADR-024;
  subagent-faithfulness now embodied in the two-stage subagent review.
- KEEP-OPEN → a repo-scan.py check for ADRs that over-claim reconciliation.

Nudge: OVERDUE (13 signals) → ok (1). make lint + 16 friction-scan tests green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
sjat 2026-06-14 21:46:23 +02:00
parent d1e1e38879
commit 13ae674cc9
7 changed files with 120 additions and 141 deletions

View file

@ -1,17 +1,20 @@
#!/usr/bin/env bash
#
# Stop guard: block ending the turn when the assistant's final message presents the
# execution-mode menu. The writing-plans / subagent-driven-development skills script a
# "Subagent-Driven vs Inline Execution — which approach?" menu at the plan→execution
# handoff. boma's standing preference (docs/FRICTION.md + the
# always-subagent-driven-execution memory) is to NEVER present it and proceed
# subagent-driven. Prose reminders failed four times (06-05/06/09/10); this is the
# mechanical guard recorded by the 2026-06-10 kaizen review.
# Stop guard for two external-skill gates that conflict with boma conventions, where
# prose reminders repeatedly failed to hold (docs/FRICTION.md):
#
# 1. The execution-mode menu — writing-plans / subagent-driven-development script a
# "Subagent-Driven vs Inline Execution — which approach?" menu at the plan→execution
# handoff. boma's standing preference is to NEVER present it and proceed
# subagent-driven. (Recorded by the 2026-06-10 kaizen review.)
# 2. The brainstorming spec-review gate — the brainstorming skill scripts "Spec written
# and committed … please review it before … the implementation plan." The standing
# agreement is to move directly from the committed spec to writing-plans. (Recorded
# by the 2026-06-14 kaizen review; 06-10/06-14 recurrences.)
#
# Fails OPEN: any parse/read problem → allow the stop. Respects stop_hook_active so a
# block can never loop. The match signature is deliberately tight ("inline execution"
# AND "which approach"/"two execution options") so it fires on the actual menu, not on
# meta-discussion of it.
# block can never loop. Match signatures are deliberately tight so they fire on the
# actual gate text, not on meta-discussion of it.
#
set -uo pipefail
@ -43,4 +46,16 @@ JSON
exit 0
fi
# Brainstorming spec-review gate: asking the user to review the committed spec before
# the implementation plan. Tight signature: "implementation plan" present, plus either the
# skill's literal "spec written and committed" line, or the review+spec+before combination.
if [[ "$low" == *"implementation plan"* \
&& ( "$low" == *"spec written and committed"* \
|| ( "$low" == *"review"* && "$low" == *"the spec"* && "$low" == *"before"* ) ) ]]; then
cat <<'JSON'
{"decision":"block","reason":"Brainstorming spec-review gate detected in your final message. boma standing agreement (docs/FRICTION.md): once the spec is written and committed, move directly to the implementation plan (superpowers:writing-plans) — do not stop to ask the user to review the spec first. Drop the gate and proceed."}
JSON
exit 0
fi
exit 0

View file

@ -62,6 +62,9 @@ Full design rationale: `docs/decisions/`
- **Variables**: `rolename__varname` double-underscore namespace for role defaults
- **No inline vars in playbooks**: use `group_vars/` or `host_vars/` only
- **Loops**: prefer `loop:` over `with_items:`
- **Loop var keys**: index with `item['key']`, never `item.key` — a key named
`values`/`keys`/`items`/`get`/… resolves to the dict *method* (silently corrupt +
non-idempotent), not the value
- **Conditionals**: prefer `true`/`false` over `yes`/`no`
---
@ -178,6 +181,9 @@ Single-contributor, trunk-based (no merge requests / approval gates):
- Secrets via `TF_VAR_*` env vars only — never in `.tfvars` files
- `terraform.tfvars.example` is tracked; `terraform.tfvars` is gitignored
- `.terraform.lock.hcl` is tracked (pins provider versions)
- Every module declares its own `required_providers` (in `versions.tf`) for any
non-hashicorp provider — otherwise TF infers `hashicorp/<name>` and `init` fails
(caught only by a live `tf-init`, not by static review)
- Full rationale: `docs/decisions/006-terraform.md`
---

View file

@ -181,7 +181,14 @@ endif
roles/$(NAME)/molecule/default
echo "---" > roles/$(NAME)/tasks/main.yml
echo "---" > roles/$(NAME)/handlers/main.yml
echo "---" > roles/$(NAME)/defaults/main.yml
printf '%s\n' '---' \
'# Role defaults use the <rolename>__var double-underscore namespace.' \
'#' \
'# Service roles (ADR-004) also declare access__*/backup__* data here. Those are' \
'# cross-role conventions (not rolename-prefixed), so EACH such line needs a trailing' \
'# noqa: var-naming[no-role-prefix] (ansible-lint 24.x has no per-prefix allowlist).' \
'# Reference: roles/reverse_proxy/defaults/main.yml' \
> roles/$(NAME)/defaults/main.yml
echo "---" > roles/$(NAME)/meta/main.yml
printf '# %s\n\nRole description here.\n' "$(NAME)" > roles/$(NAME)/README.md
cp .scaffold/molecule.yml roles/$(NAME)/molecule/default/molecule.yml

View file

@ -1,14 +1,15 @@
# FRICTION.md — kaizen friction log
Raw signals for the periodic **kaizen review** (the methodology retrospective; see
`docs/TODO.md`). This is the input that keeps our tooling and conventions sharpening
over time instead of only accreting.
Raw signals for the periodic **kaizen review** (`/kaizen`; see `docs/TODO.md` 11). This is
the input that keeps our tooling and conventions sharpening over time instead of only
accreting.
**How to use:** append freely _during_ work under **Open signals** — don't curate,
don't fix there. Capture friction, surprises, fixes that keep recurring, and tooling
that isn't earning its keep. The kaizen review reads this, then proposes
**add / change / remove** (biased toward _remove_), migrates durable knowledge into the
right docs, and moves consumed signals into the **decisions ledger** below.
that isn't earning its keep. `/kaizen` reads this, then proposes a verdict per signal
(SYSTEMATIZE / CHANGE / PARK / REMOVE / ALREADY-BUILT / ACCEPTED / KEEP-OPEN; biased
toward _remove/park_ for unused tooling), migrates durable knowledge into the right docs,
and moves consumed signals into the **decisions ledger** below.
**Entry format:** `date — [tag] observation — (optional) → systematization idea`
Tags: `[friction]` recurring annoyance · `[gotcha]` surprising behaviour ·
@ -21,127 +22,6 @@ earning its keep.
_(append new raw signals here; the next kaizen review consumes them)_
- `[gotcha]` **Hetzner IPs are 403'd by Google's Go module infra; caddy-dns/gandi DNS-01
didn't issue** (2026-06-14, M4a): building the custom Caddy image *on askari* failed —
`proxy.golang.org` and `golang.org` both return **403 Forbidden** to the Hetzner IP
(worked on ubongo). Reworked the role to build on the control node + `docker save`/`load`
to the target. *Then* the `caddy-dns/gandi` DNS-01 plugin would not create the
`_acme-challenge` TXT despite a token verified to (a) be in Caddy's env and (b) create
TXT records via the Gandi API directly — no plugin error, just "propagation timeout,
last error <nil>"; resolvers/timeout tuning didn't help. **Resolution:** askari is a
*public* host, so switched it to **HTTP-01 + vanilla Caddy** (works, drops the custom
image entirely). DNS-01 deferred to Phase 2 (cluster's mesh/LAN-only services) — the
plugin + the Hetzner-build-block to be solved then. → lesson: prefer HTTP-01 wherever a
host is publicly reachable; reserve DNS-01 (and its plugin/build complexity) for hosts
that genuinely can't do HTTP-01. Both bugs surfaced only on the live host.
- `[gotcha]` **A tag on `include_tasks` does NOT reach the included tasks — need
`apply: {tags:}`** (2026-06-14): M3's `base/tasks/main.yml` tagged the ssh/fail2ban
`include_tasks` with `hardening`, but `make deploy … TAGS=hardening` ran *nothing*
(`ok=3 changed=0`) — a tag on a dynamic include selects the include, not its contents.
Fix: `include_tasks: {file: x.yml, apply: {tags: [hardening]}}`. The same latent bug sat
in the firewall include (never hit — firewall was only ever run untagged). Also the
check-mode artifact: a `service`/handler for a not-yet-installed package fails in a
first-run `--check` → guard with `when: not ansible_check_mode`. Both caught only by the
**live `make check`/`deploy` on askari** — Molecule converges *untagged*, so it can't
catch tag-propagation. 3rd reinforcement (after M1 `item.values`, M2 TF
`required_providers`) that live execution catches what review + container tests miss.
→ when a role uses tags to apply concern-subsets, `apply:` is mandatory on its includes;
consider an ansible-lint/CI check that `make deploy … TAGS=<concern>` actually changes things.
- `[gotcha]` **Terraform child modules need their own `required_providers` for
non-hashicorp providers** (2026-06-14): `terraform init` for the `offsite` env failed —
the `hetzner_vm` module used `hcloud_*` resources with no `required_providers` block, so
TF inferred `hashicorp/hcloud` (nonexistent). The `proxmox_vm` module had the **identical
latent bug**, never caught because Proxmox TF was never `init`ed. Both the terraform-MCP
schema check and the final review subagent missed it; only `make tf-init/plan` on ubongo
caught it. Reinforces the M1 signal that **live/real execution catches what static review
can't** — now for Terraform. → always give a TF module its own `versions.tf` with
`required_providers`; treat "reviewed but never run" as a structural blind spot.
- `[gotcha]` **`item.values` in a loop sends the dict's `.values()` METHOD, not the
key** (2026-06-14): the `public_dns` role looped over records that have a `values:`
key and used `{{ item.values }}` in the `gandi_livedns` task. Jinja attribute access
resolved `item.values` to the built-in dict method, so Gandi received
`"<built-in method values of dict object at 0x...>"` as the live TXT value — corrupt
**and** non-idempotent (the address changes each run → always "changed"). The fix is
bracket-indexing: `item['values']` (same risk for any key named `keys`/`items`/`get`/
`update`/...). → convention: in loops, index loop-var keys with `item['key']`, never
`item.key`; consider an ansible-lint guard.
- `[gotcha]` **Gandi LiveDNS rejects RFC-7505 null-MX `0 .`** (2026-06-14): "invalid
format for MX record." Used "no MX + no apex A" + SPF `-all` + DMARC reject instead.
Minor, but worth a note for any future no-mail domain on Gandi.
- `[recurring]` **apply=false Molecule + data-only pytest leave a real gap for
API/templating roles** (2026-06-14): both the null-MX and the `item.values` bugs sailed
through the spec, BOTH review subagents, the pytest (validates the data file, not the
rendered template), and the Molecule scenario (`apply=false`, so the API tasks never
run) — only the **live `make check`/`deploy`** against the real Gandi API surfaced them.
For roles whose payload is "render data → external API call", the rendered template is
the thing that breaks, and nothing short of a real (or check-mode) API call exercises it.
→ for such roles, treat a check-mode run against the real API as a required gate, not an
optional final step; or build a render-only assertion that materializes the module args.
- `[recurring]` **Execution-mode menu asked AGAIN despite the 2026-06-10 "mechanical
fix"** (2026-06-14): at the M1 (`public_dns`) plan handoff I presented the "1.
Subagent-Driven / 2. Inline Execution — which approach?" menu and asked the user to
pick. The decisions ledger (2026-06-10) records this exact behaviour as CHANGE →
mechanical: *"Stop hook in `.claude/settings.json` blocks the turn if the menu appears
and tells me to proceed subagent-driven."* It did not fire — either the hook is absent
in this clone, its matcher doesn't match the wording the `writing-plans` skill actually
produces, or it isn't installed/active. The standing agreement is to **default straight
to subagent-driven without asking**. → verify the Stop hook exists and that its pattern
matches the real menu text (the skill scripts "Two execution options" / "Which
approach?"); if it relies on `.claude/settings.json` hooks that aren't active here,
that's the gap. 5th occurrence (06-05/06/09/10/14).
- `[friction]` **ADR-writing policy is unsettled** (2026-05-31): drafting an ADR, I
invented a Status header ("Proposed") on the fly because there's no documented
convention for how we write ADRs (status lifecycle, required sections). → TODO 10.2 —
decide a minimal ADR template / status convention.
- `[recurring]` **Brainstorming's "user reviews spec" gate fires despite a standing
agreement to skip it** (2026-06-10): writing the ADR-structure spec, I stopped to ask
the user to review the finished spec before writing the plan — the
`superpowers:brainstorming` skill scripts that gate. We had previously agreed I should
move directly from the Q/A to the implementation plan once the spec is written. Same
shape as the execution-mode-menu signal: an external skill's script conflicting with a
boma convention, where prose reminders don't hold. → consider a mechanical guard
(Stop-hook family) or a CLAUDE.md/skill-override note that suppresses the spec-review
gate.
- `[recurring]` **Subagent faithfulness self-reports can be wrong — controller must
diff** (2026-06-10): during the ADR-023 retroactive restructure, an implementer
subagent reported "0 substantive deletions, the See-also lines reappear verbatim" for
ADR-014, but it had actually dropped the cross-reference lines. Caught only by the
controller independently running `git show <sha> | grep '^-[^-]'`. For
faithfulness-critical edits delegated to subagents, the agent's own audit is not
sufficient evidence. → systematize a controller-side deletion-audit step (every `-`
line must be a classified, expected change) before accepting any "presentational-only"
restructure; consider a helper script.
- `[friction]` **ansible-lint `var-naming[no-role-prefix]` rejects the ADR-021/022
`access__*`/`backup__*` cross-role field names** (2026-06-14): building the first
service role's records (`reverse_proxy`), adding the ADR-mandated `access__*` /
`backup__*` data to `defaults/main.yml` failed lint — the rule requires every role var
to start with `<rolename>_`, and ansible-lint 24.x has **no per-prefix allowlist**. The
double-underscore `reverse_proxy__*` namespace passes (starts with `reverse_proxy_`),
but the deliberately shared `access__`/`backup__` names don't. Resolved with inline
`# noqa: var-naming[no-role-prefix]` per var (keeps the rule enforced elsewhere). This
**will recur in every service role**. → decide a project-wide policy before the next
service role: a documented `.ansible-lint` stance, a sanctioned noqa snippet baked into
the `make new-role` scaffold, or reconcile the convention. First collision because
`reverse_proxy` is the first built service role.
- `[gotcha]` **Molecule CAN exercise tag-propagation, but only with a tagged converge +
full-then-partial sequencing** (2026-06-14): closing part of the 2026-06-14 `apply:
{tags:}` signal ("Molecule converges untagged, so it can't catch tag-propagation"). Added
a second converge play (`include_role` with `apply: {tags: [config]}` + a fresh user)
and an assertion, then proved the fix with `molecule converge -- --tags config`. Caveat
learned the hard way: a partial-tag run on a **fresh** instance fails on cross-concern
deps (a `config` task needs `git`, installed by the `packages` concern), and untagged
pre_tasks (test-user creation) get filtered out — so the realistic test is **full
converge → partial re-run** (idempotent), and harness pre_tasks need `tags: [always]`.
→ adopt the tagged-converge-play pattern for any role with concern subsets; this is the
CI check the prior signal asked for, in Molecule rather than `make deploy`.
- `[recurring]` **ADRs claim cross-doc reconciliation they didn't actually perform**
(2026-06-14): ADR-024's Status + Consequences asserted "ADR-017 prose that mentioned
Traefik is updated to read Caddy" — but ADR-008/017/019 + CAPABILITIES still said
@ -151,6 +31,7 @@ _(append new raw signals here; the next kaizen review consumes them)_
its promised ripple edits don't). → candidate `repo-scan.py` check: when an ADR's text
asserts "X is updated to Y" / supersedes a named tool, flag remaining occurrences of the
old name (or verify the claimed edit landed) — the structural cousin of `stale-deferred`.
(KEEP-OPEN per the 2026-06-14 `/kaizen` run — it's its own build task.)
---
@ -158,6 +39,28 @@ _(append new raw signals here; the next kaizen review consumes them)_
Consumed signals and where their resolution now lives. Newest first.
### 2026-06-14
First `/kaizen` run (dogfood). 12 signals triaged; 11 consumed, 1 kept open (#13 above —
a `repo-scan.py` check is its own build). **Bias-to-remove note:** zero PARK/REMOVE — none
of the open signals were `[unused]` *tooling*; they were all knowledge/gotchas/process,
which migrate or archive (knowledge is never deleted).
| Signal (first seen) | Verdict | Resolution / where it lives now |
|---|---|---|
| Execution-mode menu asked AGAIN — 5× (06-05→06-14) | ALREADY-BUILT | The 06-10 mechanical guard (`.claude/hooks/guard-execution-mode-menu.sh`, wired in `.claude/settings.json`) is **verified firing** on the real writing-plans menu text (tested 06-14). The 06-14 miss was hook-activation timing (the known "hooks-need-restart" gotcha), not a matcher defect. |
| Brainstorming spec-review gate fires despite the standing agreement (06-10) | CHANGE → mechanical | Extended the same Stop hook with a tight second matcher (review + "the spec" + "before" + "implementation plan", or the literal "spec written and committed"); tested to block the gate and pass meta-discussion. Same external-skill-script-vs-convention family as the execution menu. |
| Subagent faithfulness self-reports can be wrong (06-10) | ACCEPTED | The mitigation — independent two-stage review where the reviewer is told "do not trust the report" and reads the actual diff — is now embodied in `superpowers:subagent-driven-development`, used for the `/kaizen` build itself. Revisit if it recurs. |
| ADR-writing policy unsettled (05-31) | ALREADY-BUILT | ADR-023 (ADR structure & lifecycle) + `docs/decisions/adr-template.md` settle status/sections — both postdate this signal. |
| Hetzner 403 / caddy-dns DNS-01 didn't issue (06-14) | ALREADY-BUILT | ADR-024's revised Status records the HTTP-01 decision, the DNS-01 deferral to Phase 2, and the Hetzner-build + plugin blocks. |
| `apply:{tags}` not propagated by dynamic `include_tasks` (06-14) | SYSTEMATIZE | → `docs/testing/gotchas.md` — "Tags on dynamic `include_tasks` need `apply:`". |
| Molecule CAN test tag-propagation, via a tagged converge (06-14) | SYSTEMATIZE | → `docs/testing/gotchas.md` — "Testing concern-tag isolation in Molecule". |
| apply=false Molecule + data-pytest gap for API/templating roles (06-14) | SYSTEMATIZE | → `docs/testing/gotchas.md` — "API / templating roles: render-only tests miss the real call". |
| `item.values` in a loop sends the dict method, not the key (06-14) | SYSTEMATIZE | → CLAUDE.md Ansible conventions ("index loop-var keys with `item['key']`, never `item.key`"). |
| TF child modules need their own `required_providers` (06-14) | SYSTEMATIZE | → CLAUDE.md Terraform conventions ("every module declares its own `required_providers` in `versions.tf`"). |
| ansible-lint `var-naming` rejects `access__`/`backup__` cross-role names (06-14) | SYSTEMATIZE | → `make new-role` scaffolds a noqa reminder in `defaults/main.yml`; ADR-004's service-role section documents the convention; `roles/reverse_proxy/defaults/main.yml` is the reference. |
| Gandi rejects RFC-7505 null-MX `0 .` (06-14) | MIGRATE | → `roles/public_dns/README.md` Notes (no MX + SPF `-all` + DMARC reject for a no-mail domain). |
### 2026-06-10
| Signal (first seen) | Verdict | Resolution / where it lives now |
@ -172,6 +75,7 @@ Consumed signals and where their resolution now lives. Newest first.
| hooks-need-restart, pre-commit stashes unstaged, `rbw sync` stale cache, zsh word-split (05-30) | MIGRATE | → `docs/runbooks/claude-code-setup.md` "Environment gotchas". |
| `finishing-a-development-branch` offers open-a-PR vs our trunk-based merge (06-01) | accepted | Same root cause as the menu ask (external skill script vs boma convention). CLAUDE.md already mandates trunk-based merge-to-main; covered by the Stop-hook family + awareness. Revisit if it recurs. |
**Process note:** the `/retro` tool (TODO 11) still isn't built, so this review was
manual. Curating by hand (migrate durable knowledge → docs, archive consumed signals →
this ledger) worked well; fold that curation step into `/retro` when it's built.
**Process note:** the 2026-06-10 review was manual (the `/retro`/`/kaizen` tool wasn't
built). The 2026-06-14 block was the **first run of `/kaizen`** itself
(`scripts/friction-scan.py` Phase 0 + `.claude/commands/kaizen.md`); the dogfood both
cleared the backlog and validated the command.

View file

@ -51,6 +51,13 @@ below). Each service role contains a standard set of files:
| `BACKUP.md` | Per-service backup record — see ADR-022 and `docs/backup/service-backup-template.md` (a stateless service declares `backup__state: false` with a reason) |
| `meta/main.yml`, `molecule/default/` | Metadata + Debian 13 test scenario |
The `access__*` (ADR-021) and `backup__*` (ADR-022) data in `defaults/main.yml` are
**cross-role conventions** — shared field names that deliberately do *not* carry the
`<rolename>__` prefix. ansible-lint's `var-naming[no-role-prefix]` has no per-prefix
allowlist, so each such line carries a trailing `# noqa: var-naming[no-role-prefix]` (the
rule stays enforced for genuinely role-scoped vars). `make new-role` scaffolds a reminder;
`roles/reverse_proxy/defaults/main.yml` is the reference.
### Standard deploy mechanics
Every service role's `tasks/main.yml` follows the same sequence, so all roles are

View file

@ -34,3 +34,39 @@ testing surprise is worth remembering past the session that hit it.
apply/safety paths Molecule can't exercise, validate out-of-band (a throwaway
`--privileged` container with its own netns) and treat a final adversarial review as
**mandatory, not optional**.
## Tags on dynamic `include_tasks` need `apply:` to reach the included tasks
- **A tag on a dynamic `include_tasks` selects the include statement, not its contents.**
Tagging `include_tasks: x.yml` with `concern` and running `--tags concern` runs
*nothing* (`ok=N changed=0`) unless the included tasks are independently tagged. Use
`include_tasks: {file: x.yml, apply: {tags: [concern]}}` to propagate the tag onto the
included tasks — **mandatory** whenever a role uses tags to apply concern-subsets
(`roles/base/tasks/main.yml` and `roles/dev_env/tasks/main.yml` are the references).
- **Molecule converges *untagged*, so it cannot catch this by default** — the bug only
shows under `make deploy … TAGS=<concern>` on a real host (first hit live on askari, M3).
See the tag-isolation pattern below to catch it in Molecule instead.
- **Check-mode artifact:** a `service`/handler for a not-yet-installed package fails in a
first-run `--check`; guard with `when: not ansible_check_mode`.
## Testing concern-tag isolation in Molecule
- To catch the tag-propagation bug above *in Molecule*, add a **second converge play**
that applies one concern to a fresh target — `include_role` with `apply: {tags: [config]}`
— plus a `verify` assertion that the concern's effect landed. Drive the real partial
path with `molecule converge -- --tags config`.
- **Sequence matters:** a partial-tag run on a *fresh* instance fails on cross-concern
deps (a `config` task may need a binary the `packages` concern installs). The realistic
test is **full converge → partial `--tags` re-run** (idempotent). Harness `pre_tasks`
(e.g. test-user creation) must be tagged `always`, or `--tags` filters them out.
(Pattern proven on `dev_env`, 2026-06-14.)
## API / templating roles: render-only tests miss the real call
- For a role whose payload is "render data → external API call" (e.g. `public_dns`
Gandi LiveDNS), `apply=false` Molecule + data-only pytest exercise the *data file*, not
the *rendered module args* — so corrupt-template and API-rejection bugs (`item.values`
resolving to a dict method; Gandi rejecting RFC-7505 null-MX `0 .`) sail through both,
plus review. Only a real (or `--check`) call against the API surfaces them.
- → Treat a **check-mode run against the real API as a required gate** for such roles, or
build a render-only assertion that materializes and inspects the rendered module args.

View file

@ -28,3 +28,7 @@ Everything else is reached over LAN/mesh and never appears here.
The zone is reconciled **additively** plus an explicit `absent` list (Gandi seeds 13
default records on a new `.me`; we purge the unwanted 11 and overwrite MX/SPF with the
anti-spoof baseline). Full-zone authoritative pruning is a future enhancement (TODO 8.3).
**Gandi rejects RFC-7505 null-MX (`0 .`)** with "invalid format for MX record" — so a
no-mail domain can't use the standard null-MX. We instead **remove the MX entirely** (no
MX + no apex A = no mail) and rely on SPF `-all` + DMARC `reject` to prevent spoofing.