docs(friction): record host-nftables build gotchas (iif/iifname, molecule ansible_host, venv PATH, apply-path coverage)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
sjat 2026-06-06 19:16:21 +02:00
parent 402913efb3
commit fcfb056591

View file

@ -94,3 +94,33 @@ earning its keep.
is redundant friction. → After `writing-plans` finishes, begin subagent-driven
implementation directly. The only reason to stop is a genuine blocker or ambiguity, not
a routine checkpoint.
### Host nftables firewall build (`base` role)
- `[gotcha]` **`nft -c` rejects `iif "<name>"` when the interface is absent** (it resolves
to an interface *index* at load time). The render+syntax-check Molecule step caught
`iif "wt0"` failing in the container — and it would fail identically on any real host
before NetBird brings up `wt0`. Use **`iifname "<name>"`** (string match, no existence
requirement, survives the interface coming/going) for any interface that may be absent.
- `[gotcha]` **Molecule's `community.docker` connection uses `ansible_host` as the
container name** (`remote_addr`). Setting `ansible_host` as *data* in a scenario's
`host_vars` (e.g. to give a resolver a fake IP) breaks the connection → `UNREACHABLE`,
"Failed to create temporary directory". Don't override `ansible_host` in molecule; feed
fixture IPs another way (or keep fixtures to zone sources and unit-test IP resolution).
- `[recurring]` **`make test ROLE=<r>` needs the venv on PATH.** Run non-activated (as
agents do), molecule dies with `FileNotFoundError: 'ansible-config'` — it shells out to
`ansible-config`/`ansible-playbook` by bare name. Workaround: `PATH="$PWD/.venv/bin:$PATH"
.venv/bin/molecule test`. Also the molecule image wasn't in the Forgejo registry (pull →
"not found"); had to `make molecule-image` to build it locally. → Consider (a) the
Makefile `test` target prepending `.venv/bin` to PATH, and (b) `make molecule-image-push`
so a fresh checkout can pull it.
- `[gotcha]` **Apply-only task paths have no Level-1 coverage**, so safety bugs hide there.
The `nft` auto-rollback snapshot used a bare `nft list ruleset` (no leading `flush
ruleset`) → the revert was a silent no-op on first apply and errored on later ones; the
whole safety net was dead. Molecule never runs the apply (gated off), so only adversarial
review + an isolated-netns round-trip test caught it. → For apply/safety paths molecule
can't exercise, validate out-of-band (a throwaway `--privileged` container with its own
netns) and treat a final adversarial review as mandatory, not optional.
- `[note]` The render-and-`nft -c` (no-apply) Molecule approach **earned its keep**
caught the `iif`/`iifname` bug deterministically without touching the host kernel. Good
pattern to reuse for other config-rendering roles.