boma/docs/testing/gotchas.md
sjat 91713127cb docs(kaizen): migrate gotchas to docs; curate FRICTION log (2026-06-10 review)
- New docs/testing/gotchas.md (nft iif/iifname, Molecule ansible_host,
  apply-path coverage blind spot, render-nft-c pattern); pointer from ADR-008.
- claude-code-setup.md gains "Environment gotchas" (hooks-need-restart,
  pre-commit stashes unstaged, rbw sync cache, zsh word-split).
- FRICTION.md restructured into Open signals + a decisions ledger; consumed
  signals archived with where their resolution now lives.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-10 12:51:39 +02:00

36 lines
2.1 KiB
Markdown

# Testing & Molecule gotchas
Durable, point-of-use knowledge for writing and running role tests (ADR-008).
Migrated from `docs/FRICTION.md` by the 2026-06-10 kaizen review. Append here when a
testing surprise is worth remembering past the session that hit it.
## nftables / `nft -c` render checks
- **`nft -c` rejects `iif "<name>"` when the interface is absent** — `iif` resolves to
an interface *index* at load time, so it fails in the Molecule container and would
fail identically on any real host before the interface exists (e.g. `wt0` before
NetBird is up). Use **`iifname "<name>"`** (string match, no existence requirement,
survives the interface coming and going) for any interface that may be absent.
- **The render-and-`nft -c` (no-apply) Molecule approach earns its keep** — it caught
the `iif`/`iifname` bug deterministically without touching the host kernel. Reuse
this pattern (render template → static-check, never apply) for other config-rendering
roles.
## Molecule (`community.docker`)
- **Molecule's `community.docker` connection uses `ansible_host` as the container name**
(`remote_addr`). Setting `ansible_host` as *data* in a scenario's `host_vars` (e.g. to
give a resolver a fake IP) breaks the connection → `UNREACHABLE` / "Failed to create
temporary directory". Don't override `ansible_host` in Molecule; feed fixture IPs
another way (keep fixtures to zone sources and unit-test IP resolution).
## Coverage blind spot: apply-only task paths
- **Apply-only task paths have no Level-1 coverage**, so safety bugs hide there. Example:
an `nft` auto-rollback snapshot used a bare `nft list ruleset` (no leading
`flush ruleset`), so the revert was a silent no-op on first apply and errored on later
ones — the whole safety net was dead. Molecule never runs the apply (gated off), so
only adversarial review + an isolated-netns round-trip test caught it. → For
apply/safety paths Molecule can't exercise, validate out-of-band (a throwaway
`--privileged` container with its own netns) and treat a final adversarial review as
**mandatory, not optional**.