boma/docs/testing/gotchas.md
sjat 91713127cb docs(kaizen): migrate gotchas to docs; curate FRICTION log (2026-06-10 review)
- New docs/testing/gotchas.md (nft iif/iifname, Molecule ansible_host,
  apply-path coverage blind spot, render-nft-c pattern); pointer from ADR-008.
- claude-code-setup.md gains "Environment gotchas" (hooks-need-restart,
  pre-commit stashes unstaged, rbw sync cache, zsh word-split).
- FRICTION.md restructured into Open signals + a decisions ledger; consumed
  signals archived with where their resolution now lives.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-10 12:51:39 +02:00

2.1 KiB

Testing & Molecule gotchas

Durable, point-of-use knowledge for writing and running role tests (ADR-008). Migrated from docs/FRICTION.md by the 2026-06-10 kaizen review. Append here when a testing surprise is worth remembering past the session that hit it.

nftables / nft -c render checks

  • nft -c rejects iif "<name>" when the interface is absentiif resolves to an interface index at load time, so it fails in the Molecule container and would fail identically on any real host before the interface exists (e.g. wt0 before NetBird is up). Use iifname "<name>" (string match, no existence requirement, survives the interface coming and going) for any interface that may be absent.
  • The render-and-nft -c (no-apply) Molecule approach earns its keep — it caught the iif/iifname bug deterministically without touching the host kernel. Reuse this pattern (render template → static-check, never apply) for other config-rendering roles.

Molecule (community.docker)

  • Molecule's community.docker connection uses ansible_host as the container name (remote_addr). Setting ansible_host as data in a scenario's host_vars (e.g. to give a resolver a fake IP) breaks the connection → UNREACHABLE / "Failed to create temporary directory". Don't override ansible_host in Molecule; feed fixture IPs another way (keep fixtures to zone sources and unit-test IP resolution).

Coverage blind spot: apply-only task paths

  • Apply-only task paths have no Level-1 coverage, so safety bugs hide there. Example: an nft auto-rollback snapshot used a bare nft list ruleset (no leading flush ruleset), so the revert was a silent no-op on first apply and errored on later ones — the whole safety net was dead. Molecule never runs the apply (gated off), so only adversarial review + an isolated-netns round-trip test caught it. → For apply/safety paths Molecule can't exercise, validate out-of-band (a throwaway --privileged container with its own netns) and treat a final adversarial review as mandatory, not optional.