Dogfood of the new /kaizen command. 11 consumed, 1 kept open.
- SYSTEMATIZE → docs/testing/gotchas.md (apply:{tags} propagation, Molecule
tag-isolation testing, API/templating render-only gap); CLAUDE.md
(item['key'] loop convention, TF module required_providers); public_dns
README (Gandi null-MX workaround).
- CHANGE → extend the Stop hook to also guard the brainstorming spec-review gate
(verified: blocks the gate, passes meta-discussion).
- SYSTEMATIZE → make new-role scaffolds the access__/backup__ noqa reminder;
ADR-004 documents the cross-role-naming convention.
- ALREADY-BUILT/ACCEPTED → exec-menu guard verified firing; ADR-023; ADR-024;
subagent-faithfulness now embodied in the two-stage subagent review.
- KEEP-OPEN → a repo-scan.py check for ADRs that over-claim reconciliation.
Nudge: OVERDUE (13 signals) → ok (1). make lint + 16 friction-scan tests green.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
4.5 KiB
4.5 KiB
Testing & Molecule gotchas
Durable, point-of-use knowledge for writing and running role tests (ADR-008).
Migrated from docs/FRICTION.md by the 2026-06-10 kaizen review. Append here when a
testing surprise is worth remembering past the session that hit it.
nftables / nft -c render checks
nft -crejectsiif "<name>"when the interface is absent —iifresolves to an interface index at load time, so it fails in the Molecule container and would fail identically on any real host before the interface exists (e.g.wt0before NetBird is up). Useiifname "<name>"(string match, no existence requirement, survives the interface coming and going) for any interface that may be absent.- The render-and-
nft -c(no-apply) Molecule approach earns its keep — it caught theiif/iifnamebug deterministically without touching the host kernel. Reuse this pattern (render template → static-check, never apply) for other config-rendering roles.
Molecule (community.docker)
- Molecule's
community.dockerconnection usesansible_hostas the container name (remote_addr). Settingansible_hostas data in a scenario'shost_vars(e.g. to give a resolver a fake IP) breaks the connection →UNREACHABLE/ "Failed to create temporary directory". Don't overrideansible_hostin Molecule; feed fixture IPs another way (keep fixtures to zone sources and unit-test IP resolution).
Coverage blind spot: apply-only task paths
- Apply-only task paths have no Level-1 coverage, so safety bugs hide there. Example:
an
nftauto-rollback snapshot used a barenft list ruleset(no leadingflush ruleset), so the revert was a silent no-op on first apply and errored on later ones — the whole safety net was dead. Molecule never runs the apply (gated off), so only adversarial review + an isolated-netns round-trip test caught it. → For apply/safety paths Molecule can't exercise, validate out-of-band (a throwaway--privilegedcontainer with its own netns) and treat a final adversarial review as mandatory, not optional.
Tags on dynamic include_tasks need apply: to reach the included tasks
- A tag on a dynamic
include_tasksselects the include statement, not its contents. Tagginginclude_tasks: x.ymlwithconcernand running--tags concernruns nothing (ok=N changed=0) unless the included tasks are independently tagged. Useinclude_tasks: {file: x.yml, apply: {tags: [concern]}}to propagate the tag onto the included tasks — mandatory whenever a role uses tags to apply concern-subsets (roles/base/tasks/main.ymlandroles/dev_env/tasks/main.ymlare the references). - Molecule converges untagged, so it cannot catch this by default — the bug only
shows under
make deploy … TAGS=<concern>on a real host (first hit live on askari, M3). See the tag-isolation pattern below to catch it in Molecule instead. - Check-mode artifact: a
service/handler for a not-yet-installed package fails in a first-run--check; guard withwhen: not ansible_check_mode.
Testing concern-tag isolation in Molecule
- To catch the tag-propagation bug above in Molecule, add a second converge play
that applies one concern to a fresh target —
include_rolewithapply: {tags: [config]}— plus averifyassertion that the concern's effect landed. Drive the real partial path withmolecule converge -- --tags config. - Sequence matters: a partial-tag run on a fresh instance fails on cross-concern
deps (a
configtask may need a binary thepackagesconcern installs). The realistic test is full converge → partial--tagsre-run (idempotent). Harnesspre_tasks(e.g. test-user creation) must be taggedalways, or--tagsfilters them out. (Pattern proven ondev_env, 2026-06-14.)
API / templating roles: render-only tests miss the real call
- For a role whose payload is "render data → external API call" (e.g.
public_dns→ Gandi LiveDNS),apply=falseMolecule + data-only pytest exercise the data file, not the rendered module args — so corrupt-template and API-rejection bugs (item.valuesresolving to a dict method; Gandi rejecting RFC-7505 null-MX0 .) sail through both, plus review. Only a real (or--check) call against the API surfaces them. - → Treat a check-mode run against the real API as a required gate for such roles, or build a render-only assertion that materializes and inspects the rendered module args.