boma/docs/reviews/2026-06-05-review.md
sjat 3dd03d4198 review-repo: 2026-06-05 report (4 auto-fixed, 12 open)
Stale-deferred check exercised: 6 open-deferred-items all confirmed genuinely
open, 0 stale-deferred. Top open: thread ADR-017 VERIFY.md convention through
ADR-004/new-role/README; name the askari inventory group (ADR-016).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-05 18:24:39 +02:00

7.6 KiB
Raw Permalink Blame History

Repo review — 2026-06-05

  • Reviewed commit: f566fd1 (scan); auto-fixes landed in 666ad42
  • Mode: on-demand (interactive)
  • Scope: whole repo — 2 roles, 17 ADRs, 4 runbooks, 7 scripts; doc-heavy
  • Prior run: 2026-05-30 (de38d1c) — 7 auto-fixed, 17 open

Summary

high medium low total
Auto-fixed 2 0 2 4
Open (report-only) 0 5 7 12

This review followed a session of heavy documentation work (ADR-015 ubongo, ADR-016 NetBird mesh, ADR-017 Level-4 verification). Most findings are propagation gaps — a new decision landed but an older doc still reflects the prior design.

New deferral check exercised. repo-scan.py now enumerates open ADR "Deferred/ Open" items and flags any another file calls resolved-but-unmarked. This run: 6 open-deferred-items surfaced, all confirmed genuinely open by the cross-cutting reviewer (ADR-011 #15, ADR-015 #3), 0 stale-deferred. The check produced no false resolutions and the judgement layer agreed — working as designed.

Auto-fixes applied (666ad42)

id dim sev location fix
AF1 consistency high docs/decisions/005-bootstrapping.md:36, docs/runbooks/new-host.md:62,71 Removed "Terraform writes the host's DNS A record" — contradicts ADR-009 (the dns role owns the zone). Recurring: the 2026-05-30 run fixed the same contradiction in README/ADR-003; it reappeared in two more files.
AF2 consistency high docs/decisions/005-bootstrapping.md:8 Control node described as cloned from the cloud-init template; ADR-015 makes ubongo a physical box installed directly. Corrected.
AF3 consistency low CLAUDE.md:197 Added the missing docs/testing/service-verify-template.md row to Further reading (parallels the security-template row).
AF4 cruft low docs/TODO.md:79 Typos: "we we" → "we"; "seperate" → "separate".

Open findings (report-only)

VERIFY.md propagation cluster (ADR-017 not fully threaded through)

id sev location finding suggested fix
O1 medium docs/decisions/004-docker-model.md (file table) The service-role standard lists SECURITY.md but not VERIFY.md, though ADR-017 + CLAUDE.md:85 now mandate it. Add a VERIFY.md row to ADR-004's file table.
O2 medium docs/runbooks/new-role.md (step 9 → Commit) No step to write VERIFY.md for service roles (only SECURITY.md). Makes STATUS.md:17 ("runbooks current and mutually reconciled") slightly overstated. Add a "write the per-service verification spec" step mirroring the SECURITY.md step.
O3 low README.md:58-60, 94 ADR list stops at 001009 (010017 absent); the docs/ tree omits security/, testing/, hardware/. Extend the ADR list (or point to docs/decisions/ + CLAUDE.md's table); expand the docs/ subtree.

Design gaps from the recent ADRs

id sev location finding suggested fix
O4 medium CLAUDE.md:106, docs/decisions/009-provisioning-handoff.md:78, scripts/tf_to_inventory.py:24 ADR-016 says "askari is Ansible-managed — its own inventory group", but no group is named anywhere; host-groups list + valid-groups set don't include it. Decide the group name (e.g. edge_hosts/hetzner_hosts), add to CLAUDE.md host groups + ADR-009 valid groups. (askari is manual like the control node, so tf_to_inventory.py need not generate it, but the group must be valid.)
O5 medium docs/decisions/006-terraform.md:78 backend.tf labelled "Forgejo state backend", contradicting ADR-006's own State-backend section (local state on ubongo; Forgejo's API is read-only). Relabel to "local state backend (no remote backend)".
O6 medium docs/decisions/014-knowledge-sourcing.md:88 Plugin-reproducibility described as open ("tracked in docs/TODO.md"), but TODO 10.7 is marked DONE (settings.json declares the plugin set; claude-code-setup.md covers bootstrap). Update to reflect the resolved state; drop the forward-pointer.

Clarity / lower-priority consistency

id sev location finding suggested fix
O7 low docs/decisions/011-update-management.md:128 "Digest-pinning the stateful tier" sits in the ruled-out table, but Decision #2 adopts tag@digest for stateful (TODO 16 confirms). ADR-011 is still Proposed/draft. Remove/replace the ruled-out row when accepting ADR-011 (TODO 16).
O8 low docs/decisions/003-toolchain.md:85, docs/decisions/010-forgejo-ci.md:66 "act_runner on the control node or a dedicated runner VM" reads ambiguously against ADR-015 (no cluster control VM). Not wrong (a runner VM is a separate option) but worth disambiguating. Name ubongo as the runner host; cross-ref ADR-015; keep "dedicated runner VM" as an explicit future option.
O9 low docs/decisions/008-testing.md:148 The "WireGuard tunnel establishment" Molecule-exclusion row is framed for the retired OPNsense VLAN-99 WireGuard; NetBird still uses WireGuard (wt0) as its data plane. Reframe the row to the NetBird wt0 data-plane (ADR-016).
O10 low docs/decisions/011-update-management.md:67 Cross-references "the scheduled_jobs plan and ADR-010"; ADR-010 is Forgejo CI, not scheduled jobs (that's TODO 8.3, unbuilt). Point to TODO 8.3 instead.
O11 low docs/CAPABILITIES.md §10 No row for the /verify-service (Level 4) capability though ADR-017 decided it. Add an Operations row for /verify-service.
O12 low docs/TODO.md:30 (item 3.10) Garbled text ("maybe something in the improvements of the methods in boma moods the point?") — unfollowable. Rewrite the question clearly or strike it.

Deterministic-scan noise (not fixed — known limitations)

  • broken-path-ref ×14 — all illustrative/future paths: report-name templates (docs/testing/reviews/YYYY-MM-DD-<service>.md) and latest.md files not yet created. The path-ref check stops at the <placeholder> boundary, so a templated path registers as a partial broken ref. Potential scanner improvement: skip a path ref immediately followed by a placeholder char or a YYYY-MM-DD token.
  • marker ×35 — mostly prose references to TODO.md items, not code markers. Known noise; the regex already excludes TODO.md/alternations but not "TODO 8.2" prose.
  • open-deferred-item ×6 — all confirmed genuinely open (see above). 0 stale-deferred. New check healthy.

Diff vs prior run (2026-05-30)

  • Recurring: the Terraform-writes-DNS contradiction (AF1) — fixed in README/ADR-003 last run, reappeared in ADR-005/new-host.md. Signal that this phrasing keeps being copied; worth a /review-repo-time grep for "writes … DNS A record".
  • New: everything else — the repo gained ADR-010…017 and the ubongo/NetBird/ Level-4 work since the prior run, so most findings are fresh propagation gaps.
  • Resolved: prior-run open items were largely addressed during the intervening doc work (control-node-as-VM, WireGuard framing, etc., now mostly reconciled).

Follow-up prompt

Thread the ADR-017 VERIFY.md convention through the remaining docs (O1O3): add a VERIFY.md row to ADR-004's service-role file table, a VERIFY.md step to new-role.md (and reconcile STATUS.md:17), and refresh README.md's ADR list + docs/ tree. Then settle the askari inventory group name (O4) and propagate it to CLAUDE.md host-groups + ADR-009 valid-groups. Finally clear the stale labels O5 (ADR-006 backend.tf) and O6 (ADR-014 plugin reproducibility = DONE).