boma/docs/reviews/2026-06-05-review.md
sjat 3dd03d4198 review-repo: 2026-06-05 report (4 auto-fixed, 12 open)
Stale-deferred check exercised: 6 open-deferred-items all confirmed genuinely
open, 0 stale-deferred. Top open: thread ADR-017 VERIFY.md convention through
ADR-004/new-role/README; name the askari inventory group (ADR-016).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-05 18:24:39 +02:00

93 lines
7.6 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Repo review — 2026-06-05
- **Reviewed commit:** `f566fd1` (scan); auto-fixes landed in `666ad42`
- **Mode:** on-demand (interactive)
- **Scope:** whole repo — 2 roles, 17 ADRs, 4 runbooks, 7 scripts; doc-heavy
- **Prior run:** 2026-05-30 (`de38d1c`) — 7 auto-fixed, 17 open
## Summary
| | high | medium | low | total |
|---|---|---|---|---|
| Auto-fixed | 2 | 0 | 2 | 4 |
| Open (report-only) | 0 | 5 | 7 | 12 |
This review followed a session of heavy documentation work (ADR-015 `ubongo`,
ADR-016 NetBird mesh, ADR-017 Level-4 verification). Most findings are **propagation
gaps** — a new decision landed but an older doc still reflects the prior design.
**New deferral check exercised.** `repo-scan.py` now enumerates open ADR "Deferred/
Open" items and flags any another file calls resolved-but-unmarked. This run: 6
open-deferred-items surfaced, **all confirmed genuinely open** by the cross-cutting
reviewer (ADR-011 #15, ADR-015 #3), **0 stale-deferred**. The check produced no false
resolutions and the judgement layer agreed — working as designed.
## Auto-fixes applied (`666ad42`)
| id | dim | sev | location | fix |
|---|---|---|---|---|
| AF1 | consistency | high | `docs/decisions/005-bootstrapping.md:36`, `docs/runbooks/new-host.md:62,71` | Removed "Terraform writes the host's DNS A record" — contradicts ADR-009 (the `dns` role owns the zone). **Recurring**: the 2026-05-30 run fixed the same contradiction in README/ADR-003; it reappeared in two more files. |
| AF2 | consistency | high | `docs/decisions/005-bootstrapping.md:8` | Control node described as cloned from the cloud-init template; ADR-015 makes `ubongo` a physical box installed directly. Corrected. |
| AF3 | consistency | low | `CLAUDE.md:197` | Added the missing `docs/testing/service-verify-template.md` row to Further reading (parallels the security-template row). |
| AF4 | cruft | low | `docs/TODO.md:79` | Typos: "we we" → "we"; "seperate" → "separate". |
## Open findings (report-only)
### VERIFY.md propagation cluster (ADR-017 not fully threaded through)
| id | sev | location | finding | suggested fix |
|---|---|---|---|---|
| O1 | medium | `docs/decisions/004-docker-model.md` (file table) | The service-role standard lists `SECURITY.md` but not `VERIFY.md`, though ADR-017 + CLAUDE.md:85 now mandate it. | Add a `VERIFY.md` row to ADR-004's file table. |
| O2 | medium | `docs/runbooks/new-role.md` (step 9 → Commit) | No step to write `VERIFY.md` for service roles (only `SECURITY.md`). Makes `STATUS.md:17` ("runbooks current and mutually reconciled") slightly overstated. | Add a "write the per-service verification spec" step mirroring the SECURITY.md step. |
| O3 | low | `README.md:58-60, 94` | ADR list stops at 001009 (010017 absent); the `docs/` tree omits `security/`, `testing/`, `hardware/`. | Extend the ADR list (or point to `docs/decisions/` + CLAUDE.md's table); expand the `docs/` subtree. |
### Design gaps from the recent ADRs
| id | sev | location | finding | suggested fix |
|---|---|---|---|---|
| O4 | medium | `CLAUDE.md:106`, `docs/decisions/009-provisioning-handoff.md:78`, `scripts/tf_to_inventory.py:24` | ADR-016 says "`askari` is Ansible-managed — its own inventory group", but no group is named anywhere; host-groups list + valid-groups set don't include it. | Decide the group name (e.g. `edge_hosts`/`hetzner_hosts`), add to CLAUDE.md host groups + ADR-009 valid groups. (`askari` is manual like the control node, so `tf_to_inventory.py` need not generate it, but the group must be valid.) |
| O5 | medium | `docs/decisions/006-terraform.md:78` | `backend.tf` labelled "Forgejo state backend", contradicting ADR-006's own State-backend section (local state on `ubongo`; Forgejo's API is read-only). | Relabel to "local state backend (no remote backend)". |
| O6 | medium | `docs/decisions/014-knowledge-sourcing.md:88` | Plugin-reproducibility described as open ("tracked in `docs/TODO.md`"), but TODO 10.7 is marked DONE (settings.json declares the plugin set; claude-code-setup.md covers bootstrap). | Update to reflect the resolved state; drop the forward-pointer. |
### Clarity / lower-priority consistency
| id | sev | location | finding | suggested fix |
|---|---|---|---|---|
| O7 | low | `docs/decisions/011-update-management.md:128` | "Digest-pinning the stateful tier" sits in the ruled-out table, but Decision #2 *adopts* `tag@digest` for stateful (TODO 16 confirms). ADR-011 is still **Proposed/draft**. | Remove/replace the ruled-out row when accepting ADR-011 (TODO 16). |
| O8 | low | `docs/decisions/003-toolchain.md:85`, `docs/decisions/010-forgejo-ci.md:66` | "act_runner on the control node **or a dedicated runner VM**" reads ambiguously against ADR-015 (no cluster control VM). Not wrong (a runner VM is a separate option) but worth disambiguating. | Name `ubongo` as the runner host; cross-ref ADR-015; keep "dedicated runner VM" as an explicit future option. |
| O9 | low | `docs/decisions/008-testing.md:148` | The "WireGuard tunnel establishment" Molecule-exclusion row is framed for the retired OPNsense VLAN-99 WireGuard; NetBird still uses WireGuard (`wt0`) as its data plane. | Reframe the row to the NetBird `wt0` data-plane (ADR-016). |
| O10 | low | `docs/decisions/011-update-management.md:67` | Cross-references "the `scheduled_jobs` plan and ADR-010"; ADR-010 is Forgejo CI, not scheduled jobs (that's TODO 8.3, unbuilt). | Point to TODO 8.3 instead. |
| O11 | low | `docs/CAPABILITIES.md` §10 | No row for the `/verify-service` (Level 4) capability though ADR-017 decided it. | Add an Operations row for `/verify-service`. |
| O12 | low | `docs/TODO.md:30` (item 3.10) | Garbled text ("maybe something in the improvements of the methods in boma moods the point?") — unfollowable. | Rewrite the question clearly or strike it. |
### Deterministic-scan noise (not fixed — known limitations)
- **`broken-path-ref` ×14** — all illustrative/future paths: report-name templates
(`docs/testing/reviews/YYYY-MM-DD-<service>.md`) and `latest.md` files not yet
created. The path-ref check stops at the `<placeholder>` boundary, so a templated
path registers as a partial broken ref. *Potential scanner improvement: skip a path
ref immediately followed by a placeholder char or a `YYYY-MM-DD` token.*
- **`marker` ×35** — mostly prose references to `TODO.md` items, not code markers.
Known noise; the regex already excludes `TODO.md`/alternations but not "TODO 8.2"
prose.
- **`open-deferred-item` ×6** — all confirmed genuinely open (see above). `0`
stale-deferred. New check healthy.
## Diff vs prior run (2026-05-30)
- **Recurring:** the Terraform-writes-DNS contradiction (AF1) — fixed in README/ADR-003
last run, reappeared in ADR-005/new-host.md. Signal that this phrasing keeps being
copied; worth a `/review-repo`-time grep for "writes … DNS A record".
- **New:** everything else — the repo gained ADR-010…017 and the `ubongo`/NetBird/
Level-4 work since the prior run, so most findings are fresh propagation gaps.
- **Resolved:** prior-run open items were largely addressed during the intervening
doc work (control-node-as-VM, WireGuard framing, etc., now mostly reconciled).
## Follow-up prompt
> Thread the ADR-017 `VERIFY.md` convention through the remaining docs (O1O3): add a
> `VERIFY.md` row to ADR-004's service-role file table, a VERIFY.md step to
> `new-role.md` (and reconcile STATUS.md:17), and refresh `README.md`'s ADR list +
> `docs/` tree. Then settle the `askari` inventory group name (O4) and propagate it to
> CLAUDE.md host-groups + ADR-009 valid-groups. Finally clear the stale labels O5
> (ADR-006 backend.tf) and O6 (ADR-014 plugin reproducibility = DONE).