# Repo review — 2026-06-14 - **Reviewed commit:** `e346137` (docs(plan): M4b — NetBird coordinator service role) - **Mode:** on-demand (interactive — auto-fixes applied + committed) - **Previous run:** 2026-06-11 (`67f2aba`) - **`make lint`:** green before and after fixes (260 files, profile production; check-tags OK). ## Summary A lot shipped since the last review (M4a: `docker_host` Docker engine, `reverse_proxy` Caddy applied to askari; offsite Terraform env live; ADR-024). Most findings this run are the predictable **docs-lagging-the-build** kind — stale "not built yet" notes, a reverse-proxy that switched from DNS-01/custom-image to vanilla HTTP-01 leaving stale descriptions behind, and the **Traefik→Caddy** rename only half-propagated through the ADR set. The previous run's blocker (O1, `make lint` RED) is **resolved**. ### Counts | Dimension | High | Medium | Low | Total | |---|---|---|---|---| | Cruft / staleness | 0 | 0 | 0 | 0 | | Design conformance | 1 | 2 | 2 | 5 | | Consistency & intent | 2 | 2 | 9 | 13 | | Docs-vs-reality drift | 1 | 4 | 5 | 10 | | **Open total** | **4** | **8** | **16** | **29** | Plus **11 auto-fixes applied** (3 high, 5 medium, 3 low). ### Phase-0 scan `repo-scan.py`: 5 roles, 25 ADRs · broken-adr-ref=4, broken-path-ref=2, marker=14, open-deferred-item=5, **stale-deferred=0**. Every scan finding is a known false-positive (test fixtures ADR-099/100; the `roles/netbird/` references in the M4b *plan* for unbuilt work; superpowers planning artifacts; `019-tagging.md:14` is prose about "over-tagging", not a TODO). Details in the findings JSON. ### Deferral checklist All 5 ADR-011 "Open questions" (Proxmox snapshot driver, exact cadences, health-check harness home, classification home, staging-first) confirmed **genuinely still open** — ADR-011 is still Proposed/unbuilt, the same questions sit open in `docs/TODO.md` item 16, and no later ADR or STATUS decides any of them. **No stale-deferred** (same as last run). ## Auto-fixes applied All safe/obvious (stale text contradicting code/reality, partial enumerations, broken descriptions) — no logic, variable, secret, or task-order changes. | ID | Sev | File | What | |---|---|---|---| | AF1 | high | `roles/reverse_proxy/meta/main.yml` | description still said DNS-01 + custom on-host image → rewrote to vanilla Caddy + HTTP-01 (matches the role since b7e919d) | | AF2 | med | `roles/README.md` | base hardening + docker_host/reverse_proxy/public_dns build-state was stale → reconciled with STATUS | | AF3 | med | `playbooks/README.md` | stale "docker_host has no tasks" note; added missing `dns.yml` + `offsite.yml` bullets | | AF4 | low | `roles/public_dns/README.md` | "askari in M4" → askari + `*.askari` records applied in M4a | | AF5 | low | `scripts/README.md` | added the missing `check-tags.py` entry (run by `make lint`) | | AF6 | med | `terraform/README.md` | added `modules/hetzner_vm` + `environments/offsite` (the one applied env) | | AF7 | low | `terraform/environments/offsite/providers.tf` | verified-stamp `cax11@hel1` → `cx23@hel1` (actual server) | | AF8 | low | `terraform/modules/hetzner_vm/variables.tf` | `server_type` example `cax11 (ARM)` → `cx23 (x86) or cax11 (ARM)` | | AF9 | med | `inventories/production/group_vars/all/public_dns.yml` | wildcard comment "cert via DNS-01" → ACME HTTP-01 (M4a) | | AF10 | high | `docs/CAPABILITIES.md` | reverse-proxy candidate `Traefik` → `Caddy (ADR-024)`; public DNS "apply pending" → "applied (M1)" | | AF11 | low | `README.md` | Documentation ADR list extended ADR-017 → ADR-024 | ## Open findings (prioritised) ### High - **O1 — drift — STATUS.md:41 (+45-48) ↔ 33-34** *(new)*: docker_host still appears in the "Scaffolded but empty — NOT implemented" table as a no-op, contradicting its own "Built + applied" rows and the real tasks file. Reword the scaffold row + closing paragraph (left for the operator — STATUS is the ground-truth doc). - **O2 — consistency — ADR-004:105,131 ↔ ADR-022** *(recurring)*: ADR-004 says backup is "not in scope of this repo"; ADR-022 defines a full in-repo backup doctrine. Repoint ADR-004 at ADR-022 (ADR↔ADR design decision — report). - **O3 — consistency — ADR-024 Consequences ↔ ADR-008:70/017:27,88/019:52** *(new)*: ADR-024 claims it updated ADR-017's Traefik prose to Caddy; it didn't, and ADR-008/019 still say Traefik too. Either finish the rename or soften ADR-024's claim. - **O4 — conformance — ADR-023:7-8,77-80 ↔ ADR-016/017/018** *(recurring)*: ADR-023 claims ADRs 001–018 were restructured to lead with `## Status`, but 016/017/018 still open with `## Context` and bury Status. Fix the three ADRs or correct ADR-023 §6. ### Medium - **O5 — ADR-004:48-50** *(recurring)*: service-role file table omits ACCESS.md + BACKUP.md rows (now mandated by CLAUDE.md/ADR-021/022). - **O6 — ADR-002:82** *(recurring)*: `make deploy PLAYBOOK=upgrade` cited as real, but no `upgrade.yml` exists and ADR-011 is unbuilt — needs a `(planned)` caveat. - **O7 — CAPABILITIES:150-155 ↔ STATUS:29** *(recurring)*: nvim/tmux listed as a "confirmed exclusion" while `dev_env` installs them on ubongo; needs a control-host carve-out (not a token swap, so left from AF10). - **O8 — dev_env tasks (include_tasks + per_user.yml:4-9)** *(recurring)*: untagged `set_fact dev_env__home` preflight + include without `apply: tags:`; a partial `--tags users|config` run breaks (base guards this; dev_env doesn't). - **O9 — inventories/production/hosts.yml** *(recurring)*: header claims TF-generated but it's hand-maintained (carries ubongo, omits offsite_hosts); `tf-inventory` would drop ubongo. Make the header honest. - **O10 — group_vars/all/vars.yml:42 ↔ ADR-007** *(recurring)*: ubongo `10.20.10.151` is in no ADR-007 subnet and undocumented; `base__firewall_control_addr` depends on it. - **O11 — terraform tfvars.example (both envs)** *(recurring)*: `pve01` vs ADR-007's `pve0`; verify the real node name before changing. - **O12 — roles/reverse_proxy/** *(new)*: first built+applied service role, but missing SECURITY/VERIFY/ACCESS/BACKUP.md. (Recorded judgement: public_dns is exempt — control- node external-API role, not a host service.) - **O15 — runbooks/new-host.md Part E** *(recurring)*: still describes an `ansible` user on ubongo; STATUS says ubongo is managed as `sjat` (ansible-user bootstrap pending). - **O18 — ADR-007/009/016 internal-zone name** *(new)*: `boma.baobab.band` vs target `boma.wingu.me` used inconsistently across the doc set after M1; state the transition in one place. ### Low O13 (See-also vs `## Related` in ADR-012/013/015/016/017/018 — recurring), O14 (docs/README + inventories/README narrow enumerations — recurring), O16 (.zshrc rclone alias + unguarded direnv hook — recurring), O17 (oh_my_posh zen.toml tasks missing `config` tag — recurring), O19 (ADR-009:122 `nyumbani` example after retirement — recurring), O20 (ROADMAP M2 CAX11/ARM vs cx23/x86 — new), O21 (ADR-020 "ports will be added in M4" stale; already opened in M4a — new), O22 (ADR-024 body still asserts custom- image obligation contradicting its revised Status — new), O23 (`netbird_coordinator` vs `netbird` role name across ADRs/ROADMAP/plan — new), O24 (`*.boma.` vs `*.` wildcard scope ADR-024 vs ROADMAP — new), O25 (`tags: [verify]` out of the ADR-019 vocabulary in molecule verify — new), O26 (reverse_proxy templates lack `ansible_managed` header — new), O27 (reverse_proxy vars in `group_vars/all/` not `offsite_hosts/` — new), O28 (capacity-scan.py ignores `offsite.yml` — new), O29 (offsite.yml duplicates empty groups from hosts.yml, undocumented merge — new). Full detail + suggested fixes in `2026-06-14-findings.json`. ## Themes worth a deliberate pass 1. **Finish the Traefik→Caddy rename** (O3, and ADR-024 over-claimed it was done). One sweep across ADR-008/017/019 closes it. 2. **STATUS docker_host self-contradiction** (O1) — quick, but it's the ground-truth doc. 3. **ADR-024 internal consistency** (O22) — the role went vanilla/HTTP-01 but the ADR body still mandates the custom image; reconcile §2/§3/Consequences with its own Status. 4. **dev_env tag-isolation** (O8) — the one real conformance bug with runtime impact; mirror base's `apply: tags:` guard. 5. **First service-role doc quartet** (O12) — reverse_proxy is the template for every future service role; getting SECURITY/VERIFY/ACCESS/BACKUP.md right now pays forward. ## Follow-up prompt > Work the open findings from `docs/reviews/2026-06-14-review.md`. Priority order: > (1) **O1** — fix the STATUS.md docker_host contradiction (it's built+applied, not a > no-op; reword the "Scaffolded but empty" row + the 45-48 paragraph). > (2) **O3 + O22** — finish the Traefik→Caddy rename in ADR-008:70, ADR-017:27,88, > ADR-019:52, and reconcile ADR-024's body (§2 custom image, §3 NetBird, Consequences) > with its own revised HTTP-01 Status note. > (3) **O2 + O5** — repoint ADR-004's "backup not in scope" line at ADR-022 and add > ACCESS.md + BACKUP.md rows to its service-role file table. > (4) **O8** — add `apply: tags: [users, config]` to dev_env's per_user.yml include and > tag the `dev_env__home` set_fact `always`; add a Molecule assertion that a partial > `--tags config` run still resolves the home dir. > (5) **O12** — author the four service-role doc files for `roles/reverse_proxy/` from the > templates (BACKUP.md = `backup__state: false`, re-issuable certs). > (6) **O4** — restructure ADR-016/017/018 to lead with `## Status`, or correct ADR-023 §6. > Then the medium drift items (O6 upgrade caveat, O7 nvim/tmux carve-out, O9 hosts.yml > header, O15 new-host Part E, O18 internal-zone naming). Run `make lint` after each > batch; commit per CLAUDE.md git conventions.