boma/docs/reviews/2026-06-11-findings.json
sjat 1da117d65b docs(review): 2026-06-11 repo audit — fix build-wave doc drift
/review-repo run at 67f2aba. Auto-fixed 5 safe doc-drift items left by the
base(firewall)+dev_env build wave: README/playbook/role notes that still called
the roles "empty/not built", plus README tree gaps and the reciprocal ADR-021
cross-links in ADR-016/020.

18 open findings reported (not fixed). Headline: `make lint` is red on `main`
(site.yml imports the non-existent docker_host role) and an ADR-004 <-> ADR-022
backup-scope contradiction. Deferral checklist clean (0 stale-deferred); 7 of
12 prior findings confirmed resolved. See docs/reviews/2026-06-11-review.md.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-11 14:48:00 +02:00

65 lines
16 KiB
JSON

{
"date": "2026-06-11",
"reviewed_commit": "67f2aba",
"fixes_commit": null,
"mode": "on-demand",
"counts": {
"auto_fixed": 5,
"open": 18,
"scan": {
"broken-adr-ref": 4,
"broken-path-ref": 1,
"marker": 14,
"open-deferred-item": 5,
"stale-deferred": 0
}
},
"deferral_checklist": {
"adr-011-open-items": "all 5 (snapshot driver, cadences, health-check harness home, classification home, staging-first) confirmed genuinely still open; cross-checked against later ADRs + TODO 16. No stale-deferred.",
"adr-015-deferred": "deferred #1 (mesh VPN) #2 (service-UI) #3 (build) all confirmed marked RESOLVED in place. No stale-deferred.",
"stale_deferred_found": 0
},
"scan_false_positives": [
{"check": "broken-path-ref", "location": "STATUS.md:38", "why": "STATUS legitimately documents roles/docker_host/ as 'Not in git.' — intentional reference to an unbuilt role."},
{"check": "broken-adr-ref", "location": "tests/test_repo_scan.py:10,43; docs/superpowers/plans/2026-06-10-adr-structure.md:50,83", "why": "ADR-099/ADR-100 are intentional test fixtures exercising the scanner's bad-ref detection."},
{"check": "marker", "location": "docs/superpowers/plans/*, docs/superpowers/specs/*, docs/decisions/019-tagging.md:14", "why": "All 14 markers are in historical planning artifacts (commit-message TODOs, plan steps) or prose discussing 'over-tagging' as a concept — not actionable cruft."}
],
"auto_fixed": [
{"id": "AF1", "dimension": "drift", "severity": "high", "location": "roles/README.md:11-13", "description": "'base and docker_host not built yet — empty, untracked dirs, so site.yml would fail on a clean clone' contradicts STATUS.md: base is partially built (firewall concern, tracked), docker_host does not exist, dev_env is built+applied.", "fix": "rewrote Current-state paragraph: base partially built (firewall), docker_host not yet created, dev_env built+applied.", "tag": "new"},
{"id": "AF2", "dimension": "drift", "severity": "medium", "location": "playbooks/site.yml:4-5", "description": "NOTE claimed base + docker_host 'not built yet ... fails on a clean clone'; base's firewall concern is built+applied per STATUS.md.", "fix": "NOTE now states base is partially built (firewall) and only docker_host is missing.", "tag": "new"},
{"id": "AF3", "dimension": "drift", "severity": "medium", "location": "playbooks/README.md:6-8", "description": "site.yml described as 'currently a no-op' (roles empty); base's firewall now applies real nftables state. workstation.yml (applies dev_env) was unlisted.", "fix": "reworded the no-op claim and added a workstation.yml bullet.", "tag": "new"},
{"id": "AF4", "dimension": "drift", "severity": "low", "location": "README.md:58-76", "description": "project-structure tree omitted docs/access/, docs/backup/, roles/dev_env/, and playbooks/workstation.yml — all present on disk.", "fix": "added the four missing tree entries.", "tag": "recurring"},
{"id": "AF5", "dimension": "consistency", "severity": "low", "location": "docs/decisions/016-mesh-vpn.md:110; docs/decisions/020-firewall.md:135", "description": "ADR-021 states it amends ADR-016 and ADR-020 to cross-reference the SSH ladder, but neither listed ADR-021 back in its See-also/Related section.", "fix": "added the reciprocal ADR-021 cross-reference to both.", "tag": "new"}
],
"open": [
{"id": "O1", "dimension": "conformance", "severity": "high", "location": "playbooks/site.yml:18", "description": "`make lint` is RED on `main`: site.yml imports the `docker_host` role which does not exist, so ansible-lint syntax-check fails on a clean checkout. Violates CLAUDE.md 'main must always work' and 'Never skip lint' (pre-commit would block every commit unless bypassed).", "suggested_fix": "Decide an interim posture: guard the docker_host play (e.g. skip until the role exists), stub the role via `make new-role NAME=docker_host`, or exclude site.yml from syntax-check until built — and record it. Judgement call.", "tag": "new", "auto_fixable": false},
{"id": "O2", "dimension": "consistency", "severity": "high", "location": "docs/decisions/004-docker-model.md:105 ↔ docs/decisions/022-backup.md", "description": "ADR-004 'Persistent data' says 'Backup strategy is defined separately (not in scope of this repo).' ADR-022 defines a full in-repo backup strategy (backup role, fisi pull node, per-service backup__* + BACKUP.md). Direct ADR↔ADR contradiction on scope.", "suggested_fix": "Update ADR-004's line to point at ADR-022 (backup is now in-repo scope) and cross-link, per ADR-023's no-silent-reversal rule. Design decision — report only.", "tag": "new", "auto_fixable": false},
{"id": "O3", "dimension": "consistency", "severity": "medium", "location": "docs/decisions/004-docker-model.md:48-49", "description": "ADR-004's service-role file table (the canonical standard) lists only SECURITY.md + VERIFY.md, but CLAUDE.md + ADR-021/ADR-022 now mandate ACCESS.md (every service role) and BACKUP.md (stateful service roles).", "suggested_fix": "Add ACCESS.md (ADR-021) and BACKUP.md (ADR-022) rows to ADR-004's service-role file table. (Prior O1 'missing VERIFY.md' is now resolved — this is the next evolution.)", "tag": "new", "auto_fixable": false},
{"id": "O4", "dimension": "consistency", "severity": "medium", "location": "docs/CAPABILITIES.md:149-154 ↔ STATUS.md:29", "description": "CAPABILITIES lists nvim/tmux/shell config as a CONFIRMED EXCLUSION ('boma is server-only, so these are correctly absent'), but the dev_env role (built+applied to ubongo) installs exactly zsh+oh-my-zsh+tmux+neovim.", "suggested_fix": "Carve out an exception for the control-node developer/AI-worker environment (ubongo, ADR-015) rather than flatly excluding nvim/tmux; distinguish infra worker-host config from personal desktops.", "tag": "new", "auto_fixable": false},
{"id": "O5", "dimension": "drift", "severity": "medium", "location": "docs/decisions/002-security.md:82", "description": "References `make deploy PLAYBOOK=upgrade` as the deliberate full-upgrade mechanism, but no upgrade.yml playbook exists (only bootstrap/site/workstation) and ADR-011 update-management is still Proposed/unbuilt — stated without the '(planned)' caveat ADR-002 uses for its other unbuilt controls.", "suggested_fix": "Add a '(planned — ADR-011, not yet built)' caveat to the upgrade line.", "tag": "new", "auto_fixable": false},
{"id": "O6", "dimension": "drift", "severity": "medium", "location": "inventories/production/hosts.yml:7-16; inventories/staging/hosts.yml:7-14", "description": "Committed hosts.yml stubs omit the offsite_hosts group, but it is one of the four VALID_GROUPS in tf_to_inventory.py and in ADR-009/ADR-016/CLAUDE.md; the next `make tf-inventory` would add it, so the hand-stubs have drifted. (Prior O4 'askari group unnamed' is resolved — naming is now consistent; this is the residual stub gap.)", "suggested_fix": "Regenerate via `make tf-inventory TF_ENV=production` and `TF_ENV=staging` (do NOT hand-edit hosts.yml — CLAUDE.md), or accept the stubs lag until TF runs.", "tag": "new", "auto_fixable": false},
{"id": "O7", "dimension": "drift", "severity": "medium", "location": "docs/runbooks/new-host.md:81-130", "description": "Part E (control node ubongo) instructs creating an 'ansible' user and 'ssh ansible@<IP>', but STATUS.md records ubongo is deliberately managed as the operator account sjat (group_vars/control ansible_user: sjat) with the ansible-user bootstrap listed as Pending.", "suggested_fix": "Update Part E to reflect ubongo managed as sjat (no ansible user yet), ansible-user bootstrap a pending item per STATUS.md.", "tag": "new", "auto_fixable": false},
{"id": "O8", "dimension": "conformance", "severity": "medium", "location": "roles/dev_env/tasks/per_user.yml:2-9", "description": "The getent + `set_fact: dev_env__home` preflight is untagged, but downstream tasks that consume dev_env__home carry concern tags (users, config). A partial `--tags users` or `--tags config` run skips the set_fact, leaving dev_env__home undefined and failing the tagged tasks — against ADR-019's concern-runnable-in-isolation intent.", "suggested_fix": "Tag the preflight with the union of dependent concerns ([users, config]) or `always`.", "tag": "new", "auto_fixable": false},
{"id": "O9", "dimension": "consistency", "severity": "medium", "location": "STATUS.md:31 ↔ docs/decisions/007-network.md", "description": "STATUS places ubongo at 10.20.10.151; ADR-007 defines srv as 10.20.0.0/24 and mgmt as 10.10.0.0/24 — 10.20.10.151 is in neither. base__firewall_control_addr (ADR-021 recovery path) depends on this address being correct. Already a tracked follow-up in the ubongo-build plan (line 147).", "suggested_fix": "Either correct ubongo's recorded address to a valid ADR-007 subnet, or amend ADR-007 to document the actual VLAN/subnet ubongo's physical port lives on, before base__firewall_control_addr is populated.", "tag": "new", "auto_fixable": false},
{"id": "O10", "dimension": "drift", "severity": "low", "location": "README.md:104-106", "description": "README's Documentation ADR list stops at 017; ADRs 018 (logging), 019 (tagging), 020 (firewall), 021 (access), 022 (backup), 023 (ADR structure) exist and are in CLAUDE.md's full table. Partial enumeration is now stale. (Evolved from prior O3, which is otherwise resolved — the docs/ tree omissions were fixed in AF4.)", "suggested_fix": "Extend the list through 023, or trim it to a pointer at CLAUDE.md's full table to avoid a stale partial list.", "tag": "recurring", "auto_fixable": false},
{"id": "O11", "dimension": "conformance", "severity": "low", "location": "docs/decisions/008-testing.md:3; 014-knowledge-sourcing.md:98; 016-mesh-vpn.md:91; 017-service-ui-verification.md:66; 018-logging.md:73", "description": "ADR-023 §2 mandates section order Status→Context→Decision→Consequences. ADR-008 injects a gotchas blockquote before ## Status; ADR-014's ## Decision is a late summary after six topical sections; ADR-016/017/018 place ## Status mid-document. The scan checks presence, not order, so all pass lint — but they don't match the stated standard.", "suggested_fix": "Presentational restructure per ADR-023 §6 (move Status first; pull Decision up). No decision substance changes. Judgement call — report.", "tag": "new", "auto_fixable": false},
{"id": "O12", "dimension": "consistency", "severity": "low", "location": "docs/decisions/007-network.md:160", "description": "The naming-scheme table states the public FQDN convention is `<service>.baobab.band`, but its own example is `forgejo.nyumbani.baobab.band` (extra nyumbani label). The nyumbani split-horizon sub-label is still OPEN (TODO 4); convention and example disagree.", "suggested_fix": "Change the example to forgejo.baobab.band, or note nyumbani is an unresolved split-horizon sub-label (TODO 4). Ties to an open decision — report.", "tag": "new", "auto_fixable": false},
{"id": "O13", "dimension": "consistency", "severity": "low", "location": "roles/dev_env/files/dotfiles/zsh/.zshrc:28,55", "description": "Shipped .zshrc hard-codes `alias rclone=\"/usr/bin/rclone\"` (rclone is not installed by dev_env) and `eval \"$(direnv hook zsh)\"` unguarded (unlike the guarded oh-my-posh block) — heritage fisi/V4 carryovers. If direnv is dropped from dev_env__packages every shell startup errors.", "suggested_fix": "Drop the rclone alias (role doesn't install it) and guard the direnv hook with `command -v direnv`, or document direnv as a hard dependency of the shipped .zshrc.", "tag": "new", "auto_fixable": false},
{"id": "O14", "dimension": "consistency", "severity": "low", "location": "roles/dev_env/tasks/oh_my_posh.yml:15-26", "description": "The zen.toml theme-directory + deploy tasks render config to disk but carry no `config` tag, while analogous dotfile tasks in per_user.yml are tagged `config` — inconsistent concern tagging within the role.", "suggested_fix": "Add tags: [config] to the zen.toml directory + deploy tasks.", "tag": "new", "auto_fixable": false},
{"id": "O15", "dimension": "consistency", "severity": "low", "location": "terraform/environments/production/terraform.tfvars.example:9-11; staging/terraform.tfvars.example", "description": "proxmox_node/endpoint examples use pve01 / pve01.baobab.band, but ADR-007 defines Proxmox node names as pve0/pve1/pve2 (single digit, no leading zero). Example contradicts the naming convention.", "suggested_fix": "Change example values to pve0 / pve0.baobab.band (both envs). Verify the actual node name first — report rather than auto-fix.", "tag": "new", "auto_fixable": false},
{"id": "O16", "dimension": "consistency", "severity": "low", "location": "docs/decisions/013-heritage-v4.md:77; docs/decisions/015-control-host.md", "description": "ADR-013 and ADR-015 close with an inline 'See also:' prose line, whereas ADRs 014/019/020/021/022 and the adr-template use a dedicated `## Related` section. Stylistic inconsistency (## Related is optional per ADR-023 §3).", "suggested_fix": "Convert the 'See also:' prose in ADR-013/015 into ## Related sections for uniformity. Cosmetic.", "tag": "new", "auto_fixable": false},
{"id": "O17", "dimension": "cruft", "severity": "low", "location": "roles/dev_env/handlers/main.yml; roles/base/handlers/main.yml", "description": "Both roles ship an empty handlers/main.yml (only `---`); neither defines or uses handlers (base's firewall apply/rollback is deliberately in tasks). Scaffold artifacts from make new-role.", "suggested_fix": "Confirm whether empty scaffold files are an intentional convention; if not, delete. Low priority.", "tag": "new", "auto_fixable": false},
{"id": "O18", "dimension": "consistency", "severity": "low", "location": "docs/README.md:5-8; inventories/README.md:1-12", "description": "docs/README.md lists only decisions/ + runbooks/ (omits security/testing/access/backup/hardware/reviews/superpowers); inventories/README.md omits the offsite_hosts group documented in CLAUDE.md. Both are narrower than current reality.", "suggested_fix": "Add the missing subdir rows / note offsite_hosts, or explicitly defer to the canonical list. Low priority.", "tag": "new", "auto_fixable": false}
],
"prior_resolved": [
{"id": "O1@2026-06-05", "description": "ADR-004 service-role table missing VERIFY.md row", "status": "resolved — table now lists SECURITY.md + VERIFY.md (next gap ACCESS/BACKUP tracked as O3)"},
{"id": "O2@2026-06-05", "description": "new-role runbook missing VERIFY.md step", "status": "resolved — step 10 present"},
{"id": "O3@2026-06-05", "description": "README ADR list + docs/ tree omissions", "status": "partial — docs tree security/testing/hardware now present; access/backup fixed in AF4; ADR-list staleness carried as O10"},
{"id": "O4@2026-06-05", "description": "askari inventory group unnamed", "status": "resolved — offsite_hosts named consistently (residual stub gap = O6)"},
{"id": "O5@2026-06-05", "description": "backend.tf mislabelled Forgejo state backend", "status": "resolved — now labelled local state"},
{"id": "O6@2026-06-05", "description": "ADR-014 plugin reproducibility described open but TODO done", "status": "resolved"},
{"id": "O11@2026-06-05", "description": "CAPABILITIES missing /verify-service Level-4 row", "status": "resolved — present (§10)"},
{"id": "O12@2026-06-05", "description": "TODO 3.10 garbled", "status": "resolved — readable"},
{"id": "O7-O10@2026-06-05", "description": "ADR-011 digest-pinning row; act_runner ambiguity; WireGuard Molecule row; ADR-011 scheduled_jobs cross-ref", "status": "not re-detected this run (ADR-011 still Proposed) — verify on next run"}
]
}