Add /review-repo command with deterministic pre-scan and reviews store
New on-demand repo audit: scripts/repo-scan.py does the cheap deterministic checks (markers, broken refs, unencrypted vaults) and inventory; the command fans out judgement reviewers across four dimensions, applies only safe/obvious fixes, and writes a tracked report to docs/reviews/. Cron + email deferred. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
5c087b413b
commit
b33130eea9
5 changed files with 262 additions and 0 deletions
89
.claude/commands/review-repo.md
Normal file
89
.claude/commands/review-repo.md
Normal file
|
|
@ -0,0 +1,89 @@
|
||||||
|
# Review the repo for staleness, drift, and contradictions
|
||||||
|
|
||||||
|
Audit the whole repo across four dimensions, apply only the safe/obvious fixes,
|
||||||
|
report the rest, and write a tracked report to `docs/reviews/`.
|
||||||
|
|
||||||
|
## Dimensions (equal weight)
|
||||||
|
|
||||||
|
1. **Cruft / staleness** — code, comments, or notes left behind after the thing they
|
||||||
|
addressed was resolved; commented-out blocks; done-but-undeleted TODOs; orphans.
|
||||||
|
2. **Design conformance** — does the repo obey the ADRs and CLAUDE.md conventions
|
||||||
|
(FQCN, tags, `rolename__var`, nested vault, Makefile-as-interface, etc.)?
|
||||||
|
3. **Consistency & intent** — contradictions (doc↔doc and doc↔code), overlaps,
|
||||||
|
duplication, and whether intent is actually documented.
|
||||||
|
4. **Docs-vs-reality drift** — do `STATUS.md` and the ADRs match what is actually
|
||||||
|
built?
|
||||||
|
|
||||||
|
## Rubric (the source of truth to measure against)
|
||||||
|
|
||||||
|
- `docs/decisions/` (ADRs) — the design decisions.
|
||||||
|
- `CLAUDE.md` — the conventions.
|
||||||
|
- `STATUS.md` — what is actually built vs only planned.
|
||||||
|
|
||||||
|
## Process
|
||||||
|
|
||||||
|
### Phase 0 — deterministic pre-scan
|
||||||
|
Run `python3 scripts/repo-scan.py > /tmp/repo-scan.json`. It returns the **inventory**
|
||||||
|
(roles, ADRs, runbooks, playbooks, scripts — your shard list) and **exact findings**
|
||||||
|
(markers, broken refs, unencrypted vaults). Fold these into the report verbatim.
|
||||||
|
|
||||||
|
### Phase 1 — fan-out judgement review
|
||||||
|
Scale to repo size:
|
||||||
|
- **Small** (≤ ~10 roles, like boma today): a few sub-agents, or one pass per area.
|
||||||
|
- **Large** (many roles/docs — heading toward AnsibleBaobabV4's 100+ roles): fan out
|
||||||
|
**one reviewer per shard** (role / doc-cluster / ADR group) in parallel via the
|
||||||
|
Agent tool, or the Workflow orchestrator for the biggest sweeps.
|
||||||
|
|
||||||
|
Give each reviewer: its slice **plus the rubric** and the four dimensions. Each
|
||||||
|
returns structured findings — `{dimension, severity (high|medium|low),
|
||||||
|
location (file:line), description, suggested_fix, auto_fixable (bool)}`.
|
||||||
|
|
||||||
|
### Phase 2 — synthesis
|
||||||
|
- Merge and dedupe all findings (deterministic + reviewer).
|
||||||
|
- Run **one cross-cutting reviewer** over the full ADR set + `STATUS.md` + `CLAUDE.md`
|
||||||
|
to catch contradictions that span files (per-shard agents can't see these).
|
||||||
|
- Diff against the previous run's `docs/reviews/<prev>-findings.json` and tag each
|
||||||
|
finding **new / recurring / resolved**.
|
||||||
|
- Prioritise by severity; split into auto-fixable vs report-only.
|
||||||
|
|
||||||
|
### Phase 3 — act
|
||||||
|
- Apply **only** the safe/obvious fixes (allow-list below). Run `make lint` after;
|
||||||
|
revert any fix that breaks it.
|
||||||
|
- Write the report and machine findings (see Output) and generate a follow-up prompt.
|
||||||
|
- Commit per CLAUDE.md git conventions (small/safe → `main`; a larger batch → a
|
||||||
|
branch, diff shown first). Summarise to the user: counts, what was auto-fixed, the
|
||||||
|
top open findings, and the follow-up prompt.
|
||||||
|
|
||||||
|
## Auto-fix allow-list — safe and obvious ONLY
|
||||||
|
|
||||||
|
**Allowed:** typos in prose/comments; broken cross-reference paths (repoint to the
|
||||||
|
correct existing path); dead commented-out blocks with no explanatory value;
|
||||||
|
obviously-stale comments that contradict the code they annotate; trivial formatting;
|
||||||
|
updating a doc reference to a renamed file.
|
||||||
|
|
||||||
|
**Never:** anything that changes runtime behaviour, logic, variables, or task order;
|
||||||
|
anything touching secrets, vault files, or credentials; anything where intent is
|
||||||
|
ambiguous or the answer is a judgement call; deleting files; resolving a
|
||||||
|
contradiction between two ADRs (that is a design decision — report it).
|
||||||
|
|
||||||
|
**When unsure, REPORT — do not fix.**
|
||||||
|
|
||||||
|
## Output (written to `docs/reviews/`)
|
||||||
|
|
||||||
|
- `docs/reviews/<YYYY-MM-DD>-review.md` — human report.
|
||||||
|
- `docs/reviews/<YYYY-MM-DD>-findings.json` — machine-readable (next run's diff + the
|
||||||
|
cron email payload).
|
||||||
|
- `docs/reviews/latest.md` — overwrite with a copy of the newest report.
|
||||||
|
|
||||||
|
Report contents: run metadata (date, reviewed commit SHA); summary counts by
|
||||||
|
dimension/severity; auto-fixes applied (with file list); open findings (prioritised,
|
||||||
|
grouped by dimension, each with location + suggested fix + new/recurring tag); and a
|
||||||
|
generated, copy-pasteable **follow-up prompt**. See `docs/reviews/README.md`.
|
||||||
|
|
||||||
|
## Headless / cron (future — `docs/todo.md` "Scheduled work")
|
||||||
|
|
||||||
|
When run non-interactively (`claude -p`, e.g. a fortnightly cron): **report only** —
|
||||||
|
do not apply fixes or push to `main`. Write the report artifact and email its summary
|
||||||
|
+ the follow-up prompt to the maintainer via the machine's email skill (other
|
||||||
|
projects on this host already use it). The report file is the email payload. Validate
|
||||||
|
the headless email path when wiring the cron job.
|
||||||
|
|
@ -30,6 +30,7 @@ Full design rationale: `docs/decisions/`
|
||||||
| Check mode (dry run) | `make check PLAYBOOK=<name>` |
|
| Check mode (dry run) | `make check PLAYBOOK=<name>` |
|
||||||
| Deploy a playbook | `make deploy PLAYBOOK=<name>` |
|
| Deploy a playbook | `make deploy PLAYBOOK=<name>` |
|
||||||
| Scaffold a new role | `make new-role NAME=<name>` |
|
| Scaffold a new role | `make new-role NAME=<name>` |
|
||||||
|
| Review repo for drift/cruft | `/review-repo` (Claude command) |
|
||||||
| Encrypt a vault file | `make encrypt FILE=<path>` |
|
| Encrypt a vault file | `make encrypt FILE=<path>` |
|
||||||
| Decrypt a vault file | `make decrypt FILE=<path>` |
|
| Decrypt a vault file | `make decrypt FILE=<path>` |
|
||||||
| Install Python deps | `make setup` |
|
| Install Python deps | `make setup` |
|
||||||
|
|
|
||||||
|
|
@ -19,6 +19,7 @@ _Last reviewed: 2026-05-30._
|
||||||
| `git` | Initialized, trunk-based on `main`, pushed to `origin` (`forgejo.nyumbani.baobab.band:7577`). |
|
| `git` | Initialized, trunk-based on `main`, pushed to `origin` (`forgejo.nyumbani.baobab.band:7577`). |
|
||||||
| Pre-commit hooks | Configured: lint, gitleaks, vault-encryption guard. Activate with `pre-commit install` after `make setup`. |
|
| Pre-commit hooks | Configured: lint, gitleaks, vault-encryption guard. Activate with `pre-commit install` after `make setup`. |
|
||||||
| Vault password client | `scripts/vault-pass-client.sh` fetches the master password from Vaultwarden via `rbw` (wired as `vault_password_file`). Requires `rbw` installed + `rbw unlock`. |
|
| Vault password client | `scripts/vault-pass-client.sh` fetches the master password from Vaultwarden via `rbw` (wired as `vault_password_file`). Requires `rbw` installed + `rbw unlock`. |
|
||||||
|
| `/review-repo` | Repo audit: `scripts/repo-scan.py` (Phase 0) + `.claude/commands/review-repo.md`, reports to `docs/reviews/`. On-demand only; cron + email deferred (`docs/todo.md`). |
|
||||||
| Terraform HCL (`terraform/`) | Written (proxmox VM module + envs) — but never run; see below |
|
| Terraform HCL (`terraform/`) | Written (proxmox VM module + envs) — but never run; see below |
|
||||||
|
|
||||||
## Scaffolded but empty — NOT implemented
|
## Scaffolded but empty — NOT implemented
|
||||||
|
|
|
||||||
25
docs/reviews/README.md
Normal file
25
docs/reviews/README.md
Normal file
|
|
@ -0,0 +1,25 @@
|
||||||
|
# docs/reviews/
|
||||||
|
|
||||||
|
Tracked output of the `/review-repo` command (one set of files per run). This is an
|
||||||
|
**audit trail** — committed, not hand-edited. The command writes these files; don't
|
||||||
|
edit them yourself.
|
||||||
|
|
||||||
|
## Files per run
|
||||||
|
|
||||||
|
| File | Purpose |
|
||||||
|
|---|---|
|
||||||
|
| `<YYYY-MM-DD>-review.md` | Human-readable report |
|
||||||
|
| `<YYYY-MM-DD>-findings.json` | Machine-readable findings — used to diff new/recurring/resolved on the next run, and as the cron email payload |
|
||||||
|
| `latest.md` | A copy of the most recent report (stable path for quick reference / email) |
|
||||||
|
|
||||||
|
## What a report contains
|
||||||
|
|
||||||
|
- **Run metadata** — date and the commit SHA reviewed.
|
||||||
|
- **Summary** — finding counts by dimension and severity.
|
||||||
|
- **Auto-fixes applied** — what the run fixed (safe/obvious only), with a file list.
|
||||||
|
- **Open findings** — prioritised, grouped by dimension; each with a location, a
|
||||||
|
suggested fix, and a `new` / `recurring` / `resolved` tag (vs the previous run).
|
||||||
|
- **Follow-up prompt** — a copy-pasteable prompt to act on the open findings.
|
||||||
|
|
||||||
|
The four review dimensions and the auto-fix safety rules live in
|
||||||
|
`.claude/commands/review-repo.md`.
|
||||||
146
scripts/repo-scan.py
Normal file
146
scripts/repo-scan.py
Normal file
|
|
@ -0,0 +1,146 @@
|
||||||
|
#!/usr/bin/env python3
|
||||||
|
"""
|
||||||
|
Phase-0 deterministic pre-scan for the /review-repo command.
|
||||||
|
|
||||||
|
Python standard library only. Emits a JSON object to stdout:
|
||||||
|
- inventory: roles, adrs, runbooks, playbooks, scripts (the shard list)
|
||||||
|
- findings: exact, no-judgement issues (markers, broken refs, unencrypted vaults)
|
||||||
|
|
||||||
|
The *judgement* review — contradictions, design-conformance, stale intent — is done
|
||||||
|
by the /review-repo fan-out reviewers, NOT here. This script only catches the cheap,
|
||||||
|
exact things so the reviewers can focus on what needs reasoning.
|
||||||
|
|
||||||
|
Usage: python3 scripts/repo-scan.py [repo_root] > scan.json
|
||||||
|
"""
|
||||||
|
import json
|
||||||
|
import os
|
||||||
|
import re
|
||||||
|
import sys
|
||||||
|
|
||||||
|
ROOT = os.path.abspath(sys.argv[1] if len(sys.argv) > 1 else ".")
|
||||||
|
|
||||||
|
PRUNE = {".git", ".venv", ".collections", ".ansible", ".worktrees",
|
||||||
|
".pytest_cache", "node_modules", "__pycache__"}
|
||||||
|
SKIP_PREFIX = os.path.join("docs", "reviews") # don't scan our own reports
|
||||||
|
SOURCE_EXTS = {".yml", ".yaml", ".j2", ".py", ".sh", ".md", ".tf", ".cfg", ".ini"}
|
||||||
|
|
||||||
|
MARKER_RE = re.compile(r"(?<![(|])\b(TODO|FIXME|XXX|HACK)\b(?![|)])")
|
||||||
|
ADR_REF_RE = re.compile(r"\bADR-(\d{3})\b")
|
||||||
|
PATH_REF_RE = re.compile(r"(?:docs|scripts|roles|inventories|terraform|playbooks)/[\w./-]+")
|
||||||
|
PLACEHOLDER = set("<>*${}")
|
||||||
|
|
||||||
|
|
||||||
|
def walk_files():
|
||||||
|
for dirpath, dirnames, filenames in os.walk(ROOT):
|
||||||
|
dirnames[:] = [d for d in dirnames if d not in PRUNE]
|
||||||
|
for f in filenames:
|
||||||
|
yield os.path.join(dirpath, f)
|
||||||
|
|
||||||
|
|
||||||
|
def rel(path):
|
||||||
|
return os.path.relpath(path, ROOT)
|
||||||
|
|
||||||
|
|
||||||
|
def inventory():
|
||||||
|
def listdir(*parts, want_dirs=False, suffixes=None):
|
||||||
|
d = os.path.join(ROOT, *parts)
|
||||||
|
if not os.path.isdir(d):
|
||||||
|
return []
|
||||||
|
out = []
|
||||||
|
for e in sorted(os.listdir(d)):
|
||||||
|
full = os.path.join(d, e)
|
||||||
|
if want_dirs and not os.path.isdir(full):
|
||||||
|
continue
|
||||||
|
if suffixes and not e.endswith(suffixes):
|
||||||
|
continue
|
||||||
|
out.append(e)
|
||||||
|
return out
|
||||||
|
|
||||||
|
return {
|
||||||
|
"roles": listdir("roles", want_dirs=True),
|
||||||
|
"adrs": listdir("docs", "decisions", suffixes=(".md",)),
|
||||||
|
"runbooks": listdir("docs", "runbooks", suffixes=(".md",)),
|
||||||
|
"playbooks": listdir("playbooks", suffixes=(".yml", ".yaml")),
|
||||||
|
"scripts": listdir("scripts"),
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
def adr_numbers():
|
||||||
|
dec = os.path.join(ROOT, "docs", "decisions")
|
||||||
|
nums = set()
|
||||||
|
if os.path.isdir(dec):
|
||||||
|
for f in os.listdir(dec):
|
||||||
|
m = re.match(r"(\d{3})-", f)
|
||||||
|
if m:
|
||||||
|
nums.add(m.group(1))
|
||||||
|
return nums
|
||||||
|
|
||||||
|
|
||||||
|
def scan():
|
||||||
|
findings = []
|
||||||
|
adrs = adr_numbers()
|
||||||
|
for path in walk_files():
|
||||||
|
rpath = rel(path)
|
||||||
|
if rpath.startswith(SKIP_PREFIX):
|
||||||
|
continue
|
||||||
|
name = os.path.basename(path)
|
||||||
|
|
||||||
|
if name == "vault.yml":
|
||||||
|
try:
|
||||||
|
text = open(path, encoding="utf-8", errors="replace").read()
|
||||||
|
except OSError:
|
||||||
|
continue
|
||||||
|
if not text.startswith("$ANSIBLE_VAULT"):
|
||||||
|
real = [ln for ln in text.splitlines()
|
||||||
|
if ln.strip() and not ln.lstrip().startswith("#") and ln.strip() != "---"]
|
||||||
|
if real:
|
||||||
|
findings.append({"check": "vault-unencrypted", "severity": "high",
|
||||||
|
"path": rpath, "line": 1,
|
||||||
|
"detail": "vault.yml is not ansible-vault encrypted but has content"})
|
||||||
|
continue
|
||||||
|
|
||||||
|
if os.path.splitext(path)[1] not in SOURCE_EXTS:
|
||||||
|
continue
|
||||||
|
try:
|
||||||
|
lines = open(path, encoding="utf-8", errors="replace").readlines()
|
||||||
|
except OSError:
|
||||||
|
continue
|
||||||
|
|
||||||
|
for i, line in enumerate(lines, 1):
|
||||||
|
markers = sorted(set(m.group(1) for m in MARKER_RE.finditer(line)))
|
||||||
|
if markers:
|
||||||
|
findings.append({"check": "marker", "severity": "low", "path": rpath,
|
||||||
|
"line": i, "detail": f"{'/'.join(markers)}: {line.strip()[:120]}"})
|
||||||
|
for m in ADR_REF_RE.finditer(line):
|
||||||
|
if m.group(1) not in adrs:
|
||||||
|
findings.append({"check": "broken-adr-ref", "severity": "medium", "path": rpath,
|
||||||
|
"line": i, "detail": f"references ADR-{m.group(1)} (no such file)"})
|
||||||
|
for m in PATH_REF_RE.finditer(line):
|
||||||
|
# Skip paths that are part of a URL (e.g. a backend API endpoint).
|
||||||
|
token_prefix = line[max(line.rfind(" ", 0, m.start()),
|
||||||
|
line.rfind("\t", 0, m.start())) + 1:m.start()]
|
||||||
|
if "://" in token_prefix:
|
||||||
|
continue
|
||||||
|
ref = m.group(0).rstrip(".,);:`'\"")
|
||||||
|
if any(c in ref for c in PLACEHOLDER):
|
||||||
|
continue
|
||||||
|
if not os.path.exists(os.path.join(ROOT, ref)):
|
||||||
|
findings.append({"check": "broken-path-ref", "severity": "medium", "path": rpath,
|
||||||
|
"line": i, "detail": f"references '{ref}' which does not exist"})
|
||||||
|
return findings
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
result = {"root": ROOT, "inventory": inventory(), "findings": scan()}
|
||||||
|
json.dump(result, sys.stdout, indent=2)
|
||||||
|
sys.stdout.write("\n")
|
||||||
|
counts = {}
|
||||||
|
for f in result["findings"]:
|
||||||
|
counts[f["check"]] = counts.get(f["check"], 0) + 1
|
||||||
|
summary = ", ".join(f"{k}={v}" for k, v in sorted(counts.items())) or "no deterministic findings"
|
||||||
|
print(f"repo-scan: {len(result['inventory']['roles'])} roles, "
|
||||||
|
f"{len(result['inventory']['adrs'])} ADRs; {summary}", file=sys.stderr)
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
Loading…
Add table
Reference in a new issue