boma/docs/superpowers/plans/2026-06-14-kaizen-command.md
sjat d14639e80a docs(plan): /kaizen command — implementation plan (TODO 11)
7 tasks: friction-scan.py (TDD, --json/--nudge) + tests; kaizen.md command;
/review-repo nudge hookup + STATUS/TODO; dogfood run. Mirrors /review-repo.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-14 21:09:29 +02:00

641 lines
22 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# `/kaizen` Command Implementation Plan
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
**Goal:** Build the `/kaizen` kaizen-loop command — a stdlib scanner that parses `docs/FRICTION.md` *Open signals* plus an interactive command that curates them (add/change/park/remove) into the decisions ledger.
**Architecture:** Mirrors `/review-repo` exactly: a deterministic stdlib Phase-0 scanner (`scripts/friction-scan.py`, unit-tested) feeds a markdown command (`.claude/commands/kaizen.md`) that drives the interactive curation. The same scanner powers a stage-2 nudge surfaced in `/review-repo`.
**Tech Stack:** Python 3 standard library only (matches `scripts/repo-scan.py`); pytest; markdown command docs.
**Spec:** `docs/superpowers/specs/2026-06-14-kaizen-command-design.md`
---
## File structure
- Create: `scripts/friction-scan.py` — stdlib parser of `FRICTION.md` *Open signals*; `--json` (default) and `--nudge` modes. One responsibility: turn the prose signal log into structured data + the nudge line.
- Create: `tests/test_friction_scan.py` — unit tests for the parser (string-based, deterministic via `--today`), matching `tests/test_repo_scan.py`.
- Create: `.claude/commands/kaizen.md` — the interactive curation process.
- Modify: `.claude/commands/review-repo.md` — add the stage-2 nudge line to its report.
- Modify: `STATUS.md` — add a `/kaizen` row.
- Modify: `docs/TODO.md` — mark item 11.1 in progress / built.
All scanner logic lives in functions that take strings/data (not files) so tests need no fixtures on disk; only `load_signals(path, today)` and `main()` touch the filesystem.
---
## Task 1: Scanner scaffold — section extraction + signal splitting
**Files:**
- Create: `scripts/friction-scan.py`
- Test: `tests/test_friction_scan.py`
- [ ] **Step 1: Write the failing test**
```python
# tests/test_friction_scan.py
import importlib.util
import os
_SPEC = importlib.util.spec_from_file_location(
"friction_scan",
os.path.join(os.path.dirname(__file__), "..", "scripts", "friction-scan.py"),
)
fs = importlib.util.module_from_spec(_SPEC)
_SPEC.loader.exec_module(fs)
SAMPLE = """# FRICTION.md
## Open signals
_(append new raw signals here)_
- `[gotcha]` **First thing** (2026-06-01): body line one.
continuation line two.
- `[friction]` **Second thing** (2026-06-10): only one line.
---
## Kaizen reviews — decisions ledger
- `[gotcha]` **Should not be parsed** (2026-01-01): in the ledger.
"""
def test_extract_open_section_stops_at_next_heading():
section = fs.extract_open_section(SAMPLE)
assert "First thing" in section
assert "Second thing" in section
assert "Should not be parsed" not in section
def test_split_signals_finds_two_items_and_joins_continuations():
signals = fs.split_signals(fs.extract_open_section(SAMPLE))
assert len(signals) == 2
assert "continuation line two" in signals[0]
assert signals[1].startswith("`[friction]`")
```
- [ ] **Step 2: Run test to verify it fails**
Run: `.venv/bin/python -m pytest tests/test_friction_scan.py -v`
Expected: FAIL — `friction-scan.py` does not exist / `extract_open_section` undefined.
- [ ] **Step 3: Write minimal implementation**
```python
#!/usr/bin/env python3
"""Parse docs/FRICTION.md 'Open signals' into structured data for /kaizen.
Stdlib only. Modes:
--json (default): emit the open signals as JSON (Phase-0 input for /kaizen)
--nudge : print a one-line 'loop overdue?' summary
Authoritative design: docs/superpowers/specs/2026-06-14-kaizen-command-design.md
"""
import argparse
import datetime
import json
import os
import re
REPO_ROOT = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
FRICTION = os.path.join(REPO_ROOT, "docs", "FRICTION.md")
def extract_open_section(text):
"""Return the body between '## Open signals' and the next '## ' heading."""
lines = text.splitlines()
start = None
for i, line in enumerate(lines):
if line.strip().lower() == "## open signals":
start = i + 1
break
if start is None:
return ""
end = len(lines)
for j in range(start, len(lines)):
if lines[j].startswith("## "):
end = j
break
return "\n".join(lines[start:end])
def split_signals(section):
"""Split the Open-signals body into raw per-signal blocks.
A signal starts with a top-level '- ' bullet; indented or blank lines are
continuations. Returns a list of multi-line strings with the leading '- '
stripped from the first line."""
signals = []
current = None
for line in section.splitlines():
if line.startswith("- "):
if current is not None:
signals.append("\n".join(current).strip())
current = [line[2:]]
elif current is not None:
if line.strip() == "" or line.startswith(" "):
current.append(line.strip())
else:
signals.append("\n".join(current).strip())
current = None
if current is not None:
signals.append("\n".join(current).strip())
return [s for s in signals if s]
if __name__ == "__main__": # pragma: no cover (filled in Task 4)
pass
```
- [ ] **Step 4: Run test to verify it passes**
Run: `.venv/bin/python -m pytest tests/test_friction_scan.py -v`
Expected: PASS (2 tests).
- [ ] **Step 5: Commit**
```bash
git add scripts/friction-scan.py tests/test_friction_scan.py
git commit -m "feat(kaizen): friction-scan section extraction + signal split"
```
---
## Task 2: Per-signal fields — tag, first_seen, age_days
**Files:**
- Modify: `scripts/friction-scan.py`
- Test: `tests/test_friction_scan.py`
- [ ] **Step 1: Write the failing test**
```python
import datetime
TODAY = datetime.date(2026, 6, 15)
def test_parse_signal_extracts_tag_and_date_and_age():
raw = fs.split_signals(fs.extract_open_section(SAMPLE))[0]
sig = fs.parse_signal(raw, TODAY)
assert sig["tag"] == "gotcha"
assert sig["first_seen"] == "2026-06-01"
assert sig["age_days"] == 14
assert "First thing" in sig["text"]
def test_parse_signal_handles_missing_date():
sig = fs.parse_signal("`[unused]` **No date here** something", TODAY)
assert sig["tag"] == "unused"
assert sig["first_seen"] is None
assert sig["age_days"] is None
```
- [ ] **Step 2: Run test to verify it fails**
Run: `.venv/bin/python -m pytest tests/test_friction_scan.py::test_parse_signal_extracts_tag_and_date_and_age -v`
Expected: FAIL — `parse_signal` undefined.
- [ ] **Step 3: Write minimal implementation**
Add near the top (after imports):
```python
TAG_RE = re.compile(r"`\[(friction|gotcha|recurring|unused)\]`")
DATE_RE = re.compile(r"(\d{4})-(\d{2})-(\d{2})")
```
Add the function (above the `__main__` block):
```python
def parse_signal(raw, today):
"""Turn one raw signal block into a structured dict."""
tag_m = TAG_RE.search(raw)
date_m = DATE_RE.search(raw)
if date_m:
first_seen = date_m.group(0)
seen = datetime.date(int(date_m.group(1)), int(date_m.group(2)), int(date_m.group(3)))
age_days = (today - seen).days
else:
first_seen = None
age_days = None
return {
"tag": tag_m.group(1) if tag_m else None,
"first_seen": first_seen,
"age_days": age_days,
"recurrence_count": 1, # refined in Task 3
"referenced_paths": [], # filled in Task 3
"still_exists": True, # filled in Task 3
"text": " ".join(raw.split()),
}
```
- [ ] **Step 4: Run test to verify it passes**
Run: `.venv/bin/python -m pytest tests/test_friction_scan.py -v`
Expected: PASS (4 tests).
- [ ] **Step 5: Commit**
```bash
git add scripts/friction-scan.py tests/test_friction_scan.py
git commit -m "feat(kaizen): parse tag/first_seen/age per signal"
```
---
## Task 3: Recurrence count + referenced paths + still_exists
**Files:**
- Modify: `scripts/friction-scan.py`
- Test: `tests/test_friction_scan.py`
- [ ] **Step 1: Write the failing test**
```python
def test_recurrence_from_ordinal():
assert fs.parse_recurrence("blah 5th occurrence (06-05/06/06) blah") == 5
def test_recurrence_from_datelist_when_no_ordinal():
# three slash-separated date fragments → recurrence 3
assert fs.parse_recurrence("recurred (06-05/06-09/06-10) again") == 3
def test_recurrence_defaults_to_one():
assert fs.parse_recurrence("a one-off gotcha") == 1
def test_parse_paths_picks_repo_paths_only():
paths = fs.parse_paths("see `scripts/repo-scan.py` and `latest` and `foo.yml`")
assert "scripts/repo-scan.py" in paths
assert "foo.yml" in paths
assert "latest" not in paths
def test_still_exists_false_for_missing_path():
sig = fs.parse_signal("`[unused]` **x** (2026-06-01): `scripts/nope-not-real.py`", TODAY)
assert sig["still_exists"] is False
def test_still_exists_true_for_real_path():
sig = fs.parse_signal("`[gotcha]` **x** (2026-06-01): `scripts/repo-scan.py`", TODAY)
assert sig["still_exists"] is True
```
- [ ] **Step 2: Run test to verify it fails**
Run: `.venv/bin/python -m pytest tests/test_friction_scan.py -k "recurrence or paths or still_exists" -v`
Expected: FAIL — `parse_recurrence` / `parse_paths` undefined.
- [ ] **Step 3: Write minimal implementation**
Add the regexes near the others:
```python
ORDINAL_RE = re.compile(r"(\d+)(?:st|nd|rd|th)\s+(?:occurrence|reinforcement|time)", re.I)
DATELIST_RE = re.compile(r"\((\d{2}-\d{2}(?:/[\d/-]+)+)\)")
BACKTICK_RE = re.compile(r"`([^`]+)`")
PATH_EXTS = (".py", ".yml", ".yaml", ".md", ".sh", ".tf", ".j2", ".toml", ".cfg", ".hcl")
```
Add the helpers (above `parse_signal`):
```python
def parse_recurrence(text):
"""Best-effort recurrence count from explicit markers; default 1."""
counts = [1]
m = ORDINAL_RE.search(text)
if m:
counts.append(int(m.group(1)))
dl = DATELIST_RE.search(text)
if dl:
counts.append(dl.group(1).count("/") + 1)
return max(counts)
def parse_paths(text):
"""Backtick tokens that look like repo paths (contain '/' or a known ext)."""
out, seen = [], set()
for m in BACKTICK_RE.finditer(text):
tok = m.group(1).strip()
if ("/" in tok or tok.endswith(PATH_EXTS)) and tok not in seen:
seen.add(tok)
out.append(tok)
return out
```
Then update `parse_signal` — replace the three placeholder fields:
```python
paths = parse_paths(raw)
still_exists = all(os.path.exists(os.path.join(REPO_ROOT, p)) for p in paths) if paths else True
return {
"tag": tag_m.group(1) if tag_m else None,
"first_seen": first_seen,
"age_days": age_days,
"recurrence_count": parse_recurrence(raw),
"referenced_paths": paths,
"still_exists": still_exists,
"text": " ".join(raw.split()),
}
```
- [ ] **Step 4: Run test to verify it passes**
Run: `.venv/bin/python -m pytest tests/test_friction_scan.py -v`
Expected: PASS (10 tests).
- [ ] **Step 5: Commit**
```bash
git add scripts/friction-scan.py tests/test_friction_scan.py
git commit -m "feat(kaizen): recurrence count + referenced-path existence"
```
---
## Task 4: CLI — `load_signals`, `--json`, `--nudge`
**Files:**
- Modify: `scripts/friction-scan.py`
- Test: `tests/test_friction_scan.py`
- [ ] **Step 1: Write the failing test**
```python
def test_nudge_line_overdue_on_recurrence():
sigs = [{"age_days": 2, "recurrence_count": 5}]
line = fs.nudge_line(sigs)
assert "OVERDUE" in line
assert "max recurrence 5x" in line
def test_nudge_line_ok_when_quiet():
sigs = [{"age_days": 3, "recurrence_count": 1}, {"age_days": 1, "recurrence_count": 1}]
line = fs.nudge_line(sigs)
assert "ok" in line
assert "OVERDUE" not in line
def test_nudge_line_overdue_on_count():
sigs = [{"age_days": 1, "recurrence_count": 1} for _ in range(8)]
assert "OVERDUE" in fs.nudge_line(sigs)
def test_load_signals_reads_real_friction_file():
path = os.path.join(os.path.dirname(__file__), "..", "docs", "FRICTION.md")
sigs = fs.load_signals(path, TODAY)
assert len(sigs) >= 1
assert all(s["tag"] in {"friction", "gotcha", "recurring", "unused"} for s in sigs)
```
- [ ] **Step 2: Run test to verify it fails**
Run: `.venv/bin/python -m pytest tests/test_friction_scan.py -k "nudge or load_signals" -v`
Expected: FAIL — `nudge_line` / `load_signals` undefined.
- [ ] **Step 3: Write minimal implementation**
Add thresholds near the top (after `FRICTION = ...`):
```python
# Nudge thresholds (tunable; the /kaizen self-eval phase revisits these).
NUDGE_MIN_OPEN = 8
NUDGE_MAX_AGE_DAYS = 21
NUDGE_MIN_RECURRENCE = 3
```
Add the functions and replace the `__main__` block:
```python
def load_signals(path, today):
with open(path, encoding="utf-8") as fh:
text = fh.read()
return [parse_signal(s, today) for s in split_signals(extract_open_section(text))]
def nudge_line(signals):
n = len(signals)
ages = [s["age_days"] for s in signals if s.get("age_days") is not None]
oldest = max(ages) if ages else 0
max_rec = max((s["recurrence_count"] for s in signals), default=0)
overdue = n >= NUDGE_MIN_OPEN or oldest >= NUDGE_MAX_AGE_DAYS or max_rec >= NUDGE_MIN_RECURRENCE
status = "OVERDUE — run /kaizen" if overdue else "ok"
return f"kaizen: {n} open signals, oldest {oldest}d, max recurrence {max_rec}x — {status}"
def main():
parser = argparse.ArgumentParser(description="Parse FRICTION.md Open signals for /kaizen.")
parser.add_argument("--nudge", action="store_true", help="print a one-line overdue summary")
parser.add_argument("--today", help="override today's date (YYYY-MM-DD) for testing")
parser.add_argument("--file", default=FRICTION, help="path to FRICTION.md")
args = parser.parse_args()
if args.today:
y, m, d = args.today.split("-")
today = datetime.date(int(y), int(m), int(d))
else:
today = datetime.date.today()
signals = load_signals(args.file, today)
if args.nudge:
print(nudge_line(signals))
else:
print(json.dumps(signals, indent=2))
if __name__ == "__main__":
main()
```
- [ ] **Step 4: Run tests + smoke-test the CLI**
Run: `.venv/bin/python -m pytest tests/test_friction_scan.py -v`
Expected: PASS (14 tests).
Run: `python3 scripts/friction-scan.py --nudge`
Expected: one line like `kaizen: 13 open signals, oldest 14d, max recurrence 5x — OVERDUE — run /kaizen`.
Run: `python3 scripts/friction-scan.py | head -20`
Expected: a JSON array of signal objects.
- [ ] **Step 5: Commit**
```bash
git add scripts/friction-scan.py tests/test_friction_scan.py
git commit -m "feat(kaizen): friction-scan CLI (--json default, --nudge)"
```
---
## Task 5: The `/kaizen` command document
**Files:**
- Create: `.claude/commands/kaizen.md`
- [ ] **Step 1: Create the command file**
Write `.claude/commands/kaizen.md` with exactly this content:
````markdown
# Kaizen — curate the friction log into improvements
Consume the **Open signals** in `docs/FRICTION.md`: decide a verdict for each, migrate
durable knowledge into the right docs, and archive consumed signals into the decisions
ledger. **Curate-only** — do not hunt for new signals; capture stays manual. This is an
interactive, judgment-dense pass: propose, the operator decides, you apply on approval.
Design: `docs/superpowers/specs/2026-06-14-kaizen-command-design.md`.
## Phase 0 — scan
Run `python3 scripts/friction-scan.py > /tmp/kaizen.json`. It returns each Open signal as
`{tag, first_seen, age_days, recurrence_count, referenced_paths, still_exists, text}`.
Treat `still_exists: false` as a hint the signal may already be resolved.
## Phase 1 — triage
Order signals by `recurrence_count` desc, then `age_days` desc, then tag. **Group signals
that share a root cause** and curate them together. Present the agenda before editing
anything: total open, how many recurring (≥3), how many look already-resolved.
## Phase 2 — per-signal curation (interactive)
For each signal/group, present: a one-line restatement, the evidence (age, recurrence,
still-real), and a proposed **verdict**. Verdicts:
- **SYSTEMATIZE** — migrate the durable lesson into its right home (a runbook, an ADR,
`CLAUDE.md`, a new `scripts/repo-scan.py` check, or a hook).
- **CHANGE** — adjust an existing tool/convention/config rather than document it.
- **PARK** — *out-of-phase but not obsolete*. Remove from the active tree, but write a
ledger row recording **where it now lives (git SHA/branch/doc) and a resurrection
trigger**. The default for "not touched lately but not wrong."
- **REMOVE** — *obsolete*: superseded, wrong, never worked, duplicated. Ledger row states
why.
- **ALREADY-BUILT** — the systematization already exists / the fix landed; archive.
- **ACCEPTED** — conscious no-op (revisit-if-recurs); archive.
- **KEEP-OPEN** — still accruing, not ripe; leave it in *Open signals* (no ledger row).
Rules:
- **Knowledge is never removed** — SYSTEMATIZE/migrate it; only *active surface* (scripts,
checks, conventions, plugins) is parked/removed.
- Every reductive verdict must classify *why unused*: **obsolete → REMOVE**,
**out-of-phase → PARK**.
- The operator approves / modifies / rejects each verdict. On approval: do the mechanical
edit (migrate text into the target doc; **move the signal from *Open signals* into the
ledger table**; delete the parked/removed file) and show the diff.
- PARK and REMOVE both delete from the active tree — the difference is the ledger row.
Git history + the ledger row are the park mechanism; never create a `parked/` directory.
## Phase 3 — close-out
- Add a new dated block under `## Kaizen reviews — decisions ledger` (newest first), same
shape as the existing block: a table with columns **Signal (first seen) | Verdict |
Resolution / where it lives now**.
- **Bias-to-remove discipline check:** if every verdict this pass was SYSTEMATIZE/CHANGE
(only accreting), say so explicitly.
- **Self-eval (light):** is `/kaizen` being run often enough (oldest consumed age)? Should
the nudge thresholds in `scripts/friction-scan.py` change? Note it.
- Run `make lint` if any code/docs changed; revert anything that breaks it.
- Commit per `CLAUDE.md` git conventions (one logical unit — straight to `main` if
small/safe, a branch if sweeping; show the diff first for a branch).
- Print a one-line summary: `consumed X · parked Y · removed Z · kept-open W · migrated → <docs>`.
## Headless / cron (future)
Deferred until the notify + cron stack exists (`docs/TODO.md` 11.3). When run
non-interactively, **report only**: print the proposed verdicts and the nudge, do not edit
or commit.
````
- [ ] **Step 2: Verify it parses against the real log**
Run: `python3 scripts/friction-scan.py --today 2026-06-15 | python3 -c "import sys,json; print(len(json.load(sys.stdin)), 'signals')"`
Expected: prints a non-zero signal count with no traceback.
- [ ] **Step 3: Lint**
Run: `make lint`
Expected: passes (markdown isn't linted by yamllint/ansible-lint, but this confirms nothing else broke).
- [ ] **Step 4: Commit**
```bash
git add .claude/commands/kaizen.md
git commit -m "feat(kaizen): /kaizen command — interactive friction curation"
```
---
## Task 6: Stage-2 nudge in `/review-repo` + STATUS/TODO
**Files:**
- Modify: `.claude/commands/review-repo.md`
- Modify: `STATUS.md`
- Modify: `docs/TODO.md`
- [ ] **Step 1: Add the nudge to the review-repo command**
In `.claude/commands/review-repo.md`, find the "Phase 0 — deterministic pre-scan" section
(it runs `scripts/repo-scan.py`). Immediately after that paragraph, add:
```markdown
Also run `python3 scripts/friction-scan.py --nudge` and include its one-line output in the
report's summary — it flags when the kaizen loop (`/kaizen`) is overdue (recurring signals,
backlog size, or age). This is a reminder only; do not act on `FRICTION.md` from here.
```
- [ ] **Step 2: Add a STATUS row**
In `STATUS.md`, under "Real and working today", add a row:
```markdown
| `/kaizen` | Curate `docs/FRICTION.md` Open signals → decisions ledger (`scripts/friction-scan.py` Phase 0 + `.claude/commands/kaizen.md`). On-demand; `--nudge` surfaces in `/review-repo`. Headless/cron deferred (TODO 11.3). |
```
- [ ] **Step 3: Update TODO 11**
In `docs/TODO.md` item 11, mark sub-item 1 built:
Change `1. Build `/retro`: ...` to begin with `1. ~~Build `/retro``... ` — i.e. strike it
through and append: `DONE — built as `/kaizen` (scope narrowed to curate-only per the
2026-06-14 spec; `/retro` name dropped). `scripts/friction-scan.py` + `.claude/commands/kaizen.md`.`
- [ ] **Step 4: Lint**
Run: `make lint`
Expected: passes.
- [ ] **Step 5: Commit**
```bash
git add .claude/commands/review-repo.md STATUS.md docs/TODO.md
git commit -m "feat(kaizen): nudge in /review-repo; STATUS + TODO"
```
---
## Task 7: Dogfood — first real `/kaizen` run
This task is **not** automated; it is the first real use, done interactively with the operator.
- [ ] **Step 1:** Run `/kaizen` against the current Open signals (there are several,
including the 3 added 2026-06-14 and the 5× execution-mode-menu signal).
- [ ] **Step 2:** Work the interactive curation (Phase 2) with the operator, applying
verdicts on approval.
- [ ] **Step 3:** Confirm the close-out: ledger updated, `make lint` green, summary printed.
This both processes the backlog and validates the command end-to-end.
---
## Self-review notes (author)
- **Spec coverage:** scope-A curate-only → Task 5 Phase 02; verdict model incl. PARK →
Task 5 Phase 2 + ledger; single source FRICTION.md → Task 4 `load_signals`; interactive
apply (B) → Task 5; ledger format → Task 5 Phase 3; scanner schema → Tasks 24; nudge +
thresholds → Task 4 + Task 6; out-of-scope items → not built (correct); `/review-repo`
relationship → Task 6 nudge. All covered.
- **No placeholders:** every code step shows complete code; the command doc is written in
full.
- **Type consistency:** the signal dict keys (`tag, first_seen, age_days,
recurrence_count, referenced_paths, still_exists, text`) are identical across Tasks 24
and the command doc; `nudge_line` reads `age_days`/`recurrence_count` only.