docs(spec): /kaizen — kaizen-loop command (TODO 11)

Curate-only consume pass over FRICTION.md Open signals: interactive guided
session, add/change/park/remove verdicts (park-with-resurrection-trigger to
protect out-of-phase tooling on a solo project), single source = FRICTION.md,
ledger is the durable record. Mirrors /review-repo (command md + stdlib scanner).
Stage 1 on-demand + stage-2 nudge; headless/cron deferred (TODO 11.3).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
sjat 2026-06-14 21:05:09 +02:00
parent e5867422d0
commit 1a0e30e278

View file

@ -0,0 +1,171 @@
# `/kaizen` — kaizen-loop command (design)
**Status:** Designed, not built. Resolves `docs/TODO.md` item 11 (Kaizen loop).
**Date:** 2026-06-14.
## Context
boma runs a kaizen (continuous-improvement) loop on its own methodology and tooling.
The *capture* half already works: raw signals are appended to `docs/FRICTION.md`
*Open signals* during work (tags `[friction]`/`[gotcha]`/`[recurring]`/`[unused]`). The
*consume* half — periodically reading those signals, deciding **add / change / park /
remove**, migrating durable knowledge into the right docs, and archiving consumed signals
into the decisions ledger — is **manual** and therefore easy to skip. The one kaizen
review on record (`FRICTION.md`, 2026-06-10 ledger block) was done by hand; its own
process note says "the `/retro` tool (TODO 11) still isn't built, so this review was
manual."
This spec defines the command that makes the consume pass a repeatable, low-friction
ritual. The name **`/kaizen`** is chosen over the placeholder `/retro`: a "retro" is a
backward-looking, time-boxed ritual, whereas this is continuous improvement — and
`FRICTION.md` already speaks this language ("kaizen friction log", "Kaizen reviews —
decisions ledger"), so the command name reinforces the artifact names.
## Decisions (from the 2026-06-14 brainstorm)
1. **Scope: curate-only.** `/kaizen` consumes the `FRICTION.md` *Open signals*; it does
**not** auto-harvest new signals. Capture stays manual and continuous. `FRICTION.md`
is the **single input source** (single source of truth).
2. **Verdict model: add / change / park / remove**, with a critical split on the
reductive side to protect a solo, phase-shifting project:
- **Knowledge is never removed** — it is *migrated* to the right doc or *archived* to
the ledger. The reductive verdicts act only on *active surface* (scripts, checks,
conventions, plugins), never on understanding.
- **park***out-of-phase but not obsolete*; plausibly valuable in a later focus.
Moved out of the active surface but recorded in the ledger with **(a)** where it now
lives (git SHA/branch/doc) and **(b)** an explicit **resurrection trigger**.
- **remove** — reserved for the *obsolete*: superseded, wrong, never worked, duplicated.
- Every reductive verdict must classify *why unused*: **obsolete → remove**,
**out-of-phase → park**. The default for "not touched lately but not wrong" is **park**.
- Reversibility safety net: single operator + everything in git, so even a wrong
`remove` is `git revert`-able with a ledger breadcrumb; `park` lowers the cost further.
- Precedent in-repo: `docs/runbooks/claude-code-setup.md` already lists "Deferred
plugins … with triggers" — park-with-a-trigger, made a first-class kaizen outcome.
3. **Trigger model: on-demand command + a light nudge, staged.**
- Stage 1 (this spec): the **on-demand** `/kaizen` command.
- Stage 2 (this spec, small follow-on): a **nudge**`friction-scan.py --nudge`
prints a one-line "loop overdue" reminder, surfaced inside `/review-repo`'s report.
- Deferred (TODO 11.3): a **scheduled headless** run (report-only) once the
notification (ntfy) + scheduled-job/cron stack exists.
4. **Apply model: interactive guided session.** `/kaizen` proposes verdicts (one or
grouped); the operator approves / modifies / rejects each; on approval the command
performs the mechanical edit and shows the diff, then commits at close-out. There is
no auto-applied "safe class" (unlike `/review-repo`): every kaizen verdict is a
judgment call, so the human is in the loop for each. Report-only behaviour is reserved
for the future headless path.
## Components
Mirrors the `/review-repo` shape (`.claude/commands/review-repo.md` + `scripts/repo-scan.py`):
1. **`scripts/friction-scan.py`** — stdlib only; parses `FRICTION.md` *Open signals* and
emits structured data. Two modes:
- `--json` (default): the Phase-0 input for `/kaizen`.
- `--nudge`: prints one line and signals "overdue" per the thresholds below.
2. **`.claude/commands/kaizen.md`** — the interactive curation process (session flow below).
3. **`tests/test_friction_scan.py`** — unit tests for the parser (matches the
`tests/test_repo_scan.py` convention).
4. **`/review-repo` hook-up (stage 2)** — `review-repo.md` calls `friction-scan.py
--nudge` and includes the line in its report.
5. **Deferred:** the headless/cron path (TODO 11.3).
### `friction-scan.py` output schema
Per *Open signal*: `{tag, first_seen, age_days, recurrence_count, referenced_paths,
still_exists, text}`.
- `tag` — one of `friction` / `gotcha` / `recurring` / `unused`.
- `first_seen` / `age_days` — parsed from the leading `date — [tag]` marker.
- `recurrence_count` — best-effort from explicit markers in the entry (entries already
write "5th occurrence (06-05/06/06/…)"); refined by the human during triage.
- `referenced_paths` / `still_exists` — paths the signal names and whether they still
exist on disk (a missing target hints the signal may be already-resolved).
`--nudge` reports **overdue** when **any** holds: `recurrence_count >= 3` for any signal,
open count `>= 8`, or oldest `age_days >= 21`. Thresholds are constants, tunable, and the
self-eval phase revisits them.
## The `/kaizen` session flow
**Phase 0 — scan (deterministic).** Run `friction-scan.py --json`. Produces the agenda
and the cheap "is this still real?" check (`still_exists`).
**Phase 1 — triage.** Order signals by recurrence, then age, then tag. Group signals
sharing a root cause (e.g. the execution-mode-menu and brainstorming-gate signals are both
"external skill script vs boma convention" — curated together). Present the agenda before
editing anything: counts of open / recurring / likely-already-resolved.
**Phase 2 — per-signal curation (interactive).** For each signal/group, present: a
one-line restatement, the evidence (age/recurrence, still-real), and a proposed
**verdict** —
- **systematize** → migrate the durable lesson into its right home (a runbook, an ADR,
`CLAUDE.md`, a new `repo-scan.py` check, or a hook),
- **change** → adjust an existing tool/convention/config rather than document it,
- **park** → ledger row with git location + resurrection trigger,
- **remove** → obsolete; ledger row with the reason,
- **already-built** → the systematization already exists / the fix landed elsewhere; archive,
- **accepted** → conscious no-op (revisit-if-recurs); archive,
- **keep-open** → still accruing; leave in *Open signals* (the only verdict with no ledger row).
The ledger verdict vocabulary is therefore `SYSTEMATIZE · CHANGE · PARK · REMOVE ·
ALREADY-BUILT · ACCEPTED` (keep-open produces no row). These extend the verdicts the
2026-06-10 ledger block already used (CHANGE, MIGRATE, already built, accepted).
The operator approves / modifies / rejects each. On approval, the command performs the
mechanical edit (migrate the text into the target doc; **move the signal from *Open
signals* into the ledger table**; delete/park the file) and shows the diff. **park and
remove both delete from the active tree** — the difference is the ledger row (park records
a resurrection trigger). Git history + the ledger row *are* the park mechanism; there is
no `parked/` graveyard directory.
**Phase 3 — close-out.**
- Write a new dated review block in the ledger (newest-first, same shape as the 2026-06-10
block).
- **Bias-to-remove discipline check** — if every verdict this pass was "add", flag that
the loop is only accreting.
- **Self-eval** (light) — is `/kaizen` being run often enough (oldest-consumed age); should
the nudge thresholds change.
- `make lint` if code/docs changed; commit per `CLAUDE.md` git conventions (the curation
is one logical unit — straight to `main` if small/safe, a branch if sweeping).
- Print a one-line summary: consumed X · parked Y · removed Z · kept-open W · migrated →
\<docs>.
### Ledger row format
A new dated block extends the existing `## Kaizen reviews — decisions ledger` table:
| column | content |
|---|---|
| Signal (first seen) | the signal + first-seen date + recurrence (e.g. "5× 06-05…06-14") |
| Verdict | `SYSTEMATIZE` · `CHANGE` · `PARK` · `REMOVE` · `ALREADY-BUILT` · `ACCEPTED` |
| Resolution / where it lives now | systematize → the doc/guard it migrated to; **park** → git location + resurrection trigger; remove → why obsolete |
Parked rows stay permanently visible in the ledger with their trigger, so a future phase
can `grep PARK` and revive — the explicit answer to "don't drop something we'll come back to."
## Out of scope (YAGNI)
- **Headless/cron run** — deferred to the notify + cron stack (TODO 11.3).
- **Auto-harvesting new signals** — rejected; capture stays manual, and the `[unused]` tag
is how dormant tooling enters the loop.
- **Decision/ADR re-challenge** — that is TODO 13 ("Intentions"), a separate future
command; `/kaizen` curates methodology/tooling signals, not service/architecture decisions.
- **Auto tooling-usage inventory** — rejected for the same reason as auto-harvest.
- **A separate report artifact** (à la `docs/reviews/`) — the ledger *is* the durable
record; the interactive session is the "report".
## Relationship to `/review-repo`
`/review-repo` audits **repo drift** (code/doc staleness, conformance). `/kaizen` curates
**methodology/tooling** friction. They stay distinct. Future integration (not in this
spec): when `/review-repo` sees a finding recur across runs, it could append a
`[recurring]` signal to `FRICTION.md`, making `/review-repo` a *producer* into the single
input source that `/kaizen` consumes.
## Build order
1. `scripts/friction-scan.py` (`--json`) + `tests/test_friction_scan.py`.
2. `.claude/commands/kaizen.md` (the session flow).
3. First real `/kaizen` run against the current Open signals (dogfood).
4. Stage 2: `--nudge` + `/review-repo` hook-up.
5. (Deferred) headless/cron — TODO 11.3.