boma/docs/superpowers/specs/2026-06-14-kaizen-command-design.md
sjat 1a0e30e278 docs(spec): /kaizen — kaizen-loop command (TODO 11)
Curate-only consume pass over FRICTION.md Open signals: interactive guided
session, add/change/park/remove verdicts (park-with-resurrection-trigger to
protect out-of-phase tooling on a solo project), single source = FRICTION.md,
ledger is the durable record. Mirrors /review-repo (command md + stdlib scanner).
Stage 1 on-demand + stage-2 nudge; headless/cron deferred (TODO 11.3).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-14 21:05:09 +02:00

171 lines
9.9 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# `/kaizen` — kaizen-loop command (design)
**Status:** Designed, not built. Resolves `docs/TODO.md` item 11 (Kaizen loop).
**Date:** 2026-06-14.
## Context
boma runs a kaizen (continuous-improvement) loop on its own methodology and tooling.
The *capture* half already works: raw signals are appended to `docs/FRICTION.md`
*Open signals* during work (tags `[friction]`/`[gotcha]`/`[recurring]`/`[unused]`). The
*consume* half — periodically reading those signals, deciding **add / change / park /
remove**, migrating durable knowledge into the right docs, and archiving consumed signals
into the decisions ledger — is **manual** and therefore easy to skip. The one kaizen
review on record (`FRICTION.md`, 2026-06-10 ledger block) was done by hand; its own
process note says "the `/retro` tool (TODO 11) still isn't built, so this review was
manual."
This spec defines the command that makes the consume pass a repeatable, low-friction
ritual. The name **`/kaizen`** is chosen over the placeholder `/retro`: a "retro" is a
backward-looking, time-boxed ritual, whereas this is continuous improvement — and
`FRICTION.md` already speaks this language ("kaizen friction log", "Kaizen reviews —
decisions ledger"), so the command name reinforces the artifact names.
## Decisions (from the 2026-06-14 brainstorm)
1. **Scope: curate-only.** `/kaizen` consumes the `FRICTION.md` *Open signals*; it does
**not** auto-harvest new signals. Capture stays manual and continuous. `FRICTION.md`
is the **single input source** (single source of truth).
2. **Verdict model: add / change / park / remove**, with a critical split on the
reductive side to protect a solo, phase-shifting project:
- **Knowledge is never removed** — it is *migrated* to the right doc or *archived* to
the ledger. The reductive verdicts act only on *active surface* (scripts, checks,
conventions, plugins), never on understanding.
- **park** — *out-of-phase but not obsolete*; plausibly valuable in a later focus.
Moved out of the active surface but recorded in the ledger with **(a)** where it now
lives (git SHA/branch/doc) and **(b)** an explicit **resurrection trigger**.
- **remove** — reserved for the *obsolete*: superseded, wrong, never worked, duplicated.
- Every reductive verdict must classify *why unused*: **obsolete → remove**,
**out-of-phase → park**. The default for "not touched lately but not wrong" is **park**.
- Reversibility safety net: single operator + everything in git, so even a wrong
`remove` is `git revert`-able with a ledger breadcrumb; `park` lowers the cost further.
- Precedent in-repo: `docs/runbooks/claude-code-setup.md` already lists "Deferred
plugins … with triggers" — park-with-a-trigger, made a first-class kaizen outcome.
3. **Trigger model: on-demand command + a light nudge, staged.**
- Stage 1 (this spec): the **on-demand** `/kaizen` command.
- Stage 2 (this spec, small follow-on): a **nudge**`friction-scan.py --nudge`
prints a one-line "loop overdue" reminder, surfaced inside `/review-repo`'s report.
- Deferred (TODO 11.3): a **scheduled headless** run (report-only) once the
notification (ntfy) + scheduled-job/cron stack exists.
4. **Apply model: interactive guided session.** `/kaizen` proposes verdicts (one or
grouped); the operator approves / modifies / rejects each; on approval the command
performs the mechanical edit and shows the diff, then commits at close-out. There is
no auto-applied "safe class" (unlike `/review-repo`): every kaizen verdict is a
judgment call, so the human is in the loop for each. Report-only behaviour is reserved
for the future headless path.
## Components
Mirrors the `/review-repo` shape (`.claude/commands/review-repo.md` + `scripts/repo-scan.py`):
1. **`scripts/friction-scan.py`** — stdlib only; parses `FRICTION.md` *Open signals* and
emits structured data. Two modes:
- `--json` (default): the Phase-0 input for `/kaizen`.
- `--nudge`: prints one line and signals "overdue" per the thresholds below.
2. **`.claude/commands/kaizen.md`** — the interactive curation process (session flow below).
3. **`tests/test_friction_scan.py`** — unit tests for the parser (matches the
`tests/test_repo_scan.py` convention).
4. **`/review-repo` hook-up (stage 2)** — `review-repo.md` calls `friction-scan.py
--nudge` and includes the line in its report.
5. **Deferred:** the headless/cron path (TODO 11.3).
### `friction-scan.py` output schema
Per *Open signal*: `{tag, first_seen, age_days, recurrence_count, referenced_paths,
still_exists, text}`.
- `tag` — one of `friction` / `gotcha` / `recurring` / `unused`.
- `first_seen` / `age_days` — parsed from the leading `date — [tag]` marker.
- `recurrence_count` — best-effort from explicit markers in the entry (entries already
write "5th occurrence (06-05/06/06/…)"); refined by the human during triage.
- `referenced_paths` / `still_exists` — paths the signal names and whether they still
exist on disk (a missing target hints the signal may be already-resolved).
`--nudge` reports **overdue** when **any** holds: `recurrence_count >= 3` for any signal,
open count `>= 8`, or oldest `age_days >= 21`. Thresholds are constants, tunable, and the
self-eval phase revisits them.
## The `/kaizen` session flow
**Phase 0 — scan (deterministic).** Run `friction-scan.py --json`. Produces the agenda
and the cheap "is this still real?" check (`still_exists`).
**Phase 1 — triage.** Order signals by recurrence, then age, then tag. Group signals
sharing a root cause (e.g. the execution-mode-menu and brainstorming-gate signals are both
"external skill script vs boma convention" — curated together). Present the agenda before
editing anything: counts of open / recurring / likely-already-resolved.
**Phase 2 — per-signal curation (interactive).** For each signal/group, present: a
one-line restatement, the evidence (age/recurrence, still-real), and a proposed
**verdict** —
- **systematize** → migrate the durable lesson into its right home (a runbook, an ADR,
`CLAUDE.md`, a new `repo-scan.py` check, or a hook),
- **change** → adjust an existing tool/convention/config rather than document it,
- **park** → ledger row with git location + resurrection trigger,
- **remove** → obsolete; ledger row with the reason,
- **already-built** → the systematization already exists / the fix landed elsewhere; archive,
- **accepted** → conscious no-op (revisit-if-recurs); archive,
- **keep-open** → still accruing; leave in *Open signals* (the only verdict with no ledger row).
The ledger verdict vocabulary is therefore `SYSTEMATIZE · CHANGE · PARK · REMOVE ·
ALREADY-BUILT · ACCEPTED` (keep-open produces no row). These extend the verdicts the
2026-06-10 ledger block already used (CHANGE, MIGRATE, already built, accepted).
The operator approves / modifies / rejects each. On approval, the command performs the
mechanical edit (migrate the text into the target doc; **move the signal from *Open
signals* into the ledger table**; delete/park the file) and shows the diff. **park and
remove both delete from the active tree** — the difference is the ledger row (park records
a resurrection trigger). Git history + the ledger row *are* the park mechanism; there is
no `parked/` graveyard directory.
**Phase 3 — close-out.**
- Write a new dated review block in the ledger (newest-first, same shape as the 2026-06-10
block).
- **Bias-to-remove discipline check** — if every verdict this pass was "add", flag that
the loop is only accreting.
- **Self-eval** (light) — is `/kaizen` being run often enough (oldest-consumed age); should
the nudge thresholds change.
- `make lint` if code/docs changed; commit per `CLAUDE.md` git conventions (the curation
is one logical unit — straight to `main` if small/safe, a branch if sweeping).
- Print a one-line summary: consumed X · parked Y · removed Z · kept-open W · migrated →
\<docs>.
### Ledger row format
A new dated block extends the existing `## Kaizen reviews — decisions ledger` table:
| column | content |
|---|---|
| Signal (first seen) | the signal + first-seen date + recurrence (e.g. "5× 06-05…06-14") |
| Verdict | `SYSTEMATIZE` · `CHANGE` · `PARK` · `REMOVE` · `ALREADY-BUILT` · `ACCEPTED` |
| Resolution / where it lives now | systematize → the doc/guard it migrated to; **park** → git location + resurrection trigger; remove → why obsolete |
Parked rows stay permanently visible in the ledger with their trigger, so a future phase
can `grep PARK` and revive — the explicit answer to "don't drop something we'll come back to."
## Out of scope (YAGNI)
- **Headless/cron run** — deferred to the notify + cron stack (TODO 11.3).
- **Auto-harvesting new signals** — rejected; capture stays manual, and the `[unused]` tag
is how dormant tooling enters the loop.
- **Decision/ADR re-challenge** — that is TODO 13 ("Intentions"), a separate future
command; `/kaizen` curates methodology/tooling signals, not service/architecture decisions.
- **Auto tooling-usage inventory** — rejected for the same reason as auto-harvest.
- **A separate report artifact** (à la `docs/reviews/`) — the ledger *is* the durable
record; the interactive session is the "report".
## Relationship to `/review-repo`
`/review-repo` audits **repo drift** (code/doc staleness, conformance). `/kaizen` curates
**methodology/tooling** friction. They stay distinct. Future integration (not in this
spec): when `/review-repo` sees a finding recur across runs, it could append a
`[recurring]` signal to `FRICTION.md`, making `/review-repo` a *producer* into the single
input source that `/kaizen` consumes.
## Build order
1. `scripts/friction-scan.py` (`--json`) + `tests/test_friction_scan.py`.
2. `.claude/commands/kaizen.md` (the session flow).
3. First real `/kaizen` run against the current Open signals (dogfood).
4. Stage 2: `--nudge` + `/review-repo` hook-up.
5. (Deferred) headless/cron — TODO 11.3.