boma/docs/FRICTION.md
sjat 8d1d8a88ea docs(friction): escalate execution-mode prompt; no plan→impl approval gate
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-06 15:57:40 +02:00

96 lines
6 KiB
Markdown

# FRICTION.md — kaizen friction log
Raw signals for the periodic **kaizen review** (the methodology retrospective; see
`docs/TODO.md`). This is the input that keeps our tooling and conventions sharpening
over time instead of only accreting.
**How to use:** append freely _during_ work — don't curate, don't fix here. Capture
friction, surprises, fixes that keep recurring, and tooling that isn't earning its
keep. The kaizen review reads this, then proposes **add / change / remove** (biased
toward _remove_) and records the decisions as ADRs.
**Entry format:** `date — [tag] observation — (optional) → systematization idea`
Tags: `[friction]` recurring annoyance · `[gotcha]` surprising behaviour ·
`[recurring]` keeps coming back, should be systematized · `[unused]` tooling not
earning its keep.
---
## 2026-05-30 — initial seed (from the Claude-Code setup session)
- `[recurring]` Every `git commit` needs `rbw` unlocked (the pre-commit ansible-lint
hook decrypts `vault.yml` for its syntax-check). Mitigated with a 5h lock timeout
and an `rbw unlocked` pre-flight convention. → _Open:_ could ansible-lint skip vault
decryption for syntax-check, so committing doesn't need the vault at all?
- `[gotcha]` pre-commit stashes _unstaged_ changes before running hooks, so a partial
commit reverted an interdependent file (`ansible.cfg`) and failed. → Commit
interdependent changes together, or stage the config change first.
- `[gotcha]` `make new-role` had never worked on this host: `mkdir {a,b,c}` brace
expansion fails under `/bin/sh` (dash). Fixed with explicit paths. → A real run
catches what static review can't; consider smoke-testing scaffold commands.
- `[gotcha]` `rbw sync` is required after adding a Vaultwarden item before `rbw get`
finds it (stale local cache).
- `[gotcha]` This shell is zsh — unquoted `$VAR` does not word-split, so a variable
holding a file list was passed as a single argument. → Use explicit args/arrays.
- `[friction]` Long sessions: I make a batch of edits but can't commit until you
`rbw unlock`. The 5h timeout + pre-flight check address the symptom; watch whether
it still bites.
- `[gotcha]` Hooks (or any new `.claude/settings.json`) added mid-session don't
activate until a Claude Code **restart** — the settings watcher only tracks settings
files that existed at session start. Opening `/hooks` and dismissing did _not_ load
them. → Fresh sessions load them normally; restart after adding hooks.
## 2026-05-31
- I asked to draft an ADR and got: No formal status-header convention, but since this is a draft for discussion I'll mark it Proposed so it isn't mistaken for an
accepted decision. Here's the draft.
## 2026-06-01
- `[friction]` The `finishing-a-development-branch` flow (and generic AI/dev tooling)
offers "push and open a Pull Request," but our Forgejo `origin` is trunk-based with
no merge-request / approval gate (CLAUDE.md git conventions). That option doesn't
apply — the real path is local fast-forward merge to `main`, then push. → Skills and
conventions that assume a GitHub-style PR workflow need a homelab-aware variant;
encode that here "finishing a branch" means merge-locally-then-push, not open-a-PR.
## 2026-06-05
- `[recurring]` The `writing-plans` skill ends by asking "subagent-driven vs inline
execution?" — always answer subagent-driven here. Don't ask; default straight to
subagent-driven (fresh subagent per task + review between tasks). → Standing
preference; skip the execution-mode prompt.
- `[recurring]` When a **deferred** decision later resolves, docs that referenced the
deferral go stale and a plan's file-map can miss them (e.g. resolving the mesh-VPN
choice left `new-host.md` still saying "mesh VPN (choice deferred)"; the ubongo work
similarly left a contradiction in CLAUDE.md). A *broadened* final grep sweep caught
both. → On resolving a deferred decision, grep all canonical docs for the deferral
language ("choice deferred", "pending", "TBD", the placeholder's name) and reconcile
every hit — don't rely on the plan's file-map alone. Worth a `/review-repo` check for
lingering "deferred/pending/TBD" references whose ADR has since resolved.
- **Recurred a 3rd time (same day):** ADR-017 resolved the browser-E2E harness but
left ADR-015's own "Deferred" list item #2 still reading as open — not caught by the
ADR-017 plan's sweep (which only checked for *its own* placeholder language), only
by a later STATUS pass. Lesson sharpened: the stale reference often lives in the
**originating ADR's Deferred section**, which the resolving ADR's plan won't think
to grep. → When an ADR resolves another ADR's deferred item, edit that **source
ADR's Deferred list** in the same change. Three hits now — promote from "worth a
check" to **build it**: a `/review-repo` rule flagging any ADR "Deferred/Open" entry
whose subject is named as RESOLVED/DECIDED elsewhere.
## 2026-06-06
- `[recurring]` **Asked the execution-mode question AGAIN** ("subagent-driven vs inline —
which approach?") at the end of `writing-plans`, despite the 2026-06-05 standing
preference *and* the `always-subagent-driven-execution` memory both saying don't ask.
Root cause: the `writing-plans` skill's "Execution Handoff" step scripts the menu, and
I followed the skill text over the user's standing override. Second occurrence →
escalate from "skip the prompt" to a **hard rule**: never present the execution-mode
menu; finishing a plan means defaulting straight to subagent-driven.
- `[friction]` **Don't pause for approval between writing a plan and implementing it.**
The user has standing pre-approval to carry straight through plan → implementation. The
brainstorming/plan flow already has explicit approval gates (design approval, spec
review); adding another "shall I proceed to implement?" gate after the plan is written
is redundant friction. → After `writing-plans` finishes, begin subagent-driven
implementation directly. The only reason to stop is a genuine blocker or ambiguity, not
a routine checkpoint.