diff --git a/docs/FRICTION.md b/docs/FRICTION.md new file mode 100644 index 0000000..b8d58ca --- /dev/null +++ b/docs/FRICTION.md @@ -0,0 +1,37 @@ +# FRICTION.md — kaizen friction log + +Raw signals for the periodic **kaizen review** (the methodology retrospective; see +`docs/TODO.md`). This is the input that keeps our tooling and conventions sharpening +over time instead of only accreting. + +**How to use:** append freely *during* work — don't curate, don't fix here. Capture +friction, surprises, fixes that keep recurring, and tooling that isn't earning its +keep. The kaizen review reads this, then proposes **add / change / remove** (biased +toward *remove*) and records the decisions as ADRs. + +**Entry format:** `date — [tag] observation — (optional) → systematization idea` +Tags: `[friction]` recurring annoyance · `[gotcha]` surprising behaviour · +`[recurring]` keeps coming back, should be systematized · `[unused]` tooling not +earning its keep. + +--- + +## 2026-05-30 — initial seed (from the Claude-Code setup session) + +- `[recurring]` Every `git commit` needs `rbw` unlocked (the pre-commit ansible-lint + hook decrypts `vault.yml` for its syntax-check). Mitigated with a 5h lock timeout + and an `rbw unlocked` pre-flight convention. → *Open:* could ansible-lint skip vault + decryption for syntax-check, so committing doesn't need the vault at all? +- `[gotcha]` pre-commit stashes *unstaged* changes before running hooks, so a partial + commit reverted an interdependent file (`ansible.cfg`) and failed. → Commit + interdependent changes together, or stage the config change first. +- `[gotcha]` `make new-role` had never worked on this host: `mkdir {a,b,c}` brace + expansion fails under `/bin/sh` (dash). Fixed with explicit paths. → A real run + catches what static review can't; consider smoke-testing scaffold commands. +- `[gotcha]` `rbw sync` is required after adding a Vaultwarden item before `rbw get` + finds it (stale local cache). +- `[gotcha]` This shell is zsh — unquoted `$VAR` does not word-split, so a variable + holding a file list was passed as a single argument. → Use explicit args/arrays. +- `[friction]` Long sessions: I make a batch of edits but can't commit until you + `rbw unlock`. The 5h timeout + pre-flight check address the symptom; watch whether + it still bites. diff --git a/docs/TODO.md b/docs/TODO.md index 8782b51..c402ea5 100644 --- a/docs/TODO.md +++ b/docs/TODO.md @@ -3,7 +3,7 @@ - [x] Main readme only says ansible, not terraform. Should properbly be included. - [x] Main readme does not include a description of the name boma, nor the scope (i.e. infrastructure - not laptops) -- [ ] Method to review repo to ensure +- [x] Method to review repo to ensure - We dont carry around code, comments, notes, etc. that is no longer needed but was perhaps added to fix an issue that has been resolved. - That all code, structure, comments, notes etc. follow our design decisions. - That clear intent is documented throughout - and that there are not any overlaps, contradictions etc. @@ -21,6 +21,8 @@ - What to install on nodes? - firewalls? - apps? + - wirering up loki, prometheous, grafana dashboards, grafana alerts, uptimekuma alerts on askari + - tagging strategy - we need a specific standard so that we can target runs, but dont over-tag. - [ ] Split horizon FQDN - with or without nyumbani @@ -48,3 +50,15 @@ managed /etc/cron.d file. Open Qs: general role vs control-node-only; prune undeclared jobs (repo authoritative) vs additive; validate headless email + that cron's env has the `claude` CLI. The /review-repo fortnightly job is the first entry. + +- [ ] Claude setup +- superpowers or other methodologies? → decided: brainstorm for intent, capture as + ADRs (skip plan files); hooks + slash commands + /review-repo for enforcement at scale. + +- [ ] Kaizen loop — set up ~2026-06-06 (one week from now) + - Build `/retro`: reads `docs/FRICTION.md` + `/review-repo` recurring findings + a + tooling-usage inventory; proposes add / change / **remove** (biased to remove); + records decisions as ADRs; evaluates itself. Recurrence-triggered + light periodic sweep. + - `docs/FRICTION.md` is live now — keep appending raw signals until the retro consumes them. + +- [ ] What is the right order of operation when spinning up from scratch? (OS, DNS, authentik, traefik...?)