Compare commits
17 commits
4116286ed0
...
7ebbc113ab
| Author | SHA1 | Date | |
|---|---|---|---|
| 7ebbc113ab | |||
| fa3db421dc | |||
| d0a3307822 | |||
| 0df24909e3 | |||
| 40a428975a | |||
| 6d7d27b03b | |||
| b3ca510380 | |||
| 44dbd4628f | |||
| 188882449d | |||
| 9b1502cf7d | |||
| a9aab9d040 | |||
| 3c920ae630 | |||
| ab14d65aa1 | |||
| 89179dd7c9 | |||
| a3ea0f7d80 | |||
| ce3319cbed | |||
| dfbe37916f |
27 changed files with 1434 additions and 60 deletions
|
|
@ -25,7 +25,8 @@ report the rest, and write a tracked report to `docs/reviews/`.
|
|||
### Phase 0 — deterministic pre-scan
|
||||
Run `python3 scripts/repo-scan.py > /tmp/repo-scan.json`. It returns the **inventory**
|
||||
(roles, ADRs, runbooks, playbooks, scripts — your shard list) and **exact findings**
|
||||
(markers, broken refs, unencrypted vaults). Fold these into the report verbatim.
|
||||
(markers, broken refs, unencrypted vaults, ADR-structure violations). Fold these into
|
||||
the report verbatim.
|
||||
|
||||
It also emits two deferral checks (see Phase 2): `open-deferred-item` (every still-open
|
||||
ADR "Deferred/Open" entry — a checklist to confirm) and `stale-deferred` (an entry
|
||||
|
|
|
|||
|
|
@ -231,6 +231,7 @@ Single-contributor, trunk-based (no merge requests / approval gates):
|
|||
| Firewall strategy | `docs/decisions/020-firewall.md` |
|
||||
| Operational access | `docs/decisions/021-operational-access.md` |
|
||||
| Backup & disaster recovery | `docs/decisions/022-backup.md` |
|
||||
| ADR structure & lifecycle | `docs/decisions/023-adr-structure.md` |
|
||||
| Adding a new role | `docs/runbooks/new-role.md` |
|
||||
| Adding a new host | `docs/runbooks/new-host.md` |
|
||||
| Rotating vault secrets | `docs/runbooks/rotate-secrets.md` |
|
||||
|
|
|
|||
|
|
@ -25,6 +25,24 @@ _(append new raw signals here; the next kaizen review consumes them)_
|
|||
invented a Status header ("Proposed") on the fly because there's no documented
|
||||
convention for how we write ADRs (status lifecycle, required sections). → TODO 10.2 —
|
||||
decide a minimal ADR template / status convention.
|
||||
- `[recurring]` **Brainstorming's "user reviews spec" gate fires despite a standing
|
||||
agreement to skip it** (2026-06-10): writing the ADR-structure spec, I stopped to ask
|
||||
the user to review the finished spec before writing the plan — the
|
||||
`superpowers:brainstorming` skill scripts that gate. We had previously agreed I should
|
||||
move directly from the Q/A to the implementation plan once the spec is written. Same
|
||||
shape as the execution-mode-menu signal: an external skill's script conflicting with a
|
||||
boma convention, where prose reminders don't hold. → consider a mechanical guard
|
||||
(Stop-hook family) or a CLAUDE.md/skill-override note that suppresses the spec-review
|
||||
gate.
|
||||
- `[recurring]` **Subagent faithfulness self-reports can be wrong — controller must
|
||||
diff** (2026-06-10): during the ADR-023 retroactive restructure, an implementer
|
||||
subagent reported "0 substantive deletions, the See-also lines reappear verbatim" for
|
||||
ADR-014, but it had actually dropped the cross-reference lines. Caught only by the
|
||||
controller independently running `git show <sha> | grep '^-[^-]'`. For
|
||||
faithfulness-critical edits delegated to subagents, the agent's own audit is not
|
||||
sufficient evidence. → systematize a controller-side deletion-audit step (every `-`
|
||||
line must be a classified, expected change) before accepting any "presentational-only"
|
||||
restructure; consider a helper script.
|
||||
|
||||
---
|
||||
|
||||
|
|
|
|||
|
|
@ -1,5 +1,9 @@
|
|||
# ADR-001 — Architecture overview
|
||||
|
||||
## Status
|
||||
|
||||
Accepted (2026-05-30)
|
||||
|
||||
## Context
|
||||
|
||||
This document describes the overall architecture of the homelab infrastructure
|
||||
|
|
@ -65,3 +69,21 @@ This architecture prioritises:
|
|||
- **Simplicity**: few moving parts, no orchestration layer (no Kubernetes, no Swarm)
|
||||
- **Reproducibility**: any host can be rebuilt from scratch via Ansible
|
||||
- **Legibility**: a human reading the repo can understand what runs where
|
||||
|
||||
## Consequences
|
||||
|
||||
Drawn from the boundaries this ADR already states:
|
||||
|
||||
- The small fleet (2–5 VMs) is treated as individuals, not cattle (per Infrastructure),
|
||||
and forgoing an orchestration layer is the cost of the simplicity priority (per
|
||||
Decision).
|
||||
- The control node `ubongo` cannot be created by the Terraform it hosts, so it is
|
||||
provisioned manually — the one documented exception to Terraform-owned VM existence
|
||||
(per Infrastructure / Host groups; ADR-009, ADR-015).
|
||||
- Management scope is deliberately bounded: Proxmox configuration itself (storage,
|
||||
clustering, networking) is out of scope, and the `control` group never runs the
|
||||
`docker_host` role (per Host groups).
|
||||
- Compose files are always regenerated by Ansible on deploy; no hand-edited Compose
|
||||
files exist on hosts (per Service interaction model).
|
||||
- The "What this repo manages" table describes the *intended* design — STATUS.md
|
||||
records what is actually built (per that section).
|
||||
|
|
|
|||
|
|
@ -1,5 +1,9 @@
|
|||
# ADR-002 — Security baseline and strategy
|
||||
|
||||
## Status
|
||||
|
||||
Accepted (2026-05-30)
|
||||
|
||||
## Context
|
||||
|
||||
Security here is not a single control but the sum of several combined efforts —
|
||||
|
|
@ -183,3 +187,27 @@ This posture was chosen to be:
|
|||
Out-of-scope items and conscious trade-offs are recorded in
|
||||
`docs/security/accepted-risks.md` rather than here, so this decision record stays
|
||||
stable while the risk posture evolves.
|
||||
|
||||
## Consequences
|
||||
|
||||
Drawn from the trade-offs, scoping, and follow-on work this ADR already states:
|
||||
|
||||
- Targeted/physical adversaries are out of scope at this scale, and supply chain is
|
||||
consciously deprioritized — active vuln scanning is deferred as an accepted risk
|
||||
(per Threat model; `docs/security/accepted-risks.md`).
|
||||
- SELinux is not used (non-native to Debian, redundant with AppArmor), recorded as an
|
||||
accepted risk (per Mandatory access control).
|
||||
- Some CIS L2 items require separate partitions with restrictive mount options, which
|
||||
reaches into VM disk layout — a provisioning concern (Terraform / cloud-init, ADR-006),
|
||||
not just the `base` role (per Hardening standard). Any impractical CIS item is exempted
|
||||
into the accepted-risk register with rationale, recording named exceptions rather than a
|
||||
blanket opt-out.
|
||||
- Several controls and governance mechanisms are stated as planned, not yet built:
|
||||
Suricata network IDS, active alerting wiring AIDE/`auditd`/`fail2ban`/Suricata plus
|
||||
log-source-silence into Grafana, the `/security-review` skill and its aggregation of
|
||||
every `roles/*/SECURITY.md`, and the periodic security review (per File integrity /
|
||||
Governance; STATUS.md / `docs/TODO.md`).
|
||||
- The per-service security bar is enforced manually in review today, pending the planned
|
||||
`/security-review` automation (per Governance).
|
||||
- The accepted-risk register is kept out of this ADR so the record stays stable while the
|
||||
risk posture evolves (per Decision; `docs/security/accepted-risks.md`).
|
||||
|
|
|
|||
|
|
@ -1,6 +1,20 @@
|
|||
# ADR-003 — Toolchain decisions
|
||||
|
||||
## Execution engine
|
||||
## Status
|
||||
|
||||
Accepted (2026-05-30)
|
||||
|
||||
## Context
|
||||
|
||||
boma needs a defined, reproducible toolchain for running and testing its Ansible
|
||||
monorepo: an execution engine, a Python environment, secrets handling, a testing
|
||||
framework, linting, CI/CD, developer-ergonomics conventions, and a collections/roles
|
||||
policy. This ADR records the choice made for each, together with the alternatives
|
||||
weighed and why they were not adopted.
|
||||
|
||||
## Decision
|
||||
|
||||
### Execution engine
|
||||
|
||||
**Choice**: `ansible-core` (pip-installed, pinned version) + explicit `requirements.yml`
|
||||
|
||||
|
|
@ -12,7 +26,7 @@ that isn't needed in a maintained monorepo.
|
|||
|
||||
---
|
||||
|
||||
## Python environment
|
||||
### Python environment
|
||||
|
||||
**Choice**: `python3-venv` (system Python on Debian 13) + pinned `requirements.txt`
|
||||
|
||||
|
|
@ -24,7 +38,7 @@ reproducible, and has no extra dependencies.
|
|||
|
||||
---
|
||||
|
||||
## Secrets
|
||||
### Secrets
|
||||
|
||||
**Choice**: Ansible Vault (file-based, built-in)
|
||||
|
||||
|
|
@ -40,7 +54,7 @@ CLAUDE.md → Secrets).
|
|||
|
||||
---
|
||||
|
||||
## Testing
|
||||
### Testing
|
||||
|
||||
**Choice**: Molecule with Docker driver (`molecule-plugins[docker]`)
|
||||
|
||||
|
|
@ -59,7 +73,7 @@ are needed.
|
|||
|
||||
---
|
||||
|
||||
## Linting
|
||||
### Linting
|
||||
|
||||
**Choice**: `ansible-lint` + `yamllint` + `pre-commit`
|
||||
|
||||
|
|
@ -71,7 +85,7 @@ Config files: `.ansible-lint`, `.yamllint` in repo root.
|
|||
|
||||
---
|
||||
|
||||
## CI/CD
|
||||
### CI/CD
|
||||
|
||||
**Choice**: Forgejo Actions (self-hosted at forgejo.nyumbani.baobab.band) + `act_runner`
|
||||
|
||||
|
|
@ -87,7 +101,7 @@ a dedicated runner VM later if CI load warrants a separate host.
|
|||
|
||||
---
|
||||
|
||||
## Developer ergonomics
|
||||
### Developer ergonomics
|
||||
|
||||
**Choice**: `Makefile` as the single interface for all operations
|
||||
|
||||
|
|
@ -102,7 +116,7 @@ The venv is activated in the user's shell profile.
|
|||
|
||||
---
|
||||
|
||||
## Collections and roles policy
|
||||
### Collections and roles policy
|
||||
|
||||
**No Galaxy roles.** All roles are written and maintained locally in `roles/`.
|
||||
Galaxy roles introduce external state, versioning surprises, and implicit
|
||||
|
|
@ -136,3 +150,24 @@ are removed. Each entry in `requirements.yml` must justify its presence.
|
|||
| NixOS targets | Poor Ansible fit; all hosts standardised on Debian 13 |
|
||||
|
||||
Terraform is **adopted** for VM provisioning only (no DNS) — see `docs/decisions/006-terraform.md`.
|
||||
|
||||
## Consequences
|
||||
|
||||
Drawn from the rationale and trade-offs this ADR already states:
|
||||
|
||||
- Pinning `ansible-core` + an explicit `requirements.yml` and a plain pinned venv keeps
|
||||
the control-node environment small and fully reproducible, at the cost of maintaining
|
||||
the pins (per Execution engine / Python environment).
|
||||
- Ansible Vault's whole-file encryption makes diffs unreadable regardless of layout, so
|
||||
secrets are organised for human lookup (`vault.<service>.<key>`) rather than diff
|
||||
ergonomics — the trade accepted against SOPS/age (per Secrets).
|
||||
- The `Makefile` is the single interface: Claude Code and CI invoke the same targets, so
|
||||
local and CI behaviour can't drift and collaborators need not know raw flags (per
|
||||
Developer ergonomics).
|
||||
- Collections are added only on demand, so `requirements.yml` stays minimal; this defers
|
||||
`community.crypto` (use `openssl` CLI until a role needs certs) and `community.general`
|
||||
(add only the specific sub-module needed) until a real need appears (per Collections
|
||||
and roles policy).
|
||||
- The heavier orchestration tools were declined for this scale, each with a named
|
||||
revisit trigger — e.g. Semaphore if non-SSH operators must trigger runs, AWX-adjacent
|
||||
tooling only if AWX/AAP is ever adopted (per "What was explicitly ruled out").
|
||||
|
|
|
|||
|
|
@ -1,5 +1,9 @@
|
|||
# ADR-004 — Docker and Compose service model
|
||||
|
||||
## Status
|
||||
|
||||
Accepted (2026-05-30)
|
||||
|
||||
## Context
|
||||
|
||||
All services run as Docker containers managed via Docker Compose. This document
|
||||
|
|
@ -107,3 +111,22 @@ Docker Compose was chosen over Kubernetes/Swarm because:
|
|||
- Compose files are human-readable and easily auditable
|
||||
- No distributed state to manage
|
||||
- Straightforward to back up and restore
|
||||
|
||||
## Consequences
|
||||
|
||||
Drawn from the trade-offs and deferred items this ADR already states:
|
||||
|
||||
- A shared `compose_service` engine role is intentionally not built: the ~5 standard
|
||||
tasks are duplicated per role in favour of legible, self-contained roles, with a stated
|
||||
revisit trigger — extract a shared engine if maintaining the duplicated mechanics
|
||||
becomes painful (a pattern change touching many roles, or drift this standard alone
|
||||
isn't preventing) (per "Why not a shared engine").
|
||||
- Forgoing Kubernetes/Swarm is the deliberate cost of matching complexity to a 2–5 host
|
||||
fleet with no distributed state to manage (per Decision).
|
||||
- User-namespace remapping is not enabled by default — evaluated per use case (per Docker
|
||||
daemon configuration).
|
||||
- Bare `latest` is acceptable only on the stateless tier; the stateful tier is always
|
||||
pinned `tag@digest`, and image updates are a deliberate operation (per Image management;
|
||||
ADR-011).
|
||||
- Backup strategy is stated as defined separately, not in scope of this ADR (per Persistent
|
||||
data).
|
||||
|
|
|
|||
|
|
@ -1,5 +1,9 @@
|
|||
# ADR-005 — Host bootstrapping
|
||||
|
||||
## Status
|
||||
|
||||
Accepted (2026-05-30)
|
||||
|
||||
## Context
|
||||
|
||||
This document defines the **cloud-init template** that managed VMs are cloned
|
||||
|
|
@ -81,3 +85,19 @@ Cloud-init with Proxmox templates provides:
|
|||
- No manual installer interaction
|
||||
- A clean handoff point to Ansible
|
||||
- Easy rebuilds — destroy VM, clone template, run Ansible
|
||||
|
||||
## Consequences
|
||||
|
||||
Drawn from the trade-offs and special cases this ADR already states:
|
||||
|
||||
- The cloud-init image was chosen over a manual Debian installer (slow, error-prone,
|
||||
not reproducible) and over preseed/netboot (powerful but complex to maintain) (per
|
||||
Approach).
|
||||
- Template creation is a one-time manual procedure per Proxmox cluster, and the template
|
||||
is never booted directly (per Template creation).
|
||||
- There is no manual `qm clone` path for managed hosts; the full create → inventory →
|
||||
configure pipeline and the Terraform↔Ansible contract live in ADR-009 (per VM
|
||||
provisioning / Ansible handoff).
|
||||
- The control node is the sole documented exception — `ubongo`, a physical machine
|
||||
installed by hand because it cannot be created by the Terraform it hosts (chicken-and-egg);
|
||||
its hardware target and recovery model live in ADR-015 (per Control node bootstrapping).
|
||||
|
|
|
|||
|
|
@ -1,5 +1,9 @@
|
|||
# ADR-006 — Terraform for infrastructure provisioning
|
||||
|
||||
## Status
|
||||
|
||||
Accepted (2026-05-30)
|
||||
|
||||
## Context
|
||||
|
||||
Ansible manages host configuration well but has no state model for infrastructure
|
||||
|
|
@ -13,7 +17,9 @@ exact boundary, handoff pipeline, and data contract between them live in **ADR-0
|
|||
|
||||
---
|
||||
|
||||
## Responsibility split
|
||||
## Decision
|
||||
|
||||
### Responsibility split
|
||||
|
||||
The canonical responsibility-split table lives in **ADR-009**. In short: Terraform
|
||||
owns VM existence only; Ansible owns everything inside a VM, including all internal
|
||||
|
|
@ -26,7 +32,7 @@ cadence, making them a poor fit for Terraform state.
|
|||
|
||||
---
|
||||
|
||||
## Providers
|
||||
### Providers
|
||||
|
||||
**`bpg/proxmox` (`~> 0.70`)**: Chosen over `telmate/proxmox` for active maintenance,
|
||||
full Proxmox 8 API support, and better cloud-init integration. This is the only
|
||||
|
|
@ -42,7 +48,7 @@ Terraform manages its own provider dependencies via `required_providers` and
|
|||
|
||||
---
|
||||
|
||||
## State backend
|
||||
### State backend
|
||||
|
||||
**Choice**: Local state on the control node.
|
||||
|
||||
|
|
@ -59,7 +65,7 @@ integration boundary.
|
|||
|
||||
---
|
||||
|
||||
## Structure
|
||||
### Structure
|
||||
|
||||
```
|
||||
terraform/
|
||||
|
|
@ -83,7 +89,7 @@ Each environment directory contains:
|
|||
|
||||
---
|
||||
|
||||
## Secrets handling
|
||||
### Secrets handling
|
||||
|
||||
The only secret input (the Proxmox API token) is passed via a `TF_VAR_*`
|
||||
environment variable and declared `sensitive = true` in `variables.tf`. It never
|
||||
|
|
@ -92,7 +98,7 @@ appears in `.tfvars` files. Non-secret configuration lives in tracked
|
|||
|
||||
---
|
||||
|
||||
## Ansible integration
|
||||
### Ansible integration
|
||||
|
||||
After `terraform apply`, run `make tf-inventory TF_ENV=<env>` to regenerate
|
||||
`inventories/<env>/hosts.yml` from the `vms` output. The full handoff pipeline,
|
||||
|
|
@ -102,7 +108,7 @@ handoff)**.
|
|||
|
||||
---
|
||||
|
||||
## What was ruled out
|
||||
### What was ruled out
|
||||
|
||||
| Option | Reason |
|
||||
|---|---|
|
||||
|
|
@ -110,3 +116,24 @@ handoff)**.
|
|||
| OPNsense Terraform provider | Community-maintained; provider rot risk across OPNsense releases |
|
||||
| Terraform workspaces | Single state file with workspace prefix; accidental cross-env apply possible |
|
||||
| Separate Terraform repo | Cross-referencing between infra and config adds friction; monorepo keeps the full picture together |
|
||||
|
||||
## Consequences
|
||||
|
||||
Drawn from the "What was ruled out" section and the decisions stated above:
|
||||
|
||||
- `bpg/proxmox` is the only provider; `telmate/proxmox` was ruled out for weaker
|
||||
maintenance and Proxmox 8 / cloud-init support (Providers; What was ruled out).
|
||||
- OPNsense stays entirely in Ansible — no Terraform OPNsense provider — to avoid
|
||||
community-provider rot across OPNsense releases (Responsibility split; What was
|
||||
ruled out).
|
||||
- Terraform writes no DNS records; Ansible's `dns` role owns the entire internal
|
||||
zone, avoiding the bootstrap cycle and split DNS ownership the earlier
|
||||
`hashicorp/dns` design created (Providers).
|
||||
- State is local on the control node because Forgejo offers no usable HTTP state
|
||||
backend; this is sufficient at solo-operator scale (no concurrent applies, no
|
||||
remote locking), with a real backend such as MinIO/S3 to be added later if
|
||||
warranted (State backend).
|
||||
- Separate environment directories are used instead of Terraform workspaces to
|
||||
remove the risk of applying the wrong state (Structure; What was ruled out).
|
||||
- Terraform and Ansible internals are kept in one monorepo rather than a separate
|
||||
Terraform repo to avoid cross-referencing friction (What was ruled out).
|
||||
|
|
|
|||
|
|
@ -1,5 +1,9 @@
|
|||
# ADR-007 — Network topology and addressing
|
||||
|
||||
## Status
|
||||
|
||||
Accepted (2026-05-30)
|
||||
|
||||
## Context
|
||||
|
||||
The boma homelab is a Proxmox cluster on a dedicated private network behind an
|
||||
|
|
@ -10,7 +14,9 @@ and OPNsense configuration.
|
|||
|
||||
---
|
||||
|
||||
## Physical topology
|
||||
## Decision
|
||||
|
||||
### Physical topology
|
||||
|
||||
```
|
||||
ISP
|
||||
|
|
@ -38,7 +44,7 @@ ISP
|
|||
|
||||
---
|
||||
|
||||
## VLAN design
|
||||
### VLAN design
|
||||
|
||||
| VLAN | Name | Subnet | Purpose |
|
||||
|---|---|---|---|
|
||||
|
|
@ -51,9 +57,9 @@ ISP
|
|||
|
||||
---
|
||||
|
||||
## IP addressing
|
||||
### IP addressing
|
||||
|
||||
### VLAN 10 — mgmt (10.10.0.0/24) — no DHCP
|
||||
#### VLAN 10 — mgmt (10.10.0.0/24) — no DHCP
|
||||
|
||||
| Address | Host |
|
||||
|---|---|
|
||||
|
|
@ -63,7 +69,7 @@ ISP
|
|||
| `10.10.0.201` | `pve1` |
|
||||
| `10.10.0.202` | `pve2` |
|
||||
|
||||
### VLAN 20 — srv (10.20.0.0/24) — no DHCP, all static
|
||||
#### VLAN 20 — srv (10.20.0.0/24) — no DHCP, all static
|
||||
|
||||
| Range | Purpose |
|
||||
|---|---|
|
||||
|
|
@ -81,28 +87,28 @@ Assigned infrastructure addresses:
|
|||
| `10.20.0.12` | `proxy` | Reverse proxy |
|
||||
| `10.20.0.13` | `homeassistant` | Home Assistant (IoT controller) |
|
||||
|
||||
### VLAN 30 — lan (10.30.0.0/24)
|
||||
#### VLAN 30 — lan (10.30.0.0/24)
|
||||
|
||||
| Range | Purpose |
|
||||
|---|---|
|
||||
| `10.30.0.1` | OPNsense gateway |
|
||||
| `10.30.0.100`–`.249` | DHCP pool |
|
||||
|
||||
### VLAN 40 — iot (10.40.0.0/24)
|
||||
#### VLAN 40 — iot (10.40.0.0/24)
|
||||
|
||||
| Range | Purpose |
|
||||
|---|---|
|
||||
| `10.40.0.1` | OPNsense gateway |
|
||||
| `10.40.0.100`–`.249` | DHCP pool |
|
||||
|
||||
### VLAN 50 — guest (10.50.0.0/24)
|
||||
#### VLAN 50 — guest (10.50.0.0/24)
|
||||
|
||||
| Range | Purpose |
|
||||
|---|---|
|
||||
| `10.50.0.1` | OPNsense gateway |
|
||||
| `10.50.0.100`–`.249` | DHCP pool |
|
||||
|
||||
### VLAN 99 — vpn — retired
|
||||
#### VLAN 99 — vpn — retired
|
||||
|
||||
The OPNsense WireGuard VPN (`10.99.0.0/24`) is **replaced by the NetBird mesh**
|
||||
(ADR-016). Remote access for `ubongo`, `askari`, and road-warrior clients rides a
|
||||
|
|
@ -111,7 +117,7 @@ NetBird self-hosted on `askari`. NetBird manages its own overlay addressing
|
|||
(default `100.64.0.0/10`); no boma VLAN/subnet is allocated for it, and
|
||||
`10.99.0.0/24` is freed.
|
||||
|
||||
### Corosync ring (172.16.0.0/24) — not on managed switch
|
||||
#### Corosync ring (172.16.0.0/24) — not on managed switch
|
||||
|
||||
| Address | Host |
|
||||
|---|---|
|
||||
|
|
@ -121,7 +127,7 @@ NetBird self-hosted on `askari`. NetBird manages its own overlay addressing
|
|||
|
||||
---
|
||||
|
||||
## OPNsense firewall rules (intent)
|
||||
### OPNsense firewall rules (intent)
|
||||
|
||||
| Source | Destination | Policy |
|
||||
|---|---|---|
|
||||
|
|
@ -142,7 +148,7 @@ IoT devices cannot initiate connections to `srv`.
|
|||
|
||||
---
|
||||
|
||||
## Naming scheme
|
||||
### Naming scheme
|
||||
|
||||
| Layer | Convention | Examples |
|
||||
|---|---|---|
|
||||
|
|
@ -155,7 +161,7 @@ IoT devices cannot initiate connections to `srv`.
|
|||
|
||||
---
|
||||
|
||||
## DNS zones and split-horizon
|
||||
### DNS zones and split-horizon
|
||||
|
||||
**Internal zone**: `boma.baobab.band` — served by `dns1` and `dns2`.
|
||||
The zone is rendered by the Ansible `dns` role: host A records come from the
|
||||
|
|
@ -175,7 +181,7 @@ All other queries go upstream (e.g., `1.1.1.1`, `9.9.9.9`).
|
|||
|
||||
---
|
||||
|
||||
## External monitoring — askari
|
||||
### External monitoring — askari
|
||||
|
||||
`askari` (Hetzner VPS) is a peer on the **NetBird mesh** (ADR-016) and also **hosts
|
||||
the self-hosted NetBird coordinator** (management/signal/relay). It reaches `srv`
|
||||
|
|
@ -186,3 +192,24 @@ ACLs — no OPNsense WireGuard tunnel and no `10.99.0.0/24` routing.
|
|||
be reachable even when the homelab is down (its entire purpose), which is also why
|
||||
the mesh coordinator lives here: an off-site control plane survives a homelab outage.
|
||||
FQDN: `askari.baobab.band`.
|
||||
|
||||
---
|
||||
|
||||
## Consequences
|
||||
|
||||
Drawn from the implications already stated above:
|
||||
|
||||
- VLAN 99 (`vpn`, `10.99.0.0/24`) is retired and the subnet freed; remote access is
|
||||
carried by the self-hosted NetBird mesh instead of an OPNsense WireGuard subnet
|
||||
(VLAN design; IP addressing — VLAN 99 retired).
|
||||
- Mesh-peer firewall allowances (to `srv` metrics ports and `mgmt`) are enforced by
|
||||
NetBird ACLs, not OPNsense rules (OPNsense firewall rules (intent)).
|
||||
- IoT devices cannot initiate connections to `srv`; only Home Assistant at
|
||||
`10.20.0.13` may reach the IoT VLAN, with OPNsense Avahi bridging `srv` ↔ `iot`
|
||||
for discovery (OPNsense firewall rules (intent)).
|
||||
- Terraform writes no DNS records; the Ansible `dns` role renders the internal zone
|
||||
from inventory plus `group_vars`, with `dns1`/`dns2` serving split-horizon answers
|
||||
(DNS zones and split-horizon).
|
||||
- `askari` runs independently of the cluster so it survives a homelab outage, which
|
||||
is why the off-site NetBird control plane lives there (External monitoring —
|
||||
askari).
|
||||
|
|
|
|||
|
|
@ -3,6 +3,10 @@
|
|||
> Practical point-of-use pitfalls (nft render checks, Molecule `community.docker`,
|
||||
> apply-path coverage blind spots) live in `docs/testing/gotchas.md`.
|
||||
|
||||
## Status
|
||||
|
||||
Accepted (2026-05-30)
|
||||
|
||||
## Context
|
||||
|
||||
Ansible roles must be idempotent and correct before they touch production hosts.
|
||||
|
|
@ -11,9 +15,11 @@ This document records the testing strategy, what each level covers, and — crit
|
|||
|
||||
---
|
||||
|
||||
## Three testing levels
|
||||
## Decision
|
||||
|
||||
### Level 1 — Molecule (per role, always required)
|
||||
### Three testing levels
|
||||
|
||||
#### Level 1 — Molecule (per role, always required)
|
||||
|
||||
Runs in Docker on the control node (`ubongo`) or in CI. Fast (~5 min per role).
|
||||
|
||||
|
|
@ -41,7 +47,7 @@ The idempotency step is non-negotiable. Every role must pass it cleanly.
|
|||
that: svc.stdout == "active"
|
||||
```
|
||||
|
||||
### Level 2 — Staging playbook (full stack, real VMs)
|
||||
#### Level 2 — Staging playbook (full stack, real VMs)
|
||||
|
||||
`make check PLAYBOOK=site` followed by `make deploy PLAYBOOK=site` on
|
||||
Terraform-provisioned staging VMs. Catches inter-role dependencies and ordering
|
||||
|
|
@ -50,13 +56,13 @@ have already run and configured the firewall).
|
|||
|
||||
Run before every merge to `main`.
|
||||
|
||||
### Level 3 — External smoke test from askari
|
||||
#### Level 3 — External smoke test from askari
|
||||
|
||||
Once `askari` is operational: scripted checks from outside the network confirming
|
||||
that public-facing services respond correctly. Catches firewall and reverse proxy
|
||||
configuration issues invisible to Ansible check mode.
|
||||
|
||||
### Level 4 — Service-UI acceptance (Claude-driven exploratory)
|
||||
#### Level 4 — Service-UI acceptance (Claude-driven exploratory)
|
||||
|
||||
A Claude-driven exploratory check of a service's **application UI**, run as
|
||||
`/verify-service <name>` on `ubongo` (ADR-017). Claude drives Chromium via the
|
||||
|
|
@ -78,7 +84,7 @@ deploy (STATUS.md). Full design: ADR-017.
|
|||
|
||||
---
|
||||
|
||||
## Molecule test image
|
||||
### Molecule test image
|
||||
|
||||
**No external images.** The project builds and hosts its own test image.
|
||||
|
||||
|
|
@ -103,7 +109,7 @@ functionally equivalent and fully owned.
|
|||
|
||||
---
|
||||
|
||||
## Idempotency requirements
|
||||
### Idempotency requirements
|
||||
|
||||
Every role task must satisfy one of these:
|
||||
|
||||
|
|
@ -121,9 +127,9 @@ catches anything lint misses.
|
|||
|
||||
---
|
||||
|
||||
## What Molecule tests — and what it does not
|
||||
### What Molecule tests — and what it does not
|
||||
|
||||
### Tested in Molecule
|
||||
#### Tested in Molecule
|
||||
|
||||
| Capability | Notes |
|
||||
|---|---|
|
||||
|
|
@ -139,7 +145,7 @@ catches anything lint misses.
|
|||
| auditd installation and configuration | Install and config file |
|
||||
| Idempotency of all of the above | Enforced by Molecule's idempotency step |
|
||||
|
||||
### Not tested in Molecule — explicit exceptions
|
||||
#### Not tested in Molecule — explicit exceptions
|
||||
|
||||
The following require a real kernel or real hardware and are validated only at
|
||||
Level 2 (staging) or Level 3 (external). This is a conscious, documented decision
|
||||
|
|
@ -161,7 +167,7 @@ Behavioural correctness is confirmed on staging.
|
|||
|
||||
---
|
||||
|
||||
## CI pipeline
|
||||
### CI pipeline
|
||||
|
||||
```
|
||||
push to main
|
||||
|
|
@ -178,3 +184,27 @@ promote to production
|
|||
|
||||
Manual gates are intentional. Automated tests prove correctness in isolation;
|
||||
a human confirms the change is safe to promote.
|
||||
|
||||
---
|
||||
|
||||
## Consequences
|
||||
|
||||
Drawn from the limitations and trade-offs already stated above:
|
||||
|
||||
- The Molecule idempotency step is non-negotiable; every role must pass it cleanly
|
||||
(Three testing levels — Level 1).
|
||||
- A class of capabilities (nftables rule loading, NetBird mesh data plane,
|
||||
unattended-upgrades behaviour, OPNsense DHCP, Avahi mDNS reflection, hardware
|
||||
passthrough, corosync cluster formation) cannot be verified in Molecule and is
|
||||
validated only at Level 2 (staging) or Level 3 (external) — a conscious,
|
||||
documented decision, not a gap (What Molecule tests — and what it does not).
|
||||
- The project builds and hosts its own `molecule-debian13` image rather than relying
|
||||
on an external Docker Hub image (e.g. geerlingguy), accepting the maintenance of a
|
||||
custom image to avoid drift, disappearance, or unexpected changes outside project
|
||||
control (Molecule test image).
|
||||
- Level 4 service-UI acceptance is authorable now but its execution is deferred,
|
||||
pending `ubongo`, the `playwright` plugin, Authentik, and a staging deploy (Three
|
||||
testing levels — Level 4).
|
||||
- Promotion to staging and to production stays behind intentional manual approval
|
||||
gates; automation proves isolated correctness, a human confirms promotion safety
|
||||
(CI pipeline).
|
||||
|
|
|
|||
|
|
@ -1,5 +1,9 @@
|
|||
# ADR-009 — Terraform ↔ Ansible provisioning handoff
|
||||
|
||||
## Status
|
||||
|
||||
Accepted (2026-05-30)
|
||||
|
||||
## Context
|
||||
|
||||
Two tools touch every managed host. Terraform owns **what exists** — VMs on
|
||||
|
|
@ -14,7 +18,9 @@ the cloud-init template that VMs are cloned from. This ADR covers how they conne
|
|||
|
||||
---
|
||||
|
||||
## The boundary
|
||||
## Decision
|
||||
|
||||
### The boundary
|
||||
|
||||
| Layer | Tool | Notes |
|
||||
|---|---|---|
|
||||
|
|
@ -31,7 +37,7 @@ below).
|
|||
|
||||
---
|
||||
|
||||
## The handoff pipeline
|
||||
### The handoff pipeline
|
||||
|
||||
There is one path by which a managed host comes into existence and reaches its
|
||||
configured state:
|
||||
|
|
@ -55,7 +61,7 @@ this pipeline — **never** by hand-editing the inventory.
|
|||
|
||||
---
|
||||
|
||||
## The data contract
|
||||
### The data contract
|
||||
|
||||
The seam's interface is a single Terraform output consumed by a single script.
|
||||
|
||||
|
|
@ -88,7 +94,7 @@ Terraform, and the inventory is regenerated, never edited.
|
|||
|
||||
---
|
||||
|
||||
## Cloud-init's role
|
||||
### Cloud-init's role
|
||||
|
||||
Cloud-init is the thin first-boot layer between Terraform and Ansible:
|
||||
|
||||
|
|
@ -103,7 +109,7 @@ The line is sharp: cloud-init buys *reachability*, Ansible owns *configuration*.
|
|||
|
||||
---
|
||||
|
||||
## Internal DNS — owned by Ansible, no chicken-and-egg
|
||||
### Internal DNS — owned by Ansible, no chicken-and-egg
|
||||
|
||||
Terraform writes **no** DNS records. The internal zone (`boma.baobab.band`) is
|
||||
rendered entirely by the Ansible `dns` role:
|
||||
|
|
@ -129,7 +135,7 @@ convention only — it no longer implies any difference in how records are writt
|
|||
|
||||
---
|
||||
|
||||
## The control-node exception
|
||||
### The control-node exception
|
||||
|
||||
The control node — the host that runs Terraform and Ansible — is `ubongo`, a
|
||||
dedicated **physical** machine outside the cluster. It is not a VM at all, so
|
||||
|
|
@ -146,7 +152,7 @@ Every other host is Terraform-managed.
|
|||
|
||||
---
|
||||
|
||||
## What was ruled out
|
||||
### What was ruled out
|
||||
|
||||
| Option | Reason |
|
||||
|---|---|
|
||||
|
|
@ -154,3 +160,25 @@ Every other host is Terraform-managed.
|
|||
| Hand-editing the generated inventory | `hosts.yml` is a build artifact of `tf_to_inventory.py`; edits are overwritten on the next `make tf-inventory`. Edit `local.vms` instead. |
|
||||
| Documenting the seam in both ADR-005 and ADR-006 | The boundary belongs in exactly one place. Those ADRs link here. |
|
||||
| Terraform-managed DNS records (`hashicorp/dns` + RFC 2136) | Created a bootstrap cycle (the first DNS server can't register itself) and split DNS ownership across two tools. Ansible owns the whole internal zone instead — one owner, no cycle. |
|
||||
|
||||
## Consequences
|
||||
|
||||
Drawn from the boundary, the data contract, and the "What was ruled out" section above:
|
||||
|
||||
- Adding a host means editing `local.vms` and running the handoff pipeline; the
|
||||
generated `hosts.yml` is a build artifact and must never be hand-edited — manual
|
||||
edits are overwritten on the next `make tf-inventory` (The handoff pipeline; The
|
||||
data contract; What was ruled out).
|
||||
- Manual `qm clone` is rejected as a general provisioning path so the inventory and
|
||||
real infrastructure cannot drift; Terraform is the single way VMs come into
|
||||
existence (What was ruled out).
|
||||
- Terraform writes no DNS records: the Ansible `dns` role renders the whole internal
|
||||
zone from inventory plus `group_vars`, dissolving the bootstrap cycle a
|
||||
Terraform-managed zone (`hashicorp/dns` + RFC 2136) would create (Internal DNS —
|
||||
owned by Ansible, no chicken-and-egg; What was ruled out).
|
||||
- The control node (`ubongo`) is the single documented exception to "Terraform owns
|
||||
VM existence" — a physical machine provisioned manually and managed by Ansible for
|
||||
baseline config only; every other host is Terraform-managed (The control-node
|
||||
exception).
|
||||
- The seam is documented in exactly one place (this ADR); ADR-005 and ADR-006 link
|
||||
here rather than restating it (What was ruled out).
|
||||
|
|
|
|||
|
|
@ -1,5 +1,9 @@
|
|||
# ADR-010 — Forgejo integration and CI
|
||||
|
||||
## Status
|
||||
|
||||
Accepted (2026-05-30)
|
||||
|
||||
## Context
|
||||
|
||||
boma's git host, container registry, and (planned) CI all run on a self-hosted
|
||||
|
|
@ -20,7 +24,7 @@ held to the same standard as the rest of the repo's secrets.
|
|||
|
||||
---
|
||||
|
||||
## Decisions
|
||||
## Decision
|
||||
|
||||
### 1. API tokens are managed secrets, least-privilege
|
||||
|
||||
|
|
@ -75,3 +79,21 @@ later if CI load warrants a separate host. Actions is not yet enabled — see ST
|
|||
| Terraform Forgejo HTTP state backend | Forgejo's `/raw/` API is read-only; state can't be written there. Local state instead (ADR-006). |
|
||||
| Admin-scoped automation tokens | Unnecessary privilege; scope to `read:repository` + `read`/`write:package`. |
|
||||
| Ad-hoc UI/API configuration as the norm | Becomes undocumented drift; codify or document instead. |
|
||||
|
||||
---
|
||||
|
||||
## Consequences
|
||||
|
||||
- The planned CI pipeline (see "CI pipeline (planned)") is trunk-based per ADR-003 /
|
||||
ADR-008 — `push to main → lint + Molecule → deploy staging → [manual gate] → deploy
|
||||
production` — running `act_runner` on `ubongo` (or a dedicated runner VM later if CI
|
||||
load warrants); Actions is not yet enabled, so this remains future work tracked in
|
||||
STATUS.md.
|
||||
- Terraform state is not held in Forgejo: its `/raw/` API is read-only and cannot be
|
||||
written, so local state is used instead (ADR-006) (see "What was ruled out").
|
||||
- Automation tokens are scoped to `read:repository` + `read`/`write:package` rather
|
||||
than admin, accepting the limits that least-privilege imposes on what automation can
|
||||
do (see "What was ruled out").
|
||||
- Instance/repo configuration must be codified or documented rather than changed
|
||||
ad-hoc, to avoid the undocumented drift `/review-repo` exists to catch (see "What was
|
||||
ruled out").
|
||||
|
|
|
|||
|
|
@ -1,6 +1,9 @@
|
|||
# ADR-011 — Update and upgrade management
|
||||
|
||||
**Status: Proposed — draft for discussion (not yet accepted).**
|
||||
## Status
|
||||
|
||||
Proposed (2026-06-04) — draft for discussion; not yet accepted. The core decisions
|
||||
below are settled in intent, but several specifics remain open (see "Open questions").
|
||||
|
||||
## Context
|
||||
|
||||
|
|
@ -10,7 +13,7 @@ drift over time and must be kept current without breaking the homelab: the **hos
|
|||
|
||||
---
|
||||
|
||||
## Decisions
|
||||
## Decision
|
||||
|
||||
### 1. Every service is classified stateful or stateless
|
||||
|
||||
|
|
@ -132,3 +135,19 @@ alert-driven.
|
|||
| 8-weekly as the only stateful path | Too slow for urgent CVEs — hence the DIUN security fast-path. |
|
||||
|
||||
---
|
||||
|
||||
## Consequences
|
||||
|
||||
- A single uniform update policy is rejected: the stateful/stateless split is
|
||||
load-bearing, so stateless services roll on rolling tags while stateful services are
|
||||
pinned `tag@digest`, human-gated, and backup-first (see "What was ruled out").
|
||||
- The weekly run never touches stateful services and the whole fleet is never updated
|
||||
at once, accepting the added orchestration of host ordering and an 8-weekly +
|
||||
fast-path cadence in exchange for bounded blast radius (see "What was ruled out").
|
||||
- No update automation ships until the health-check verification gate is in order; the
|
||||
pipeline is deliberately sequenced behind that harness (see Decision 6).
|
||||
- Several points remain open for discussion (see "Open questions"): where the Proxmox
|
||||
snapshot is driven from across the TF/Ansible boundary; the exact cadences; where the
|
||||
health-check harness lives and the minimum bar that counts as "in order"; whether
|
||||
classification is a per-role `__stateful` flag or a group_vars list; whether the
|
||||
weekly run hits staging first; and the notification + "skip/pause" control channel.
|
||||
|
|
|
|||
|
|
@ -1,5 +1,9 @@
|
|||
# ADR-012 — Hardware reference & capacity evaluation
|
||||
|
||||
## Status
|
||||
|
||||
Accepted (2026-06-01)
|
||||
|
||||
## Context
|
||||
|
||||
The repo modelled the logical/network layer (Terraform VM specs, ADR-007
|
||||
|
|
|
|||
|
|
@ -1,5 +1,9 @@
|
|||
# ADR-013 — Heritage: learning from AnsibleBaobabV4 without inheriting it
|
||||
|
||||
## Status
|
||||
|
||||
Accepted (2026-06-04)
|
||||
|
||||
## Context
|
||||
|
||||
boma is the methodology successor to AnsibleBaobabV4 (and V3 before it) — not a new
|
||||
|
|
@ -10,7 +14,9 @@ structure and assumptions creep back in under the guise of "inspiration." This A
|
|||
sets the policy for drawing on V4 without inheriting it. (Resolves the questions
|
||||
previously parked in TODO 3.3 and 10.1.)
|
||||
|
||||
## Principle — translate, don't transplant
|
||||
## Decision
|
||||
|
||||
### Principle — translate, don't transplant
|
||||
|
||||
V4 is **evidence, never authority.** It can show what was needed or what went wrong;
|
||||
it can never be the reason boma does something a certain way.
|
||||
|
|
@ -21,7 +27,7 @@ it can never be the reason boma does something a certain way.
|
|||
- **Acceptance test** for anything V4-derived: *can it be justified purely from
|
||||
boma's principles, with zero reference to V4?* If not, it does not land.
|
||||
|
||||
## What V4 is — and is not — a source of
|
||||
### What V4 is — and is not — a source of
|
||||
|
||||
| Legitimate source of | Never a source of |
|
||||
|---|---|
|
||||
|
|
@ -33,7 +39,7 @@ it can never be the reason boma does something a certain way.
|
|||
Only concrete, verifiable, low-level knowledge crosses over — precisely because it is
|
||||
safe to re-derive, whereas structure and requirements drag assumptions along.
|
||||
|
||||
## Provenance — transient only
|
||||
### Provenance — transient only
|
||||
|
||||
When a boma decision was prompted by a V4 lesson, or a config adapted from V4, the
|
||||
lineage is recorded only in **transient** places: the commit message, the working
|
||||
|
|
@ -42,7 +48,7 @@ extraction warrants one. **Durable artifacts (ADRs, role READMEs, `SECURITY.md`)
|
|||
stand on boma's own terms with no V4 reference.** Honest about lineage in history;
|
||||
clean in the living repo.
|
||||
|
||||
## AI consultation guardrails
|
||||
### AI consultation guardrails
|
||||
|
||||
The AI is the main consumer of V4 — it is on disk and readable. When consulting it:
|
||||
|
||||
|
|
|
|||
|
|
@ -1,5 +1,9 @@
|
|||
# ADR-014 — Sourcing technical knowledge (docs and best practices)
|
||||
|
||||
## Status
|
||||
|
||||
Accepted (2026-06-04)
|
||||
|
||||
## Context
|
||||
|
||||
Most work in boma is done by AI agents drawing on training memory, which is stale
|
||||
|
|
@ -100,5 +104,27 @@ above keeps the policy working.
|
|||
- Commit to the principle, not a tool — degrade to `WebFetch`/`WebSearch` when plugins
|
||||
are absent.
|
||||
|
||||
See also: ADR-013 (heritage / translate-don't-transplant), ADR-011 (version pinning),
|
||||
ADR-008 (testing/verification).
|
||||
## Consequences
|
||||
|
||||
Drawn from the follow-on work and limitations this ADR already states:
|
||||
|
||||
- Verified facts carry a durable, greppable stamp; a stamp binds a fact to a pinned
|
||||
version, so a `requirements` change or image upgrade marks exactly what to re-check
|
||||
(per Capture / Re-verification).
|
||||
- Stale-stamp detection — a `/review-repo` or `/security-review` check flagging stamps
|
||||
whose recorded version no longer matches what is pinned — is a noted enhancement, not
|
||||
built yet (per Re-verification).
|
||||
- Any version-specific claim given from memory must be marked "from memory, unverified"
|
||||
as a transparency backstop, since agent self-assessed certainty is unreliable (per
|
||||
When consulting is required).
|
||||
- The policy commits to the principle rather than a specific plugin, so it degrades to
|
||||
`WebFetch`/`WebSearch` on a bare install; reproducing the plugin toolchain from the
|
||||
repo is done via `.claude/settings.json` and `docs/runbooks/claude-code-setup.md`,
|
||||
with the graceful-degradation fallback covering a fresh clone until bootstrap runs
|
||||
(per Source hierarchy / Reproducibility of the toolchain).
|
||||
|
||||
## Related
|
||||
|
||||
- ADR-013 — heritage / translate-don't-transplant.
|
||||
- ADR-011 — version pinning.
|
||||
- ADR-008 — testing / verification.
|
||||
|
|
|
|||
|
|
@ -1,5 +1,9 @@
|
|||
# ADR-015 — Control / development / AI-worker host (`ubongo`)
|
||||
|
||||
## Status
|
||||
|
||||
Accepted (2026-06-05)
|
||||
|
||||
## Context
|
||||
|
||||
Earlier ADRs framed the control node — the host that runs Terraform and Ansible —
|
||||
|
|
|
|||
|
|
@ -90,7 +90,7 @@ allocated for it.
|
|||
|
||||
## Status
|
||||
|
||||
Designed, not built — depends on the unbuilt `base` role and service-role machinery
|
||||
Accepted (2026-06-05). Designed, not built — depends on the unbuilt `base` role and service-role machinery
|
||||
(STATUS.md). This ADR records the decision and doc reconciliation; role tasks land when
|
||||
`base` exists.
|
||||
|
||||
|
|
@ -108,3 +108,22 @@ Designed, not built — depends on the unbuilt `base` role and service-role mach
|
|||
See also: ADR-007 (network — amended), ADR-015 (control host), ADR-002 (security),
|
||||
ADR-011 (version pinning), ADR-004 (one service = one role), ADR-009 (TF↔Ansible
|
||||
handoff), ADR-013 (heritage — V4 ran WireGuard; NetBird is translated, not transplanted).
|
||||
|
||||
## Consequences
|
||||
|
||||
- A new public surface appears on `askari` — management API + dashboard (80/443) +
|
||||
Coturn (3478) — mitigated by TLS, embedded-IdP login, source-IP limits where
|
||||
practical, `base` hardening and version-pinned NetBird, and recorded as accepted-risk
|
||||
R3 (Security).
|
||||
- On-LAN SSH never depends on the mesh: `base` allows inbound SSH from `ubongo`'s LAN
|
||||
address as a mesh-independent secondary path, so a mesh/coordinator outage never
|
||||
blocks on-LAN SSH and Ansible stays off the mesh (Security; Recovery & operations).
|
||||
- The mesh survives a homelab outage because the coordinator is off-site on `askari`,
|
||||
with its management datastore backed up encrypted off `askari` and peers keeping
|
||||
last-known config through a brief coordinator outage (Recovery & operations).
|
||||
- Choosing NetBird over plain OPNsense WireGuard, Tailscale, Tailscale+Headscale, an
|
||||
on-cluster coordinator, a `ubongo` subnet router, and a standalone IdP gains
|
||||
identity/ACL policy, self-hosted sovereignty, no routing SPOF, and a light single
|
||||
operator footprint (What was ruled out).
|
||||
- Implementation is pending: the role tasks land only once the unbuilt `base` role and
|
||||
service-role machinery exist (Status).
|
||||
|
|
|
|||
|
|
@ -65,7 +65,7 @@ them.
|
|||
|
||||
## Status
|
||||
|
||||
Designed. **Authorable now:** this ADR, the ADR-008 Level 4 expansion, the `VERIFY.md`
|
||||
Accepted (2026-06-05). Designed. **Authorable now:** this ADR, the ADR-008 Level 4 expansion, the `VERIFY.md`
|
||||
template, the `/verify-service` skill, the convention/checklist/Further-reading edits,
|
||||
`.gitignore`/dir, STATUS/TODO. **Running is deferred** on its dependencies.
|
||||
|
||||
|
|
@ -90,3 +90,21 @@ template, the `/verify-service` skill, the convention/checklist/Further-reading
|
|||
|
||||
See also: ADR-008 (testing — expanded), ADR-015 (control host), ADR-002 (security),
|
||||
ADR-004 (`VERIFY.md` parallels `SECURITY.md`), ADR-013/014 (heritage / knowledge sourcing).
|
||||
|
||||
## Consequences
|
||||
|
||||
- The harness is confined to staging by a hard stop: it refuses to run against
|
||||
production because exploratory clicking is destructive, the blast radius is bounded to
|
||||
the target service, and test users live only in the staging `test` group (Safety).
|
||||
- No secrets leak: the git-ignored screenshot dir is the safety boundary and credential
|
||||
screens are avoided (Safety; Reporting & manual handoff).
|
||||
- Test identities are ephemeral per-run credentials in the staging Authentik only —
|
||||
never production, none persisted in `vault.yml` — created reuse-or-create and torn
|
||||
down via staging rebuild or `test`-group cleanup (Test-user standard).
|
||||
- Anything Claude cannot exercise (physical device, paid/external flow, subjective
|
||||
judgment) is handed off via a structured manual-test checklist in the run report
|
||||
(Reporting & manual handoff).
|
||||
- Authoring is possible now (this ADR, the `VERIFY.md` template, the `/verify-service`
|
||||
skill, conventions/checklist edits), but running is deferred on its dependencies:
|
||||
`ubongo`, the `playwright` plugin, Authentik, a staging deploy, and `make new-role`
|
||||
scaffolding `VERIFY.md` (Status; Dependencies).
|
||||
|
|
|
|||
|
|
@ -72,7 +72,7 @@ tracked allocation in `docs/hardware/reference.md` (ADR-012).
|
|||
|
||||
## Status
|
||||
|
||||
Designed. **Authorable now:** this ADR + the ADR-002/CAPABILITIES/ADR-012/
|
||||
Accepted (2026-06-06). Designed. **Authorable now:** this ADR + the ADR-002/CAPABILITIES/ADR-012/
|
||||
accepted-risks/STATUS/TODO reconciliations. **Deferred on the stack:** Alloy-in-`base`,
|
||||
the `loki`/`grafana` service roles, OPNsense syslog config, the push-only credential,
|
||||
and the live pipeline.
|
||||
|
|
@ -97,3 +97,26 @@ the metrics stack (Prometheus / `node_exporter`) for SSD-wearout + log-silence a
|
|||
See also: ADR-002 (security baseline — realised here), ADR-016 (mesh / `askari`),
|
||||
ADR-007 (OPNsense / `askari`), ADR-012 (hardware/capacity), ADR-004 (service-role
|
||||
standard), ADR-011 (health checks — distinct from this).
|
||||
|
||||
## Consequences
|
||||
|
||||
- Opportunistic track-covering and host-pivot-to-store are defeated because logs leave
|
||||
the host in near-real-time and the off-cluster security trail is append-only, so it
|
||||
survives full-cluster compromise (Security, integrity & residual risks).
|
||||
- Conscious residuals remain: append-only is not cryptographic WORM (root-on-`askari`
|
||||
could edit chunks — R4); there is a few-seconds un-shipped window; agent compromise
|
||||
can stop future shipping but not alter shipped history; a stolen push credential
|
||||
appends noise but cannot delete; and an `askari` outage buffers then flushes on
|
||||
reconnect (Security, integrity & residual risks).
|
||||
- A host going silent is itself an alert (Security, integrity & residual risks).
|
||||
- Only a bounded security subset ships off-site — `auditd`, `authpriv`, `fail2ban`,
|
||||
AIDE, Suricata and key container security events tagged `security="true"` — while the
|
||||
cluster Loki holds everything, keeping off-site volume small (Data flow & the security
|
||||
subset).
|
||||
- Disk-wear is a managed parameter: log storage on NVMe/SSD or HDD never SD/USB flash,
|
||||
bounded verbosity at source, tuned Loki retention/compaction, and monitored SSD
|
||||
wearout/TBW with an alert; log storage is a tracked allocation in
|
||||
`docs/hardware/reference.md` (Retention & disk-wear).
|
||||
- The decision is authorable now but the live pipeline is deferred on the stack:
|
||||
Alloy-in-`base`, the `loki`/`grafana` service roles, OPNsense syslog config, and the
|
||||
push-only credential (Status; Dependencies).
|
||||
|
|
|
|||
106
docs/decisions/023-adr-structure.md
Normal file
106
docs/decisions/023-adr-structure.md
Normal file
|
|
@ -0,0 +1,106 @@
|
|||
# ADR-023 — ADR structure & lifecycle
|
||||
|
||||
## Status
|
||||
|
||||
Accepted (2026-06-10). Meta/doctrine ADR — pins how ADRs are written; the
|
||||
`adr-structure` check (`scripts/repo-scan.py`) and `docs/decisions/adr-template.md`
|
||||
ship with it, and ADRs 001–018 were retroactively restructured to conform. Resolves
|
||||
the FRICTION signal (2026-05-31) about ADR-writing policy being unsettled.
|
||||
|
||||
## Context
|
||||
|
||||
boma records architectural decisions as numbered ADRs in `docs/decisions/`, and
|
||||
CLAUDE.md treats them as load-bearing. Yet no ADR said how an ADR is written. The
|
||||
newest ADRs (019–022) converged on a clean shape — Status → Context → Decision →
|
||||
Consequences → Related — but only by imitation. ADRs 001–018 predate it and drifted
|
||||
widely: most lacked a `## Status` section entirely (016–018 carried only a trailing
|
||||
build-state note), and many lacked an explicit `## Decision` or `## Consequences`
|
||||
heading, their decisions spread across ad-hoc topical sections. The result was
|
||||
structural drift and no uniform way to tell an active decision from a superseded or
|
||||
deprecated one.
|
||||
|
||||
## Decision
|
||||
|
||||
### 1. Title & filename
|
||||
|
||||
Title line: `# ADR-NNN — <Title>: <optional clarifying subtitle>` (em-dash). Filename:
|
||||
`NNN-kebab-title.md`, zero-padded 3-digit, monotonic, never reused — a superseded ADR
|
||||
keeps its number and file. A new ADR is registered as a row in the CLAUDE.md
|
||||
"Further reading" table.
|
||||
|
||||
### 2. Mandatory sections, in this order
|
||||
|
||||
- `## Status` — a lifecycle line, usually `Accepted (YYYY-MM-DD)` (see §4), plus an
|
||||
optional one-line note.
|
||||
- `## Context` — the forces, the problem, what exists today, why now.
|
||||
- `## Decision` — what we are doing; numbered sub-decisions for multi-part ADRs.
|
||||
- `## Consequences` — results, trade-offs explicitly accepted, follow-on work.
|
||||
|
||||
### 3. Optional sections (use only where they genuinely apply)
|
||||
|
||||
`## Related`, `## Scope`, `## Guardrails` / `## Enforcement`, `## What was ruled out`,
|
||||
`## Verified facts (ADR-014)`.
|
||||
|
||||
### 4. Status lifecycle
|
||||
|
||||
Four states. Because boma is single-contributor and trunk-based with no review gate,
|
||||
most ADRs are **born `Accepted (YYYY-MM-DD)`** — committed-to on writing. A
|
||||
**`Proposed`** state exists for a genuine draft whose core direction is recorded but
|
||||
whose specifics are still open for discussion (e.g. ADR-011); it is promoted to
|
||||
`Accepted` once settled.
|
||||
|
||||
- **`Proposed (YYYY-MM-DD)`** — drafted, under discussion, not yet committed-to. May
|
||||
carry open questions. Promoted to `Accepted (YYYY-MM-DD)` when decided.
|
||||
- **`Accepted (YYYY-MM-DD)`** — committed-to. The common starting state.
|
||||
- Replaced → old ADR's Status becomes **`Superseded by ADR-NNN (YYYY-MM-DD)`**; the new
|
||||
ADR records `Supersedes ADR-MMM` in its Status and `## Related`. The link is
|
||||
**bidirectional**.
|
||||
- Retired with no replacement → **`Deprecated (YYYY-MM-DD)`** + a one-line reason.
|
||||
|
||||
**No silent rewrites.** An Accepted ADR is not edited to reverse its decision. Typo and
|
||||
clarity fixes are fine; a material reversal requires a new ADR and a `Superseded by`
|
||||
marker on the old one.
|
||||
|
||||
### 5. Template & enforcement
|
||||
|
||||
`docs/decisions/adr-template.md` is the scaffold for new ADRs. The `/review-repo`
|
||||
command's pre-scan (`scripts/repo-scan.py`) emits an `adr-structure` finding for any
|
||||
numbered ADR missing a mandatory section or with an unparseable Status line. It checks
|
||||
**presence and Status, not section order** — order is a convention the template carries,
|
||||
deliberately not gated, to keep enforcement lightweight (consistent with boma's other
|
||||
doctrine ADRs adding no CI gate).
|
||||
|
||||
### 6. Retroactive conformance of the back-catalogue
|
||||
|
||||
ADRs 001–018 are restructured to satisfy this standard rather than grandfathered. The
|
||||
restructure is **presentational** — existing headings are relabelled, regrouped, or
|
||||
demoted under a `## Decision` umbrella; a dated `## Status` is added; a `## Consequences`
|
||||
section is assembled from implications the ADR already states. **The substance of no
|
||||
decision is changed.** This keeps the check uniform (no number threshold) and the corpus
|
||||
a consistent, legible decision history.
|
||||
|
||||
## Consequences
|
||||
|
||||
- New ADRs have one obvious shape and a scaffold; structural drift stops.
|
||||
- Every ADR declares its lifecycle state uniformly, and reversals are traceable.
|
||||
- The whole corpus conforms; the check needs no grandfathering and stays simple.
|
||||
- One-time restructure churn across ADRs 001–018 (heading reorganization + a Status and
|
||||
a Consequences section per file; no decision substance changed).
|
||||
- `/review-repo` grows one deterministic check; no new CI machinery.
|
||||
- This ADR is the first conformant example and is held to its own check.
|
||||
|
||||
## What was ruled out
|
||||
|
||||
- **A `make lint` / CI gate for ADR structure** — heavier than the risk warrants;
|
||||
the `/review-repo` check and the template suffice.
|
||||
- **Machine-enforcing section order** — brittle for marginal value; left as a
|
||||
template-demonstrated convention.
|
||||
- **Grandfathering 001–018 from the check** — rejected in favour of restructuring the
|
||||
whole corpus to conform, so the standard applies uniformly with no exceptions.
|
||||
|
||||
## Related
|
||||
|
||||
- ADR-014 — knowledge sourcing (the `Verified facts` optional section).
|
||||
- ADR-019/020/021/022 — the emergent structure this ADR codifies.
|
||||
- `docs/decisions/adr-template.md` — the scaffold.
|
||||
- `scripts/repo-scan.py` — the `adr-structure` enforcement check.
|
||||
40
docs/decisions/adr-template.md
Normal file
40
docs/decisions/adr-template.md
Normal file
|
|
@ -0,0 +1,40 @@
|
|||
# ADR-NNN — <Title>: <optional clarifying subtitle>
|
||||
|
||||
<!-- Filename: NNN-kebab-title.md (zero-padded, monotonic, never reused).
|
||||
Register a row in CLAUDE.md "Further reading" when this ADR is created.
|
||||
Sections below in order. Mandatory: Status, Context, Decision, Consequences.
|
||||
Delete this comment and any optional section you don't use. -->
|
||||
|
||||
## Status
|
||||
|
||||
Accepted (YYYY-MM-DD)
|
||||
<!-- Lifecycle: usually born "Accepted (YYYY-MM-DD)"; use "Proposed (YYYY-MM-DD)" for a
|
||||
genuine draft (open questions), promoted to Accepted once settled. Later:
|
||||
"Superseded by ADR-NNN (YYYY-MM-DD)" or "Deprecated (YYYY-MM-DD)" + one-line why.
|
||||
Optional trailing note OK, e.g.
|
||||
"Accepted (2026-06-10). Doctrine ADR — pins policy, builds nothing yet." -->
|
||||
|
||||
## Context
|
||||
|
||||
<!-- The forces, the problem, what exists today, why now. -->
|
||||
|
||||
## Decision
|
||||
|
||||
<!-- What we are doing. Use numbered sub-decisions (### 1. ...) for multi-part ADRs. -->
|
||||
|
||||
## Consequences
|
||||
|
||||
<!-- Results, trade-offs explicitly accepted, follow-on work. -->
|
||||
|
||||
<!-- Optional sections — uncomment any that genuinely apply; never pad:
|
||||
|
||||
## Scope — explicit in / out-of-scope boundaries.
|
||||
|
||||
## Guardrails — how the decision is mechanically enforced (lint, CI, hooks).
|
||||
|
||||
## What was ruled out — rejected alternatives, each with its reason.
|
||||
|
||||
## Verified facts (ADR-014) — verified: <subject> · <tool> <version> · <source> · <YYYY-MM-DD>
|
||||
|
||||
## Related — links to other ADRs by number; bidirectional for Supersedes/Superseded-by.
|
||||
-->
|
||||
556
docs/superpowers/plans/2026-06-10-adr-structure.md
Normal file
556
docs/superpowers/plans/2026-06-10-adr-structure.md
Normal file
|
|
@ -0,0 +1,556 @@
|
|||
# ADR Structure & Lifecycle Implementation Plan
|
||||
|
||||
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
|
||||
|
||||
**Goal:** Codify how boma's ADRs are structured — a canonical section set, an Accepted/Superseded/Deprecated lifecycle, a template, a lightweight enforcement check, and a one-time Status backfill of the back-catalogue.
|
||||
|
||||
**Architecture:** Five independent units. (1) A pure-function `adr-structure` check added to the existing `scripts/repo-scan.py` (stdlib only, pytest-tested like its siblings), verifying every numbered ADR has the four mandatory sections and a parseable Status line — presence only, not order. (2) An `adr-template.md` scaffold. (3) ADR-023 itself, written to pass its own check. (4) Wiring into CLAUDE.md and the `/review-repo` command doc. (5) A mechanical backfill adding `## Status` to ADRs 001–018, dated from each file's first git-commit.
|
||||
|
||||
**Tech Stack:** Python 3 stdlib (`scripts/repo-scan.py`), pytest (`.venv/bin/pytest`), Markdown, git.
|
||||
|
||||
**Spec:** `docs/superpowers/specs/2026-06-10-adr-structure-design.md`
|
||||
|
||||
**Branch:** `feat/adr-structure` (already created; the design spec is the first commit).
|
||||
|
||||
**Convention reminders (from CLAUDE.md):** docs-/script-only commits skip the ansible-lint pre-commit hook and need no `rbw` unlock. Imperative subject ≤72 chars. `Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>` trailer on every commit.
|
||||
|
||||
---
|
||||
|
||||
## Decisions locked by the spec (do not re-litigate)
|
||||
|
||||
- **Mandatory sections, in this order:** `## Status`, `## Context`, `## Decision`, `## Consequences`.
|
||||
- **Optional sections:** `## Related`, `## Scope`, `## Guardrails` / `## Enforcement`, `## What was ruled out`, `## Verified facts (ADR-014)`.
|
||||
- **Status lifecycle (4 states):** `Proposed (YYYY-MM-DD)` (genuine drafts, e.g. ADR-011) → `Accepted (YYYY-MM-DD)` (the common starting state) → optionally `Superseded by ADR-NNN (YYYY-MM-DD)` or `Deprecated (YYYY-MM-DD)`. (`Proposed` was added on the evidence of ADR-011, which is a real draft with open questions.)
|
||||
- **No silent rewrites:** material reversal = new ADR + `Superseded by` marker; bidirectional link.
|
||||
- **Enforcement checks presence + parseable Status line, NOT section order.** Order is demonstrated by the template, not machine-enforced.
|
||||
- **Back-catalogue is fully restructured (no grandfathering)** — ADRs 001–018 are brought to all-four-section conformance. The restructure is **presentational**: relabel/regroup/demote existing headings, add a dated Status, assemble a Consequences section from implications the ADR already states. **The substance of no decision is changed.** If a faithful Consequences cannot be drawn from existing content, escalate that file rather than inventing one.
|
||||
|
||||
---
|
||||
|
||||
## Task 1: `adr-structure` check in repo-scan.py
|
||||
|
||||
**Files:**
|
||||
- Modify: `scripts/repo-scan.py` (add module-level regexes near the other `_RE` definitions ~line 38–44; add `adr_structure_findings()` next to `deferred_findings()` ~line 96; wire it into `scan()` at the `findings.extend(...)` site ~line 215)
|
||||
- Test: `tests/test_repo_scan.py` (new)
|
||||
|
||||
- [ ] **Step 1: Write the failing test**
|
||||
|
||||
Create `tests/test_repo_scan.py`:
|
||||
|
||||
```python
|
||||
import importlib.util
|
||||
import pathlib
|
||||
|
||||
_PATH = pathlib.Path(__file__).resolve().parent.parent / "scripts" / "repo-scan.py"
|
||||
_spec = importlib.util.spec_from_file_location("repo_scan", _PATH)
|
||||
rs = importlib.util.module_from_spec(_spec)
|
||||
_spec.loader.exec_module(rs)
|
||||
|
||||
GOOD = [
|
||||
"# ADR-099 — Example\n", "\n",
|
||||
"## Status\n", "\n", "Accepted (2026-06-10)\n", "\n",
|
||||
"## Context\n", "\n", "Why.\n", "\n",
|
||||
"## Decision\n", "\n", "What.\n", "\n",
|
||||
"## Consequences\n", "\n", "So what.\n",
|
||||
]
|
||||
|
||||
|
||||
def _checks(findings):
|
||||
return [f for f in findings if f["check"] == "adr-structure"]
|
||||
|
||||
|
||||
def test_good_adr_has_no_findings():
|
||||
out = rs.adr_structure_findings({"docs/decisions/099-example.md": GOOD})
|
||||
assert _checks(out) == []
|
||||
|
||||
|
||||
def test_missing_mandatory_section_is_flagged():
|
||||
lines = [ln for ln in GOOD if not ln.startswith("## Consequences")]
|
||||
out = _checks(rs.adr_structure_findings({"docs/decisions/099-example.md": lines}))
|
||||
assert len(out) == 1
|
||||
assert "Consequences" in out[0]["detail"]
|
||||
|
||||
|
||||
def test_unparseable_status_is_flagged():
|
||||
lines = [("Designed, not built.\n" if ln == "Accepted (2026-06-10)\n" else ln)
|
||||
for ln in GOOD]
|
||||
out = _checks(rs.adr_structure_findings({"docs/decisions/099-example.md": lines}))
|
||||
assert len(out) == 1
|
||||
assert "Status not parseable" in out[0]["detail"]
|
||||
|
||||
|
||||
def test_superseded_status_is_accepted():
|
||||
lines = [("Superseded by ADR-100 (2026-06-11)\n" if ln == "Accepted (2026-06-10)\n"
|
||||
else ln) for ln in GOOD]
|
||||
out = _checks(rs.adr_structure_findings({"docs/decisions/099-example.md": lines}))
|
||||
assert out == []
|
||||
|
||||
|
||||
def test_non_numbered_file_is_skipped():
|
||||
bare = ["# ADR template\n", "\n", "## Status\n", "\n", "<!-- hint -->\n"]
|
||||
out = _checks(rs.adr_structure_findings({"docs/decisions/adr-template.md": bare}))
|
||||
assert out == []
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Run the test to verify it fails**
|
||||
|
||||
Run: `.venv/bin/pytest tests/test_repo_scan.py -q`
|
||||
Expected: FAIL — `AttributeError: module 'repo_scan' has no attribute 'adr_structure_findings'`.
|
||||
|
||||
- [ ] **Step 3: Add the regexes**
|
||||
|
||||
In `scripts/repo-scan.py`, after the `RESOLVE_WORD_RE = ...` line (~line 44), add:
|
||||
|
||||
```python
|
||||
# ADR-structure check (ADR-023): numbered ADRs must carry the four mandatory
|
||||
# sections and a parseable Status line. Presence only — section ORDER is a
|
||||
# template-demonstrated convention, not machine-enforced.
|
||||
ADR_FILE_RE = re.compile(r"^\d{3}-.*\.md$")
|
||||
ADR_REQUIRED_SECTIONS = ("Status", "Context", "Decision", "Consequences")
|
||||
ADR_STATUS_LINE_RE = re.compile(
|
||||
r"^(Accepted \(\d{4}-\d{2}-\d{2}\)"
|
||||
r"|Superseded by ADR-\d{3}"
|
||||
r"|Deprecated \(\d{4}-\d{2}-\d{2}\))")
|
||||
```
|
||||
|
||||
- [ ] **Step 4: Add the check function**
|
||||
|
||||
In `scripts/repo-scan.py`, immediately after the `deferred_findings(...)` function (it ends ~line 96, just before `def walk_files():`), add:
|
||||
|
||||
```python
|
||||
def adr_structure_findings(adr_files):
|
||||
"""adr_files: {rel_path: [lines]} for docs/decisions/*.md.
|
||||
Flags numbered ADRs (NNN-*.md) missing a mandatory section or whose Status
|
||||
section has no parseable lifecycle line. Non-numbered files (e.g.
|
||||
adr-template.md) are skipped. Section order is NOT checked (ADR-023)."""
|
||||
out = []
|
||||
for rpath, lines in sorted(adr_files.items()):
|
||||
if not ADR_FILE_RE.match(os.path.basename(rpath)):
|
||||
continue
|
||||
headings = {}
|
||||
for i, line in enumerate(lines):
|
||||
m = re.match(r"^##\s+(\w+)", line)
|
||||
if m:
|
||||
headings.setdefault(m.group(1), i)
|
||||
missing = [s for s in ADR_REQUIRED_SECTIONS if s not in headings]
|
||||
if missing:
|
||||
out.append({"check": "adr-structure", "severity": "medium",
|
||||
"path": rpath, "line": 1,
|
||||
"detail": f"missing mandatory section(s): {', '.join(missing)}"})
|
||||
if "Status" in headings:
|
||||
body = []
|
||||
for line in lines[headings["Status"] + 1:]:
|
||||
if line.startswith("## "):
|
||||
break
|
||||
body.append(line)
|
||||
status_text = next((ln.strip() for ln in body if ln.strip()), "")
|
||||
if not ADR_STATUS_LINE_RE.match(status_text):
|
||||
out.append({"check": "adr-structure", "severity": "medium",
|
||||
"path": rpath, "line": headings["Status"] + 1,
|
||||
"detail": "Status not parseable (want 'Accepted (YYYY-MM-DD)', "
|
||||
"'Superseded by ADR-NNN', or 'Deprecated (YYYY-MM-DD)'); "
|
||||
f"got: {status_text[:60]!r}"})
|
||||
return out
|
||||
```
|
||||
|
||||
- [ ] **Step 5: Run the test to verify it passes**
|
||||
|
||||
Run: `.venv/bin/pytest tests/test_repo_scan.py -q`
|
||||
Expected: PASS — 5 passed.
|
||||
|
||||
- [ ] **Step 6: Wire the check into `scan()`**
|
||||
|
||||
In `scripts/repo-scan.py`, find (~line 215):
|
||||
|
||||
```python
|
||||
findings.extend(deferred_findings(adr_files, defer_refs))
|
||||
return findings
|
||||
```
|
||||
|
||||
Replace with:
|
||||
|
||||
```python
|
||||
findings.extend(deferred_findings(adr_files, defer_refs))
|
||||
findings.extend(adr_structure_findings(adr_files))
|
||||
return findings
|
||||
```
|
||||
|
||||
- [ ] **Step 7: Confirm the check fires on the real (not-yet-backfilled) repo**
|
||||
|
||||
Run: `python3 scripts/repo-scan.py 2>/dev/null | python3 -c "import json,sys; print(sorted({f['path'] for f in json.load(sys.stdin)['findings'] if f['check']=='adr-structure'}))"`
|
||||
Expected: a list including `docs/decisions/001-architecture.md` … through `018-logging.md` (001–015 missing Status; 016–018 unparseable Status). 019–022 and 023 must NOT appear. This proves the check works and previews Task 5's worklist.
|
||||
|
||||
- [ ] **Step 8: Commit**
|
||||
|
||||
```bash
|
||||
git add scripts/repo-scan.py tests/test_repo_scan.py
|
||||
git commit -m "feat(review): add adr-structure check to repo-scan
|
||||
|
||||
Flags numbered ADRs missing a mandatory section (Status/Context/Decision/
|
||||
Consequences) or with an unparseable Status line. Presence only, not order.
|
||||
|
||||
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Task 2: ADR template
|
||||
|
||||
**Files:**
|
||||
- Create: `docs/decisions/adr-template.md`
|
||||
|
||||
- [ ] **Step 1: Write the template**
|
||||
|
||||
Create `docs/decisions/adr-template.md` with exactly:
|
||||
|
||||
```markdown
|
||||
# ADR-NNN — <Title>: <optional clarifying subtitle>
|
||||
|
||||
<!-- Filename: NNN-kebab-title.md (zero-padded, monotonic, never reused).
|
||||
Register a row in CLAUDE.md "Further reading" when this ADR is created.
|
||||
Sections below in order. Mandatory: Status, Context, Decision, Consequences.
|
||||
Delete this comment and any optional section you don't use. -->
|
||||
|
||||
## Status
|
||||
|
||||
Accepted (YYYY-MM-DD)
|
||||
<!-- Lifecycle: "Accepted (YYYY-MM-DD)" → later "Superseded by ADR-NNN (YYYY-MM-DD)"
|
||||
or "Deprecated (YYYY-MM-DD)" + one-line why. Optional trailing note OK, e.g.
|
||||
"Accepted (2026-06-10). Doctrine ADR — pins policy, builds nothing yet." -->
|
||||
|
||||
## Context
|
||||
|
||||
<!-- The forces, the problem, what exists today, why now. -->
|
||||
|
||||
## Decision
|
||||
|
||||
<!-- What we are doing. Use numbered sub-decisions (### 1. ...) for multi-part ADRs. -->
|
||||
|
||||
## Consequences
|
||||
|
||||
<!-- Results, trade-offs explicitly accepted, follow-on work. -->
|
||||
|
||||
<!-- Optional sections — uncomment any that genuinely apply; never pad:
|
||||
|
||||
## Scope — explicit in / out-of-scope boundaries.
|
||||
|
||||
## Guardrails — how the decision is mechanically enforced (lint, CI, hooks).
|
||||
|
||||
## What was ruled out — rejected alternatives, each with its reason.
|
||||
|
||||
## Verified facts (ADR-014) — verified: <subject> · <tool> <version> · <source> · <YYYY-MM-DD>
|
||||
|
||||
## Related — links to other ADRs by number; bidirectional for Supersedes/Superseded-by.
|
||||
-->
|
||||
```
|
||||
|
||||
(HTML comments do not nest — optional sections use one flat comment block with inline
|
||||
em-dash descriptions, not commented sub-hints inside an outer comment.)
|
||||
|
||||
- [ ] **Step 2: Confirm the template is skipped by the check**
|
||||
|
||||
Run: `python3 scripts/repo-scan.py 2>/dev/null | python3 -c "import json,sys; print([f for f in json.load(sys.stdin)['findings'] if f['check']=='adr-structure' and 'adr-template' in f['path']])"`
|
||||
Expected: `[]` (non-numbered filename → skipped).
|
||||
|
||||
- [ ] **Step 3: Commit**
|
||||
|
||||
```bash
|
||||
git add docs/decisions/adr-template.md
|
||||
git commit -m "docs(adr): add adr-template.md scaffold (ADR-023)
|
||||
|
||||
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Task 3: ADR-023 itself
|
||||
|
||||
**Files:**
|
||||
- Create: `docs/decisions/023-adr-structure.md`
|
||||
|
||||
- [ ] **Step 1: Write ADR-023**
|
||||
|
||||
Create `docs/decisions/023-adr-structure.md`. It must pass its own check (Status/Context/Decision/Consequences present; parseable Status line). Use this content:
|
||||
|
||||
```markdown
|
||||
# ADR-023 — ADR structure & lifecycle
|
||||
|
||||
## Status
|
||||
|
||||
Accepted (2026-06-10). Meta/doctrine ADR — pins how ADRs are written; the
|
||||
`adr-structure` check (`scripts/repo-scan.py`) and `docs/decisions/adr-template.md`
|
||||
ship with it, and ADRs 001–018 were retroactively restructured to conform. Resolves
|
||||
the FRICTION signal (2026-05-31) about ADR-writing policy being unsettled.
|
||||
|
||||
## Context
|
||||
|
||||
boma records architectural decisions as numbered ADRs in `docs/decisions/`, and
|
||||
CLAUDE.md treats them as load-bearing. Yet no ADR said how an ADR is written. The
|
||||
newest ADRs (019–022) converged on a clean shape — Status → Context → Decision →
|
||||
Consequences → Related — but only by imitation. ADRs 001–018 predate it and drifted
|
||||
widely: most lacked a `## Status` section entirely (016–018 carried only a trailing
|
||||
build-state note), and many lacked an explicit `## Decision` or `## Consequences`
|
||||
heading, their decisions spread across ad-hoc topical sections. The result was
|
||||
structural drift and no uniform way to tell an active decision from a superseded or
|
||||
deprecated one.
|
||||
|
||||
## Decision
|
||||
|
||||
### 1. Title & filename
|
||||
|
||||
Title line: `# ADR-NNN — <Title>: <optional clarifying subtitle>` (em-dash). Filename:
|
||||
`NNN-kebab-title.md`, zero-padded 3-digit, monotonic, never reused — a superseded ADR
|
||||
keeps its number and file. A new ADR is registered as a row in the CLAUDE.md
|
||||
"Further reading" table.
|
||||
|
||||
### 2. Mandatory sections, in this order
|
||||
|
||||
- `## Status` — a lifecycle line, usually `Accepted (YYYY-MM-DD)` (see §4), plus an
|
||||
optional one-line note.
|
||||
- `## Context` — the forces, the problem, what exists today, why now.
|
||||
- `## Decision` — what we are doing; numbered sub-decisions for multi-part ADRs.
|
||||
- `## Consequences` — results, trade-offs explicitly accepted, follow-on work.
|
||||
|
||||
### 3. Optional sections (use only where they genuinely apply)
|
||||
|
||||
`## Related`, `## Scope`, `## Guardrails` / `## Enforcement`, `## What was ruled out`,
|
||||
`## Verified facts (ADR-014)`.
|
||||
|
||||
### 4. Status lifecycle
|
||||
|
||||
Four states. Because boma is single-contributor and trunk-based with no review gate,
|
||||
most ADRs are **born `Accepted (YYYY-MM-DD)`** — committed-to on writing. A
|
||||
**`Proposed`** state exists for a genuine draft whose core direction is recorded but
|
||||
whose specifics are still open for discussion (e.g. ADR-011); it is promoted to
|
||||
`Accepted` once settled.
|
||||
|
||||
- **`Proposed (YYYY-MM-DD)`** — drafted, under discussion, not yet committed-to. May
|
||||
carry open questions. Promoted to `Accepted (YYYY-MM-DD)` when decided.
|
||||
- **`Accepted (YYYY-MM-DD)`** — committed-to. The common starting state.
|
||||
- Replaced → old ADR's Status becomes **`Superseded by ADR-NNN (YYYY-MM-DD)`**; the new
|
||||
ADR records `Supersedes ADR-MMM` in its Status and `## Related`. The link is
|
||||
**bidirectional**.
|
||||
- Retired with no replacement → **`Deprecated (YYYY-MM-DD)`** + a one-line reason.
|
||||
|
||||
**No silent rewrites.** An Accepted ADR is not edited to reverse its decision. Typo and
|
||||
clarity fixes are fine; a material reversal requires a new ADR and a `Superseded by`
|
||||
marker on the old one.
|
||||
|
||||
### 5. Template & enforcement
|
||||
|
||||
`docs/decisions/adr-template.md` is the scaffold for new ADRs. The `/review-repo`
|
||||
command's pre-scan (`scripts/repo-scan.py`) emits an `adr-structure` finding for any
|
||||
numbered ADR missing a mandatory section or with an unparseable Status line. It checks
|
||||
**presence and Status, not section order** — order is a convention the template carries,
|
||||
deliberately not gated, to keep enforcement lightweight (consistent with boma's other
|
||||
doctrine ADRs adding no CI gate).
|
||||
|
||||
### 6. Retroactive conformance of the back-catalogue
|
||||
|
||||
ADRs 001–018 are restructured to satisfy this standard rather than grandfathered. The
|
||||
restructure is **presentational** — existing headings are relabelled, regrouped, or
|
||||
demoted under a `## Decision` umbrella; a dated `## Status` is added; a `## Consequences`
|
||||
section is assembled from implications the ADR already states. **The substance of no
|
||||
decision is changed.** This keeps the check uniform (no number threshold) and the corpus
|
||||
a consistent, legible decision history.
|
||||
|
||||
## Consequences
|
||||
|
||||
- New ADRs have one obvious shape and a scaffold; structural drift stops.
|
||||
- Every ADR declares its lifecycle state uniformly, and reversals are traceable.
|
||||
- The whole corpus conforms; the check needs no grandfathering and stays simple.
|
||||
- One-time restructure churn across ADRs 001–018 (heading reorganization + a Status and
|
||||
a Consequences section per file; no decision substance changed).
|
||||
- `/review-repo` grows one deterministic check; no new CI machinery.
|
||||
- This ADR is the first conformant example and is held to its own check.
|
||||
|
||||
## What was ruled out
|
||||
|
||||
- **A `make lint` / CI gate for ADR structure** — heavier than the risk warrants;
|
||||
the `/review-repo` check and the template suffice.
|
||||
- **Machine-enforcing section order** — brittle for marginal value; left as a
|
||||
template-demonstrated convention.
|
||||
- **Grandfathering 001–018 from the check** — rejected in favour of restructuring the
|
||||
whole corpus to conform, so the standard applies uniformly with no exceptions.
|
||||
|
||||
## Related
|
||||
|
||||
- ADR-014 — knowledge sourcing (the `Verified facts` optional section).
|
||||
- ADR-019/020/021/022 — the emergent structure this ADR codifies.
|
||||
- `docs/decisions/adr-template.md` — the scaffold.
|
||||
- `scripts/repo-scan.py` — the `adr-structure` enforcement check.
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Confirm ADR-023 passes its own check**
|
||||
|
||||
Run: `python3 scripts/repo-scan.py 2>/dev/null | python3 -c "import json,sys; print([f for f in json.load(sys.stdin)['findings'] if f['check']=='adr-structure' and '023-' in f['path']])"`
|
||||
Expected: `[]`.
|
||||
|
||||
- [ ] **Step 3: Commit**
|
||||
|
||||
```bash
|
||||
git add docs/decisions/023-adr-structure.md
|
||||
git commit -m "docs(adr): ADR-023 — ADR structure & lifecycle
|
||||
|
||||
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Task 4: Wire into CLAUDE.md and the review-repo command doc
|
||||
|
||||
**Files:**
|
||||
- Modify: `CLAUDE.md` ("Further reading" table)
|
||||
- Modify: `.claude/commands/review-repo.md` (the deterministic-findings description, ~line 26–28)
|
||||
|
||||
- [ ] **Step 1: Add the CLAUDE.md "Further reading" row**
|
||||
|
||||
In `CLAUDE.md`, in the "Further reading" table, after the `Backup & disaster recovery` row, add:
|
||||
|
||||
```markdown
|
||||
| ADR structure & lifecycle | `docs/decisions/023-adr-structure.md` |
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Mention the new check in review-repo.md**
|
||||
|
||||
In `.claude/commands/review-repo.md`, find (~line 27–28):
|
||||
|
||||
```markdown
|
||||
(roles, ADRs, runbooks, playbooks, scripts — your shard list) and **exact findings**
|
||||
(markers, broken refs, unencrypted vaults). Fold these into the report verbatim.
|
||||
```
|
||||
|
||||
Replace the parenthetical with:
|
||||
|
||||
```markdown
|
||||
(roles, ADRs, runbooks, playbooks, scripts — your shard list) and **exact findings**
|
||||
(markers, broken refs, unencrypted vaults, ADR-structure violations). Fold these into
|
||||
the report verbatim.
|
||||
```
|
||||
|
||||
- [ ] **Step 3: Verify the CLAUDE.md link resolves**
|
||||
|
||||
Run: `test -f docs/decisions/023-adr-structure.md && echo OK`
|
||||
Expected: `OK`.
|
||||
|
||||
- [ ] **Step 4: Commit**
|
||||
|
||||
```bash
|
||||
git add CLAUDE.md .claude/commands/review-repo.md
|
||||
git commit -m "docs(adr): register ADR-023 and note adr-structure check
|
||||
|
||||
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Task 5: Retroactively restructure ADRs 001–018 to full conformance
|
||||
|
||||
**Goal:** every ADR in 001–018 ends with all four mandatory sections present and a
|
||||
parseable Status line, so the `adr-structure` check reports zero findings — **without
|
||||
changing the substance of any decision.**
|
||||
|
||||
**Files (current findings — the exact worklist):**
|
||||
- Missing `Status` + `Consequences`: `001-architecture.md`, `002-security.md`, `004-docker-model.md`, `005-bootstrapping.md`, `014-knowledge-sourcing.md`
|
||||
- Missing `Status` + `Decision` + `Consequences`: `006-terraform.md`, `007-network.md`, `008-testing.md`, `009-provisioning-handoff.md`, `010-forgejo-ci.md`, `011-update-management.md`
|
||||
- Missing all four: `003-toolchain.md`
|
||||
- Missing `Status` + `Decision`: `013-heritage-v4.md`
|
||||
- Missing `Status` only: `012-hardware-capacity.md`, `015-control-host.md`
|
||||
- Have unparseable `Status` + missing `Consequences`: `016-mesh-vpn.md`, `017-service-ui-verification.md`, `018-logging.md`
|
||||
|
||||
(`010`/`011` use `## Decisions` (plural) → relabel to `## Decision`. The "missing
|
||||
Decision" cases generally have the decision spread across topical `##` headings.)
|
||||
|
||||
**THE FAITHFULNESS RULE (non-negotiable):** This is a *presentational* restructure.
|
||||
You MAY: add a `## Status` section; relabel a heading (`## Decisions` → `## Decision`);
|
||||
introduce a `## Decision` umbrella heading and **demote** existing topical `##` headings
|
||||
to `###` beneath it; add a `## Consequences` section. You MUST NOT alter any existing
|
||||
sentence of decision prose, reword arguments, or add new policy. A `## Consequences`
|
||||
section is assembled **only** from implications the ADR already states (its trade-offs,
|
||||
"what was ruled out", "open questions", named follow-on work). **If an ADR states
|
||||
nothing that can be faithfully cast as a consequence, STOP and report it as
|
||||
DONE_WITH_CONCERNS / escalate — do not invent consequences.**
|
||||
|
||||
**Per-file date source:** the file's first git-commit (add) date —
|
||||
`git log --diff-filter=A --format=%as -- <path> | tail -1` (yields `YYYY-MM-DD`).
|
||||
|
||||
- [ ] **Step 1: Add a dated `## Status` section to each ADR**
|
||||
|
||||
For 001–015 (no Status today): insert, between the title line and the first `##`
|
||||
heading, a Status section:
|
||||
|
||||
```markdown
|
||||
## Status
|
||||
|
||||
Accepted (<d>)
|
||||
```
|
||||
|
||||
where `<d>` is the file's first-git-commit date. For 016/017/018 (unparseable Status
|
||||
today): prepend a parseable `Accepted (<d>). ` clause to the first line of their
|
||||
existing `## Status` section so the build-state note becomes its tail, e.g.
|
||||
`Accepted (2026-06-05). Designed. **Authorable now:** ...`.
|
||||
|
||||
- [ ] **Step 2: Ensure a `## Decision` section exists**
|
||||
|
||||
For ADRs flagged "missing Decision" (003, 006, 007, 008, 009, 010, 011, 013): relabel a
|
||||
plural/synonym heading where one exists (`## Decisions` → `## Decision` in 010/011), or
|
||||
introduce a `## Decision` umbrella immediately after `## Context` and demote the existing
|
||||
topical `##` body headings (e.g. in 003: "Execution engine", "Python environment", …) to
|
||||
`###`. Do not move or rewrite the prose under them.
|
||||
|
||||
- [ ] **Step 3: Ensure a `## Consequences` section exists**
|
||||
|
||||
For every ADR flagged "missing Consequences" (001, 002, 003, 004, 005, 006, 007, 008,
|
||||
009, 010, 011, 014, 016, 017, 018): add a `## Consequences` section near the end,
|
||||
assembled strictly from implications the ADR already states. Where an ADR has a trailing
|
||||
section that *is* consequences under another name (e.g. "What was ruled out", "Open
|
||||
questions", "Trade-offs"), you may keep that section and add a short `## Consequences`
|
||||
that references/summarizes the already-stated trade-offs — without introducing new
|
||||
claims. **Honour the faithfulness rule; escalate any ADR where no faithful Consequences
|
||||
can be drawn.**
|
||||
|
||||
- [ ] **Step 4: Verify the whole corpus passes the check**
|
||||
|
||||
Run: `python3 scripts/repo-scan.py 2>/dev/null | python3 -c "import json,sys; v=[f for f in json.load(sys.stdin)['findings'] if f['check']=='adr-structure']; print('adr-structure findings:', len(v)); [print(' ', f['path'], '—', f['detail']) for f in v]"`
|
||||
Expected: `adr-structure findings: 0`.
|
||||
|
||||
- [ ] **Step 5: Verify faithfulness via diff**
|
||||
|
||||
Run: `git diff --stat` and spot-check `git diff docs/decisions/003-toolchain.md`.
|
||||
Expected: changes are heading additions/relabels/level-demotions, a new Status section,
|
||||
and a new Consequences section — **no edits to existing decision sentences.**
|
||||
|
||||
- [ ] **Step 6: Run the repo-scan test suite**
|
||||
|
||||
Run: `.venv/bin/pytest tests/test_repo_scan.py -q`
|
||||
Expected: PASS — 5 passed.
|
||||
|
||||
- [ ] **Step 7: Commit**
|
||||
|
||||
```bash
|
||||
git add docs/decisions/0*.md docs/decisions/1*.md
|
||||
git commit -m "docs(adr): restructure ADRs 001-018 to ADR-023 conformance
|
||||
|
||||
Presentational only: add a dated Status section, relabel/regroup headings
|
||||
under Decision, and add a Consequences section assembled from each ADR's
|
||||
already-stated implications. No decision substance changed.
|
||||
|
||||
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Final verification (after all tasks)
|
||||
|
||||
- [ ] **Lint:** `make lint` — Expected: passes (docs + a stdlib script touched; ansible content unchanged).
|
||||
- [ ] **Full deterministic scan clean for our check:** `python3 scripts/repo-scan.py 2>/dev/null | python3 -c "import json,sys; print('adr-structure:', sum(1 for f in json.load(sys.stdin)['findings'] if f['check']=='adr-structure'))"` → `adr-structure: 0`.
|
||||
- [ ] **Tests green:** `.venv/bin/pytest tests/ -q` → all pass.
|
||||
- [ ] **Branch ready:** invoke `superpowers:finishing-a-development-branch` to merge `feat/adr-structure` to `main` (trunk-based, no PR) and delete the branch.
|
||||
|
||||
---
|
||||
|
||||
## Self-review notes
|
||||
|
||||
- **Spec coverage:** §1 title/filename → Task 3 + template; §2 sections → Tasks 2/3 + check; §3 lifecycle → Task 3; §4 cross-refs → Task 3 `## Related`; §5 template → Task 2; §6 retroactive restructure → Task 5; §7 enforcement → Task 1 + Task 4. All covered.
|
||||
- **Order nuance:** spec says sections come "in this order"; the check enforces presence + Status only. This is intentional and stated in both the spec's enforcement wording ("the four mandatory sections and a parseable Status line") and ADR-023's Decision §5 / "What was ruled out". Not a gap.
|
||||
- **Type/name consistency:** `adr_structure_findings` and the `"adr-structure"` check key are used identically in the function, the `scan()` wiring, the tests, and both verification one-liners.
|
||||
164
docs/superpowers/specs/2026-06-10-adr-structure-design.md
Normal file
164
docs/superpowers/specs/2026-06-10-adr-structure-design.md
Normal file
|
|
@ -0,0 +1,164 @@
|
|||
# Design — ADR structure & lifecycle
|
||||
|
||||
- **Date:** 2026-06-10
|
||||
- **Status:** Approved design — implementation plan to follow
|
||||
- **Resolves:** the absence of a written standard for how ADRs in
|
||||
`docs/decisions/` are structured. The newest ADRs (019–022) have converged on a
|
||||
clean pattern (`Status` → `Context` → `Decision` → `Consequences` → `Related`),
|
||||
but it lives only as imitation; ADRs 001–018 predate it and most lack a `Status`
|
||||
section.
|
||||
- **Becomes:** ADR-023 (this design is the basis for that ADR).
|
||||
- **Reuses:** boma's existing `*-template.md` convention (`service-security-template.md`,
|
||||
`service-verify-template.md`, `service-access-template.md`, `service-backup-template.md`);
|
||||
ADR-014 (knowledge-sourcing → the optional `Verified facts` section); ADR-019/020/021/022
|
||||
(the emergent structure being codified); the `/review-repo` command (enforcement home).
|
||||
|
||||
---
|
||||
|
||||
## Problem
|
||||
|
||||
boma documents architectural decisions as numbered ADRs in `docs/decisions/`, and
|
||||
CLAUDE.md treats them as load-bearing ("Before assuming a role, provider, or pipeline
|
||||
exists, check STATUS.md"; the entire "Further reading" table points into them). Yet
|
||||
there is no ADR that says how an ADR is written. The result:
|
||||
|
||||
- **Structural drift.** ADRs 001–018 are freeform; 019–022 converged on a consistent
|
||||
shape but only by imitation. A new ADR's structure depends on which existing one the
|
||||
author happened to copy.
|
||||
- **No status discipline.** Most early ADRs have no `## Status` section, so there is no
|
||||
uniform way to tell an active decision from a superseded or deprecated one — and no
|
||||
written rule for how a decision gets reversed without silently rewriting history.
|
||||
- **No scaffold.** Every other recurring document type in boma has a template
|
||||
(`service-security-template.md`, etc.). ADRs do not.
|
||||
|
||||
This design codifies the structure 019–022 already demonstrate, pins a status
|
||||
lifecycle, ships a template, and reconciles the back-catalogue.
|
||||
|
||||
## Scope
|
||||
|
||||
- **In:** the canonical section set (mandatory + optional); title and filename
|
||||
convention; the `Accepted / Superseded / Deprecated` status lifecycle and the
|
||||
no-silent-rewrite rule; cross-reference convention; an ADR template file; a
|
||||
lightweight `/review-repo` structure check; a **one-time retroactive restructure of
|
||||
ADRs 001–018** to full conformance (all four mandatory sections + a parseable Status
|
||||
line), reorganizing existing content under canonical headings.
|
||||
- **Out (for now):** *changing the substance of* any existing decision (the restructure
|
||||
is presentational — relabel/regroup/demote existing content, add a dated Status, never
|
||||
alter what was decided); a `make lint` / CI gate for ADR structure (explicitly
|
||||
rejected in favour of the `/review-repo` check — consistent with boma's other doctrine
|
||||
ADRs, which add no CI gate); grandfathering pre-convention ADRs from the check
|
||||
(rejected — the whole corpus is brought to conformance instead).
|
||||
|
||||
The lifecycle uses four states — `Proposed / Accepted / Superseded / Deprecated`. An
|
||||
earlier draft of this design omitted `Proposed`, but ADR-011 (a real draft with open
|
||||
questions) is evidence boma occasionally needs it, so it was kept.
|
||||
|
||||
## Decision
|
||||
|
||||
### 1. Title & filename
|
||||
- Title line: `# ADR-NNN — <Title>: <optional clarifying subtitle>` (em-dash `—`,
|
||||
matching every existing ADR).
|
||||
- Filename: `NNN-kebab-title.md`, zero-padded 3-digit, monotonic, **never reused**
|
||||
(a superseded ADR keeps its number and file).
|
||||
- A new ADR is registered as a row in the CLAUDE.md "Further reading" table.
|
||||
|
||||
### 2. Canonical sections
|
||||
|
||||
**Mandatory — every ADR, in this order:**
|
||||
|
||||
| Section | Holds |
|
||||
|---|---|
|
||||
| `## Status` | `Accepted (YYYY-MM-DD)`, plus an optional one-line note (what it resolves/supersedes, or a doctrine-not-yet-built caveat as ADR-022 uses) |
|
||||
| `## Context` | the forces, the problem, what exists today, why now |
|
||||
| `## Decision` | what we are doing — numbered sub-decisions for multi-part ADRs, as 020/021/022 do |
|
||||
| `## Consequences` | results, trade-offs *explicitly accepted*, follow-on work |
|
||||
|
||||
**Optional — use only where genuinely applicable, never as padding:**
|
||||
|
||||
- `## Related` — links to other ADRs by number.
|
||||
- `## Scope` — explicit in/out-of-scope boundaries.
|
||||
- `## Guardrails` / `## Enforcement` — how the decision is mechanically enforced
|
||||
(lint, CI, hooks).
|
||||
- `## What was ruled out` — rejected alternatives, each with its reason.
|
||||
- `## Verified facts (ADR-014)` — version-stamped facts per the knowledge-sourcing rule.
|
||||
|
||||
### 3. Status lifecycle
|
||||
|
||||
Four states. Most ADRs are **born `Accepted (YYYY-MM-DD)`** — the sole author commits
|
||||
to it on writing (boma is single-contributor and trunk-based with no review gate).
|
||||
|
||||
- **`Proposed (YYYY-MM-DD)`** — a genuine draft whose core direction is recorded but
|
||||
whose specifics are still open (e.g. ADR-011, which carries open questions). Promoted
|
||||
to `Accepted (YYYY-MM-DD)` once settled.
|
||||
- **`Accepted (YYYY-MM-DD)`** — committed-to; the common starting state.
|
||||
- Replaced by a later decision → the old ADR's Status becomes
|
||||
**`Superseded by ADR-NNN (YYYY-MM-DD)`**; the superseding ADR records
|
||||
`Supersedes ADR-MMM` in its own `## Status` and `## Related`. The link is
|
||||
**bidirectional** — both files must point at each other.
|
||||
- Retired with no replacement → **`Deprecated (YYYY-MM-DD)`** plus a one-line reason.
|
||||
|
||||
**Load-bearing rule — no silent rewrites.** An `Accepted` ADR is not edited to reverse
|
||||
its decision. Typo and clarity fixes are fine; a *material reversal* requires a new ADR
|
||||
and a `Superseded by` marker on the old one. The history of decisions stays legible.
|
||||
|
||||
### 4. Cross-references
|
||||
Reference other ADRs by number inline (`ADR-019`), and collect the relationships in a
|
||||
`## Related` section.
|
||||
|
||||
### 5. Template file
|
||||
Ship `docs/decisions/adr-template.md` — consistent with boma's existing
|
||||
`*-template.md` convention. It contains the mandatory section headers pre-filled with
|
||||
short HTML-comment hints, and the optional sections listed as commented stubs to
|
||||
uncomment when relevant. It is a skeleton, not a numbered decision, so it does not take
|
||||
an ADR number.
|
||||
|
||||
### 6. Retroactive restructure (001–018)
|
||||
A **separate step** after the ADR and template land: bring every pre-convention ADR to
|
||||
full conformance — all four mandatory sections present and a parseable Status line. This
|
||||
is a **presentational** restructure, governed by a strict faithfulness rule:
|
||||
|
||||
- **Add** a `## Status` section valued `Accepted (YYYY-MM-DD)`, the date reconstructed
|
||||
from the file's **first git-commit date**. For 016–018, whose existing trailing
|
||||
build-state note is unparseable, prepend the dated `Accepted (...)` clause so the note
|
||||
becomes a parseable Status line's tail.
|
||||
- **Reorganize** existing content under the canonical headings: relabel a synonym
|
||||
(`## Decisions` → `## Decision`), or introduce a `## Decision` umbrella and **demote**
|
||||
the existing topical `##` headings to `###` beneath it. No sentence of existing prose
|
||||
is altered.
|
||||
- **Add** a `## Consequences` section built **only** from implications the ADR already
|
||||
states (trade-offs, "what was ruled out", "open questions", follow-on work already
|
||||
named). If an ADR genuinely states nothing that can be faithfully cast as a
|
||||
consequence, that file is escalated for a human decision rather than inventing one.
|
||||
- **Never** change the substance of a decision. A `git diff` of the restructure should
|
||||
show heading-level changes, a new Status section, and a Consequences section assembled
|
||||
from existing material — not edits to existing argument.
|
||||
|
||||
ADRs already conformant (019–022) are left alone. End state: the `adr-structure` check
|
||||
reports zero findings across the whole corpus, with no grandfathering.
|
||||
|
||||
### 7. Enforcement
|
||||
Lightweight, no CI gate. The `/review-repo` command gains an ADR-structure check:
|
||||
every file in `docs/decisions/` matching `NNN-*.md` has the four mandatory sections and
|
||||
a parseable `## Status` line. The template carries the convention forward for new ADRs.
|
||||
|
||||
## Consequences
|
||||
|
||||
- New ADRs have one obvious shape and a scaffold to start from; structural drift stops.
|
||||
- Every ADR declares its lifecycle state uniformly, and reversals are traceable rather
|
||||
than silent — the back-catalogue becomes a legible decision history.
|
||||
- One-time churn: a restructure touching ~18 files (heading reorganization + a Status
|
||||
section + a Consequences section per file). Larger and more judgment-heavy than a
|
||||
Status-only backfill, hence the faithfulness rule and per-file review.
|
||||
- The whole corpus conforms — the check needs no grandfathering or number threshold, and
|
||||
stays simple (presence + parseable Status, applied uniformly).
|
||||
- `/review-repo` grows a new check; no new CI machinery, matching boma's habit of not
|
||||
gating doctrine in CI.
|
||||
- This ADR is itself the first conformant example — it must follow its own structure.
|
||||
|
||||
## Open questions
|
||||
|
||||
None outstanding — title/filename, the **4-state lifecycle** (`Proposed / Accepted /
|
||||
Superseded / Deprecated`; `Proposed` adopted on the evidence of ADR-011), template name
|
||||
(`adr-template.md`), enforcement (`/review-repo`, no CI gate), and the **full
|
||||
retroactive restructure** of 001–018 (no grandfathering) were all confirmed during
|
||||
brainstorming and execution.
|
||||
|
|
@ -41,6 +41,17 @@ LIST_ITEM_RE = re.compile(r"^\s*(\d+\.|[-*+])\s+(.*)")
|
|||
DEFER_REF_RE = re.compile(r"ADR-(\d{3})\D{0,40}?deferred\D{0,12}?(\d+)", re.I)
|
||||
RESOLVE_WORD_RE = re.compile(r"\b(?:resolv\w*|decid\w*|address\w*|complet\w*|done)\b", re.I)
|
||||
|
||||
# ADR-structure check (ADR-023): numbered ADRs must carry the four mandatory
|
||||
# sections and a parseable Status line. Presence only — section ORDER is a
|
||||
# template-demonstrated convention, not machine-enforced.
|
||||
ADR_FILE_RE = re.compile(r"^\d{3}-.*\.md$")
|
||||
ADR_REQUIRED_SECTIONS = ("Status", "Context", "Decision", "Consequences")
|
||||
ADR_STATUS_LINE_RE = re.compile(
|
||||
r"^(Proposed \(\d{4}-\d{2}-\d{2}\)"
|
||||
r"|Accepted \(\d{4}-\d{2}-\d{2}\)"
|
||||
r"|Superseded by ADR-\d{3} \(\d{4}-\d{2}-\d{2}\)"
|
||||
r"|Deprecated \(\d{4}-\d{2}-\d{2}\))")
|
||||
|
||||
|
||||
def _is_defer_heading(text):
|
||||
t = text.strip().lower()
|
||||
|
|
@ -95,6 +106,42 @@ def deferred_findings(adr_files, defer_refs):
|
|||
return out
|
||||
|
||||
|
||||
def adr_structure_findings(adr_files):
|
||||
"""adr_files: {rel_path: [lines]} for docs/decisions/*.md.
|
||||
Flags numbered ADRs (NNN-*.md) missing a mandatory section or whose Status
|
||||
section has no parseable lifecycle line. Non-numbered files (e.g.
|
||||
adr-template.md) are skipped. Section order is NOT checked (ADR-023)."""
|
||||
out = []
|
||||
for rpath, lines in sorted(adr_files.items()):
|
||||
if not ADR_FILE_RE.match(os.path.basename(rpath)):
|
||||
continue
|
||||
headings = {}
|
||||
for i, line in enumerate(lines):
|
||||
m = re.match(r"^##\s+(\w+)", line)
|
||||
if m:
|
||||
headings.setdefault(m.group(1), i)
|
||||
missing = [s for s in ADR_REQUIRED_SECTIONS if s not in headings]
|
||||
if missing:
|
||||
out.append({"check": "adr-structure", "severity": "medium",
|
||||
"path": rpath, "line": 1,
|
||||
"detail": f"missing mandatory section(s): {', '.join(missing)}"})
|
||||
if "Status" in headings:
|
||||
body = []
|
||||
for line in lines[headings["Status"] + 1:]:
|
||||
if line.startswith("## "):
|
||||
break
|
||||
body.append(line)
|
||||
status_text = next((ln.strip() for ln in body if ln.strip()), "")
|
||||
if not ADR_STATUS_LINE_RE.match(status_text):
|
||||
out.append({"check": "adr-structure", "severity": "medium",
|
||||
"path": rpath, "line": headings["Status"] + 1,
|
||||
"detail": "Status not parseable (want 'Proposed (YYYY-MM-DD)', "
|
||||
"'Accepted (YYYY-MM-DD)', 'Superseded by ADR-NNN "
|
||||
"(YYYY-MM-DD)', or 'Deprecated (YYYY-MM-DD)'); "
|
||||
f"got: {status_text[:60]!r}"})
|
||||
return out
|
||||
|
||||
|
||||
def walk_files():
|
||||
for dirpath, dirnames, filenames in os.walk(ROOT):
|
||||
dirnames[:] = [d for d in dirnames if d not in PRUNE]
|
||||
|
|
@ -213,6 +260,7 @@ def scan():
|
|||
findings.append({"check": "broken-path-ref", "severity": "medium", "path": rpath,
|
||||
"line": i, "detail": f"references '{ref}' which does not exist"})
|
||||
findings.extend(deferred_findings(adr_files, defer_refs))
|
||||
findings.extend(adr_structure_findings(adr_files))
|
||||
return findings
|
||||
|
||||
|
||||
|
|
|
|||
59
tests/test_repo_scan.py
Normal file
59
tests/test_repo_scan.py
Normal file
|
|
@ -0,0 +1,59 @@
|
|||
import importlib.util
|
||||
import pathlib
|
||||
|
||||
_PATH = pathlib.Path(__file__).resolve().parent.parent / "scripts" / "repo-scan.py"
|
||||
_spec = importlib.util.spec_from_file_location("repo_scan", _PATH)
|
||||
rs = importlib.util.module_from_spec(_spec)
|
||||
_spec.loader.exec_module(rs)
|
||||
|
||||
GOOD = [
|
||||
"# ADR-099 — Example\n", "\n",
|
||||
"## Status\n", "\n", "Accepted (2026-06-10)\n", "\n",
|
||||
"## Context\n", "\n", "Why.\n", "\n",
|
||||
"## Decision\n", "\n", "What.\n", "\n",
|
||||
"## Consequences\n", "\n", "So what.\n",
|
||||
]
|
||||
|
||||
|
||||
def _checks(findings):
|
||||
return [f for f in findings if f["check"] == "adr-structure"]
|
||||
|
||||
|
||||
def test_good_adr_has_no_findings():
|
||||
out = rs.adr_structure_findings({"docs/decisions/099-example.md": GOOD})
|
||||
assert _checks(out) == []
|
||||
|
||||
|
||||
def test_missing_mandatory_section_is_flagged():
|
||||
lines = [ln for ln in GOOD if not ln.startswith("## Consequences")]
|
||||
out = _checks(rs.adr_structure_findings({"docs/decisions/099-example.md": lines}))
|
||||
assert len(out) == 1
|
||||
assert "Consequences" in out[0]["detail"]
|
||||
|
||||
|
||||
def test_unparseable_status_is_flagged():
|
||||
lines = [("Designed, not built.\n" if ln == "Accepted (2026-06-10)\n" else ln)
|
||||
for ln in GOOD]
|
||||
out = _checks(rs.adr_structure_findings({"docs/decisions/099-example.md": lines}))
|
||||
assert len(out) == 1
|
||||
assert "Status not parseable" in out[0]["detail"]
|
||||
|
||||
|
||||
def test_superseded_status_is_accepted():
|
||||
lines = [("Superseded by ADR-100 (2026-06-11)\n" if ln == "Accepted (2026-06-10)\n"
|
||||
else ln) for ln in GOOD]
|
||||
out = _checks(rs.adr_structure_findings({"docs/decisions/099-example.md": lines}))
|
||||
assert out == []
|
||||
|
||||
|
||||
def test_proposed_status_is_accepted():
|
||||
lines = [("Proposed (2026-06-04)\n" if ln == "Accepted (2026-06-10)\n"
|
||||
else ln) for ln in GOOD]
|
||||
out = _checks(rs.adr_structure_findings({"docs/decisions/099-example.md": lines}))
|
||||
assert out == []
|
||||
|
||||
|
||||
def test_non_numbered_file_is_skipped():
|
||||
bare = ["# ADR template\n", "\n", "## Status\n", "\n", "<!-- hint -->\n"]
|
||||
out = _checks(rs.adr_structure_findings({"docs/decisions/adr-template.md": bare}))
|
||||
assert out == []
|
||||
Loading…
Add table
Reference in a new issue