Compare commits

..

17 commits

Author SHA1 Message Date
7ebbc113ab Merge feat/adr-structure: ADR-023 structure & lifecycle + back-catalogue conformance
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-10 15:18:48 +02:00
fa3db421dc docs(kaizen): FRICTION signal — controller must diff-audit subagent restructures
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-10 15:01:21 +02:00
d0a3307822 docs(adr): fix 007/008 heading nesting; require date in Superseded status
Final-review polish: demote the sub-headings under the demoted 'IP addressing'
(007) and 'Three testing levels'/'What Molecule tests' (008) to #### so they
nest correctly instead of flattening to siblings. Tighten the adr-structure
Superseded pattern to require '(YYYY-MM-DD)' per ADR-023.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-10 15:00:58 +02:00
0df24909e3 docs(adr): restructure ADRs 016-018 to ADR-023 conformance
Make the existing Status sections parseable (Accepted (date) + the existing
designed-not-built note) and add Consequences sections assembled from each
ADR's already-stated residual risks, trade-offs and build status. No
decision substance changed.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-10 14:51:51 +02:00
40a428975a docs(adr): restructure ADR-003 to ADR-023 conformance
Add Status, a descriptive Context, a Decision umbrella over the existing
topical sections (demoted to ###), and a Consequences section assembled
from the ADR's already-stated rationale. No decision substance changed.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-10 14:50:03 +02:00
6d7d27b03b docs(adr): add Proposed lifecycle state; mark ADR-011 Proposed
Revisits the lifecycle decision on the evidence of ADR-011 (a real draft
with open questions). Adds a fourth state, Proposed (YYYY-MM-DD), to ADR-023,
the template, the adr-structure check (+test), spec and plan. Sets ADR-011's
Status to Proposed and removes its now-redundant inline 'Proposed' line.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-10 14:48:55 +02:00
b3ca510380 docs(adr): restructure ADRs 010,011,013 to ADR-023 conformance
010/011: relabel Decisions->Decision + add Status/Consequences.
013: add Status + Decision umbrella (existing Consequences untouched).
No decision substance changed.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-10 14:43:41 +02:00
44dbd4628f docs(adr): restructure ADRs 006-009 to ADR-023 conformance
Add dated Status sections, a Decision umbrella over the existing topical
sections (demoted to ###), and Consequences assembled from each ADR's
already-stated implications. No decision substance changed.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-10 14:41:24 +02:00
188882449d docs(adr): restructure ADRs 001,002,004,005,012,014,015 to ADR-023 conformance
Add dated Status sections and (where missing) Consequences sections assembled
from each ADR's already-stated implications. No decision substance changed.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-10 14:39:00 +02:00
9b1502cf7d docs(adr): register ADR-023 and note adr-structure check
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-10 14:33:55 +02:00
a9aab9d040 docs(adr): ADR-023 — ADR structure & lifecycle
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-10 14:32:40 +02:00
3c920ae630 docs(adr): sync plan Task 2 with flat-comment template fix
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-10 14:31:23 +02:00
ab14d65aa1 docs(adr): add adr-template.md scaffold (ADR-023)
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-10 14:30:52 +02:00
89179dd7c9 docs(adr): revise spec+plan — full retroactive restructure of 001-018
Replaces the Status-only backfill with a faithful presentational
restructure bringing the whole back-catalogue to 4-section conformance
(no grandfathering). Adds the faithfulness rule and per-file worklist.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-10 14:28:20 +02:00
a3ea0f7d80 feat(review): add adr-structure check to repo-scan
Flags numbered ADRs missing a mandatory section (Status/Context/Decision/
Consequences) or with an unparseable Status line. Presence only, not order.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-10 13:57:42 +02:00
ce3319cbed docs(adr): implementation plan + FRICTION signal for ADR structure
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-10 13:55:16 +02:00
dfbe37916f docs(adr): design spec for ADR structure & lifecycle (ADR-023)
Codifies the structure ADRs 019-022 converged on, pins an
Accepted/Superseded/Deprecated lifecycle with a no-silent-rewrite rule,
adds an adr-template.md scaffold, and plans a Status-header backfill of
ADRs 001-018. Basis for ADR-023.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-10 13:45:21 +02:00
27 changed files with 1434 additions and 60 deletions

View file

@ -25,7 +25,8 @@ report the rest, and write a tracked report to `docs/reviews/`.
### Phase 0 — deterministic pre-scan ### Phase 0 — deterministic pre-scan
Run `python3 scripts/repo-scan.py > /tmp/repo-scan.json`. It returns the **inventory** Run `python3 scripts/repo-scan.py > /tmp/repo-scan.json`. It returns the **inventory**
(roles, ADRs, runbooks, playbooks, scripts — your shard list) and **exact findings** (roles, ADRs, runbooks, playbooks, scripts — your shard list) and **exact findings**
(markers, broken refs, unencrypted vaults). Fold these into the report verbatim. (markers, broken refs, unencrypted vaults, ADR-structure violations). Fold these into
the report verbatim.
It also emits two deferral checks (see Phase 2): `open-deferred-item` (every still-open It also emits two deferral checks (see Phase 2): `open-deferred-item` (every still-open
ADR "Deferred/Open" entry — a checklist to confirm) and `stale-deferred` (an entry ADR "Deferred/Open" entry — a checklist to confirm) and `stale-deferred` (an entry

View file

@ -231,6 +231,7 @@ Single-contributor, trunk-based (no merge requests / approval gates):
| Firewall strategy | `docs/decisions/020-firewall.md` | | Firewall strategy | `docs/decisions/020-firewall.md` |
| Operational access | `docs/decisions/021-operational-access.md` | | Operational access | `docs/decisions/021-operational-access.md` |
| Backup & disaster recovery | `docs/decisions/022-backup.md` | | Backup & disaster recovery | `docs/decisions/022-backup.md` |
| ADR structure & lifecycle | `docs/decisions/023-adr-structure.md` |
| Adding a new role | `docs/runbooks/new-role.md` | | Adding a new role | `docs/runbooks/new-role.md` |
| Adding a new host | `docs/runbooks/new-host.md` | | Adding a new host | `docs/runbooks/new-host.md` |
| Rotating vault secrets | `docs/runbooks/rotate-secrets.md` | | Rotating vault secrets | `docs/runbooks/rotate-secrets.md` |

View file

@ -25,6 +25,24 @@ _(append new raw signals here; the next kaizen review consumes them)_
invented a Status header ("Proposed") on the fly because there's no documented invented a Status header ("Proposed") on the fly because there's no documented
convention for how we write ADRs (status lifecycle, required sections). → TODO 10.2 — convention for how we write ADRs (status lifecycle, required sections). → TODO 10.2 —
decide a minimal ADR template / status convention. decide a minimal ADR template / status convention.
- `[recurring]` **Brainstorming's "user reviews spec" gate fires despite a standing
agreement to skip it** (2026-06-10): writing the ADR-structure spec, I stopped to ask
the user to review the finished spec before writing the plan — the
`superpowers:brainstorming` skill scripts that gate. We had previously agreed I should
move directly from the Q/A to the implementation plan once the spec is written. Same
shape as the execution-mode-menu signal: an external skill's script conflicting with a
boma convention, where prose reminders don't hold. → consider a mechanical guard
(Stop-hook family) or a CLAUDE.md/skill-override note that suppresses the spec-review
gate.
- `[recurring]` **Subagent faithfulness self-reports can be wrong — controller must
diff** (2026-06-10): during the ADR-023 retroactive restructure, an implementer
subagent reported "0 substantive deletions, the See-also lines reappear verbatim" for
ADR-014, but it had actually dropped the cross-reference lines. Caught only by the
controller independently running `git show <sha> | grep '^-[^-]'`. For
faithfulness-critical edits delegated to subagents, the agent's own audit is not
sufficient evidence. → systematize a controller-side deletion-audit step (every `-`
line must be a classified, expected change) before accepting any "presentational-only"
restructure; consider a helper script.
--- ---

View file

@ -1,5 +1,9 @@
# ADR-001 — Architecture overview # ADR-001 — Architecture overview
## Status
Accepted (2026-05-30)
## Context ## Context
This document describes the overall architecture of the homelab infrastructure This document describes the overall architecture of the homelab infrastructure
@ -65,3 +69,21 @@ This architecture prioritises:
- **Simplicity**: few moving parts, no orchestration layer (no Kubernetes, no Swarm) - **Simplicity**: few moving parts, no orchestration layer (no Kubernetes, no Swarm)
- **Reproducibility**: any host can be rebuilt from scratch via Ansible - **Reproducibility**: any host can be rebuilt from scratch via Ansible
- **Legibility**: a human reading the repo can understand what runs where - **Legibility**: a human reading the repo can understand what runs where
## Consequences
Drawn from the boundaries this ADR already states:
- The small fleet (25 VMs) is treated as individuals, not cattle (per Infrastructure),
and forgoing an orchestration layer is the cost of the simplicity priority (per
Decision).
- The control node `ubongo` cannot be created by the Terraform it hosts, so it is
provisioned manually — the one documented exception to Terraform-owned VM existence
(per Infrastructure / Host groups; ADR-009, ADR-015).
- Management scope is deliberately bounded: Proxmox configuration itself (storage,
clustering, networking) is out of scope, and the `control` group never runs the
`docker_host` role (per Host groups).
- Compose files are always regenerated by Ansible on deploy; no hand-edited Compose
files exist on hosts (per Service interaction model).
- The "What this repo manages" table describes the *intended* design — STATUS.md
records what is actually built (per that section).

View file

@ -1,5 +1,9 @@
# ADR-002 — Security baseline and strategy # ADR-002 — Security baseline and strategy
## Status
Accepted (2026-05-30)
## Context ## Context
Security here is not a single control but the sum of several combined efforts — Security here is not a single control but the sum of several combined efforts —
@ -183,3 +187,27 @@ This posture was chosen to be:
Out-of-scope items and conscious trade-offs are recorded in Out-of-scope items and conscious trade-offs are recorded in
`docs/security/accepted-risks.md` rather than here, so this decision record stays `docs/security/accepted-risks.md` rather than here, so this decision record stays
stable while the risk posture evolves. stable while the risk posture evolves.
## Consequences
Drawn from the trade-offs, scoping, and follow-on work this ADR already states:
- Targeted/physical adversaries are out of scope at this scale, and supply chain is
consciously deprioritized — active vuln scanning is deferred as an accepted risk
(per Threat model; `docs/security/accepted-risks.md`).
- SELinux is not used (non-native to Debian, redundant with AppArmor), recorded as an
accepted risk (per Mandatory access control).
- Some CIS L2 items require separate partitions with restrictive mount options, which
reaches into VM disk layout — a provisioning concern (Terraform / cloud-init, ADR-006),
not just the `base` role (per Hardening standard). Any impractical CIS item is exempted
into the accepted-risk register with rationale, recording named exceptions rather than a
blanket opt-out.
- Several controls and governance mechanisms are stated as planned, not yet built:
Suricata network IDS, active alerting wiring AIDE/`auditd`/`fail2ban`/Suricata plus
log-source-silence into Grafana, the `/security-review` skill and its aggregation of
every `roles/*/SECURITY.md`, and the periodic security review (per File integrity /
Governance; STATUS.md / `docs/TODO.md`).
- The per-service security bar is enforced manually in review today, pending the planned
`/security-review` automation (per Governance).
- The accepted-risk register is kept out of this ADR so the record stays stable while the
risk posture evolves (per Decision; `docs/security/accepted-risks.md`).

View file

@ -1,6 +1,20 @@
# ADR-003 — Toolchain decisions # ADR-003 — Toolchain decisions
## Execution engine ## Status
Accepted (2026-05-30)
## Context
boma needs a defined, reproducible toolchain for running and testing its Ansible
monorepo: an execution engine, a Python environment, secrets handling, a testing
framework, linting, CI/CD, developer-ergonomics conventions, and a collections/roles
policy. This ADR records the choice made for each, together with the alternatives
weighed and why they were not adopted.
## Decision
### Execution engine
**Choice**: `ansible-core` (pip-installed, pinned version) + explicit `requirements.yml` **Choice**: `ansible-core` (pip-installed, pinned version) + explicit `requirements.yml`
@ -12,7 +26,7 @@ that isn't needed in a maintained monorepo.
--- ---
## Python environment ### Python environment
**Choice**: `python3-venv` (system Python on Debian 13) + pinned `requirements.txt` **Choice**: `python3-venv` (system Python on Debian 13) + pinned `requirements.txt`
@ -24,7 +38,7 @@ reproducible, and has no extra dependencies.
--- ---
## Secrets ### Secrets
**Choice**: Ansible Vault (file-based, built-in) **Choice**: Ansible Vault (file-based, built-in)
@ -40,7 +54,7 @@ CLAUDE.md → Secrets).
--- ---
## Testing ### Testing
**Choice**: Molecule with Docker driver (`molecule-plugins[docker]`) **Choice**: Molecule with Docker driver (`molecule-plugins[docker]`)
@ -59,7 +73,7 @@ are needed.
--- ---
## Linting ### Linting
**Choice**: `ansible-lint` + `yamllint` + `pre-commit` **Choice**: `ansible-lint` + `yamllint` + `pre-commit`
@ -71,7 +85,7 @@ Config files: `.ansible-lint`, `.yamllint` in repo root.
--- ---
## CI/CD ### CI/CD
**Choice**: Forgejo Actions (self-hosted at forgejo.nyumbani.baobab.band) + `act_runner` **Choice**: Forgejo Actions (self-hosted at forgejo.nyumbani.baobab.band) + `act_runner`
@ -87,7 +101,7 @@ a dedicated runner VM later if CI load warrants a separate host.
--- ---
## Developer ergonomics ### Developer ergonomics
**Choice**: `Makefile` as the single interface for all operations **Choice**: `Makefile` as the single interface for all operations
@ -102,7 +116,7 @@ The venv is activated in the user's shell profile.
--- ---
## Collections and roles policy ### Collections and roles policy
**No Galaxy roles.** All roles are written and maintained locally in `roles/`. **No Galaxy roles.** All roles are written and maintained locally in `roles/`.
Galaxy roles introduce external state, versioning surprises, and implicit Galaxy roles introduce external state, versioning surprises, and implicit
@ -136,3 +150,24 @@ are removed. Each entry in `requirements.yml` must justify its presence.
| NixOS targets | Poor Ansible fit; all hosts standardised on Debian 13 | | NixOS targets | Poor Ansible fit; all hosts standardised on Debian 13 |
Terraform is **adopted** for VM provisioning only (no DNS) — see `docs/decisions/006-terraform.md`. Terraform is **adopted** for VM provisioning only (no DNS) — see `docs/decisions/006-terraform.md`.
## Consequences
Drawn from the rationale and trade-offs this ADR already states:
- Pinning `ansible-core` + an explicit `requirements.yml` and a plain pinned venv keeps
the control-node environment small and fully reproducible, at the cost of maintaining
the pins (per Execution engine / Python environment).
- Ansible Vault's whole-file encryption makes diffs unreadable regardless of layout, so
secrets are organised for human lookup (`vault.<service>.<key>`) rather than diff
ergonomics — the trade accepted against SOPS/age (per Secrets).
- The `Makefile` is the single interface: Claude Code and CI invoke the same targets, so
local and CI behaviour can't drift and collaborators need not know raw flags (per
Developer ergonomics).
- Collections are added only on demand, so `requirements.yml` stays minimal; this defers
`community.crypto` (use `openssl` CLI until a role needs certs) and `community.general`
(add only the specific sub-module needed) until a real need appears (per Collections
and roles policy).
- The heavier orchestration tools were declined for this scale, each with a named
revisit trigger — e.g. Semaphore if non-SSH operators must trigger runs, AWX-adjacent
tooling only if AWX/AAP is ever adopted (per "What was explicitly ruled out").

View file

@ -1,5 +1,9 @@
# ADR-004 — Docker and Compose service model # ADR-004 — Docker and Compose service model
## Status
Accepted (2026-05-30)
## Context ## Context
All services run as Docker containers managed via Docker Compose. This document All services run as Docker containers managed via Docker Compose. This document
@ -107,3 +111,22 @@ Docker Compose was chosen over Kubernetes/Swarm because:
- Compose files are human-readable and easily auditable - Compose files are human-readable and easily auditable
- No distributed state to manage - No distributed state to manage
- Straightforward to back up and restore - Straightforward to back up and restore
## Consequences
Drawn from the trade-offs and deferred items this ADR already states:
- A shared `compose_service` engine role is intentionally not built: the ~5 standard
tasks are duplicated per role in favour of legible, self-contained roles, with a stated
revisit trigger — extract a shared engine if maintaining the duplicated mechanics
becomes painful (a pattern change touching many roles, or drift this standard alone
isn't preventing) (per "Why not a shared engine").
- Forgoing Kubernetes/Swarm is the deliberate cost of matching complexity to a 25 host
fleet with no distributed state to manage (per Decision).
- User-namespace remapping is not enabled by default — evaluated per use case (per Docker
daemon configuration).
- Bare `latest` is acceptable only on the stateless tier; the stateful tier is always
pinned `tag@digest`, and image updates are a deliberate operation (per Image management;
ADR-011).
- Backup strategy is stated as defined separately, not in scope of this ADR (per Persistent
data).

View file

@ -1,5 +1,9 @@
# ADR-005 — Host bootstrapping # ADR-005 — Host bootstrapping
## Status
Accepted (2026-05-30)
## Context ## Context
This document defines the **cloud-init template** that managed VMs are cloned This document defines the **cloud-init template** that managed VMs are cloned
@ -81,3 +85,19 @@ Cloud-init with Proxmox templates provides:
- No manual installer interaction - No manual installer interaction
- A clean handoff point to Ansible - A clean handoff point to Ansible
- Easy rebuilds — destroy VM, clone template, run Ansible - Easy rebuilds — destroy VM, clone template, run Ansible
## Consequences
Drawn from the trade-offs and special cases this ADR already states:
- The cloud-init image was chosen over a manual Debian installer (slow, error-prone,
not reproducible) and over preseed/netboot (powerful but complex to maintain) (per
Approach).
- Template creation is a one-time manual procedure per Proxmox cluster, and the template
is never booted directly (per Template creation).
- There is no manual `qm clone` path for managed hosts; the full create → inventory →
configure pipeline and the Terraform↔Ansible contract live in ADR-009 (per VM
provisioning / Ansible handoff).
- The control node is the sole documented exception — `ubongo`, a physical machine
installed by hand because it cannot be created by the Terraform it hosts (chicken-and-egg);
its hardware target and recovery model live in ADR-015 (per Control node bootstrapping).

View file

@ -1,5 +1,9 @@
# ADR-006 — Terraform for infrastructure provisioning # ADR-006 — Terraform for infrastructure provisioning
## Status
Accepted (2026-05-30)
## Context ## Context
Ansible manages host configuration well but has no state model for infrastructure Ansible manages host configuration well but has no state model for infrastructure
@ -13,7 +17,9 @@ exact boundary, handoff pipeline, and data contract between them live in **ADR-0
--- ---
## Responsibility split ## Decision
### Responsibility split
The canonical responsibility-split table lives in **ADR-009**. In short: Terraform The canonical responsibility-split table lives in **ADR-009**. In short: Terraform
owns VM existence only; Ansible owns everything inside a VM, including all internal owns VM existence only; Ansible owns everything inside a VM, including all internal
@ -26,7 +32,7 @@ cadence, making them a poor fit for Terraform state.
--- ---
## Providers ### Providers
**`bpg/proxmox` (`~> 0.70`)**: Chosen over `telmate/proxmox` for active maintenance, **`bpg/proxmox` (`~> 0.70`)**: Chosen over `telmate/proxmox` for active maintenance,
full Proxmox 8 API support, and better cloud-init integration. This is the only full Proxmox 8 API support, and better cloud-init integration. This is the only
@ -42,7 +48,7 @@ Terraform manages its own provider dependencies via `required_providers` and
--- ---
## State backend ### State backend
**Choice**: Local state on the control node. **Choice**: Local state on the control node.
@ -59,7 +65,7 @@ integration boundary.
--- ---
## Structure ### Structure
``` ```
terraform/ terraform/
@ -83,7 +89,7 @@ Each environment directory contains:
--- ---
## Secrets handling ### Secrets handling
The only secret input (the Proxmox API token) is passed via a `TF_VAR_*` The only secret input (the Proxmox API token) is passed via a `TF_VAR_*`
environment variable and declared `sensitive = true` in `variables.tf`. It never environment variable and declared `sensitive = true` in `variables.tf`. It never
@ -92,7 +98,7 @@ appears in `.tfvars` files. Non-secret configuration lives in tracked
--- ---
## Ansible integration ### Ansible integration
After `terraform apply`, run `make tf-inventory TF_ENV=<env>` to regenerate After `terraform apply`, run `make tf-inventory TF_ENV=<env>` to regenerate
`inventories/<env>/hosts.yml` from the `vms` output. The full handoff pipeline, `inventories/<env>/hosts.yml` from the `vms` output. The full handoff pipeline,
@ -102,7 +108,7 @@ handoff)**.
--- ---
## What was ruled out ### What was ruled out
| Option | Reason | | Option | Reason |
|---|---| |---|---|
@ -110,3 +116,24 @@ handoff)**.
| OPNsense Terraform provider | Community-maintained; provider rot risk across OPNsense releases | | OPNsense Terraform provider | Community-maintained; provider rot risk across OPNsense releases |
| Terraform workspaces | Single state file with workspace prefix; accidental cross-env apply possible | | Terraform workspaces | Single state file with workspace prefix; accidental cross-env apply possible |
| Separate Terraform repo | Cross-referencing between infra and config adds friction; monorepo keeps the full picture together | | Separate Terraform repo | Cross-referencing between infra and config adds friction; monorepo keeps the full picture together |
## Consequences
Drawn from the "What was ruled out" section and the decisions stated above:
- `bpg/proxmox` is the only provider; `telmate/proxmox` was ruled out for weaker
maintenance and Proxmox 8 / cloud-init support (Providers; What was ruled out).
- OPNsense stays entirely in Ansible — no Terraform OPNsense provider — to avoid
community-provider rot across OPNsense releases (Responsibility split; What was
ruled out).
- Terraform writes no DNS records; Ansible's `dns` role owns the entire internal
zone, avoiding the bootstrap cycle and split DNS ownership the earlier
`hashicorp/dns` design created (Providers).
- State is local on the control node because Forgejo offers no usable HTTP state
backend; this is sufficient at solo-operator scale (no concurrent applies, no
remote locking), with a real backend such as MinIO/S3 to be added later if
warranted (State backend).
- Separate environment directories are used instead of Terraform workspaces to
remove the risk of applying the wrong state (Structure; What was ruled out).
- Terraform and Ansible internals are kept in one monorepo rather than a separate
Terraform repo to avoid cross-referencing friction (What was ruled out).

View file

@ -1,5 +1,9 @@
# ADR-007 — Network topology and addressing # ADR-007 — Network topology and addressing
## Status
Accepted (2026-05-30)
## Context ## Context
The boma homelab is a Proxmox cluster on a dedicated private network behind an The boma homelab is a Proxmox cluster on a dedicated private network behind an
@ -10,7 +14,9 @@ and OPNsense configuration.
--- ---
## Physical topology ## Decision
### Physical topology
``` ```
ISP ISP
@ -38,7 +44,7 @@ ISP
--- ---
## VLAN design ### VLAN design
| VLAN | Name | Subnet | Purpose | | VLAN | Name | Subnet | Purpose |
|---|---|---|---| |---|---|---|---|
@ -51,9 +57,9 @@ ISP
--- ---
## IP addressing ### IP addressing
### VLAN 10 — mgmt (10.10.0.0/24) — no DHCP #### VLAN 10 — mgmt (10.10.0.0/24) — no DHCP
| Address | Host | | Address | Host |
|---|---| |---|---|
@ -63,7 +69,7 @@ ISP
| `10.10.0.201` | `pve1` | | `10.10.0.201` | `pve1` |
| `10.10.0.202` | `pve2` | | `10.10.0.202` | `pve2` |
### VLAN 20 — srv (10.20.0.0/24) — no DHCP, all static #### VLAN 20 — srv (10.20.0.0/24) — no DHCP, all static
| Range | Purpose | | Range | Purpose |
|---|---| |---|---|
@ -81,28 +87,28 @@ Assigned infrastructure addresses:
| `10.20.0.12` | `proxy` | Reverse proxy | | `10.20.0.12` | `proxy` | Reverse proxy |
| `10.20.0.13` | `homeassistant` | Home Assistant (IoT controller) | | `10.20.0.13` | `homeassistant` | Home Assistant (IoT controller) |
### VLAN 30 — lan (10.30.0.0/24) #### VLAN 30 — lan (10.30.0.0/24)
| Range | Purpose | | Range | Purpose |
|---|---| |---|---|
| `10.30.0.1` | OPNsense gateway | | `10.30.0.1` | OPNsense gateway |
| `10.30.0.100``.249` | DHCP pool | | `10.30.0.100``.249` | DHCP pool |
### VLAN 40 — iot (10.40.0.0/24) #### VLAN 40 — iot (10.40.0.0/24)
| Range | Purpose | | Range | Purpose |
|---|---| |---|---|
| `10.40.0.1` | OPNsense gateway | | `10.40.0.1` | OPNsense gateway |
| `10.40.0.100``.249` | DHCP pool | | `10.40.0.100``.249` | DHCP pool |
### VLAN 50 — guest (10.50.0.0/24) #### VLAN 50 — guest (10.50.0.0/24)
| Range | Purpose | | Range | Purpose |
|---|---| |---|---|
| `10.50.0.1` | OPNsense gateway | | `10.50.0.1` | OPNsense gateway |
| `10.50.0.100``.249` | DHCP pool | | `10.50.0.100``.249` | DHCP pool |
### VLAN 99 — vpn — retired #### VLAN 99 — vpn — retired
The OPNsense WireGuard VPN (`10.99.0.0/24`) is **replaced by the NetBird mesh** The OPNsense WireGuard VPN (`10.99.0.0/24`) is **replaced by the NetBird mesh**
(ADR-016). Remote access for `ubongo`, `askari`, and road-warrior clients rides a (ADR-016). Remote access for `ubongo`, `askari`, and road-warrior clients rides a
@ -111,7 +117,7 @@ NetBird self-hosted on `askari`. NetBird manages its own overlay addressing
(default `100.64.0.0/10`); no boma VLAN/subnet is allocated for it, and (default `100.64.0.0/10`); no boma VLAN/subnet is allocated for it, and
`10.99.0.0/24` is freed. `10.99.0.0/24` is freed.
### Corosync ring (172.16.0.0/24) — not on managed switch #### Corosync ring (172.16.0.0/24) — not on managed switch
| Address | Host | | Address | Host |
|---|---| |---|---|
@ -121,7 +127,7 @@ NetBird self-hosted on `askari`. NetBird manages its own overlay addressing
--- ---
## OPNsense firewall rules (intent) ### OPNsense firewall rules (intent)
| Source | Destination | Policy | | Source | Destination | Policy |
|---|---|---| |---|---|---|
@ -142,7 +148,7 @@ IoT devices cannot initiate connections to `srv`.
--- ---
## Naming scheme ### Naming scheme
| Layer | Convention | Examples | | Layer | Convention | Examples |
|---|---|---| |---|---|---|
@ -155,7 +161,7 @@ IoT devices cannot initiate connections to `srv`.
--- ---
## DNS zones and split-horizon ### DNS zones and split-horizon
**Internal zone**: `boma.baobab.band` — served by `dns1` and `dns2`. **Internal zone**: `boma.baobab.band` — served by `dns1` and `dns2`.
The zone is rendered by the Ansible `dns` role: host A records come from the The zone is rendered by the Ansible `dns` role: host A records come from the
@ -175,7 +181,7 @@ All other queries go upstream (e.g., `1.1.1.1`, `9.9.9.9`).
--- ---
## External monitoring — askari ### External monitoring — askari
`askari` (Hetzner VPS) is a peer on the **NetBird mesh** (ADR-016) and also **hosts `askari` (Hetzner VPS) is a peer on the **NetBird mesh** (ADR-016) and also **hosts
the self-hosted NetBird coordinator** (management/signal/relay). It reaches `srv` the self-hosted NetBird coordinator** (management/signal/relay). It reaches `srv`
@ -186,3 +192,24 @@ ACLs — no OPNsense WireGuard tunnel and no `10.99.0.0/24` routing.
be reachable even when the homelab is down (its entire purpose), which is also why be reachable even when the homelab is down (its entire purpose), which is also why
the mesh coordinator lives here: an off-site control plane survives a homelab outage. the mesh coordinator lives here: an off-site control plane survives a homelab outage.
FQDN: `askari.baobab.band`. FQDN: `askari.baobab.band`.
---
## Consequences
Drawn from the implications already stated above:
- VLAN 99 (`vpn`, `10.99.0.0/24`) is retired and the subnet freed; remote access is
carried by the self-hosted NetBird mesh instead of an OPNsense WireGuard subnet
(VLAN design; IP addressing — VLAN 99 retired).
- Mesh-peer firewall allowances (to `srv` metrics ports and `mgmt`) are enforced by
NetBird ACLs, not OPNsense rules (OPNsense firewall rules (intent)).
- IoT devices cannot initiate connections to `srv`; only Home Assistant at
`10.20.0.13` may reach the IoT VLAN, with OPNsense Avahi bridging `srv``iot`
for discovery (OPNsense firewall rules (intent)).
- Terraform writes no DNS records; the Ansible `dns` role renders the internal zone
from inventory plus `group_vars`, with `dns1`/`dns2` serving split-horizon answers
(DNS zones and split-horizon).
- `askari` runs independently of the cluster so it survives a homelab outage, which
is why the off-site NetBird control plane lives there (External monitoring —
askari).

View file

@ -3,6 +3,10 @@
> Practical point-of-use pitfalls (nft render checks, Molecule `community.docker`, > Practical point-of-use pitfalls (nft render checks, Molecule `community.docker`,
> apply-path coverage blind spots) live in `docs/testing/gotchas.md`. > apply-path coverage blind spots) live in `docs/testing/gotchas.md`.
## Status
Accepted (2026-05-30)
## Context ## Context
Ansible roles must be idempotent and correct before they touch production hosts. Ansible roles must be idempotent and correct before they touch production hosts.
@ -11,9 +15,11 @@ This document records the testing strategy, what each level covers, and — crit
--- ---
## Three testing levels ## Decision
### Level 1 — Molecule (per role, always required) ### Three testing levels
#### Level 1 — Molecule (per role, always required)
Runs in Docker on the control node (`ubongo`) or in CI. Fast (~5 min per role). Runs in Docker on the control node (`ubongo`) or in CI. Fast (~5 min per role).
@ -41,7 +47,7 @@ The idempotency step is non-negotiable. Every role must pass it cleanly.
that: svc.stdout == "active" that: svc.stdout == "active"
``` ```
### Level 2 — Staging playbook (full stack, real VMs) #### Level 2 — Staging playbook (full stack, real VMs)
`make check PLAYBOOK=site` followed by `make deploy PLAYBOOK=site` on `make check PLAYBOOK=site` followed by `make deploy PLAYBOOK=site` on
Terraform-provisioned staging VMs. Catches inter-role dependencies and ordering Terraform-provisioned staging VMs. Catches inter-role dependencies and ordering
@ -50,13 +56,13 @@ have already run and configured the firewall).
Run before every merge to `main`. Run before every merge to `main`.
### Level 3 — External smoke test from askari #### Level 3 — External smoke test from askari
Once `askari` is operational: scripted checks from outside the network confirming Once `askari` is operational: scripted checks from outside the network confirming
that public-facing services respond correctly. Catches firewall and reverse proxy that public-facing services respond correctly. Catches firewall and reverse proxy
configuration issues invisible to Ansible check mode. configuration issues invisible to Ansible check mode.
### Level 4 — Service-UI acceptance (Claude-driven exploratory) #### Level 4 — Service-UI acceptance (Claude-driven exploratory)
A Claude-driven exploratory check of a service's **application UI**, run as A Claude-driven exploratory check of a service's **application UI**, run as
`/verify-service <name>` on `ubongo` (ADR-017). Claude drives Chromium via the `/verify-service <name>` on `ubongo` (ADR-017). Claude drives Chromium via the
@ -78,7 +84,7 @@ deploy (STATUS.md). Full design: ADR-017.
--- ---
## Molecule test image ### Molecule test image
**No external images.** The project builds and hosts its own test image. **No external images.** The project builds and hosts its own test image.
@ -103,7 +109,7 @@ functionally equivalent and fully owned.
--- ---
## Idempotency requirements ### Idempotency requirements
Every role task must satisfy one of these: Every role task must satisfy one of these:
@ -121,9 +127,9 @@ catches anything lint misses.
--- ---
## What Molecule tests — and what it does not ### What Molecule tests — and what it does not
### Tested in Molecule #### Tested in Molecule
| Capability | Notes | | Capability | Notes |
|---|---| |---|---|
@ -139,7 +145,7 @@ catches anything lint misses.
| auditd installation and configuration | Install and config file | | auditd installation and configuration | Install and config file |
| Idempotency of all of the above | Enforced by Molecule's idempotency step | | Idempotency of all of the above | Enforced by Molecule's idempotency step |
### Not tested in Molecule — explicit exceptions #### Not tested in Molecule — explicit exceptions
The following require a real kernel or real hardware and are validated only at The following require a real kernel or real hardware and are validated only at
Level 2 (staging) or Level 3 (external). This is a conscious, documented decision Level 2 (staging) or Level 3 (external). This is a conscious, documented decision
@ -161,7 +167,7 @@ Behavioural correctness is confirmed on staging.
--- ---
## CI pipeline ### CI pipeline
``` ```
push to main push to main
@ -178,3 +184,27 @@ promote to production
Manual gates are intentional. Automated tests prove correctness in isolation; Manual gates are intentional. Automated tests prove correctness in isolation;
a human confirms the change is safe to promote. a human confirms the change is safe to promote.
---
## Consequences
Drawn from the limitations and trade-offs already stated above:
- The Molecule idempotency step is non-negotiable; every role must pass it cleanly
(Three testing levels — Level 1).
- A class of capabilities (nftables rule loading, NetBird mesh data plane,
unattended-upgrades behaviour, OPNsense DHCP, Avahi mDNS reflection, hardware
passthrough, corosync cluster formation) cannot be verified in Molecule and is
validated only at Level 2 (staging) or Level 3 (external) — a conscious,
documented decision, not a gap (What Molecule tests — and what it does not).
- The project builds and hosts its own `molecule-debian13` image rather than relying
on an external Docker Hub image (e.g. geerlingguy), accepting the maintenance of a
custom image to avoid drift, disappearance, or unexpected changes outside project
control (Molecule test image).
- Level 4 service-UI acceptance is authorable now but its execution is deferred,
pending `ubongo`, the `playwright` plugin, Authentik, and a staging deploy (Three
testing levels — Level 4).
- Promotion to staging and to production stays behind intentional manual approval
gates; automation proves isolated correctness, a human confirms promotion safety
(CI pipeline).

View file

@ -1,5 +1,9 @@
# ADR-009 — Terraform ↔ Ansible provisioning handoff # ADR-009 — Terraform ↔ Ansible provisioning handoff
## Status
Accepted (2026-05-30)
## Context ## Context
Two tools touch every managed host. Terraform owns **what exists** — VMs on Two tools touch every managed host. Terraform owns **what exists** — VMs on
@ -14,7 +18,9 @@ the cloud-init template that VMs are cloned from. This ADR covers how they conne
--- ---
## The boundary ## Decision
### The boundary
| Layer | Tool | Notes | | Layer | Tool | Notes |
|---|---|---| |---|---|---|
@ -31,7 +37,7 @@ below).
--- ---
## The handoff pipeline ### The handoff pipeline
There is one path by which a managed host comes into existence and reaches its There is one path by which a managed host comes into existence and reaches its
configured state: configured state:
@ -55,7 +61,7 @@ this pipeline — **never** by hand-editing the inventory.
--- ---
## The data contract ### The data contract
The seam's interface is a single Terraform output consumed by a single script. The seam's interface is a single Terraform output consumed by a single script.
@ -88,7 +94,7 @@ Terraform, and the inventory is regenerated, never edited.
--- ---
## Cloud-init's role ### Cloud-init's role
Cloud-init is the thin first-boot layer between Terraform and Ansible: Cloud-init is the thin first-boot layer between Terraform and Ansible:
@ -103,7 +109,7 @@ The line is sharp: cloud-init buys *reachability*, Ansible owns *configuration*.
--- ---
## Internal DNS — owned by Ansible, no chicken-and-egg ### Internal DNS — owned by Ansible, no chicken-and-egg
Terraform writes **no** DNS records. The internal zone (`boma.baobab.band`) is Terraform writes **no** DNS records. The internal zone (`boma.baobab.band`) is
rendered entirely by the Ansible `dns` role: rendered entirely by the Ansible `dns` role:
@ -129,7 +135,7 @@ convention only — it no longer implies any difference in how records are writt
--- ---
## The control-node exception ### The control-node exception
The control node — the host that runs Terraform and Ansible — is `ubongo`, a The control node — the host that runs Terraform and Ansible — is `ubongo`, a
dedicated **physical** machine outside the cluster. It is not a VM at all, so dedicated **physical** machine outside the cluster. It is not a VM at all, so
@ -146,7 +152,7 @@ Every other host is Terraform-managed.
--- ---
## What was ruled out ### What was ruled out
| Option | Reason | | Option | Reason |
|---|---| |---|---|
@ -154,3 +160,25 @@ Every other host is Terraform-managed.
| Hand-editing the generated inventory | `hosts.yml` is a build artifact of `tf_to_inventory.py`; edits are overwritten on the next `make tf-inventory`. Edit `local.vms` instead. | | Hand-editing the generated inventory | `hosts.yml` is a build artifact of `tf_to_inventory.py`; edits are overwritten on the next `make tf-inventory`. Edit `local.vms` instead. |
| Documenting the seam in both ADR-005 and ADR-006 | The boundary belongs in exactly one place. Those ADRs link here. | | Documenting the seam in both ADR-005 and ADR-006 | The boundary belongs in exactly one place. Those ADRs link here. |
| Terraform-managed DNS records (`hashicorp/dns` + RFC 2136) | Created a bootstrap cycle (the first DNS server can't register itself) and split DNS ownership across two tools. Ansible owns the whole internal zone instead — one owner, no cycle. | | Terraform-managed DNS records (`hashicorp/dns` + RFC 2136) | Created a bootstrap cycle (the first DNS server can't register itself) and split DNS ownership across two tools. Ansible owns the whole internal zone instead — one owner, no cycle. |
## Consequences
Drawn from the boundary, the data contract, and the "What was ruled out" section above:
- Adding a host means editing `local.vms` and running the handoff pipeline; the
generated `hosts.yml` is a build artifact and must never be hand-edited — manual
edits are overwritten on the next `make tf-inventory` (The handoff pipeline; The
data contract; What was ruled out).
- Manual `qm clone` is rejected as a general provisioning path so the inventory and
real infrastructure cannot drift; Terraform is the single way VMs come into
existence (What was ruled out).
- Terraform writes no DNS records: the Ansible `dns` role renders the whole internal
zone from inventory plus `group_vars`, dissolving the bootstrap cycle a
Terraform-managed zone (`hashicorp/dns` + RFC 2136) would create (Internal DNS —
owned by Ansible, no chicken-and-egg; What was ruled out).
- The control node (`ubongo`) is the single documented exception to "Terraform owns
VM existence" — a physical machine provisioned manually and managed by Ansible for
baseline config only; every other host is Terraform-managed (The control-node
exception).
- The seam is documented in exactly one place (this ADR); ADR-005 and ADR-006 link
here rather than restating it (What was ruled out).

View file

@ -1,5 +1,9 @@
# ADR-010 — Forgejo integration and CI # ADR-010 — Forgejo integration and CI
## Status
Accepted (2026-05-30)
## Context ## Context
boma's git host, container registry, and (planned) CI all run on a self-hosted boma's git host, container registry, and (planned) CI all run on a self-hosted
@ -20,7 +24,7 @@ held to the same standard as the rest of the repo's secrets.
--- ---
## Decisions ## Decision
### 1. API tokens are managed secrets, least-privilege ### 1. API tokens are managed secrets, least-privilege
@ -75,3 +79,21 @@ later if CI load warrants a separate host. Actions is not yet enabled — see ST
| Terraform Forgejo HTTP state backend | Forgejo's `/raw/` API is read-only; state can't be written there. Local state instead (ADR-006). | | Terraform Forgejo HTTP state backend | Forgejo's `/raw/` API is read-only; state can't be written there. Local state instead (ADR-006). |
| Admin-scoped automation tokens | Unnecessary privilege; scope to `read:repository` + `read`/`write:package`. | | Admin-scoped automation tokens | Unnecessary privilege; scope to `read:repository` + `read`/`write:package`. |
| Ad-hoc UI/API configuration as the norm | Becomes undocumented drift; codify or document instead. | | Ad-hoc UI/API configuration as the norm | Becomes undocumented drift; codify or document instead. |
---
## Consequences
- The planned CI pipeline (see "CI pipeline (planned)") is trunk-based per ADR-003 /
ADR-008 — `push to main → lint + Molecule → deploy staging → [manual gate] → deploy
production` — running `act_runner` on `ubongo` (or a dedicated runner VM later if CI
load warrants); Actions is not yet enabled, so this remains future work tracked in
STATUS.md.
- Terraform state is not held in Forgejo: its `/raw/` API is read-only and cannot be
written, so local state is used instead (ADR-006) (see "What was ruled out").
- Automation tokens are scoped to `read:repository` + `read`/`write:package` rather
than admin, accepting the limits that least-privilege imposes on what automation can
do (see "What was ruled out").
- Instance/repo configuration must be codified or documented rather than changed
ad-hoc, to avoid the undocumented drift `/review-repo` exists to catch (see "What was
ruled out").

View file

@ -1,6 +1,9 @@
# ADR-011 — Update and upgrade management # ADR-011 — Update and upgrade management
**Status: Proposed — draft for discussion (not yet accepted).** ## Status
Proposed (2026-06-04) — draft for discussion; not yet accepted. The core decisions
below are settled in intent, but several specifics remain open (see "Open questions").
## Context ## Context
@ -10,7 +13,7 @@ drift over time and must be kept current without breaking the homelab: the **hos
--- ---
## Decisions ## Decision
### 1. Every service is classified stateful or stateless ### 1. Every service is classified stateful or stateless
@ -132,3 +135,19 @@ alert-driven.
| 8-weekly as the only stateful path | Too slow for urgent CVEs — hence the DIUN security fast-path. | | 8-weekly as the only stateful path | Too slow for urgent CVEs — hence the DIUN security fast-path. |
--- ---
## Consequences
- A single uniform update policy is rejected: the stateful/stateless split is
load-bearing, so stateless services roll on rolling tags while stateful services are
pinned `tag@digest`, human-gated, and backup-first (see "What was ruled out").
- The weekly run never touches stateful services and the whole fleet is never updated
at once, accepting the added orchestration of host ordering and an 8-weekly +
fast-path cadence in exchange for bounded blast radius (see "What was ruled out").
- No update automation ships until the health-check verification gate is in order; the
pipeline is deliberately sequenced behind that harness (see Decision 6).
- Several points remain open for discussion (see "Open questions"): where the Proxmox
snapshot is driven from across the TF/Ansible boundary; the exact cadences; where the
health-check harness lives and the minimum bar that counts as "in order"; whether
classification is a per-role `__stateful` flag or a group_vars list; whether the
weekly run hits staging first; and the notification + "skip/pause" control channel.

View file

@ -1,5 +1,9 @@
# ADR-012 — Hardware reference & capacity evaluation # ADR-012 — Hardware reference & capacity evaluation
## Status
Accepted (2026-06-01)
## Context ## Context
The repo modelled the logical/network layer (Terraform VM specs, ADR-007 The repo modelled the logical/network layer (Terraform VM specs, ADR-007

View file

@ -1,5 +1,9 @@
# ADR-013 — Heritage: learning from AnsibleBaobabV4 without inheriting it # ADR-013 — Heritage: learning from AnsibleBaobabV4 without inheriting it
## Status
Accepted (2026-06-04)
## Context ## Context
boma is the methodology successor to AnsibleBaobabV4 (and V3 before it) — not a new boma is the methodology successor to AnsibleBaobabV4 (and V3 before it) — not a new
@ -10,7 +14,9 @@ structure and assumptions creep back in under the guise of "inspiration." This A
sets the policy for drawing on V4 without inheriting it. (Resolves the questions sets the policy for drawing on V4 without inheriting it. (Resolves the questions
previously parked in TODO 3.3 and 10.1.) previously parked in TODO 3.3 and 10.1.)
## Principle — translate, don't transplant ## Decision
### Principle — translate, don't transplant
V4 is **evidence, never authority.** It can show what was needed or what went wrong; V4 is **evidence, never authority.** It can show what was needed or what went wrong;
it can never be the reason boma does something a certain way. it can never be the reason boma does something a certain way.
@ -21,7 +27,7 @@ it can never be the reason boma does something a certain way.
- **Acceptance test** for anything V4-derived: *can it be justified purely from - **Acceptance test** for anything V4-derived: *can it be justified purely from
boma's principles, with zero reference to V4?* If not, it does not land. boma's principles, with zero reference to V4?* If not, it does not land.
## What V4 is — and is not — a source of ### What V4 is — and is not — a source of
| Legitimate source of | Never a source of | | Legitimate source of | Never a source of |
|---|---| |---|---|
@ -33,7 +39,7 @@ it can never be the reason boma does something a certain way.
Only concrete, verifiable, low-level knowledge crosses over — precisely because it is Only concrete, verifiable, low-level knowledge crosses over — precisely because it is
safe to re-derive, whereas structure and requirements drag assumptions along. safe to re-derive, whereas structure and requirements drag assumptions along.
## Provenance — transient only ### Provenance — transient only
When a boma decision was prompted by a V4 lesson, or a config adapted from V4, the When a boma decision was prompted by a V4 lesson, or a config adapted from V4, the
lineage is recorded only in **transient** places: the commit message, the working lineage is recorded only in **transient** places: the commit message, the working
@ -42,7 +48,7 @@ extraction warrants one. **Durable artifacts (ADRs, role READMEs, `SECURITY.md`)
stand on boma's own terms with no V4 reference.** Honest about lineage in history; stand on boma's own terms with no V4 reference.** Honest about lineage in history;
clean in the living repo. clean in the living repo.
## AI consultation guardrails ### AI consultation guardrails
The AI is the main consumer of V4 — it is on disk and readable. When consulting it: The AI is the main consumer of V4 — it is on disk and readable. When consulting it:

View file

@ -1,5 +1,9 @@
# ADR-014 — Sourcing technical knowledge (docs and best practices) # ADR-014 — Sourcing technical knowledge (docs and best practices)
## Status
Accepted (2026-06-04)
## Context ## Context
Most work in boma is done by AI agents drawing on training memory, which is stale Most work in boma is done by AI agents drawing on training memory, which is stale
@ -100,5 +104,27 @@ above keeps the policy working.
- Commit to the principle, not a tool — degrade to `WebFetch`/`WebSearch` when plugins - Commit to the principle, not a tool — degrade to `WebFetch`/`WebSearch` when plugins
are absent. are absent.
See also: ADR-013 (heritage / translate-don't-transplant), ADR-011 (version pinning), ## Consequences
ADR-008 (testing/verification).
Drawn from the follow-on work and limitations this ADR already states:
- Verified facts carry a durable, greppable stamp; a stamp binds a fact to a pinned
version, so a `requirements` change or image upgrade marks exactly what to re-check
(per Capture / Re-verification).
- Stale-stamp detection — a `/review-repo` or `/security-review` check flagging stamps
whose recorded version no longer matches what is pinned — is a noted enhancement, not
built yet (per Re-verification).
- Any version-specific claim given from memory must be marked "from memory, unverified"
as a transparency backstop, since agent self-assessed certainty is unreliable (per
When consulting is required).
- The policy commits to the principle rather than a specific plugin, so it degrades to
`WebFetch`/`WebSearch` on a bare install; reproducing the plugin toolchain from the
repo is done via `.claude/settings.json` and `docs/runbooks/claude-code-setup.md`,
with the graceful-degradation fallback covering a fresh clone until bootstrap runs
(per Source hierarchy / Reproducibility of the toolchain).
## Related
- ADR-013 — heritage / translate-don't-transplant.
- ADR-011 — version pinning.
- ADR-008 — testing / verification.

View file

@ -1,5 +1,9 @@
# ADR-015 — Control / development / AI-worker host (`ubongo`) # ADR-015 — Control / development / AI-worker host (`ubongo`)
## Status
Accepted (2026-06-05)
## Context ## Context
Earlier ADRs framed the control node — the host that runs Terraform and Ansible — Earlier ADRs framed the control node — the host that runs Terraform and Ansible —

View file

@ -90,7 +90,7 @@ allocated for it.
## Status ## Status
Designed, not built — depends on the unbuilt `base` role and service-role machinery Accepted (2026-06-05). Designed, not built — depends on the unbuilt `base` role and service-role machinery
(STATUS.md). This ADR records the decision and doc reconciliation; role tasks land when (STATUS.md). This ADR records the decision and doc reconciliation; role tasks land when
`base` exists. `base` exists.
@ -108,3 +108,22 @@ Designed, not built — depends on the unbuilt `base` role and service-role mach
See also: ADR-007 (network — amended), ADR-015 (control host), ADR-002 (security), See also: ADR-007 (network — amended), ADR-015 (control host), ADR-002 (security),
ADR-011 (version pinning), ADR-004 (one service = one role), ADR-009 (TF↔Ansible ADR-011 (version pinning), ADR-004 (one service = one role), ADR-009 (TF↔Ansible
handoff), ADR-013 (heritage — V4 ran WireGuard; NetBird is translated, not transplanted). handoff), ADR-013 (heritage — V4 ran WireGuard; NetBird is translated, not transplanted).
## Consequences
- A new public surface appears on `askari` — management API + dashboard (80/443) +
Coturn (3478) — mitigated by TLS, embedded-IdP login, source-IP limits where
practical, `base` hardening and version-pinned NetBird, and recorded as accepted-risk
R3 (Security).
- On-LAN SSH never depends on the mesh: `base` allows inbound SSH from `ubongo`'s LAN
address as a mesh-independent secondary path, so a mesh/coordinator outage never
blocks on-LAN SSH and Ansible stays off the mesh (Security; Recovery & operations).
- The mesh survives a homelab outage because the coordinator is off-site on `askari`,
with its management datastore backed up encrypted off `askari` and peers keeping
last-known config through a brief coordinator outage (Recovery & operations).
- Choosing NetBird over plain OPNsense WireGuard, Tailscale, Tailscale+Headscale, an
on-cluster coordinator, a `ubongo` subnet router, and a standalone IdP gains
identity/ACL policy, self-hosted sovereignty, no routing SPOF, and a light single
operator footprint (What was ruled out).
- Implementation is pending: the role tasks land only once the unbuilt `base` role and
service-role machinery exist (Status).

View file

@ -65,7 +65,7 @@ them.
## Status ## Status
Designed. **Authorable now:** this ADR, the ADR-008 Level 4 expansion, the `VERIFY.md` Accepted (2026-06-05). Designed. **Authorable now:** this ADR, the ADR-008 Level 4 expansion, the `VERIFY.md`
template, the `/verify-service` skill, the convention/checklist/Further-reading edits, template, the `/verify-service` skill, the convention/checklist/Further-reading edits,
`.gitignore`/dir, STATUS/TODO. **Running is deferred** on its dependencies. `.gitignore`/dir, STATUS/TODO. **Running is deferred** on its dependencies.
@ -90,3 +90,21 @@ template, the `/verify-service` skill, the convention/checklist/Further-reading
See also: ADR-008 (testing — expanded), ADR-015 (control host), ADR-002 (security), See also: ADR-008 (testing — expanded), ADR-015 (control host), ADR-002 (security),
ADR-004 (`VERIFY.md` parallels `SECURITY.md`), ADR-013/014 (heritage / knowledge sourcing). ADR-004 (`VERIFY.md` parallels `SECURITY.md`), ADR-013/014 (heritage / knowledge sourcing).
## Consequences
- The harness is confined to staging by a hard stop: it refuses to run against
production because exploratory clicking is destructive, the blast radius is bounded to
the target service, and test users live only in the staging `test` group (Safety).
- No secrets leak: the git-ignored screenshot dir is the safety boundary and credential
screens are avoided (Safety; Reporting & manual handoff).
- Test identities are ephemeral per-run credentials in the staging Authentik only —
never production, none persisted in `vault.yml` — created reuse-or-create and torn
down via staging rebuild or `test`-group cleanup (Test-user standard).
- Anything Claude cannot exercise (physical device, paid/external flow, subjective
judgment) is handed off via a structured manual-test checklist in the run report
(Reporting & manual handoff).
- Authoring is possible now (this ADR, the `VERIFY.md` template, the `/verify-service`
skill, conventions/checklist edits), but running is deferred on its dependencies:
`ubongo`, the `playwright` plugin, Authentik, a staging deploy, and `make new-role`
scaffolding `VERIFY.md` (Status; Dependencies).

View file

@ -72,7 +72,7 @@ tracked allocation in `docs/hardware/reference.md` (ADR-012).
## Status ## Status
Designed. **Authorable now:** this ADR + the ADR-002/CAPABILITIES/ADR-012/ Accepted (2026-06-06). Designed. **Authorable now:** this ADR + the ADR-002/CAPABILITIES/ADR-012/
accepted-risks/STATUS/TODO reconciliations. **Deferred on the stack:** Alloy-in-`base`, accepted-risks/STATUS/TODO reconciliations. **Deferred on the stack:** Alloy-in-`base`,
the `loki`/`grafana` service roles, OPNsense syslog config, the push-only credential, the `loki`/`grafana` service roles, OPNsense syslog config, the push-only credential,
and the live pipeline. and the live pipeline.
@ -97,3 +97,26 @@ the metrics stack (Prometheus / `node_exporter`) for SSD-wearout + log-silence a
See also: ADR-002 (security baseline — realised here), ADR-016 (mesh / `askari`), See also: ADR-002 (security baseline — realised here), ADR-016 (mesh / `askari`),
ADR-007 (OPNsense / `askari`), ADR-012 (hardware/capacity), ADR-004 (service-role ADR-007 (OPNsense / `askari`), ADR-012 (hardware/capacity), ADR-004 (service-role
standard), ADR-011 (health checks — distinct from this). standard), ADR-011 (health checks — distinct from this).
## Consequences
- Opportunistic track-covering and host-pivot-to-store are defeated because logs leave
the host in near-real-time and the off-cluster security trail is append-only, so it
survives full-cluster compromise (Security, integrity & residual risks).
- Conscious residuals remain: append-only is not cryptographic WORM (root-on-`askari`
could edit chunks — R4); there is a few-seconds un-shipped window; agent compromise
can stop future shipping but not alter shipped history; a stolen push credential
appends noise but cannot delete; and an `askari` outage buffers then flushes on
reconnect (Security, integrity & residual risks).
- A host going silent is itself an alert (Security, integrity & residual risks).
- Only a bounded security subset ships off-site — `auditd`, `authpriv`, `fail2ban`,
AIDE, Suricata and key container security events tagged `security="true"` — while the
cluster Loki holds everything, keeping off-site volume small (Data flow & the security
subset).
- Disk-wear is a managed parameter: log storage on NVMe/SSD or HDD never SD/USB flash,
bounded verbosity at source, tuned Loki retention/compaction, and monitored SSD
wearout/TBW with an alert; log storage is a tracked allocation in
`docs/hardware/reference.md` (Retention & disk-wear).
- The decision is authorable now but the live pipeline is deferred on the stack:
Alloy-in-`base`, the `loki`/`grafana` service roles, OPNsense syslog config, and the
push-only credential (Status; Dependencies).

View file

@ -0,0 +1,106 @@
# ADR-023 — ADR structure & lifecycle
## Status
Accepted (2026-06-10). Meta/doctrine ADR — pins how ADRs are written; the
`adr-structure` check (`scripts/repo-scan.py`) and `docs/decisions/adr-template.md`
ship with it, and ADRs 001018 were retroactively restructured to conform. Resolves
the FRICTION signal (2026-05-31) about ADR-writing policy being unsettled.
## Context
boma records architectural decisions as numbered ADRs in `docs/decisions/`, and
CLAUDE.md treats them as load-bearing. Yet no ADR said how an ADR is written. The
newest ADRs (019022) converged on a clean shape — Status → Context → Decision →
Consequences → Related — but only by imitation. ADRs 001018 predate it and drifted
widely: most lacked a `## Status` section entirely (016018 carried only a trailing
build-state note), and many lacked an explicit `## Decision` or `## Consequences`
heading, their decisions spread across ad-hoc topical sections. The result was
structural drift and no uniform way to tell an active decision from a superseded or
deprecated one.
## Decision
### 1. Title & filename
Title line: `# ADR-NNN — <Title>: <optional clarifying subtitle>` (em-dash). Filename:
`NNN-kebab-title.md`, zero-padded 3-digit, monotonic, never reused — a superseded ADR
keeps its number and file. A new ADR is registered as a row in the CLAUDE.md
"Further reading" table.
### 2. Mandatory sections, in this order
- `## Status` — a lifecycle line, usually `Accepted (YYYY-MM-DD)` (see §4), plus an
optional one-line note.
- `## Context` — the forces, the problem, what exists today, why now.
- `## Decision` — what we are doing; numbered sub-decisions for multi-part ADRs.
- `## Consequences` — results, trade-offs explicitly accepted, follow-on work.
### 3. Optional sections (use only where they genuinely apply)
`## Related`, `## Scope`, `## Guardrails` / `## Enforcement`, `## What was ruled out`,
`## Verified facts (ADR-014)`.
### 4. Status lifecycle
Four states. Because boma is single-contributor and trunk-based with no review gate,
most ADRs are **born `Accepted (YYYY-MM-DD)`** — committed-to on writing. A
**`Proposed`** state exists for a genuine draft whose core direction is recorded but
whose specifics are still open for discussion (e.g. ADR-011); it is promoted to
`Accepted` once settled.
- **`Proposed (YYYY-MM-DD)`** — drafted, under discussion, not yet committed-to. May
carry open questions. Promoted to `Accepted (YYYY-MM-DD)` when decided.
- **`Accepted (YYYY-MM-DD)`** — committed-to. The common starting state.
- Replaced → old ADR's Status becomes **`Superseded by ADR-NNN (YYYY-MM-DD)`**; the new
ADR records `Supersedes ADR-MMM` in its Status and `## Related`. The link is
**bidirectional**.
- Retired with no replacement → **`Deprecated (YYYY-MM-DD)`** + a one-line reason.
**No silent rewrites.** An Accepted ADR is not edited to reverse its decision. Typo and
clarity fixes are fine; a material reversal requires a new ADR and a `Superseded by`
marker on the old one.
### 5. Template & enforcement
`docs/decisions/adr-template.md` is the scaffold for new ADRs. The `/review-repo`
command's pre-scan (`scripts/repo-scan.py`) emits an `adr-structure` finding for any
numbered ADR missing a mandatory section or with an unparseable Status line. It checks
**presence and Status, not section order** — order is a convention the template carries,
deliberately not gated, to keep enforcement lightweight (consistent with boma's other
doctrine ADRs adding no CI gate).
### 6. Retroactive conformance of the back-catalogue
ADRs 001018 are restructured to satisfy this standard rather than grandfathered. The
restructure is **presentational** — existing headings are relabelled, regrouped, or
demoted under a `## Decision` umbrella; a dated `## Status` is added; a `## Consequences`
section is assembled from implications the ADR already states. **The substance of no
decision is changed.** This keeps the check uniform (no number threshold) and the corpus
a consistent, legible decision history.
## Consequences
- New ADRs have one obvious shape and a scaffold; structural drift stops.
- Every ADR declares its lifecycle state uniformly, and reversals are traceable.
- The whole corpus conforms; the check needs no grandfathering and stays simple.
- One-time restructure churn across ADRs 001018 (heading reorganization + a Status and
a Consequences section per file; no decision substance changed).
- `/review-repo` grows one deterministic check; no new CI machinery.
- This ADR is the first conformant example and is held to its own check.
## What was ruled out
- **A `make lint` / CI gate for ADR structure** — heavier than the risk warrants;
the `/review-repo` check and the template suffice.
- **Machine-enforcing section order** — brittle for marginal value; left as a
template-demonstrated convention.
- **Grandfathering 001018 from the check** — rejected in favour of restructuring the
whole corpus to conform, so the standard applies uniformly with no exceptions.
## Related
- ADR-014 — knowledge sourcing (the `Verified facts` optional section).
- ADR-019/020/021/022 — the emergent structure this ADR codifies.
- `docs/decisions/adr-template.md` — the scaffold.
- `scripts/repo-scan.py` — the `adr-structure` enforcement check.

View file

@ -0,0 +1,40 @@
# ADR-NNN — <Title>: <optional clarifying subtitle>
<!-- Filename: NNN-kebab-title.md (zero-padded, monotonic, never reused).
Register a row in CLAUDE.md "Further reading" when this ADR is created.
Sections below in order. Mandatory: Status, Context, Decision, Consequences.
Delete this comment and any optional section you don't use. -->
## Status
Accepted (YYYY-MM-DD)
<!-- Lifecycle: usually born "Accepted (YYYY-MM-DD)"; use "Proposed (YYYY-MM-DD)" for a
genuine draft (open questions), promoted to Accepted once settled. Later:
"Superseded by ADR-NNN (YYYY-MM-DD)" or "Deprecated (YYYY-MM-DD)" + one-line why.
Optional trailing note OK, e.g.
"Accepted (2026-06-10). Doctrine ADR — pins policy, builds nothing yet." -->
## Context
<!-- The forces, the problem, what exists today, why now. -->
## Decision
<!-- What we are doing. Use numbered sub-decisions (### 1. ...) for multi-part ADRs. -->
## Consequences
<!-- Results, trade-offs explicitly accepted, follow-on work. -->
<!-- Optional sections — uncomment any that genuinely apply; never pad:
## Scope — explicit in / out-of-scope boundaries.
## Guardrails — how the decision is mechanically enforced (lint, CI, hooks).
## What was ruled out — rejected alternatives, each with its reason.
## Verified facts (ADR-014) — verified: <subject> · <tool> <version> · <source> · <YYYY-MM-DD>
## Related — links to other ADRs by number; bidirectional for Supersedes/Superseded-by.
-->

View file

@ -0,0 +1,556 @@
# ADR Structure & Lifecycle Implementation Plan
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
**Goal:** Codify how boma's ADRs are structured — a canonical section set, an Accepted/Superseded/Deprecated lifecycle, a template, a lightweight enforcement check, and a one-time Status backfill of the back-catalogue.
**Architecture:** Five independent units. (1) A pure-function `adr-structure` check added to the existing `scripts/repo-scan.py` (stdlib only, pytest-tested like its siblings), verifying every numbered ADR has the four mandatory sections and a parseable Status line — presence only, not order. (2) An `adr-template.md` scaffold. (3) ADR-023 itself, written to pass its own check. (4) Wiring into CLAUDE.md and the `/review-repo` command doc. (5) A mechanical backfill adding `## Status` to ADRs 001018, dated from each file's first git-commit.
**Tech Stack:** Python 3 stdlib (`scripts/repo-scan.py`), pytest (`.venv/bin/pytest`), Markdown, git.
**Spec:** `docs/superpowers/specs/2026-06-10-adr-structure-design.md`
**Branch:** `feat/adr-structure` (already created; the design spec is the first commit).
**Convention reminders (from CLAUDE.md):** docs-/script-only commits skip the ansible-lint pre-commit hook and need no `rbw` unlock. Imperative subject ≤72 chars. `Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>` trailer on every commit.
---
## Decisions locked by the spec (do not re-litigate)
- **Mandatory sections, in this order:** `## Status`, `## Context`, `## Decision`, `## Consequences`.
- **Optional sections:** `## Related`, `## Scope`, `## Guardrails` / `## Enforcement`, `## What was ruled out`, `## Verified facts (ADR-014)`.
- **Status lifecycle (4 states):** `Proposed (YYYY-MM-DD)` (genuine drafts, e.g. ADR-011) → `Accepted (YYYY-MM-DD)` (the common starting state) → optionally `Superseded by ADR-NNN (YYYY-MM-DD)` or `Deprecated (YYYY-MM-DD)`. (`Proposed` was added on the evidence of ADR-011, which is a real draft with open questions.)
- **No silent rewrites:** material reversal = new ADR + `Superseded by` marker; bidirectional link.
- **Enforcement checks presence + parseable Status line, NOT section order.** Order is demonstrated by the template, not machine-enforced.
- **Back-catalogue is fully restructured (no grandfathering)** — ADRs 001018 are brought to all-four-section conformance. The restructure is **presentational**: relabel/regroup/demote existing headings, add a dated Status, assemble a Consequences section from implications the ADR already states. **The substance of no decision is changed.** If a faithful Consequences cannot be drawn from existing content, escalate that file rather than inventing one.
---
## Task 1: `adr-structure` check in repo-scan.py
**Files:**
- Modify: `scripts/repo-scan.py` (add module-level regexes near the other `_RE` definitions ~line 3844; add `adr_structure_findings()` next to `deferred_findings()` ~line 96; wire it into `scan()` at the `findings.extend(...)` site ~line 215)
- Test: `tests/test_repo_scan.py` (new)
- [ ] **Step 1: Write the failing test**
Create `tests/test_repo_scan.py`:
```python
import importlib.util
import pathlib
_PATH = pathlib.Path(__file__).resolve().parent.parent / "scripts" / "repo-scan.py"
_spec = importlib.util.spec_from_file_location("repo_scan", _PATH)
rs = importlib.util.module_from_spec(_spec)
_spec.loader.exec_module(rs)
GOOD = [
"# ADR-099 — Example\n", "\n",
"## Status\n", "\n", "Accepted (2026-06-10)\n", "\n",
"## Context\n", "\n", "Why.\n", "\n",
"## Decision\n", "\n", "What.\n", "\n",
"## Consequences\n", "\n", "So what.\n",
]
def _checks(findings):
return [f for f in findings if f["check"] == "adr-structure"]
def test_good_adr_has_no_findings():
out = rs.adr_structure_findings({"docs/decisions/099-example.md": GOOD})
assert _checks(out) == []
def test_missing_mandatory_section_is_flagged():
lines = [ln for ln in GOOD if not ln.startswith("## Consequences")]
out = _checks(rs.adr_structure_findings({"docs/decisions/099-example.md": lines}))
assert len(out) == 1
assert "Consequences" in out[0]["detail"]
def test_unparseable_status_is_flagged():
lines = [("Designed, not built.\n" if ln == "Accepted (2026-06-10)\n" else ln)
for ln in GOOD]
out = _checks(rs.adr_structure_findings({"docs/decisions/099-example.md": lines}))
assert len(out) == 1
assert "Status not parseable" in out[0]["detail"]
def test_superseded_status_is_accepted():
lines = [("Superseded by ADR-100 (2026-06-11)\n" if ln == "Accepted (2026-06-10)\n"
else ln) for ln in GOOD]
out = _checks(rs.adr_structure_findings({"docs/decisions/099-example.md": lines}))
assert out == []
def test_non_numbered_file_is_skipped():
bare = ["# ADR template\n", "\n", "## Status\n", "\n", "<!-- hint -->\n"]
out = _checks(rs.adr_structure_findings({"docs/decisions/adr-template.md": bare}))
assert out == []
```
- [ ] **Step 2: Run the test to verify it fails**
Run: `.venv/bin/pytest tests/test_repo_scan.py -q`
Expected: FAIL — `AttributeError: module 'repo_scan' has no attribute 'adr_structure_findings'`.
- [ ] **Step 3: Add the regexes**
In `scripts/repo-scan.py`, after the `RESOLVE_WORD_RE = ...` line (~line 44), add:
```python
# ADR-structure check (ADR-023): numbered ADRs must carry the four mandatory
# sections and a parseable Status line. Presence only — section ORDER is a
# template-demonstrated convention, not machine-enforced.
ADR_FILE_RE = re.compile(r"^\d{3}-.*\.md$")
ADR_REQUIRED_SECTIONS = ("Status", "Context", "Decision", "Consequences")
ADR_STATUS_LINE_RE = re.compile(
r"^(Accepted \(\d{4}-\d{2}-\d{2}\)"
r"|Superseded by ADR-\d{3}"
r"|Deprecated \(\d{4}-\d{2}-\d{2}\))")
```
- [ ] **Step 4: Add the check function**
In `scripts/repo-scan.py`, immediately after the `deferred_findings(...)` function (it ends ~line 96, just before `def walk_files():`), add:
```python
def adr_structure_findings(adr_files):
"""adr_files: {rel_path: [lines]} for docs/decisions/*.md.
Flags numbered ADRs (NNN-*.md) missing a mandatory section or whose Status
section has no parseable lifecycle line. Non-numbered files (e.g.
adr-template.md) are skipped. Section order is NOT checked (ADR-023)."""
out = []
for rpath, lines in sorted(adr_files.items()):
if not ADR_FILE_RE.match(os.path.basename(rpath)):
continue
headings = {}
for i, line in enumerate(lines):
m = re.match(r"^##\s+(\w+)", line)
if m:
headings.setdefault(m.group(1), i)
missing = [s for s in ADR_REQUIRED_SECTIONS if s not in headings]
if missing:
out.append({"check": "adr-structure", "severity": "medium",
"path": rpath, "line": 1,
"detail": f"missing mandatory section(s): {', '.join(missing)}"})
if "Status" in headings:
body = []
for line in lines[headings["Status"] + 1:]:
if line.startswith("## "):
break
body.append(line)
status_text = next((ln.strip() for ln in body if ln.strip()), "")
if not ADR_STATUS_LINE_RE.match(status_text):
out.append({"check": "adr-structure", "severity": "medium",
"path": rpath, "line": headings["Status"] + 1,
"detail": "Status not parseable (want 'Accepted (YYYY-MM-DD)', "
"'Superseded by ADR-NNN', or 'Deprecated (YYYY-MM-DD)'); "
f"got: {status_text[:60]!r}"})
return out
```
- [ ] **Step 5: Run the test to verify it passes**
Run: `.venv/bin/pytest tests/test_repo_scan.py -q`
Expected: PASS — 5 passed.
- [ ] **Step 6: Wire the check into `scan()`**
In `scripts/repo-scan.py`, find (~line 215):
```python
findings.extend(deferred_findings(adr_files, defer_refs))
return findings
```
Replace with:
```python
findings.extend(deferred_findings(adr_files, defer_refs))
findings.extend(adr_structure_findings(adr_files))
return findings
```
- [ ] **Step 7: Confirm the check fires on the real (not-yet-backfilled) repo**
Run: `python3 scripts/repo-scan.py 2>/dev/null | python3 -c "import json,sys; print(sorted({f['path'] for f in json.load(sys.stdin)['findings'] if f['check']=='adr-structure'}))"`
Expected: a list including `docs/decisions/001-architecture.md` … through `018-logging.md` (001015 missing Status; 016018 unparseable Status). 019022 and 023 must NOT appear. This proves the check works and previews Task 5's worklist.
- [ ] **Step 8: Commit**
```bash
git add scripts/repo-scan.py tests/test_repo_scan.py
git commit -m "feat(review): add adr-structure check to repo-scan
Flags numbered ADRs missing a mandatory section (Status/Context/Decision/
Consequences) or with an unparseable Status line. Presence only, not order.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>"
```
---
## Task 2: ADR template
**Files:**
- Create: `docs/decisions/adr-template.md`
- [ ] **Step 1: Write the template**
Create `docs/decisions/adr-template.md` with exactly:
```markdown
# ADR-NNN — <Title>: <optional clarifying subtitle>
<!-- Filename: NNN-kebab-title.md (zero-padded, monotonic, never reused).
Register a row in CLAUDE.md "Further reading" when this ADR is created.
Sections below in order. Mandatory: Status, Context, Decision, Consequences.
Delete this comment and any optional section you don't use. -->
## Status
Accepted (YYYY-MM-DD)
<!-- Lifecycle: "Accepted (YYYY-MM-DD)" → later "Superseded by ADR-NNN (YYYY-MM-DD)"
or "Deprecated (YYYY-MM-DD)" + one-line why. Optional trailing note OK, e.g.
"Accepted (2026-06-10). Doctrine ADR — pins policy, builds nothing yet." -->
## Context
<!-- The forces, the problem, what exists today, why now. -->
## Decision
<!-- What we are doing. Use numbered sub-decisions (### 1. ...) for multi-part ADRs. -->
## Consequences
<!-- Results, trade-offs explicitly accepted, follow-on work. -->
<!-- Optional sections — uncomment any that genuinely apply; never pad:
## Scope — explicit in / out-of-scope boundaries.
## Guardrails — how the decision is mechanically enforced (lint, CI, hooks).
## What was ruled out — rejected alternatives, each with its reason.
## Verified facts (ADR-014) — verified: <subject> · <tool> <version> · <source> · <YYYY-MM-DD>
## Related — links to other ADRs by number; bidirectional for Supersedes/Superseded-by.
-->
```
(HTML comments do not nest — optional sections use one flat comment block with inline
em-dash descriptions, not commented sub-hints inside an outer comment.)
- [ ] **Step 2: Confirm the template is skipped by the check**
Run: `python3 scripts/repo-scan.py 2>/dev/null | python3 -c "import json,sys; print([f for f in json.load(sys.stdin)['findings'] if f['check']=='adr-structure' and 'adr-template' in f['path']])"`
Expected: `[]` (non-numbered filename → skipped).
- [ ] **Step 3: Commit**
```bash
git add docs/decisions/adr-template.md
git commit -m "docs(adr): add adr-template.md scaffold (ADR-023)
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>"
```
---
## Task 3: ADR-023 itself
**Files:**
- Create: `docs/decisions/023-adr-structure.md`
- [ ] **Step 1: Write ADR-023**
Create `docs/decisions/023-adr-structure.md`. It must pass its own check (Status/Context/Decision/Consequences present; parseable Status line). Use this content:
```markdown
# ADR-023 — ADR structure & lifecycle
## Status
Accepted (2026-06-10). Meta/doctrine ADR — pins how ADRs are written; the
`adr-structure` check (`scripts/repo-scan.py`) and `docs/decisions/adr-template.md`
ship with it, and ADRs 001018 were retroactively restructured to conform. Resolves
the FRICTION signal (2026-05-31) about ADR-writing policy being unsettled.
## Context
boma records architectural decisions as numbered ADRs in `docs/decisions/`, and
CLAUDE.md treats them as load-bearing. Yet no ADR said how an ADR is written. The
newest ADRs (019022) converged on a clean shape — Status → Context → Decision →
Consequences → Related — but only by imitation. ADRs 001018 predate it and drifted
widely: most lacked a `## Status` section entirely (016018 carried only a trailing
build-state note), and many lacked an explicit `## Decision` or `## Consequences`
heading, their decisions spread across ad-hoc topical sections. The result was
structural drift and no uniform way to tell an active decision from a superseded or
deprecated one.
## Decision
### 1. Title & filename
Title line: `# ADR-NNN — <Title>: <optional clarifying subtitle>` (em-dash). Filename:
`NNN-kebab-title.md`, zero-padded 3-digit, monotonic, never reused — a superseded ADR
keeps its number and file. A new ADR is registered as a row in the CLAUDE.md
"Further reading" table.
### 2. Mandatory sections, in this order
- `## Status` — a lifecycle line, usually `Accepted (YYYY-MM-DD)` (see §4), plus an
optional one-line note.
- `## Context` — the forces, the problem, what exists today, why now.
- `## Decision` — what we are doing; numbered sub-decisions for multi-part ADRs.
- `## Consequences` — results, trade-offs explicitly accepted, follow-on work.
### 3. Optional sections (use only where they genuinely apply)
`## Related`, `## Scope`, `## Guardrails` / `## Enforcement`, `## What was ruled out`,
`## Verified facts (ADR-014)`.
### 4. Status lifecycle
Four states. Because boma is single-contributor and trunk-based with no review gate,
most ADRs are **born `Accepted (YYYY-MM-DD)`** — committed-to on writing. A
**`Proposed`** state exists for a genuine draft whose core direction is recorded but
whose specifics are still open for discussion (e.g. ADR-011); it is promoted to
`Accepted` once settled.
- **`Proposed (YYYY-MM-DD)`** — drafted, under discussion, not yet committed-to. May
carry open questions. Promoted to `Accepted (YYYY-MM-DD)` when decided.
- **`Accepted (YYYY-MM-DD)`** — committed-to. The common starting state.
- Replaced → old ADR's Status becomes **`Superseded by ADR-NNN (YYYY-MM-DD)`**; the new
ADR records `Supersedes ADR-MMM` in its Status and `## Related`. The link is
**bidirectional**.
- Retired with no replacement → **`Deprecated (YYYY-MM-DD)`** + a one-line reason.
**No silent rewrites.** An Accepted ADR is not edited to reverse its decision. Typo and
clarity fixes are fine; a material reversal requires a new ADR and a `Superseded by`
marker on the old one.
### 5. Template & enforcement
`docs/decisions/adr-template.md` is the scaffold for new ADRs. The `/review-repo`
command's pre-scan (`scripts/repo-scan.py`) emits an `adr-structure` finding for any
numbered ADR missing a mandatory section or with an unparseable Status line. It checks
**presence and Status, not section order** — order is a convention the template carries,
deliberately not gated, to keep enforcement lightweight (consistent with boma's other
doctrine ADRs adding no CI gate).
### 6. Retroactive conformance of the back-catalogue
ADRs 001018 are restructured to satisfy this standard rather than grandfathered. The
restructure is **presentational** — existing headings are relabelled, regrouped, or
demoted under a `## Decision` umbrella; a dated `## Status` is added; a `## Consequences`
section is assembled from implications the ADR already states. **The substance of no
decision is changed.** This keeps the check uniform (no number threshold) and the corpus
a consistent, legible decision history.
## Consequences
- New ADRs have one obvious shape and a scaffold; structural drift stops.
- Every ADR declares its lifecycle state uniformly, and reversals are traceable.
- The whole corpus conforms; the check needs no grandfathering and stays simple.
- One-time restructure churn across ADRs 001018 (heading reorganization + a Status and
a Consequences section per file; no decision substance changed).
- `/review-repo` grows one deterministic check; no new CI machinery.
- This ADR is the first conformant example and is held to its own check.
## What was ruled out
- **A `make lint` / CI gate for ADR structure** — heavier than the risk warrants;
the `/review-repo` check and the template suffice.
- **Machine-enforcing section order** — brittle for marginal value; left as a
template-demonstrated convention.
- **Grandfathering 001018 from the check** — rejected in favour of restructuring the
whole corpus to conform, so the standard applies uniformly with no exceptions.
## Related
- ADR-014 — knowledge sourcing (the `Verified facts` optional section).
- ADR-019/020/021/022 — the emergent structure this ADR codifies.
- `docs/decisions/adr-template.md` — the scaffold.
- `scripts/repo-scan.py` — the `adr-structure` enforcement check.
```
- [ ] **Step 2: Confirm ADR-023 passes its own check**
Run: `python3 scripts/repo-scan.py 2>/dev/null | python3 -c "import json,sys; print([f for f in json.load(sys.stdin)['findings'] if f['check']=='adr-structure' and '023-' in f['path']])"`
Expected: `[]`.
- [ ] **Step 3: Commit**
```bash
git add docs/decisions/023-adr-structure.md
git commit -m "docs(adr): ADR-023 — ADR structure & lifecycle
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>"
```
---
## Task 4: Wire into CLAUDE.md and the review-repo command doc
**Files:**
- Modify: `CLAUDE.md` ("Further reading" table)
- Modify: `.claude/commands/review-repo.md` (the deterministic-findings description, ~line 2628)
- [ ] **Step 1: Add the CLAUDE.md "Further reading" row**
In `CLAUDE.md`, in the "Further reading" table, after the `Backup & disaster recovery` row, add:
```markdown
| ADR structure & lifecycle | `docs/decisions/023-adr-structure.md` |
```
- [ ] **Step 2: Mention the new check in review-repo.md**
In `.claude/commands/review-repo.md`, find (~line 2728):
```markdown
(roles, ADRs, runbooks, playbooks, scripts — your shard list) and **exact findings**
(markers, broken refs, unencrypted vaults). Fold these into the report verbatim.
```
Replace the parenthetical with:
```markdown
(roles, ADRs, runbooks, playbooks, scripts — your shard list) and **exact findings**
(markers, broken refs, unencrypted vaults, ADR-structure violations). Fold these into
the report verbatim.
```
- [ ] **Step 3: Verify the CLAUDE.md link resolves**
Run: `test -f docs/decisions/023-adr-structure.md && echo OK`
Expected: `OK`.
- [ ] **Step 4: Commit**
```bash
git add CLAUDE.md .claude/commands/review-repo.md
git commit -m "docs(adr): register ADR-023 and note adr-structure check
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>"
```
---
## Task 5: Retroactively restructure ADRs 001018 to full conformance
**Goal:** every ADR in 001018 ends with all four mandatory sections present and a
parseable Status line, so the `adr-structure` check reports zero findings — **without
changing the substance of any decision.**
**Files (current findings — the exact worklist):**
- Missing `Status` + `Consequences`: `001-architecture.md`, `002-security.md`, `004-docker-model.md`, `005-bootstrapping.md`, `014-knowledge-sourcing.md`
- Missing `Status` + `Decision` + `Consequences`: `006-terraform.md`, `007-network.md`, `008-testing.md`, `009-provisioning-handoff.md`, `010-forgejo-ci.md`, `011-update-management.md`
- Missing all four: `003-toolchain.md`
- Missing `Status` + `Decision`: `013-heritage-v4.md`
- Missing `Status` only: `012-hardware-capacity.md`, `015-control-host.md`
- Have unparseable `Status` + missing `Consequences`: `016-mesh-vpn.md`, `017-service-ui-verification.md`, `018-logging.md`
(`010`/`011` use `## Decisions` (plural) → relabel to `## Decision`. The "missing
Decision" cases generally have the decision spread across topical `##` headings.)
**THE FAITHFULNESS RULE (non-negotiable):** This is a *presentational* restructure.
You MAY: add a `## Status` section; relabel a heading (`## Decisions``## Decision`);
introduce a `## Decision` umbrella heading and **demote** existing topical `##` headings
to `###` beneath it; add a `## Consequences` section. You MUST NOT alter any existing
sentence of decision prose, reword arguments, or add new policy. A `## Consequences`
section is assembled **only** from implications the ADR already states (its trade-offs,
"what was ruled out", "open questions", named follow-on work). **If an ADR states
nothing that can be faithfully cast as a consequence, STOP and report it as
DONE_WITH_CONCERNS / escalate — do not invent consequences.**
**Per-file date source:** the file's first git-commit (add) date —
`git log --diff-filter=A --format=%as -- <path> | tail -1` (yields `YYYY-MM-DD`).
- [ ] **Step 1: Add a dated `## Status` section to each ADR**
For 001015 (no Status today): insert, between the title line and the first `##`
heading, a Status section:
```markdown
## Status
Accepted (<d>)
```
where `<d>` is the file's first-git-commit date. For 016/017/018 (unparseable Status
today): prepend a parseable `Accepted (<d>). ` clause to the first line of their
existing `## Status` section so the build-state note becomes its tail, e.g.
`Accepted (2026-06-05). Designed. **Authorable now:** ...`.
- [ ] **Step 2: Ensure a `## Decision` section exists**
For ADRs flagged "missing Decision" (003, 006, 007, 008, 009, 010, 011, 013): relabel a
plural/synonym heading where one exists (`## Decisions``## Decision` in 010/011), or
introduce a `## Decision` umbrella immediately after `## Context` and demote the existing
topical `##` body headings (e.g. in 003: "Execution engine", "Python environment", …) to
`###`. Do not move or rewrite the prose under them.
- [ ] **Step 3: Ensure a `## Consequences` section exists**
For every ADR flagged "missing Consequences" (001, 002, 003, 004, 005, 006, 007, 008,
009, 010, 011, 014, 016, 017, 018): add a `## Consequences` section near the end,
assembled strictly from implications the ADR already states. Where an ADR has a trailing
section that *is* consequences under another name (e.g. "What was ruled out", "Open
questions", "Trade-offs"), you may keep that section and add a short `## Consequences`
that references/summarizes the already-stated trade-offs — without introducing new
claims. **Honour the faithfulness rule; escalate any ADR where no faithful Consequences
can be drawn.**
- [ ] **Step 4: Verify the whole corpus passes the check**
Run: `python3 scripts/repo-scan.py 2>/dev/null | python3 -c "import json,sys; v=[f for f in json.load(sys.stdin)['findings'] if f['check']=='adr-structure']; print('adr-structure findings:', len(v)); [print(' ', f['path'], '—', f['detail']) for f in v]"`
Expected: `adr-structure findings: 0`.
- [ ] **Step 5: Verify faithfulness via diff**
Run: `git diff --stat` and spot-check `git diff docs/decisions/003-toolchain.md`.
Expected: changes are heading additions/relabels/level-demotions, a new Status section,
and a new Consequences section — **no edits to existing decision sentences.**
- [ ] **Step 6: Run the repo-scan test suite**
Run: `.venv/bin/pytest tests/test_repo_scan.py -q`
Expected: PASS — 5 passed.
- [ ] **Step 7: Commit**
```bash
git add docs/decisions/0*.md docs/decisions/1*.md
git commit -m "docs(adr): restructure ADRs 001-018 to ADR-023 conformance
Presentational only: add a dated Status section, relabel/regroup headings
under Decision, and add a Consequences section assembled from each ADR's
already-stated implications. No decision substance changed.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>"
```
---
## Final verification (after all tasks)
- [ ] **Lint:** `make lint` — Expected: passes (docs + a stdlib script touched; ansible content unchanged).
- [ ] **Full deterministic scan clean for our check:** `python3 scripts/repo-scan.py 2>/dev/null | python3 -c "import json,sys; print('adr-structure:', sum(1 for f in json.load(sys.stdin)['findings'] if f['check']=='adr-structure'))"``adr-structure: 0`.
- [ ] **Tests green:** `.venv/bin/pytest tests/ -q` → all pass.
- [ ] **Branch ready:** invoke `superpowers:finishing-a-development-branch` to merge `feat/adr-structure` to `main` (trunk-based, no PR) and delete the branch.
---
## Self-review notes
- **Spec coverage:** §1 title/filename → Task 3 + template; §2 sections → Tasks 2/3 + check; §3 lifecycle → Task 3; §4 cross-refs → Task 3 `## Related`; §5 template → Task 2; §6 retroactive restructure → Task 5; §7 enforcement → Task 1 + Task 4. All covered.
- **Order nuance:** spec says sections come "in this order"; the check enforces presence + Status only. This is intentional and stated in both the spec's enforcement wording ("the four mandatory sections and a parseable Status line") and ADR-023's Decision §5 / "What was ruled out". Not a gap.
- **Type/name consistency:** `adr_structure_findings` and the `"adr-structure"` check key are used identically in the function, the `scan()` wiring, the tests, and both verification one-liners.

View file

@ -0,0 +1,164 @@
# Design — ADR structure & lifecycle
- **Date:** 2026-06-10
- **Status:** Approved design — implementation plan to follow
- **Resolves:** the absence of a written standard for how ADRs in
`docs/decisions/` are structured. The newest ADRs (019022) have converged on a
clean pattern (`Status``Context``Decision``Consequences``Related`),
but it lives only as imitation; ADRs 001018 predate it and most lack a `Status`
section.
- **Becomes:** ADR-023 (this design is the basis for that ADR).
- **Reuses:** boma's existing `*-template.md` convention (`service-security-template.md`,
`service-verify-template.md`, `service-access-template.md`, `service-backup-template.md`);
ADR-014 (knowledge-sourcing → the optional `Verified facts` section); ADR-019/020/021/022
(the emergent structure being codified); the `/review-repo` command (enforcement home).
---
## Problem
boma documents architectural decisions as numbered ADRs in `docs/decisions/`, and
CLAUDE.md treats them as load-bearing ("Before assuming a role, provider, or pipeline
exists, check STATUS.md"; the entire "Further reading" table points into them). Yet
there is no ADR that says how an ADR is written. The result:
- **Structural drift.** ADRs 001018 are freeform; 019022 converged on a consistent
shape but only by imitation. A new ADR's structure depends on which existing one the
author happened to copy.
- **No status discipline.** Most early ADRs have no `## Status` section, so there is no
uniform way to tell an active decision from a superseded or deprecated one — and no
written rule for how a decision gets reversed without silently rewriting history.
- **No scaffold.** Every other recurring document type in boma has a template
(`service-security-template.md`, etc.). ADRs do not.
This design codifies the structure 019022 already demonstrate, pins a status
lifecycle, ships a template, and reconciles the back-catalogue.
## Scope
- **In:** the canonical section set (mandatory + optional); title and filename
convention; the `Accepted / Superseded / Deprecated` status lifecycle and the
no-silent-rewrite rule; cross-reference convention; an ADR template file; a
lightweight `/review-repo` structure check; a **one-time retroactive restructure of
ADRs 001018** to full conformance (all four mandatory sections + a parseable Status
line), reorganizing existing content under canonical headings.
- **Out (for now):** *changing the substance of* any existing decision (the restructure
is presentational — relabel/regroup/demote existing content, add a dated Status, never
alter what was decided); a `make lint` / CI gate for ADR structure (explicitly
rejected in favour of the `/review-repo` check — consistent with boma's other doctrine
ADRs, which add no CI gate); grandfathering pre-convention ADRs from the check
(rejected — the whole corpus is brought to conformance instead).
The lifecycle uses four states — `Proposed / Accepted / Superseded / Deprecated`. An
earlier draft of this design omitted `Proposed`, but ADR-011 (a real draft with open
questions) is evidence boma occasionally needs it, so it was kept.
## Decision
### 1. Title & filename
- Title line: `# ADR-NNN — <Title>: <optional clarifying subtitle>` (em-dash `—`,
matching every existing ADR).
- Filename: `NNN-kebab-title.md`, zero-padded 3-digit, monotonic, **never reused**
(a superseded ADR keeps its number and file).
- A new ADR is registered as a row in the CLAUDE.md "Further reading" table.
### 2. Canonical sections
**Mandatory — every ADR, in this order:**
| Section | Holds |
|---|---|
| `## Status` | `Accepted (YYYY-MM-DD)`, plus an optional one-line note (what it resolves/supersedes, or a doctrine-not-yet-built caveat as ADR-022 uses) |
| `## Context` | the forces, the problem, what exists today, why now |
| `## Decision` | what we are doing — numbered sub-decisions for multi-part ADRs, as 020/021/022 do |
| `## Consequences` | results, trade-offs *explicitly accepted*, follow-on work |
**Optional — use only where genuinely applicable, never as padding:**
- `## Related` — links to other ADRs by number.
- `## Scope` — explicit in/out-of-scope boundaries.
- `## Guardrails` / `## Enforcement` — how the decision is mechanically enforced
(lint, CI, hooks).
- `## What was ruled out` — rejected alternatives, each with its reason.
- `## Verified facts (ADR-014)` — version-stamped facts per the knowledge-sourcing rule.
### 3. Status lifecycle
Four states. Most ADRs are **born `Accepted (YYYY-MM-DD)`** — the sole author commits
to it on writing (boma is single-contributor and trunk-based with no review gate).
- **`Proposed (YYYY-MM-DD)`** — a genuine draft whose core direction is recorded but
whose specifics are still open (e.g. ADR-011, which carries open questions). Promoted
to `Accepted (YYYY-MM-DD)` once settled.
- **`Accepted (YYYY-MM-DD)`** — committed-to; the common starting state.
- Replaced by a later decision → the old ADR's Status becomes
**`Superseded by ADR-NNN (YYYY-MM-DD)`**; the superseding ADR records
`Supersedes ADR-MMM` in its own `## Status` and `## Related`. The link is
**bidirectional** — both files must point at each other.
- Retired with no replacement → **`Deprecated (YYYY-MM-DD)`** plus a one-line reason.
**Load-bearing rule — no silent rewrites.** An `Accepted` ADR is not edited to reverse
its decision. Typo and clarity fixes are fine; a *material reversal* requires a new ADR
and a `Superseded by` marker on the old one. The history of decisions stays legible.
### 4. Cross-references
Reference other ADRs by number inline (`ADR-019`), and collect the relationships in a
`## Related` section.
### 5. Template file
Ship `docs/decisions/adr-template.md` — consistent with boma's existing
`*-template.md` convention. It contains the mandatory section headers pre-filled with
short HTML-comment hints, and the optional sections listed as commented stubs to
uncomment when relevant. It is a skeleton, not a numbered decision, so it does not take
an ADR number.
### 6. Retroactive restructure (001018)
A **separate step** after the ADR and template land: bring every pre-convention ADR to
full conformance — all four mandatory sections present and a parseable Status line. This
is a **presentational** restructure, governed by a strict faithfulness rule:
- **Add** a `## Status` section valued `Accepted (YYYY-MM-DD)`, the date reconstructed
from the file's **first git-commit date**. For 016018, whose existing trailing
build-state note is unparseable, prepend the dated `Accepted (...)` clause so the note
becomes a parseable Status line's tail.
- **Reorganize** existing content under the canonical headings: relabel a synonym
(`## Decisions``## Decision`), or introduce a `## Decision` umbrella and **demote**
the existing topical `##` headings to `###` beneath it. No sentence of existing prose
is altered.
- **Add** a `## Consequences` section built **only** from implications the ADR already
states (trade-offs, "what was ruled out", "open questions", follow-on work already
named). If an ADR genuinely states nothing that can be faithfully cast as a
consequence, that file is escalated for a human decision rather than inventing one.
- **Never** change the substance of a decision. A `git diff` of the restructure should
show heading-level changes, a new Status section, and a Consequences section assembled
from existing material — not edits to existing argument.
ADRs already conformant (019022) are left alone. End state: the `adr-structure` check
reports zero findings across the whole corpus, with no grandfathering.
### 7. Enforcement
Lightweight, no CI gate. The `/review-repo` command gains an ADR-structure check:
every file in `docs/decisions/` matching `NNN-*.md` has the four mandatory sections and
a parseable `## Status` line. The template carries the convention forward for new ADRs.
## Consequences
- New ADRs have one obvious shape and a scaffold to start from; structural drift stops.
- Every ADR declares its lifecycle state uniformly, and reversals are traceable rather
than silent — the back-catalogue becomes a legible decision history.
- One-time churn: a restructure touching ~18 files (heading reorganization + a Status
section + a Consequences section per file). Larger and more judgment-heavy than a
Status-only backfill, hence the faithfulness rule and per-file review.
- The whole corpus conforms — the check needs no grandfathering or number threshold, and
stays simple (presence + parseable Status, applied uniformly).
- `/review-repo` grows a new check; no new CI machinery, matching boma's habit of not
gating doctrine in CI.
- This ADR is itself the first conformant example — it must follow its own structure.
## Open questions
None outstanding — title/filename, the **4-state lifecycle** (`Proposed / Accepted /
Superseded / Deprecated`; `Proposed` adopted on the evidence of ADR-011), template name
(`adr-template.md`), enforcement (`/review-repo`, no CI gate), and the **full
retroactive restructure** of 001018 (no grandfathering) were all confirmed during
brainstorming and execution.

View file

@ -41,6 +41,17 @@ LIST_ITEM_RE = re.compile(r"^\s*(\d+\.|[-*+])\s+(.*)")
DEFER_REF_RE = re.compile(r"ADR-(\d{3})\D{0,40}?deferred\D{0,12}?(\d+)", re.I) DEFER_REF_RE = re.compile(r"ADR-(\d{3})\D{0,40}?deferred\D{0,12}?(\d+)", re.I)
RESOLVE_WORD_RE = re.compile(r"\b(?:resolv\w*|decid\w*|address\w*|complet\w*|done)\b", re.I) RESOLVE_WORD_RE = re.compile(r"\b(?:resolv\w*|decid\w*|address\w*|complet\w*|done)\b", re.I)
# ADR-structure check (ADR-023): numbered ADRs must carry the four mandatory
# sections and a parseable Status line. Presence only — section ORDER is a
# template-demonstrated convention, not machine-enforced.
ADR_FILE_RE = re.compile(r"^\d{3}-.*\.md$")
ADR_REQUIRED_SECTIONS = ("Status", "Context", "Decision", "Consequences")
ADR_STATUS_LINE_RE = re.compile(
r"^(Proposed \(\d{4}-\d{2}-\d{2}\)"
r"|Accepted \(\d{4}-\d{2}-\d{2}\)"
r"|Superseded by ADR-\d{3} \(\d{4}-\d{2}-\d{2}\)"
r"|Deprecated \(\d{4}-\d{2}-\d{2}\))")
def _is_defer_heading(text): def _is_defer_heading(text):
t = text.strip().lower() t = text.strip().lower()
@ -95,6 +106,42 @@ def deferred_findings(adr_files, defer_refs):
return out return out
def adr_structure_findings(adr_files):
"""adr_files: {rel_path: [lines]} for docs/decisions/*.md.
Flags numbered ADRs (NNN-*.md) missing a mandatory section or whose Status
section has no parseable lifecycle line. Non-numbered files (e.g.
adr-template.md) are skipped. Section order is NOT checked (ADR-023)."""
out = []
for rpath, lines in sorted(adr_files.items()):
if not ADR_FILE_RE.match(os.path.basename(rpath)):
continue
headings = {}
for i, line in enumerate(lines):
m = re.match(r"^##\s+(\w+)", line)
if m:
headings.setdefault(m.group(1), i)
missing = [s for s in ADR_REQUIRED_SECTIONS if s not in headings]
if missing:
out.append({"check": "adr-structure", "severity": "medium",
"path": rpath, "line": 1,
"detail": f"missing mandatory section(s): {', '.join(missing)}"})
if "Status" in headings:
body = []
for line in lines[headings["Status"] + 1:]:
if line.startswith("## "):
break
body.append(line)
status_text = next((ln.strip() for ln in body if ln.strip()), "")
if not ADR_STATUS_LINE_RE.match(status_text):
out.append({"check": "adr-structure", "severity": "medium",
"path": rpath, "line": headings["Status"] + 1,
"detail": "Status not parseable (want 'Proposed (YYYY-MM-DD)', "
"'Accepted (YYYY-MM-DD)', 'Superseded by ADR-NNN "
"(YYYY-MM-DD)', or 'Deprecated (YYYY-MM-DD)'); "
f"got: {status_text[:60]!r}"})
return out
def walk_files(): def walk_files():
for dirpath, dirnames, filenames in os.walk(ROOT): for dirpath, dirnames, filenames in os.walk(ROOT):
dirnames[:] = [d for d in dirnames if d not in PRUNE] dirnames[:] = [d for d in dirnames if d not in PRUNE]
@ -213,6 +260,7 @@ def scan():
findings.append({"check": "broken-path-ref", "severity": "medium", "path": rpath, findings.append({"check": "broken-path-ref", "severity": "medium", "path": rpath,
"line": i, "detail": f"references '{ref}' which does not exist"}) "line": i, "detail": f"references '{ref}' which does not exist"})
findings.extend(deferred_findings(adr_files, defer_refs)) findings.extend(deferred_findings(adr_files, defer_refs))
findings.extend(adr_structure_findings(adr_files))
return findings return findings

59
tests/test_repo_scan.py Normal file
View file

@ -0,0 +1,59 @@
import importlib.util
import pathlib
_PATH = pathlib.Path(__file__).resolve().parent.parent / "scripts" / "repo-scan.py"
_spec = importlib.util.spec_from_file_location("repo_scan", _PATH)
rs = importlib.util.module_from_spec(_spec)
_spec.loader.exec_module(rs)
GOOD = [
"# ADR-099 — Example\n", "\n",
"## Status\n", "\n", "Accepted (2026-06-10)\n", "\n",
"## Context\n", "\n", "Why.\n", "\n",
"## Decision\n", "\n", "What.\n", "\n",
"## Consequences\n", "\n", "So what.\n",
]
def _checks(findings):
return [f for f in findings if f["check"] == "adr-structure"]
def test_good_adr_has_no_findings():
out = rs.adr_structure_findings({"docs/decisions/099-example.md": GOOD})
assert _checks(out) == []
def test_missing_mandatory_section_is_flagged():
lines = [ln for ln in GOOD if not ln.startswith("## Consequences")]
out = _checks(rs.adr_structure_findings({"docs/decisions/099-example.md": lines}))
assert len(out) == 1
assert "Consequences" in out[0]["detail"]
def test_unparseable_status_is_flagged():
lines = [("Designed, not built.\n" if ln == "Accepted (2026-06-10)\n" else ln)
for ln in GOOD]
out = _checks(rs.adr_structure_findings({"docs/decisions/099-example.md": lines}))
assert len(out) == 1
assert "Status not parseable" in out[0]["detail"]
def test_superseded_status_is_accepted():
lines = [("Superseded by ADR-100 (2026-06-11)\n" if ln == "Accepted (2026-06-10)\n"
else ln) for ln in GOOD]
out = _checks(rs.adr_structure_findings({"docs/decisions/099-example.md": lines}))
assert out == []
def test_proposed_status_is_accepted():
lines = [("Proposed (2026-06-04)\n" if ln == "Accepted (2026-06-10)\n"
else ln) for ln in GOOD]
out = _checks(rs.adr_structure_findings({"docs/decisions/099-example.md": lines}))
assert out == []
def test_non_numbered_file_is_skipped():
bare = ["# ADR template\n", "\n", "## Status\n", "\n", "<!-- hint -->\n"]
out = _checks(rs.adr_structure_findings({"docs/decisions/adr-template.md": bare}))
assert out == []