review-repo: clear O7-O12 clarity items

- ADR-011: ruled-out row was "digest-pinning stateful" (contradicted Decision 2);
  now "digest-only (no readable tag)" — tag@digest is adopted (O7)
- ADR-003/010: act_runner names ubongo as the runner host, runner VM as a future
  option (O8)
- ADR-008: WireGuard Molecule-exclusion row reframed to NetBird wt0 data plane (O9)
- ADR-011: scheduled_jobs xref points to TODO 8.3, not ADR-010 (O10)
- CAPABILITIES: add /verify-service Level 4 capability row (O11)
- TODO 3.10: rewrite the garbled base-container question (O12)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
sjat 2026-06-05 19:28:07 +02:00
parent 8e4bf3dd88
commit db76be2a63
6 changed files with 10 additions and 8 deletions

View file

@ -109,6 +109,7 @@ _(DHCP, firewall, mDNS reflection live on OPNsense — Ansible-managed, not cont
| Update watcher | DIUN | S | planned | New-image alerts driving the update process | ADR-011 | | Update watcher | DIUN | S | planned | New-image alerts driving the update process | ADR-011 |
| Scheduled jobs | `scheduled_jobs` role + `claude -p` jobs | S | planned | Declarative cron: `/review-repo`, security/capacity reviews, sanity checks | TODO 8 | | Scheduled jobs | `scheduled_jobs` role + `claude -p` jobs | S | planned | Declarative cron: `/review-repo`, security/capacity reviews, sanity checks | TODO 8 |
| Sanity / smoke | whoami + health checks | S | planned | Verification endpoints + "is it actually working" checks | ADR-011 / TODO 8.2 | | Sanity / smoke | whoami + health checks | S | planned | Verification endpoints + "is it actually working" checks | ADR-011 / TODO 8.2 |
| Service-UI verification | `/verify-service` skill | S | planned | Claude-driven exploratory Level 4 acceptance check of a deployed service's UI | Decided (ADR-017); running deferred on ubongo + playwright + Authentik |
--- ---

View file

@ -27,7 +27,7 @@
7. Define a tagging standard that lets us target runs without over-tagging. 7. Define a tagging standard that lets us target runs without over-tagging.
8. Ensure the right things are backed up (incl. database dumps if we land on PBS). 8. Ensure the right things are backed up (incl. database dumps if we land on PBS).
9. Decide: a central database server, or individual database services per app? 9. Decide: a central database server, or individual database services per app?
10. Should we continue to use the base-container method, or maybe something in the improvements of the methods in boma moods the point? 10. Should we keep the custom base-container (Molecule test image) method for role testing, or revisit it as boma's testing approach matures (ADR-008)?
11. Deliberate tagging strategy. 11. Deliberate tagging strategy.
4. **Split-horizon FQDN** — adopt split-horizon FQDN with or without nyumbani? 4. **Split-horizon FQDN** — adopt split-horizon FQDN with or without nyumbani?

View file

@ -82,7 +82,8 @@ Config files: `.ansible-lint`, `.yamllint` in repo root.
2. On green → deploy to staging 2. On green → deploy to staging
3. [manual promote gate] → deploy to production 3. [manual promote gate] → deploy to production
`act_runner` runs as a Docker container on the control node or a dedicated runner VM. `act_runner` runs as a Docker container on `ubongo` (the control node — ADR-015), or on
a dedicated runner VM later if CI load warrants a separate host.
--- ---

View file

@ -145,7 +145,7 @@ Level 2 (staging) or Level 3 (external). This is a conscious, documented decisio
| Capability | Reason not testable in Molecule | | Capability | Reason not testable in Molecule |
|---|---| |---|---|
| `nftables` rule loading | Requires `nf_tables` kernel module; not available in Docker | | `nftables` rule loading | Requires `nf_tables` kernel module; not available in Docker |
| WireGuard tunnel establishment | Requires `wireguard` kernel module | | NetBird mesh data plane (`wt0` WireGuard interface) | Requires the `wireguard` kernel module; Molecule checks only that the agent is installed/configured (ADR-016) |
| `unattended-upgrades` behaviour | Installs correctly; actual upgrade behaviour requires a real apt environment | | `unattended-upgrades` behaviour | Installs correctly; actual upgrade behaviour requires a real apt environment |
| DHCP behaviour (OPNsense) | OPNsense is managed by Ansible but not testable in a container | | DHCP behaviour (OPNsense) | OPNsense is managed by Ansible but not testable in a container |
| mDNS reflector (Avahi cross-VLAN) | Requires real network interfaces and VLANs | | mDNS reflector (Avahi cross-VLAN) | Requires real network interfaces and VLANs |

View file

@ -63,8 +63,8 @@ Trunk-based, matching ADR-003 / ADR-008:
push to main → lint + Molecule → deploy staging → [manual gate] → deploy production push to main → lint + Molecule → deploy staging → [manual gate] → deploy production
``` ```
Runner: `act_runner` on the control node or a dedicated runner VM. Actions is not Runner: `act_runner` on `ubongo` (the control node — ADR-015), or a dedicated runner VM
yet enabled — see STATUS.md. later if CI load warrants a separate host. Actions is not yet enabled — see STATUS.md.
--- ---

View file

@ -64,8 +64,8 @@ Because these are primarily Proxmox VMs, take a **VM snapshot before the Friday
### 5. Stateful upgrades — 8-weekly analysis, human-gated, backup-first ### 5. Stateful upgrades — 8-weekly analysis, human-gated, backup-first
Stateful services are **never** touched by the weekly run. Instead, **every 8 weeks** Stateful services are **never** touched by the weekly run. Instead, **every 8 weeks**
an automated analysis job (a scheduled `claude -p`, per the `scheduled_jobs` plan and an automated analysis job (a scheduled `claude -p`, per the `scheduled_jobs` design in
ADR-010) does: `docs/TODO.md` 8.3, not yet built) does:
1. Read changelogs / breaking-change notes for each pinned stateful image; diff the 1. Read changelogs / breaking-change notes for each pinned stateful image; diff the
pinned tag against what's available. pinned tag against what's available.
@ -125,7 +125,7 @@ alert-driven.
| -------------------------------------- | ----------------------------------------------------------------------------- | | -------------------------------------- | ----------------------------------------------------------------------------- |
| One uniform policy for all services | Ignores blast radius; stateful data loss ≠ stateless re-pull. | | One uniform policy for all services | Ignores blast radius; stateful data loss ≠ stateless re-pull. |
| Rolling `latest` for stateful services | Unattended schema/migration changes are how you lose data. | | Rolling `latest` for stateful services | Unattended schema/migration changes are how you lose data. |
| Digest-pinning the stateful tier | Unreadable in diffs; snapshot-before + backups give the immutability instead. | | Digest-_only_ pin (no readable tag) for stateful | Unreadable in diffs — the tiered rule pins `tag@digest` (readable tag *and* digest) instead (Decision 2). |
| Pinning the stateless tier | No durable data to protect; pins just add churn DIUN already covers. | | Pinning the stateless tier | No durable data to protect; pins just add churn DIUN already covers. |
| Auto-updating stateful on a timer | Must be human-gated and backup-first; only the _analysis_ is automated. | | Auto-updating stateful on a timer | Must be human-gated and backup-first; only the _analysis_ is automated. |
| Updating the whole fleet at once | Simultaneous reboots hide which host/phase actually broke. | | Updating the whole fleet at once | Simultaneous reboots hide which host/phase actually broke. |