review-repo: clear O7-O12 clarity items
- ADR-011: ruled-out row was "digest-pinning stateful" (contradicted Decision 2); now "digest-only (no readable tag)" — tag@digest is adopted (O7) - ADR-003/010: act_runner names ubongo as the runner host, runner VM as a future option (O8) - ADR-008: WireGuard Molecule-exclusion row reframed to NetBird wt0 data plane (O9) - ADR-011: scheduled_jobs xref points to TODO 8.3, not ADR-010 (O10) - CAPABILITIES: add /verify-service Level 4 capability row (O11) - TODO 3.10: rewrite the garbled base-container question (O12) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
8e4bf3dd88
commit
db76be2a63
6 changed files with 10 additions and 8 deletions
|
|
@ -109,6 +109,7 @@ _(DHCP, firewall, mDNS reflection live on OPNsense — Ansible-managed, not cont
|
|||
| Update watcher | DIUN | S | planned | New-image alerts driving the update process | ADR-011 |
|
||||
| Scheduled jobs | `scheduled_jobs` role + `claude -p` jobs | S | planned | Declarative cron: `/review-repo`, security/capacity reviews, sanity checks | TODO 8 |
|
||||
| Sanity / smoke | whoami + health checks | S | planned | Verification endpoints + "is it actually working" checks | ADR-011 / TODO 8.2 |
|
||||
| Service-UI verification | `/verify-service` skill | S | planned | Claude-driven exploratory Level 4 acceptance check of a deployed service's UI | Decided (ADR-017); running deferred on ubongo + playwright + Authentik |
|
||||
|
||||
---
|
||||
|
||||
|
|
|
|||
|
|
@ -27,7 +27,7 @@
|
|||
7. Define a tagging standard that lets us target runs without over-tagging.
|
||||
8. Ensure the right things are backed up (incl. database dumps if we land on PBS).
|
||||
9. Decide: a central database server, or individual database services per app?
|
||||
10. Should we continue to use the base-container method, or maybe something in the improvements of the methods in boma moods the point?
|
||||
10. Should we keep the custom base-container (Molecule test image) method for role testing, or revisit it as boma's testing approach matures (ADR-008)?
|
||||
11. Deliberate tagging strategy.
|
||||
|
||||
4. **Split-horizon FQDN** — adopt split-horizon FQDN with or without nyumbani?
|
||||
|
|
|
|||
|
|
@ -82,7 +82,8 @@ Config files: `.ansible-lint`, `.yamllint` in repo root.
|
|||
2. On green → deploy to staging
|
||||
3. [manual promote gate] → deploy to production
|
||||
|
||||
`act_runner` runs as a Docker container on the control node or a dedicated runner VM.
|
||||
`act_runner` runs as a Docker container on `ubongo` (the control node — ADR-015), or on
|
||||
a dedicated runner VM later if CI load warrants a separate host.
|
||||
|
||||
---
|
||||
|
||||
|
|
|
|||
|
|
@ -145,7 +145,7 @@ Level 2 (staging) or Level 3 (external). This is a conscious, documented decisio
|
|||
| Capability | Reason not testable in Molecule |
|
||||
|---|---|
|
||||
| `nftables` rule loading | Requires `nf_tables` kernel module; not available in Docker |
|
||||
| WireGuard tunnel establishment | Requires `wireguard` kernel module |
|
||||
| NetBird mesh data plane (`wt0` WireGuard interface) | Requires the `wireguard` kernel module; Molecule checks only that the agent is installed/configured (ADR-016) |
|
||||
| `unattended-upgrades` behaviour | Installs correctly; actual upgrade behaviour requires a real apt environment |
|
||||
| DHCP behaviour (OPNsense) | OPNsense is managed by Ansible but not testable in a container |
|
||||
| mDNS reflector (Avahi cross-VLAN) | Requires real network interfaces and VLANs |
|
||||
|
|
|
|||
|
|
@ -63,8 +63,8 @@ Trunk-based, matching ADR-003 / ADR-008:
|
|||
push to main → lint + Molecule → deploy staging → [manual gate] → deploy production
|
||||
```
|
||||
|
||||
Runner: `act_runner` on the control node or a dedicated runner VM. Actions is not
|
||||
yet enabled — see STATUS.md.
|
||||
Runner: `act_runner` on `ubongo` (the control node — ADR-015), or a dedicated runner VM
|
||||
later if CI load warrants a separate host. Actions is not yet enabled — see STATUS.md.
|
||||
|
||||
---
|
||||
|
||||
|
|
|
|||
|
|
@ -64,8 +64,8 @@ Because these are primarily Proxmox VMs, take a **VM snapshot before the Friday
|
|||
### 5. Stateful upgrades — 8-weekly analysis, human-gated, backup-first
|
||||
|
||||
Stateful services are **never** touched by the weekly run. Instead, **every 8 weeks**
|
||||
an automated analysis job (a scheduled `claude -p`, per the `scheduled_jobs` plan and
|
||||
ADR-010) does:
|
||||
an automated analysis job (a scheduled `claude -p`, per the `scheduled_jobs` design in
|
||||
`docs/TODO.md` 8.3, not yet built) does:
|
||||
|
||||
1. Read changelogs / breaking-change notes for each pinned stateful image; diff the
|
||||
pinned tag against what's available.
|
||||
|
|
@ -125,7 +125,7 @@ alert-driven.
|
|||
| -------------------------------------- | ----------------------------------------------------------------------------- |
|
||||
| One uniform policy for all services | Ignores blast radius; stateful data loss ≠ stateless re-pull. |
|
||||
| Rolling `latest` for stateful services | Unattended schema/migration changes are how you lose data. |
|
||||
| Digest-pinning the stateful tier | Unreadable in diffs; snapshot-before + backups give the immutability instead. |
|
||||
| Digest-_only_ pin (no readable tag) for stateful | Unreadable in diffs — the tiered rule pins `tag@digest` (readable tag *and* digest) instead (Decision 2). |
|
||||
| Pinning the stateless tier | No durable data to protect; pins just add churn DIUN already covers. |
|
||||
| Auto-updating stateful on a timer | Must be human-gated and backup-first; only the _analysis_ is automated. |
|
||||
| Updating the whole fleet at once | Simultaneous reboots hide which host/phase actually broke. |
|
||||
|
|
|
|||
Loading…
Add table
Reference in a new issue