fix: address final whole-branch review findings

- ADR-023 §4: ADR-015 no-sudo sub-decision now Superseded-by ADR-025 (bidirectional), not just an in-place amendment.
- STATUS: drop the deferred `reset` verb; honest integration_test (molecule not run in this env; applied to ubongo) + verify (forward/DNAT, not wt0); RED->GREEN validated.
- driver: remove unused `import shutil`.
- README: fix the ADR-025 link filename.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
sjat 2026-06-18 21:52:28 +02:00
parent d7bd31babb
commit bc8592616b
5 changed files with 11 additions and 7 deletions

View file

@ -85,9 +85,9 @@ askari.)
| Thing | State | | Thing | State |
|---|---| |---|---|
| `roles/integration_test/` | **Built** — installs/enables libvirt+QEMU+virtinst on `control` group hosts; adds `sjat`/`claude` to `libvirt` group; creates image-cache dir; drops the driver. Molecule + pytest clean. | | `roles/integration_test/` | **Built** — installs/enables libvirt+QEMU+virtinst on `control` group hosts; adds `sjat`/`claude` to `libvirt` group; creates image-cache dir. Lint clean; applied live to ubongo (substrate installed); molecule scenario present, not run in the build env. |
| `scripts/integration-vm.py` | **Built** — stdlib-only lifecycle driver over `virsh`/`virt-install`/`cloud-localds`: `up / apply / reboot / assert / cycle / reset / down / prune / console`. Lazily ensures the golden Debian-13 genericcloud image. pytest clean (transient-inventory generation, var/overlay merge, `--certs` mapping, DHCP-lease parsing, resource-guard math). | | `scripts/integration-vm.py` | **Built** — stdlib-only lifecycle driver over `virsh`/`virt-install`/`cloud-localds`: `up / apply / reboot / assert / cycle / down / prune / console`. Lazily ensures the golden Debian-13 genericcloud image. pytest clean (transient-inventory generation, var/overlay merge, `--certs` mapping, DHCP-lease parsing, resource-guard math). |
| `tests/integration/` (profile, verify, overrides) | **Built** — "be askari" profile + var overlay + `verify.yml` outcome assertions (Docker up, published-port DNAT, nft sane, `wt0` up). pytest clean. | | `tests/integration/` (profile, verify, overrides) | **Built** — "be askari" profile + var overlay + `verify.yml` outcome assertions (Docker active, forward-chain accepts present, published-port DNAT alive). Validated end-to-end by the RED→GREEN acceptance run. |
| `make test-integration` / `make test-integration-clean` | **Built** — wired into `Makefile`. | | `make test-integration` / `make test-integration-clean` | **Built** — wired into `Makefile`. |
| ADR-025 | **Accepted (2026-06-18)** — decision recorded, approach A, cert tiers, safety invariants, UEFI boot requirement, and claude-sudo dependency documented. | | ADR-025 | **Accepted (2026-06-18)** — decision recorded, approach A, cert tiers, safety invariants, UEFI boot requirement, and claude-sudo dependency documented. |
| **RED/GREEN acceptance (ubongo live pass)** | **PASSED (2026-06-18).** A throwaway KVM VM on ubongo reproduced the 2026-06-17 incident (base nftables forward default-deny kills Docker forwarding on reboot) = RED. Applying the `docker_host` container-forward drop-in and rebooting survived = GREEN. Nine shakedown findings captured in `docs/FRICTION.md`; key learnings (UEFI boot, claude sudo) recorded in ADR-025. `docs/TODO.md` item 2.4 closed. | | **RED/GREEN acceptance (ubongo live pass)** | **PASSED (2026-06-18).** A throwaway KVM VM on ubongo reproduced the 2026-06-17 incident (base nftables forward default-deny kills Docker forwarding on reboot) = RED. Applying the `docker_host` container-forward drop-in and rebooting survived = GREEN. Nine shakedown findings captured in `docs/FRICTION.md`; key learnings (UEFI boot, claude sudo) recorded in ADR-025. `docs/TODO.md` item 2.4 closed. |

View file

@ -98,7 +98,12 @@ Manual, on bare metal:
account/key can be revoked without touching the operator's access. (ADR-021 left the account/key can be revoked without touching the operator's access. (ADR-021 left the
on-`ubongo` agent identity unspecified; this records it.) on-`ubongo` agent identity unspecified; this records it.)
**Amendment (2026-06-18) — `claude` now has `NOPASSWD:ALL` sudo.** During the **Amendment (2026-06-18) — `claude` now has `NOPASSWD:ALL` sudo.**
> **Superseded by [ADR-025](025-local-vm-integration-testing.md)** (per ADR-023 §4): the
> "no local sudo" sub-decision is reversed. The shakedown that necessitated it is ADR-025;
> the resulting two-account access model is ADR-021; the accepted risk is R7.
During the
integration-testing harness shakedown, the original "no local sudo" sub-decision was integration-testing harness shakedown, the original "no local sudo" sub-decision was
reversed. No-sudo blocked the AI-worker from diagnosing a failed VM: `virsh`, reversed. No-sudo blocked the AI-worker from diagnosing a failed VM: `virsh`,
`virt-install`, `cloud-localds`, `journalctl`, `nft` — nearly all low-level `virt-install`, `cloud-localds`, `journalctl`, `nft` — nearly all low-level

View file

@ -173,7 +173,7 @@ The nine full shakedown findings (including the UEFI boot-loop) are in
- ADR-006 — Terraform owns production VM existence (boundary this ADR respects). - ADR-006 — Terraform owns production VM existence (boundary this ADR respects).
- ADR-008 — Testing methodology (Levels 14); this ADR is the concrete build of Level 2/3. - ADR-008 — Testing methodology (Levels 14); this ADR is the concrete build of Level 2/3.
- ADR-015 — Control host (ubongo); this ADR reconciles "not a hypervisor" with ephemeral test VMs; amended 2026-06-18 for claude sudo. - ADR-015 — Control host (ubongo); this ADR reconciles "not a hypervisor" with ephemeral test VMs. **Supersedes** ADR-015's "no local sudo" sub-decision for the AI-worker — the shakedown necessitated `claude` NOPASSWD sudo (ADR-023 §4; access model in ADR-021, risk R7).
- ADR-016 — Mesh VPN; the "be askari" profile includes the coordinator role. - ADR-016 — Mesh VPN; the "be askari" profile includes the coordinator role.
- ADR-020 — Firewall strategy; firewall × Docker interaction is what this harness tests. - ADR-020 — Firewall strategy; firewall × Docker interaction is what this harness tests.
- ADR-021 — Operational access; sudo model for `claude` and `sjat` on `ubongo`. - ADR-021 — Operational access; sudo model for `claude` and `sjat` on `ubongo`.

View file

@ -31,5 +31,5 @@ the tooling needed to spin up short-lived test VMs (see ADR-015).
## Related decisions ## Related decisions
- [ADR-025](../../docs/decisions/025-integration-testing.md) — local VM integration testing - [ADR-025](../../docs/decisions/025-local-vm-integration-testing.md) — local VM integration testing
- [ADR-015](../../docs/decisions/015-control-host.md) — control host scope (ubongo is not a hypervisor) - [ADR-015](../../docs/decisions/015-control-host.md) — control host scope (ubongo is not a hypervisor)

View file

@ -11,7 +11,6 @@ import json
import os import os
import pathlib import pathlib
import re import re
import shutil
import subprocess import subprocess
import sys import sys
import time import time