- ADR-008: add reboot-survivability gap row + ADR-025 pointer to the "not tested in Molecule" table - ADR-015: reconcile "not a hypervisor" with ephemeral KVM test VMs (ADR-025); note ~3 GiB test-VM RAM against the 16 GiB sizing - accepted-risks: add R6 (le-prod-wildcard PAT + transient TXT records) - CLAUDE.md: add make test-integration[/-clean] to key-commands; add ADR-025 + runbook rows to further-reading - hardware/reference.md: note one ephemeral KVM test VM on ubongo - STATUS.md: add integration harness entry (built, lint+pytest clean; RED/GREEN acceptance PENDING ubongo live pass); TODO 2.4 stays open Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
5.5 KiB
Accepted security risks
Conscious security trade-offs we are choosing to live with — recorded so "what we are not doing" is explicit and revisitable, not forgotten. This register is a living document, deliberately kept out of ADR-002 (which records durable decisions) so the ADR stays stable.
Owned by ADR-002 (Security baseline and strategy). Re-challenged during the
periodic security review (planned /security-review; see docs/TODO.md).
Each entry: the risk · why we accept it (rationale) · what would make us revisit (trigger).
| # | Accepted risk | Rationale | Revisit trigger |
|---|---|---|---|
| R1 | Active supply-chain scanning deferred — baseline hygiene is required (tiered image pinning per ADR-011 — stateful tag@digest, stateless rolling — prefer official/verified images; gitleaks), but images and dependencies are not actively vulnerability-scanned (Trivy/Grype) or signature-verified |
Scanning only pays off with the capacity to triage its output; the realistic threat is opportunistic, not a targeted supply-chain attack | A monitoring/triage stack is live; hosting high-value data/finances for others; a relevant upstream compromise |
| R2 | SELinux not used — no SELinux mandatory access control | AppArmor — Debian-native and enforced via the CIS baseline — already provides MAC; adding SELinux means two MAC systems, non-native to Debian, for no real gain | A service that ships and requires its own SELinux policy; threat model shifts toward targeted attackers |
| R3 | Self-hosted mesh control plane is a public target on askari — the NetBird coordinator (ADR-016) exposes a management API + dashboard (TCP 80/443) and STUN (UDP 3478) on askari's public IP; the management API controls the whole mesh (NetBird v0.72.4 embeds STUN in the combined server — no separate Coturn) |
Self-hosting means no third-party trust and an off-site control plane that survives a homelab outage (boma's sovereignty ethos). Residual surface is on askari (already a public VPS) and is mitigated: TLS + embedded-IdP login, source-IP restriction where practical, base hardening, version-pinned NetBird (ADR-011) patched on boma's cadence |
A coordinator compromise or unpatched NetBird CVE; the management plane is reachable without auth/IP-limits; the operational burden makes a hosted coordinator worth reconsidering |
| R4 | No cryptographic WORM for logs — shipped logs are append-only via Loki's push API and copied off-site to askari (ADR-018), but the stored chunks are not object-locked/immutable; a root-on-askari attacker could edit history |
Append-only push + off-site copy already defeats the realistic threat (a host attacker covering tracks survives even full-cluster compromise). True WORM (object-lock) is forensic-grade cost for boma's opportunistic threat model (R1) | Threat model shifts toward targeted/forensic; a regulatory/evidentiary need appears; askari itself is assessed as a likely target |
| R5 | No disk encryption on ubongo — the control node's SSD (SanDisk X600 256 GB, TCG-Opal-capable but Opal unused) is unencrypted at rest, so it holds recovery-critical secrets in plaintext: the Ansible Vault password's rbw local cache and (future) Terraform state. Physical theft of the box would expose them |
ubongo is always-on in a physically controlled location; compensating controls are a BIOS supervisor password and disabled external/USB + PXE boot (an attacker cannot trivially boot another OS to read the disk), and the offline-recoverable design means the irreducible root secret (Vaultwarden master password) is never stored on the box anyway. Full-disk encryption was weighed against the always-on/unattended-reboot requirement (LUKS+TPM auto-unlock or passphrase) and deferred for simplicity at this trust level |
ubongo is relocated to a less-trusted physical location; the box starts holding additional high-value secrets; or a reinstall onto LUKS (TPM-sealed) is undertaken |
| R6 | le-prod-wildcard integration runs — when CERTS=le-prod-wildcard is passed to make test-integration, the production Gandi PAT (vault.gandi.pat) is passed to an ephemeral local test VM via the var overlay, and transient _acme-challenge TXT records are written into the real wingu.me DNS zone to satisfy the Let's Encrypt DNS-01 challenge. A compromised or long-lived test VM could exfiltrate the PAT; the real zone is briefly (seconds) modified |
Scope is on-demand only — le-staging is the default cert tier (CERTS=internal for incident repro); le-prod-wildcard is an explicit opt-in. Compensating controls: the VM is ephemeral and destroyed on success; it sits on an isolated libvirt NAT network (no LAN/mesh access); TXT records are auto-removed by Caddy immediately after validation; the PAT is not persisted inside the VM after the run. ADR-025 documents the cert-tier design and the three isolation invariants |
The PAT is exfiltrated from a test VM; the wingu.me zone shows unexpected records; a CERTS=le-prod-wildcard run must be audited or the tier must be revoked |
Last reviewed: 2026-06-11. The prior gaps (full CIS hardening, SELinux/AppArmor,
IDS) were re-challenged and adopted rather than accepted: CIS Debian L1+L2 + CIS
Docker, AppArmor (enforce), AIDE file-integrity, and Suricata network IDS are now
part of the security strategy (ADR-002). See STATUS.md / docs/TODO.md for build
status. As CIS is implemented, any specific item that proves impractical is added
here as a named exception.