From 2f4218814a8f37ee664db324adf49cec5df2f6c0 Mon Sep 17 00:00:00 2001 From: sjat Date: Thu, 4 Jun 2026 19:21:36 +0200 Subject: [PATCH] Reconcile image pinning to a tiered tag@digest rule MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Resolve the conflict between ADR-011 (tags-not-digests) and the security work (digest pinning) with one coherent rule that respects ADR-011's stateless/stateful split: - Stateful → pin `tag@digest` (readable tag + integrity digest): legible diffs AND tamper-evidence. Snapshots cover broken updates; the digest covers swapped images. - Stateless → rolling tags (latest/stable); digest-pinning would defeat the rolling design. Integrity rests on official/verified images + disposability. Aligned across ADR-011 (decision 2), ADR-004 (image management), ADR-002 (supply-chain row), accepted-risk R1, the service checklist, and TODO 15.6. TODO 16.7 marked decided. Co-Authored-By: Claude Opus 4.8 (1M context) --- docs/TODO.md | 14 +++++++------- docs/decisions/002-security.md | 2 +- docs/decisions/004-docker-model.md | 9 ++++++--- docs/decisions/011-update-management.md | 16 ++++++++++------ docs/security/accepted-risks.md | 2 +- docs/security/service-checklist.md | 2 +- 6 files changed, 26 insertions(+), 19 deletions(-) diff --git a/docs/TODO.md b/docs/TODO.md index 98676ae..5b2f8fb 100644 --- a/docs/TODO.md +++ b/docs/TODO.md @@ -113,9 +113,9 @@ 4. Network IDS: enable Suricata on OPNsense (IDS first; IPS later?). 5. Active security alerting: wire AIDE, `auditd`, `fail2ban`, and Suricata into the Loki/Grafana alerting stack (ties to 3.6). - 6. Supply-chain hygiene: enforce image digest pinning + official/verified images - via the service checklist; revisit active scanning (Trivy/Grype) once a - triage stack exists (accepted-risk R1). + 6. Supply-chain hygiene: enforce tiered image pinning (stateful `tag@digest`; + stateless rolling tags — ADR-011) + official/verified images via the service + checklist; revisit active scanning (Trivy/Grype) once a triage stack exists (R1). 16. **ADR-011 (update management) — resolve open questions + accept.** Committed as **Proposed**; resolve before marking Accepted: @@ -129,7 +129,7 @@ Friday timing enough at this scale? 6. Notification/control channel — boma's own ntfy topics (ADR-013) + a "skip this week" / "pause" switch (ties to TODO 9). - 7. **Reconcile pinning conflict:** ADR-011 decision 2 chose *tags, not digests* - (readability + snapshot/backup immutability), but the security work says - *digest pinning* (accepted-risk R1, service checklist, 15.6 above). Decide one - coherent rule (e.g. readable tag + recorded digest?) and align all of them. + 7. ~~Reconcile pinning conflict (tags vs digests).~~ DECIDED: tiered rule — + **stateful `tag@digest`** (readable tag + integrity digest), **stateless + rolling tags**. Aligned across ADR-011 (dec. 2), ADR-004, ADR-002 supply-chain + row + accepted-risk R1, the service checklist, and 15.6. diff --git a/docs/decisions/002-security.md b/docs/decisions/002-security.md index 6259f74..d3174de 100644 --- a/docs/decisions/002-security.md +++ b/docs/decisions/002-security.md @@ -25,7 +25,7 @@ What we deliberately design against — and, just as importantly, what we do not | **Opportunistic external** — bots scanning, credential stuffing, mass-exploiting known CVEs in exposed services | Yes — primary | SSH key-only + fail2ban, deny-by-default firewall, security auto-patching, minimal attack surface, services behind a reverse proxy with auth | | **Lateral movement / blast radius** — assume one service *is* compromised; limit how far it spreads | Yes | VLAN segmentation (ADR-007), least-privilege containers, no host network mode, per-service isolation, no shared credentials | | **Operator / agent error** — accidental secret leak, misconfiguration, or an AI agent making an unsafe change | Yes | Vault + gitleaks, declarative firewall (no ad-hoc ports), review gates, agent guardrails (below), pre-commit hooks | -| **Supply chain** — compromised images, base images, dependencies, collections | Acknowledged, lower priority | Baseline hygiene required: image digest pinning + prefer official/verified images (ADR-011, service checklist), gitleaks. Active vuln scanning deferred — accepted risk | +| **Supply chain** — compromised images, base images, dependencies, collections | Acknowledged, lower priority | Baseline hygiene required: tiered image pinning (stateful `tag@digest`, stateless rolling — ADR-011) + prefer official/verified images, gitleaks. Active vuln scanning deferred — accepted risk | | **Targeted / physical** — a determined adversary specifically after this homelab, or physical device access | Out of scope | Not designed against at this scale; revisit if the threat model changes | Supply chain is consciously deprioritized, not forgotten — see diff --git a/docs/decisions/004-docker-model.md b/docs/decisions/004-docker-model.md index f2fb6ce..2d18b31 100644 --- a/docs/decisions/004-docker-model.md +++ b/docs/decisions/004-docker-model.md @@ -86,9 +86,12 @@ Managed by the `docker_host` role. Key settings: ## Image management -- Images are always pinned to a specific digest or tag in templates -- `latest` is never used in production Compose files -- Image updates are a deliberate operation: update the tag variable, run deploy +- Image pinning follows the tiered model in ADR-011: **stateful** services pin + `tag@digest` (readable tag + integrity digest); **stateless** services use rolling + tags (`latest`/`stable`), refreshed deliberately and watched by DIUN +- Bare `latest` is therefore acceptable only on the stateless tier; the stateful tier + is always pinned +- Image updates are a deliberate operation: update the tag/digest variable, run deploy ## Persistent data diff --git a/docs/decisions/011-update-management.md b/docs/decisions/011-update-management.md index 979a351..6497ab4 100644 --- a/docs/decisions/011-update-management.md +++ b/docs/decisions/011-update-management.md @@ -27,13 +27,17 @@ Each container role declares its class, e.g. `__stateful: true|false` (def ### 2. Image pinning follows the split - **Stateless → rolling tags** (`latest`/`stable`), refreshed by the weekly run and - watched by DIUN. Always-current, cheap to roll back. -- **Stateful → pinned** to a readable tag, **minor** where the image offers it - (e.g. `mariadb:11.4`, not bare `:11` and not a digest). Reproducible; upgrades are - deliberate, never incidental. + watched by DIUN. Always-current, cheap to roll back. No digest pin — it would + defeat the rolling design. +- **Stateful → pinned `tag@digest`** — a readable **minor** tag where the image + offers it (e.g. `mariadb:11.4`, not bare `:11`) **plus its digest** + (`mariadb:11.4@sha256:…`). Reproducible and tamper-evident; upgrades are deliberate + (bump tag and digest together), never incidental. -Tags, not digests — readable in diffs; immutability is bought instead via -snapshot-before and backups. +Readable tag **and** digest, not one or the other: the tag keeps diffs legible, the +digest pins the exact bytes for supply-chain integrity (ADR-002, accepted-risk R1). +Snapshot-before + backups remain the rollback mechanism for a *broken* update; the +digest is what guards against a *swapped* image, which snapshots cannot. ### 3. Weekly OS + stateless run — Friday night, fail-stop, staggered diff --git a/docs/security/accepted-risks.md b/docs/security/accepted-risks.md index 2e7f776..c71a41c 100644 --- a/docs/security/accepted-risks.md +++ b/docs/security/accepted-risks.md @@ -13,7 +13,7 @@ revisit (trigger). | # | Accepted risk | Rationale | Revisit trigger | |---|---|---|---| -| R1 | **Active supply-chain scanning deferred** — baseline hygiene *is* required (image digest pinning + prefer official/verified images, ADR-011 / service checklist; gitleaks), but images and dependencies are not actively vulnerability-scanned (Trivy/Grype) or signature-verified | Scanning only pays off with the capacity to triage its output; the realistic threat is opportunistic, not a targeted supply-chain attack | A monitoring/triage stack is live; hosting high-value data/finances for others; a relevant upstream compromise | +| R1 | **Active supply-chain scanning deferred** — baseline hygiene *is* required (tiered image pinning per ADR-011 — stateful `tag@digest`, stateless rolling — prefer official/verified images; gitleaks), but images and dependencies are not actively vulnerability-scanned (Trivy/Grype) or signature-verified | Scanning only pays off with the capacity to triage its output; the realistic threat is opportunistic, not a targeted supply-chain attack | A monitoring/triage stack is live; hosting high-value data/finances for others; a relevant upstream compromise | | R2 | **SELinux not used** — no SELinux mandatory access control | AppArmor — Debian-native and enforced via the CIS baseline — already provides MAC; adding SELinux means two MAC systems, non-native to Debian, for no real gain | A service that ships and requires its own SELinux policy; threat model shifts toward targeted attackers | _Last reviewed: 2026-06-04. The prior gaps (full CIS hardening, SELinux/AppArmor, diff --git a/docs/security/service-checklist.md b/docs/security/service-checklist.md index 0f1e112..5ad7a79 100644 --- a/docs/security/service-checklist.md +++ b/docs/security/service-checklist.md @@ -41,7 +41,7 @@ This checklist is the generic **bar**. Each service answers it in its own ## Updates & provenance -- [ ] Image/source version is pinned (tag or digest), not floating `latest` (ADR-011) +- [ ] Image pinned per ADR-011's tiered rule — stateful: `tag@digest`; stateless: rolling tag (`latest`/`stable`) acceptable - [ ] The update path is known — how this service gets patched ## Operability (security-adjacent)