From 2f4218814a8f37ee664db324adf49cec5df2f6c0 Mon Sep 17 00:00:00 2001
From: sjat <sjat@ziethen.dk>
Date: Thu, 4 Jun 2026 19:21:36 +0200
Subject: [PATCH] Reconcile image pinning to a tiered tag@digest rule
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Resolve the conflict between ADR-011 (tags-not-digests) and the security work
(digest pinning) with one coherent rule that respects ADR-011's stateless/stateful
split:

- Stateful → pin `tag@digest` (readable tag + integrity digest): legible diffs AND
  tamper-evidence. Snapshots cover broken updates; the digest covers swapped images.
- Stateless → rolling tags (latest/stable); digest-pinning would defeat the rolling
  design. Integrity rests on official/verified images + disposability.

Aligned across ADR-011 (decision 2), ADR-004 (image management), ADR-002
(supply-chain row), accepted-risk R1, the service checklist, and TODO 15.6.
TODO 16.7 marked decided.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---
 docs/TODO.md                            | 14 +++++++-------
 docs/decisions/002-security.md          |  2 +-
 docs/decisions/004-docker-model.md      |  9 ++++++---
 docs/decisions/011-update-management.md | 16 ++++++++++------
 docs/security/accepted-risks.md         |  2 +-
 docs/security/service-checklist.md      |  2 +-
 6 files changed, 26 insertions(+), 19 deletions(-)

diff --git a/docs/TODO.md b/docs/TODO.md
index 98676ae..5b2f8fb 100644
--- a/docs/TODO.md
+++ b/docs/TODO.md
@@ -113,9 +113,9 @@
     4. Network IDS: enable Suricata on OPNsense (IDS first; IPS later?).
     5. Active security alerting: wire AIDE, `auditd`, `fail2ban`, and Suricata into
        the Loki/Grafana alerting stack (ties to 3.6).
-    6. Supply-chain hygiene: enforce image digest pinning + official/verified images
-       via the service checklist; revisit active scanning (Trivy/Grype) once a
-       triage stack exists (accepted-risk R1).
+    6. Supply-chain hygiene: enforce tiered image pinning (stateful `tag@digest`;
+       stateless rolling tags — ADR-011) + official/verified images via the service
+       checklist; revisit active scanning (Trivy/Grype) once a triage stack exists (R1).
 
 16. **ADR-011 (update management) — resolve open questions + accept.** Committed as
     **Proposed**; resolve before marking Accepted:
@@ -129,7 +129,7 @@
        Friday timing enough at this scale?
     6. Notification/control channel — boma's own ntfy topics (ADR-013) + a "skip this
        week" / "pause" switch (ties to TODO 9).
-    7. **Reconcile pinning conflict:** ADR-011 decision 2 chose *tags, not digests*
-       (readability + snapshot/backup immutability), but the security work says
-       *digest pinning* (accepted-risk R1, service checklist, 15.6 above). Decide one
-       coherent rule (e.g. readable tag + recorded digest?) and align all of them.
+    7. ~~Reconcile pinning conflict (tags vs digests).~~ DECIDED: tiered rule —
+       **stateful `tag@digest`** (readable tag + integrity digest), **stateless
+       rolling tags**. Aligned across ADR-011 (dec. 2), ADR-004, ADR-002 supply-chain
+       row + accepted-risk R1, the service checklist, and 15.6.
diff --git a/docs/decisions/002-security.md b/docs/decisions/002-security.md
index 6259f74..d3174de 100644
--- a/docs/decisions/002-security.md
+++ b/docs/decisions/002-security.md
@@ -25,7 +25,7 @@ What we deliberately design against — and, just as importantly, what we do not
 | **Opportunistic external** — bots scanning, credential stuffing, mass-exploiting known CVEs in exposed services | Yes — primary | SSH key-only + fail2ban, deny-by-default firewall, security auto-patching, minimal attack surface, services behind a reverse proxy with auth |
 | **Lateral movement / blast radius** — assume one service *is* compromised; limit how far it spreads | Yes | VLAN segmentation (ADR-007), least-privilege containers, no host network mode, per-service isolation, no shared credentials |
 | **Operator / agent error** — accidental secret leak, misconfiguration, or an AI agent making an unsafe change | Yes | Vault + gitleaks, declarative firewall (no ad-hoc ports), review gates, agent guardrails (below), pre-commit hooks |
-| **Supply chain** — compromised images, base images, dependencies, collections | Acknowledged, lower priority | Baseline hygiene required: image digest pinning + prefer official/verified images (ADR-011, service checklist), gitleaks. Active vuln scanning deferred — accepted risk |
+| **Supply chain** — compromised images, base images, dependencies, collections | Acknowledged, lower priority | Baseline hygiene required: tiered image pinning (stateful `tag@digest`, stateless rolling — ADR-011) + prefer official/verified images, gitleaks. Active vuln scanning deferred — accepted risk |
 | **Targeted / physical** — a determined adversary specifically after this homelab, or physical device access | Out of scope | Not designed against at this scale; revisit if the threat model changes |
 
 Supply chain is consciously deprioritized, not forgotten — see
diff --git a/docs/decisions/004-docker-model.md b/docs/decisions/004-docker-model.md
index f2fb6ce..2d18b31 100644
--- a/docs/decisions/004-docker-model.md
+++ b/docs/decisions/004-docker-model.md
@@ -86,9 +86,12 @@ Managed by the `docker_host` role. Key settings:
 
 ## Image management
 
-- Images are always pinned to a specific digest or tag in templates
-- `latest` is never used in production Compose files
-- Image updates are a deliberate operation: update the tag variable, run deploy
+- Image pinning follows the tiered model in ADR-011: **stateful** services pin
+  `tag@digest` (readable tag + integrity digest); **stateless** services use rolling
+  tags (`latest`/`stable`), refreshed deliberately and watched by DIUN
+- Bare `latest` is therefore acceptable only on the stateless tier; the stateful tier
+  is always pinned
+- Image updates are a deliberate operation: update the tag/digest variable, run deploy
 
 ## Persistent data
 
diff --git a/docs/decisions/011-update-management.md b/docs/decisions/011-update-management.md
index 979a351..6497ab4 100644
--- a/docs/decisions/011-update-management.md
+++ b/docs/decisions/011-update-management.md
@@ -27,13 +27,17 @@ Each container role declares its class, e.g. `<role>__stateful: true|false` (def
 ### 2. Image pinning follows the split
 
 - **Stateless → rolling tags** (`latest`/`stable`), refreshed by the weekly run and
-  watched by DIUN. Always-current, cheap to roll back.
-- **Stateful → pinned** to a readable tag, **minor** where the image offers it
-  (e.g. `mariadb:11.4`, not bare `:11` and not a digest). Reproducible; upgrades are
-  deliberate, never incidental.
+  watched by DIUN. Always-current, cheap to roll back. No digest pin — it would
+  defeat the rolling design.
+- **Stateful → pinned `tag@digest`** — a readable **minor** tag where the image
+  offers it (e.g. `mariadb:11.4`, not bare `:11`) **plus its digest**
+  (`mariadb:11.4@sha256:…`). Reproducible and tamper-evident; upgrades are deliberate
+  (bump tag and digest together), never incidental.
 
-Tags, not digests — readable in diffs; immutability is bought instead via
-snapshot-before and backups.
+Readable tag **and** digest, not one or the other: the tag keeps diffs legible, the
+digest pins the exact bytes for supply-chain integrity (ADR-002, accepted-risk R1).
+Snapshot-before + backups remain the rollback mechanism for a *broken* update; the
+digest is what guards against a *swapped* image, which snapshots cannot.
 
 ### 3. Weekly OS + stateless run — Friday night, fail-stop, staggered
 
diff --git a/docs/security/accepted-risks.md b/docs/security/accepted-risks.md
index 2e7f776..c71a41c 100644
--- a/docs/security/accepted-risks.md
+++ b/docs/security/accepted-risks.md
@@ -13,7 +13,7 @@ revisit (trigger).
 
 | # | Accepted risk | Rationale | Revisit trigger |
 |---|---|---|---|
-| R1 | **Active supply-chain scanning deferred** — baseline hygiene *is* required (image digest pinning + prefer official/verified images, ADR-011 / service checklist; gitleaks), but images and dependencies are not actively vulnerability-scanned (Trivy/Grype) or signature-verified | Scanning only pays off with the capacity to triage its output; the realistic threat is opportunistic, not a targeted supply-chain attack | A monitoring/triage stack is live; hosting high-value data/finances for others; a relevant upstream compromise |
+| R1 | **Active supply-chain scanning deferred** — baseline hygiene *is* required (tiered image pinning per ADR-011 — stateful `tag@digest`, stateless rolling — prefer official/verified images; gitleaks), but images and dependencies are not actively vulnerability-scanned (Trivy/Grype) or signature-verified | Scanning only pays off with the capacity to triage its output; the realistic threat is opportunistic, not a targeted supply-chain attack | A monitoring/triage stack is live; hosting high-value data/finances for others; a relevant upstream compromise |
 | R2 | **SELinux not used** — no SELinux mandatory access control | AppArmor — Debian-native and enforced via the CIS baseline — already provides MAC; adding SELinux means two MAC systems, non-native to Debian, for no real gain | A service that ships and requires its own SELinux policy; threat model shifts toward targeted attackers |
 
 _Last reviewed: 2026-06-04. The prior gaps (full CIS hardening, SELinux/AppArmor,
diff --git a/docs/security/service-checklist.md b/docs/security/service-checklist.md
index 0f1e112..5ad7a79 100644
--- a/docs/security/service-checklist.md
+++ b/docs/security/service-checklist.md
@@ -41,7 +41,7 @@ This checklist is the generic **bar**. Each service answers it in its own
 
 ## Updates & provenance
 
-- [ ] Image/source version is pinned (tag or digest), not floating `latest` (ADR-011)
+- [ ] Image pinned per ADR-011's tiered rule — stateful: `tag@digest`; stateless: rolling tag (`latest`/`stable`) acceptable
 - [ ] The update path is known — how this service gets patched
 
 ## Operability (security-adjacent)