Add per-service SECURITY.md convention; one role per service

Revise ADR-004 to a service-role standard: every service is its own self-contained role with a required file set including SECURITY.md, uniform deploy mechanics, and a deferred shared-engine option (with revisit trigger) recorded in the ADR. Add the per-service security record: - docs/security/service-security-template.md — canonical SECURITY.md template (exposure, checklist status, service-specific hardening, residual risks) - roles/<service>/SECURITY.md is where each service records how it meets the bar; /security-review aggregates roles/*/SECURITY.md and cross-checks against config - service-checklist.md noted as the generic bar the record answers Wire-up: new-role runbook step writes SECURITY.md from the template; ADR-002 governance bullet points at it; CLAUDE.md role conventions require it and mandate one-role-per-service; STATUS records the convention as defined-not-yet-applied. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-04 16:09:33 +02:00 · 2026-06-04 16:09:33 +02:00 · 3b029352b6
commit 3b029352b6
parent 19dd89b875
7 changed files with 112 additions and 18 deletions
--- a/CLAUDE.md
+++ b/CLAUDE.md
@ -80,6 +80,8 @@ Full design rationale: `docs/decisions/`
 - Every role must have `molecule/default/` scenario targeting Debian 13
 - Every role must have a populated `README.md`
 - Every role must have `meta/main.yml` filled in
+- Every **service** role must have a populated `SECURITY.md` (ADR-002/004) — copy `docs/security/service-security-template.md`
+- One service = one self-contained role; no shared multi-service roles (ADR-004)
 - Role names: `snake_case`, descriptive nouns (`base`, `docker_host`, `reverse_proxy`)
 - Use `make new-role NAME=<name>` to scaffold — never create role structure by hand

@ -169,6 +171,7 @@ Single-contributor, trunk-based (no merge requests / approval gates):
 | Security baseline & strategy | `docs/decisions/002-security.md`      |
 | Accepted security risks | `docs/security/accepted-risks.md`     |
 | Per-service security checklist | `docs/security/service-checklist.md` |
+| Per-service security record (template) | `docs/security/service-security-template.md` |
 | Toolchain choices      | `docs/decisions/003-toolchain.md`     |
 | Docker & Compose model | `docs/decisions/004-docker-model.md`  |
 | Bootstrapping hosts    | `docs/decisions/005-bootstrapping.md` |
--- a/STATUS.md
+++ b/STATUS.md
@ -24,6 +24,7 @@ _Last reviewed: 2026-05-30._
 | `docs/hardware/reference.md` + `scripts/capacity-scan.py` | Present — reference doc (skeleton until real hardware) + stdlib scan; emits capacity JSON |
 | `/capacity-review` | Works — on-demand capacity evaluation → `docs/hardware/reviews/`. Intent-based (no live usage yet) |
 | ADR-002 security strategy + `docs/security/{accepted-risks,service-checklist}.md` | Present — threat model, principles, governance frame; checklist + risk register are docs, enforced manually in review |
+| Service-role standard + per-service `SECURITY.md` convention | Defined (ADR-004 + `docs/security/service-security-template.md`); not yet applied — no service roles exist |

 ## Scaffolded but empty — NOT implemented

--- a/docs/decisions/002-security.md
+++ b/docs/decisions/002-security.md
@ -145,9 +145,12 @@ mechanisms; each lives where change is cheap and is linked from here.

 - **Per-service security bar** — every exposed service must clear a defined
  checklist before deploy (secrets in vault, no default creds, least-privilege /
-  non-root, declared firewall ports, reverse-proxy + auth if exposed). Lives in
-  `docs/security/service-checklist.md`; referenced from `docs/runbooks/new-role.md`.
-  Enforced manually in review today; the planned `/security-review` will automate it.
+  non-root, declared firewall ports, reverse-proxy + auth if exposed). The generic
+  bar lives in `docs/security/service-checklist.md`, and each service
+  records how it meets the bar (plus service-specific hardening) in its own
+  `roles/<service>/SECURITY.md`, created from `docs/security/service-security-template.md`
+  (ADR-004). Enforced manually in review today; the planned `/security-review`
+  aggregates every `roles/*/SECURITY.md` and cross-checks it against the role's config.
 - **Periodic security review** — a recurring review that re-checks posture,
  surfaces drift, and re-challenges accepted risks. Planned as a `/security-review`
  skill (sibling to `/review-repo`); see `docs/TODO.md` (Scheduled work). Not built
--- a/docs/decisions/004-docker-model.md
+++ b/docs/decisions/004-docker-model.md
@ -28,16 +28,44 @@ defines how services are structured, deployed, and maintained.
 All services live under `/opt/services/`. The path is defined in
 `group_vars/all/vars.yml` as `services__base_dir`.

-## Compose file delivery
+## Service-role standard

-Each service has a corresponding Ansible role (or is managed by a shared role
-with per-service variables). The role:
+**Every service has its own self-contained role** — one service, one role. Shared
+roles serving multiple services are no longer used (see "Why not a shared engine"
+below). Each service role contains a standard set of files:

-1. Creates `/opt/services/servicename/` directory
-2. Renders `docker-compose.yml` from `templates/docker-compose.yml.j2`
-3. Renders `.env` from `templates/env.j2` (pulling secrets from vault variables)
-4. Runs `docker compose up -d --remove-orphans` via `ansible.builtin.command`
-5. Optionally runs `docker compose pull` before up (controlled by variable)
+| File | Purpose |
+|---|---|
+| `tasks/main.yml` | The standard deploy mechanics (below) |
+| `templates/docker-compose.yml.j2` | The Compose definition |
+| `templates/env.j2` | `.env` rendered from vault variables |
+| `defaults/main.yml` | Tuneables, `rolename__` namespace |
+| `README.md` | Purpose, variables, usage (role convention) |
+| `SECURITY.md` | Per-service security record — see ADR-002 and `docs/security/service-security-template.md` |
+| `meta/main.yml`, `molecule/default/` | Metadata + Debian 13 test scenario |
+
+### Standard deploy mechanics
+
+Every service role's `tasks/main.yml` follows the same sequence, so all roles are
+uniform and predictable:
+
+1. Create `/opt/services/<service>/` directory
+2. Render `docker-compose.yml` from `templates/docker-compose.yml.j2`
+3. Render `.env` from `templates/env.j2` (secrets from vault variables)
+4. Run `docker compose up -d --remove-orphans` via `ansible.builtin.command`
+5. Optionally run `docker compose pull` before up (controlled by a variable)
+
+### Why not a shared engine
+
+A shared `compose_service` engine role — service roles delegating the mechanics to
+one place — is **intentionally not built**. Duplicating the ~5 standard tasks per
+role is accepted in favour of legible, self-contained roles a reader can understand
+without indirection, and AI authorship makes the duplication cheap to generate
+uniformly from this standard.
+
+**Revisit trigger:** extract a shared engine role if maintaining the duplicated
+mechanics across service roles becomes painful — a pattern change that means editing
+many roles, or drift between them that this standard alone isn't preventing.

 ## Docker daemon configuration

--- a/docs/runbooks/new-role.md
+++ b/docs/runbooks/new-role.md
@ -71,14 +71,16 @@ Fix any lint or test failures before committing.
 Add the role to the appropriate playbook in `playbooks/` and add the host group
 to `inventories/staging/hosts.yml` for integration testing.

-### 9. Clear the security checklist (services)
+### 9. Write the per-service security record (services)

-If the role is a **service** — especially one reachable beyond its own host —
-walk `docs/security/service-checklist.md` and confirm every item passes (secrets
-in vault, no default creds, least-privilege, declared firewall ports, behind the
-reverse proxy with auth if exposed). Record any conscious deviation in
-`docs/security/accepted-risks.md`. This bar is established by ADR-002; enforcement
-is manual in review today, with the planned `/security-review` to automate it.
+For a **service** role, copy `docs/security/service-security-template.md` to
+`roles/<rolename>/SECURITY.md` and fill it in: exposure, the checklist status
+(from `docs/security/service-checklist.md`), service-specific hardening, and any
+residual/accepted risks. Filling the **Checklist status** section is how the
+service clears the security bar — record any conscious deviation in
+`docs/security/accepted-risks.md`. The bar is established by ADR-002; enforcement is
+manual in review today, with the planned `/security-review` aggregating every
+`roles/*/SECURITY.md` to automate it.

 ### 10. Commit

--- a/docs/security/service-checklist.md
+++ b/docs/security/service-checklist.md
@ -9,6 +9,10 @@ Enforced manually in review today; the planned `/security-review` skill (see
 Treat each item as must-pass **unless** a deviation is recorded in
 `docs/security/accepted-risks.md` with a rationale and a revisit trigger.

+This checklist is the generic **bar**. Each service answers it in its own
+`roles/<service>/SECURITY.md` (the "Checklist status" section), created from
+`docs/security/service-security-template.md` — see ADR-004.
+
 ## Secrets & credentials

 - [ ] All secrets live in an encrypted `vault.yml` (`vault.<service>.<key>`); none in
--- a/docs/security/service-security-template.md
+++ b/docs/security/service-security-template.md
@ -0,0 +1,53 @@
+# Per-service security record — template
+
+Copy this file to `roles/<service>/SECURITY.md` and fill it in when building a
+service role (ADR-004). It is the per-service security **record**: the generic bar
+lives in `docs/security/service-checklist.md` (the questions); this file is *this
+service's answers, plus service-specific measures*. Filling the **Checklist status**
+section is how a service clears the bar (`docs/runbooks/new-role.md`).
+
+The planned `/security-review` aggregates every `roles/*/SECURITY.md` and
+cross-checks the claims here against the role's actual config (Compose template,
+declared firewall ports), so keep this honest and current.
+
+Delete this preamble in the copy and start from the heading below.
+
+---
+
+# Security — &lt;service&gt;
+
+## Exposure
+
+- **Published ports:** &lt;ports&gt; — and which are declared in the `group_vars` firewall vars
+- **Auth surface:** &lt;how it authenticates — e.g. behind reverse proxy + SSO, or app-native login&gt;
+- **Reachability:** &lt;which VLANs / sources can reach it; internal-only vs LAN vs public&gt;
+- **Data sensitivity:** &lt;what data it holds; backup/restore pointer&gt;
+
+## Checklist status
+
+Each item from `docs/security/service-checklist.md`, with this service's status:
+✅ met · ⚠️ deviation (link to `docs/security/accepted-risks.md`) · n/a.
+
+- [ ] Secrets in vault; no default creds; nothing secret in git/images
+- [ ] Non-root; no `privileged`/host-network unless justified; minimal mounts; caps dropped
+- [ ] Ports declared in `group_vars`; behind reverse proxy + auth if exposed; least-privilege inter-service reach
+- [ ] Image pinned (tag/digest), update path known
+- [ ] Logs reviewable; backup/restore covered if stateful
+
+## Service-specific hardening
+
+Measures beyond the generic bar — the application's own relevant settings. Examples
+of the *kind* of thing that belongs here (replace with this service's actual measures):
+
+- App-level brute-force / rate-limiting protection enabled
+- Trusted-domains / allowed-hosts restricted
+- Unused features or remote-access modes disabled
+- Server-side encryption / secure cookie / security-header settings
+
+## Residual / accepted risks
+
+Anything not done for this service — each with rationale and a revisit trigger.
+Link global trade-offs to `docs/security/accepted-risks.md`; note service-local ones
+here.
+
+- &lt;none yet, or: risk · rationale · revisit trigger&gt;