Add per-service SECURITY.md convention; one role per service

Revise ADR-004 to a service-role standard: every service is its own
self-contained role with a required file set including SECURITY.md, uniform
deploy mechanics, and a deferred shared-engine option (with revisit trigger)
recorded in the ADR.

Add the per-service security record:
- docs/security/service-security-template.md — canonical SECURITY.md template
  (exposure, checklist status, service-specific hardening, residual risks)
- roles/<service>/SECURITY.md is where each service records how it meets the bar;
  /security-review aggregates roles/*/SECURITY.md and cross-checks against config
- service-checklist.md noted as the generic bar the record answers

Wire-up: new-role runbook step writes SECURITY.md from the template; ADR-002
governance bullet points at it; CLAUDE.md role conventions require it and mandate
one-role-per-service; STATUS records the convention as defined-not-yet-applied.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
sjat 2026-06-04 16:09:33 +02:00
parent 19dd89b875
commit 3b029352b6
7 changed files with 112 additions and 18 deletions

View file

@ -80,6 +80,8 @@ Full design rationale: `docs/decisions/`
- Every role must have `molecule/default/` scenario targeting Debian 13
- Every role must have a populated `README.md`
- Every role must have `meta/main.yml` filled in
- Every **service** role must have a populated `SECURITY.md` (ADR-002/004) — copy `docs/security/service-security-template.md`
- One service = one self-contained role; no shared multi-service roles (ADR-004)
- Role names: `snake_case`, descriptive nouns (`base`, `docker_host`, `reverse_proxy`)
- Use `make new-role NAME=<name>` to scaffold — never create role structure by hand
@ -169,6 +171,7 @@ Single-contributor, trunk-based (no merge requests / approval gates):
| Security baseline & strategy | `docs/decisions/002-security.md` |
| Accepted security risks | `docs/security/accepted-risks.md` |
| Per-service security checklist | `docs/security/service-checklist.md` |
| Per-service security record (template) | `docs/security/service-security-template.md` |
| Toolchain choices | `docs/decisions/003-toolchain.md` |
| Docker & Compose model | `docs/decisions/004-docker-model.md` |
| Bootstrapping hosts | `docs/decisions/005-bootstrapping.md` |

View file

@ -24,6 +24,7 @@ _Last reviewed: 2026-05-30._
| `docs/hardware/reference.md` + `scripts/capacity-scan.py` | Present — reference doc (skeleton until real hardware) + stdlib scan; emits capacity JSON |
| `/capacity-review` | Works — on-demand capacity evaluation → `docs/hardware/reviews/`. Intent-based (no live usage yet) |
| ADR-002 security strategy + `docs/security/{accepted-risks,service-checklist}.md` | Present — threat model, principles, governance frame; checklist + risk register are docs, enforced manually in review |
| Service-role standard + per-service `SECURITY.md` convention | Defined (ADR-004 + `docs/security/service-security-template.md`); not yet applied — no service roles exist |
## Scaffolded but empty — NOT implemented

View file

@ -145,9 +145,12 @@ mechanisms; each lives where change is cheap and is linked from here.
- **Per-service security bar** — every exposed service must clear a defined
checklist before deploy (secrets in vault, no default creds, least-privilege /
non-root, declared firewall ports, reverse-proxy + auth if exposed). Lives in
`docs/security/service-checklist.md`; referenced from `docs/runbooks/new-role.md`.
Enforced manually in review today; the planned `/security-review` will automate it.
non-root, declared firewall ports, reverse-proxy + auth if exposed). The generic
bar lives in `docs/security/service-checklist.md`, and each service
records how it meets the bar (plus service-specific hardening) in its own
`roles/<service>/SECURITY.md`, created from `docs/security/service-security-template.md`
(ADR-004). Enforced manually in review today; the planned `/security-review`
aggregates every `roles/*/SECURITY.md` and cross-checks it against the role's config.
- **Periodic security review** — a recurring review that re-checks posture,
surfaces drift, and re-challenges accepted risks. Planned as a `/security-review`
skill (sibling to `/review-repo`); see `docs/TODO.md` (Scheduled work). Not built

View file

@ -28,16 +28,44 @@ defines how services are structured, deployed, and maintained.
All services live under `/opt/services/`. The path is defined in
`group_vars/all/vars.yml` as `services__base_dir`.
## Compose file delivery
## Service-role standard
Each service has a corresponding Ansible role (or is managed by a shared role
with per-service variables). The role:
**Every service has its own self-contained role** — one service, one role. Shared
roles serving multiple services are no longer used (see "Why not a shared engine"
below). Each service role contains a standard set of files:
1. Creates `/opt/services/servicename/` directory
2. Renders `docker-compose.yml` from `templates/docker-compose.yml.j2`
3. Renders `.env` from `templates/env.j2` (pulling secrets from vault variables)
4. Runs `docker compose up -d --remove-orphans` via `ansible.builtin.command`
5. Optionally runs `docker compose pull` before up (controlled by variable)
| File | Purpose |
|---|---|
| `tasks/main.yml` | The standard deploy mechanics (below) |
| `templates/docker-compose.yml.j2` | The Compose definition |
| `templates/env.j2` | `.env` rendered from vault variables |
| `defaults/main.yml` | Tuneables, `rolename__` namespace |
| `README.md` | Purpose, variables, usage (role convention) |
| `SECURITY.md` | Per-service security record — see ADR-002 and `docs/security/service-security-template.md` |
| `meta/main.yml`, `molecule/default/` | Metadata + Debian 13 test scenario |
### Standard deploy mechanics
Every service role's `tasks/main.yml` follows the same sequence, so all roles are
uniform and predictable:
1. Create `/opt/services/<service>/` directory
2. Render `docker-compose.yml` from `templates/docker-compose.yml.j2`
3. Render `.env` from `templates/env.j2` (secrets from vault variables)
4. Run `docker compose up -d --remove-orphans` via `ansible.builtin.command`
5. Optionally run `docker compose pull` before up (controlled by a variable)
### Why not a shared engine
A shared `compose_service` engine role — service roles delegating the mechanics to
one place — is **intentionally not built**. Duplicating the ~5 standard tasks per
role is accepted in favour of legible, self-contained roles a reader can understand
without indirection, and AI authorship makes the duplication cheap to generate
uniformly from this standard.
**Revisit trigger:** extract a shared engine role if maintaining the duplicated
mechanics across service roles becomes painful — a pattern change that means editing
many roles, or drift between them that this standard alone isn't preventing.
## Docker daemon configuration

View file

@ -71,14 +71,16 @@ Fix any lint or test failures before committing.
Add the role to the appropriate playbook in `playbooks/` and add the host group
to `inventories/staging/hosts.yml` for integration testing.
### 9. Clear the security checklist (services)
### 9. Write the per-service security record (services)
If the role is a **service** — especially one reachable beyond its own host —
walk `docs/security/service-checklist.md` and confirm every item passes (secrets
in vault, no default creds, least-privilege, declared firewall ports, behind the
reverse proxy with auth if exposed). Record any conscious deviation in
`docs/security/accepted-risks.md`. This bar is established by ADR-002; enforcement
is manual in review today, with the planned `/security-review` to automate it.
For a **service** role, copy `docs/security/service-security-template.md` to
`roles/<rolename>/SECURITY.md` and fill it in: exposure, the checklist status
(from `docs/security/service-checklist.md`), service-specific hardening, and any
residual/accepted risks. Filling the **Checklist status** section is how the
service clears the security bar — record any conscious deviation in
`docs/security/accepted-risks.md`. The bar is established by ADR-002; enforcement is
manual in review today, with the planned `/security-review` aggregating every
`roles/*/SECURITY.md` to automate it.
### 10. Commit

View file

@ -9,6 +9,10 @@ Enforced manually in review today; the planned `/security-review` skill (see
Treat each item as must-pass **unless** a deviation is recorded in
`docs/security/accepted-risks.md` with a rationale and a revisit trigger.
This checklist is the generic **bar**. Each service answers it in its own
`roles/<service>/SECURITY.md` (the "Checklist status" section), created from
`docs/security/service-security-template.md` — see ADR-004.
## Secrets & credentials
- [ ] All secrets live in an encrypted `vault.yml` (`vault.<service>.<key>`); none in

View file

@ -0,0 +1,53 @@
# Per-service security record — template
Copy this file to `roles/<service>/SECURITY.md` and fill it in when building a
service role (ADR-004). It is the per-service security **record**: the generic bar
lives in `docs/security/service-checklist.md` (the questions); this file is *this
service's answers, plus service-specific measures*. Filling the **Checklist status**
section is how a service clears the bar (`docs/runbooks/new-role.md`).
The planned `/security-review` aggregates every `roles/*/SECURITY.md` and
cross-checks the claims here against the role's actual config (Compose template,
declared firewall ports), so keep this honest and current.
Delete this preamble in the copy and start from the heading below.
---
# Security — &lt;service&gt;
## Exposure
- **Published ports:** &lt;ports&gt; — and which are declared in the `group_vars` firewall vars
- **Auth surface:** &lt;how it authenticates — e.g. behind reverse proxy + SSO, or app-native login&gt;
- **Reachability:** &lt;which VLANs / sources can reach it; internal-only vs LAN vs public&gt;
- **Data sensitivity:** &lt;what data it holds; backup/restore pointer&gt;
## Checklist status
Each item from `docs/security/service-checklist.md`, with this service's status:
✅ met · ⚠️ deviation (link to `docs/security/accepted-risks.md`) · n/a.
- [ ] Secrets in vault; no default creds; nothing secret in git/images
- [ ] Non-root; no `privileged`/host-network unless justified; minimal mounts; caps dropped
- [ ] Ports declared in `group_vars`; behind reverse proxy + auth if exposed; least-privilege inter-service reach
- [ ] Image pinned (tag/digest), update path known
- [ ] Logs reviewable; backup/restore covered if stateful
## Service-specific hardening
Measures beyond the generic bar — the application's own relevant settings. Examples
of the *kind* of thing that belongs here (replace with this service's actual measures):
- App-level brute-force / rate-limiting protection enabled
- Trusted-domains / allowed-hosts restricted
- Unused features or remote-access modes disabled
- Server-side encryption / secure cookie / security-header settings
## Residual / accepted risks
Anything not done for this service — each with rationale and a revisit trigger.
Link global trade-offs to `docs/security/accepted-risks.md`; note service-local ones
here.
- &lt;none yet, or: risk · rationale · revisit trigger&gt;