boma/docs/decisions/004-docker-model.md

# ADR-004 — Docker and Compose service model

## Context

All services run as Docker containers managed via Docker Compose. This document
defines how services are structured, deployed, and maintained.

## Core principles

- **No hand-edited files on hosts**: all Compose files are rendered by Ansible
  from Jinja2 templates. If a file exists on a host, it was put there by Ansible.
- **Compose per service**: each service (or tightly coupled service group) gets
  its own Compose file and directory under a standard path.
- **Variables drive differences**: the same template renders differently per host
  via `group_vars` and `host_vars`. No host-specific templates.

## Directory layout on hosts

```
/opt/services/
├── servicename/
│   ├── docker-compose.yml    # rendered by Ansible, never edited manually
│   ├── .env                  # rendered by Ansible from vault variables
│   └── data/                 # persistent volumes (bind mounts)
│       └── ...
```

All services live under `/opt/services/`. The path is defined in
`group_vars/all/vars.yml` as `services__base_dir`.

## Service-role standard

**Every service has its own self-contained role** — one service, one role. Shared
roles serving multiple services are no longer used (see "Why not a shared engine"
below). Each service role contains a standard set of files:

| File | Purpose |
|---|---|
| `tasks/main.yml` | The standard deploy mechanics (below) |
| `templates/docker-compose.yml.j2` | The Compose definition |
| `templates/env.j2` | `.env` rendered from vault variables |
| `defaults/main.yml` | Tuneables, `rolename__` namespace |
| `README.md` | Purpose, variables, usage (role convention) |
| `SECURITY.md` | Per-service security record — see ADR-002 and `docs/security/service-security-template.md` |
| `meta/main.yml`, `molecule/default/` | Metadata + Debian 13 test scenario |

### Standard deploy mechanics

Every service role's `tasks/main.yml` follows the same sequence, so all roles are
uniform and predictable:

1. Create `/opt/services/<service>/` directory
2. Render `docker-compose.yml` from `templates/docker-compose.yml.j2`
3. Render `.env` from `templates/env.j2` (secrets from vault variables)
4. Run `docker compose up -d --remove-orphans` via `ansible.builtin.command`
5. Optionally run `docker compose pull` before up (controlled by a variable)

### Why not a shared engine

A shared `compose_service` engine role — service roles delegating the mechanics to
one place — is **intentionally not built**. Duplicating the ~5 standard tasks per
role is accepted in favour of legible, self-contained roles a reader can understand
without indirection, and AI authorship makes the duplication cheap to generate
uniformly from this standard.

**Revisit trigger:** extract a shared engine role if maintaining the duplicated
mechanics across service roles becomes painful — a pattern change that means editing
many roles, or drift between them that this standard alone isn't preventing.

## Docker daemon configuration

Managed by the `docker_host` role. Key settings:

- `"log-driver": "json-file"` with size limits (prevents disk exhaustion)
- `"iptables": false` — firewall managed entirely by nftables (see ADR-002)
- TCP socket disabled — Unix socket only (`/var/run/docker.sock`)
- User namespace remapping: evaluated per use case, not enabled by default

## Networking

- Each service Compose file defines its own named network(s)
- Services that need to communicate are placed on a shared named network
  defined in a dedicated `docker-compose.networks.yml` (if cross-service
  networking is needed on a host)
- External port publishing is explicit and matches nftables rules

## Image management

- Image pinning follows the tiered model in ADR-011: **stateful** services pin
  `tag@digest` (readable tag + integrity digest); **stateless** services use rolling
  tags (`latest`/`stable`), refreshed deliberately and watched by DIUN
- Bare `latest` is therefore acceptable only on the stateless tier; the stateful tier
  is always pinned
- Image updates are a deliberate operation: update the tag/digest variable, run deploy

## Persistent data

- Bind mounts preferred over named volumes for data that must be backed up
- All bind mount paths are under `/opt/services/<name>/data/`
- Backup strategy is defined separately (not in scope of this repo)

## Decision

Docker Compose was chosen over Kubernetes/Swarm because:
- Appropriate complexity level for 2–5 hosts with independent service sets
- Compose files are human-readable and easily auditable
- No distributed state to manage
- Straightforward to back up and restore
-												Add architecture decision records and runbooks

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

											
										
										
											2026-05-30 14:10:01 +02:00
+								# ADR-004 — Docker and Compose service model
 								## Context
 								All services run as Docker containers managed via Docker Compose. This document
 								defines how services are structured, deployed, and maintained.
 								## Core principles
 								- **No hand-edited files on hosts**: all Compose files are rendered by Ansible
 								  from Jinja2 templates. If a file exists on a host, it was put there by Ansible.
 								- **Compose per service**: each service (or tightly coupled service group) gets
 								  its own Compose file and directory under a standard path.
 								- **Variables drive differences**: the same template renders differently per host
 								  via `group_vars` and `host_vars`. No host-specific templates.
 								## Directory layout on hosts
 								```
 								/opt/services/
 								├── servicename/
 								│   ├── docker-compose.yml    # rendered by Ansible, never edited manually
 								│   ├── .env                  # rendered by Ansible from vault variables
 								│   └── data/                 # persistent volumes (bind mounts)
 								│       └── ...
 								```
 								All services live under `/opt/services/`. The path is defined in
 								`group_vars/all/vars.yml` as `services__base_dir`.
-												Add per-service SECURITY.md convention; one role per service

Revise ADR-004 to a service-role standard: every service is its own
self-contained role with a required file set including SECURITY.md, uniform
deploy mechanics, and a deferred shared-engine option (with revisit trigger)
recorded in the ADR.

Add the per-service security record:
- docs/security/service-security-template.md — canonical SECURITY.md template
  (exposure, checklist status, service-specific hardening, residual risks)
- roles/<service>/SECURITY.md is where each service records how it meets the bar;
  /security-review aggregates roles/*/SECURITY.md and cross-checks against config
- service-checklist.md noted as the generic bar the record answers

Wire-up: new-role runbook step writes SECURITY.md from the template; ADR-002
governance bullet points at it; CLAUDE.md role conventions require it and mandate
one-role-per-service; STATUS records the convention as defined-not-yet-applied.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

											
										
										
											2026-06-04 16:09:33 +02:00
+								## Service-role standard
-												Add architecture decision records and runbooks

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

											
										
										
											2026-05-30 14:10:01 +02:00
-												Add per-service SECURITY.md convention; one role per service

Revise ADR-004 to a service-role standard: every service is its own
self-contained role with a required file set including SECURITY.md, uniform
deploy mechanics, and a deferred shared-engine option (with revisit trigger)
recorded in the ADR.

Add the per-service security record:
- docs/security/service-security-template.md — canonical SECURITY.md template
  (exposure, checklist status, service-specific hardening, residual risks)
- roles/<service>/SECURITY.md is where each service records how it meets the bar;
  /security-review aggregates roles/*/SECURITY.md and cross-checks against config
- service-checklist.md noted as the generic bar the record answers

Wire-up: new-role runbook step writes SECURITY.md from the template; ADR-002
governance bullet points at it; CLAUDE.md role conventions require it and mandate
one-role-per-service; STATUS records the convention as defined-not-yet-applied.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

											
										
										
											2026-06-04 16:09:33 +02:00
+								**Every service has its own self-contained role** — one service, one role. Shared
 								roles serving multiple services are no longer used (see "Why not a shared engine"
 								below). Each service role contains a standard set of files:
-												Add architecture decision records and runbooks

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

											
										
										
											2026-05-30 14:10:01 +02:00
-												Add per-service SECURITY.md convention; one role per service

Revise ADR-004 to a service-role standard: every service is its own
self-contained role with a required file set including SECURITY.md, uniform
deploy mechanics, and a deferred shared-engine option (with revisit trigger)
recorded in the ADR.

Add the per-service security record:
- docs/security/service-security-template.md — canonical SECURITY.md template
  (exposure, checklist status, service-specific hardening, residual risks)
- roles/<service>/SECURITY.md is where each service records how it meets the bar;
  /security-review aggregates roles/*/SECURITY.md and cross-checks against config
- service-checklist.md noted as the generic bar the record answers

Wire-up: new-role runbook step writes SECURITY.md from the template; ADR-002
governance bullet points at it; CLAUDE.md role conventions require it and mandate
one-role-per-service; STATUS records the convention as defined-not-yet-applied.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

											
										
										
											2026-06-04 16:09:33 +02:00
+								| File | Purpose |
 								|---|---|
 								| `tasks/main.yml` | The standard deploy mechanics (below) |
 								| `templates/docker-compose.yml.j2` | The Compose definition |
 								| `templates/env.j2` | `.env` rendered from vault variables |
 								| `defaults/main.yml` | Tuneables, `rolename__` namespace |
 								| `README.md` | Purpose, variables, usage (role convention) |
 								| `SECURITY.md` | Per-service security record — see ADR-002 and `docs/security/service-security-template.md` |
 								| `meta/main.yml`, `molecule/default/` | Metadata + Debian 13 test scenario |
 								### Standard deploy mechanics
 								Every service role's `tasks/main.yml` follows the same sequence, so all roles are
 								uniform and predictable:
 . Create `/opt/services/<service>/` directory
 . Render `docker-compose.yml` from `templates/docker-compose.yml.j2`
 . Render `.env` from `templates/env.j2` (secrets from vault variables)
 . Run `docker compose up -d --remove-orphans` via `ansible.builtin.command`
 . Optionally run `docker compose pull` before up (controlled by a variable)
 								### Why not a shared engine
 								A shared `compose_service` engine role — service roles delegating the mechanics to
 								one place — is **intentionally not built**. Duplicating the ~5 standard tasks per
 								role is accepted in favour of legible, self-contained roles a reader can understand
 								without indirection, and AI authorship makes the duplication cheap to generate
 								uniformly from this standard.
 								**Revisit trigger:** extract a shared engine role if maintaining the duplicated
 								mechanics across service roles becomes painful — a pattern change that means editing
 								many roles, or drift between them that this standard alone isn't preventing.
-												Add architecture decision records and runbooks

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

											
										
										
											2026-05-30 14:10:01 +02:00
 								## Docker daemon configuration
 								Managed by the `docker_host` role. Key settings:
 								- `"log-driver": "json-file"` with size limits (prevents disk exhaustion)
 								- `"iptables": false` — firewall managed entirely by nftables (see ADR-002)
 								- TCP socket disabled — Unix socket only (`/var/run/docker.sock`)
 								- User namespace remapping: evaluated per use case, not enabled by default
 								## Networking
 								- Each service Compose file defines its own named network(s)
 								- Services that need to communicate are placed on a shared named network
 								  defined in a dedicated `docker-compose.networks.yml` (if cross-service
 								  networking is needed on a host)
 								- External port publishing is explicit and matches nftables rules
 								## Image management
-												Reconcile image pinning to a tiered tag@digest rule

Resolve the conflict between ADR-011 (tags-not-digests) and the security work
(digest pinning) with one coherent rule that respects ADR-011's stateless/stateful
split:

- Stateful → pin `tag@digest` (readable tag + integrity digest): legible diffs AND
  tamper-evidence. Snapshots cover broken updates; the digest covers swapped images.
- Stateless → rolling tags (latest/stable); digest-pinning would defeat the rolling
  design. Integrity rests on official/verified images + disposability.

Aligned across ADR-011 (decision 2), ADR-004 (image management), ADR-002
(supply-chain row), accepted-risk R1, the service checklist, and TODO 15.6.
TODO 16.7 marked decided.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

											
										
										
											2026-06-04 19:21:36 +02:00
+								- Image pinning follows the tiered model in ADR-011: **stateful** services pin
 								  `tag@digest` (readable tag + integrity digest); **stateless** services use rolling
 								  tags (`latest`/`stable`), refreshed deliberately and watched by DIUN
 								- Bare `latest` is therefore acceptable only on the stateless tier; the stateful tier
 								  is always pinned
 								- Image updates are a deliberate operation: update the tag/digest variable, run deploy
-												Add architecture decision records and runbooks

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

											
										
										
											2026-05-30 14:10:01 +02:00
 								## Persistent data
 								- Bind mounts preferred over named volumes for data that must be backed up
 								- All bind mount paths are under `/opt/services/<name>/data/`
 								- Backup strategy is defined separately (not in scope of this repo)
 								## Decision
 								Docker Compose was chosen over Kubernetes/Swarm because:
 								- Appropriate complexity level for 2–5 hosts with independent service sets
 								- Compose files are human-readable and easily auditable
 								- No distributed state to manage
 								- Straightforward to back up and restore