Compare commits
No commits in common. "19dd89b8758e60b754c0c394519df89f678520b6" and "c57910eda87d97f3710806c636e4d2f4a863d609" have entirely different histories.
19dd89b875
...
c57910eda8
7 changed files with 24 additions and 245 deletions
|
|
@ -154,10 +154,6 @@ Single-contributor, trunk-based (no merge requests / approval gates):
|
||||||
- Edit vault-encrypted files directly — decrypt first, re-encrypt after
|
- Edit vault-encrypted files directly — decrypt first, re-encrypt after
|
||||||
- Force-push or rewrite already-pushed history on `main`
|
- Force-push or rewrite already-pushed history on `main`
|
||||||
- Add a collection to `requirements.yml` without a specific module need in existing role tasks
|
- Add a collection to `requirements.yml` without a specific module need in existing role tasks
|
||||||
- Open a firewall port anywhere but the `group_vars` firewall definitions — never ad-hoc on a host (ADR-002)
|
|
||||||
- Disable or weaken a baseline control from ADR-002 (SSH hardening, nftables default-deny, fail2ban, auditd)
|
|
||||||
- Expose a service to the LAN/WAN without it sitting behind the reverse proxy with authentication (ADR-002)
|
|
||||||
- Deploy a service that hasn't cleared `docs/security/service-checklist.md` (record any deviation in `docs/security/accepted-risks.md`)
|
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
|
|
@ -166,9 +162,7 @@ Single-contributor, trunk-based (no merge requests / approval gates):
|
||||||
| Topic | File |
|
| Topic | File |
|
||||||
|------------------------|---------------------------------------|
|
|------------------------|---------------------------------------|
|
||||||
| Architecture overview | `docs/decisions/001-architecture.md` |
|
| Architecture overview | `docs/decisions/001-architecture.md` |
|
||||||
| Security baseline & strategy | `docs/decisions/002-security.md` |
|
| Security baseline | `docs/decisions/002-security.md` |
|
||||||
| Accepted security risks | `docs/security/accepted-risks.md` |
|
|
||||||
| Per-service security checklist | `docs/security/service-checklist.md` |
|
|
||||||
| Toolchain choices | `docs/decisions/003-toolchain.md` |
|
| Toolchain choices | `docs/decisions/003-toolchain.md` |
|
||||||
| Docker & Compose model | `docs/decisions/004-docker-model.md` |
|
| Docker & Compose model | `docs/decisions/004-docker-model.md` |
|
||||||
| Bootstrapping hosts | `docs/decisions/005-bootstrapping.md` |
|
| Bootstrapping hosts | `docs/decisions/005-bootstrapping.md` |
|
||||||
|
|
|
||||||
|
|
@ -23,7 +23,6 @@ _Last reviewed: 2026-05-30._
|
||||||
| Terraform HCL (`terraform/`) | Written (proxmox VM module + envs) — but never run; see below |
|
| Terraform HCL (`terraform/`) | Written (proxmox VM module + envs) — but never run; see below |
|
||||||
| `docs/hardware/reference.md` + `scripts/capacity-scan.py` | Present — reference doc (skeleton until real hardware) + stdlib scan; emits capacity JSON |
|
| `docs/hardware/reference.md` + `scripts/capacity-scan.py` | Present — reference doc (skeleton until real hardware) + stdlib scan; emits capacity JSON |
|
||||||
| `/capacity-review` | Works — on-demand capacity evaluation → `docs/hardware/reviews/`. Intent-based (no live usage yet) |
|
| `/capacity-review` | Works — on-demand capacity evaluation → `docs/hardware/reviews/`. Intent-based (no live usage yet) |
|
||||||
| ADR-002 security strategy + `docs/security/{accepted-risks,service-checklist}.md` | Present — threat model, principles, governance frame; checklist + risk register are docs, enforced manually in review |
|
|
||||||
|
|
||||||
## Scaffolded but empty — NOT implemented
|
## Scaffolded but empty — NOT implemented
|
||||||
|
|
||||||
|
|
@ -48,9 +47,6 @@ So `make deploy PLAYBOOK=site` currently **fails** on a clean clone — the `bas
|
||||||
| Per-service roles | ADR-004 | Model defined; no service roles built |
|
| Per-service roles | ADR-004 | Model defined; no service roles built |
|
||||||
| Forgejo Actions CI | ADR-003 / ADR-008 | Remote is live (pushed); Actions/`act_runner` pipeline not yet built |
|
| Forgejo Actions CI | ADR-003 / ADR-008 | Remote is live (pushed); Actions/`act_runner` pipeline not yet built |
|
||||||
| Live usage stats for `/capacity-review` | ADR-012 / TODO 8.4 | `gather_usage()` stubbed; source undecided (Proxmox RRD vs PLG stack); needs the cluster |
|
| Live usage stats for `/capacity-review` | ADR-012 / TODO 8.4 | `gather_usage()` stubbed; source undecided (Proxmox RRD vs PLG stack); needs the cluster |
|
||||||
| `/security-review` skill | ADR-002 / TODO 8.5 | Periodic posture re-check + accepted-risk re-challenge; planned, not built |
|
|
||||||
| CIS hardening (Debian L1+L2 + Docker) | ADR-002 / TODO 15 | Implemented by the (unbuilt) `base`/`docker_host` roles; brings AppArmor + AIDE as baseline. L2 partitions affect VM provisioning (ADR-006) |
|
|
||||||
| Network IDS + security alerting | ADR-002 / TODO 15 | Suricata on OPNsense + AIDE/`auditd`/`fail2ban` alerting into the monitoring stack; not built |
|
|
||||||
|
|
||||||
## Keeping this honest
|
## Keeping this honest
|
||||||
|
|
||||||
|
|
|
||||||
26
docs/TODO.md
26
docs/TODO.md
|
|
@ -60,12 +60,6 @@
|
||||||
Prometheus/Loki/Grafana/Grafana-Alloy stack we will likely set up anyway
|
Prometheus/Loki/Grafana/Grafana-Alloy stack we will likely set up anyway
|
||||||
(richer, per-process, but more to run) — see TODO 3.6. Don't build the
|
(richer, per-process, but more to run) — see TODO 3.6. Don't build the
|
||||||
Proxmox-RRD hook before settling this, to avoid throwaway work.
|
Proxmox-RRD hook before settling this, to avoid throwaway work.
|
||||||
5. Build a `/security-review` skill (sibling to `/review-repo`): re-check the
|
|
||||||
security posture against ADR-002, surface drift, and re-challenge the
|
|
||||||
accepted-risk register (`docs/security/accepted-risks.md`). Could pair a
|
|
||||||
deterministic pre-scan (undeclared open ports, disabled baseline controls,
|
|
||||||
world-readable secrets, services not behind auth) with a judgement pass.
|
|
||||||
Open question: standalone, or folded into the kaizen `/retro` (item 11)?
|
|
||||||
9. Should we make a basic function so that tools (and AI) can send messages to the user - email, matrix or ntfy?
|
9. Should we make a basic function so that tools (and AI) can send messages to the user - email, matrix or ntfy?
|
||||||
|
|
||||||
10. **Claude setup** — DECIDED: brainstorm for intent, capture as ADRs (skip plan
|
10. **Claude setup** — DECIDED: brainstorm for intent, capture as ADRs (skip plan
|
||||||
|
|
@ -74,9 +68,6 @@
|
||||||
1. Policy for how we collaborate with references to baobabAnsibleV4 without misusing it.
|
1. Policy for how we collaborate with references to baobabAnsibleV4 without misusing it.
|
||||||
2. Policy for how we write key documents like ADRs.
|
2. Policy for how we write key documents like ADRs.
|
||||||
3. Further development on how we we collaborate on designing the foundation for the project - seperate from how we implement new containers etc.
|
3. Further development on how we we collaborate on designing the foundation for the project - seperate from how we implement new containers etc.
|
||||||
4. How do we make sure agents always use the latest official documentation for the technologies etc. we use?
|
|
||||||
5. Always subagent driven?
|
|
||||||
6. When AI deploys, i.e. runs playbooks etc., should we make a methodology so that it does not have to poll all the time or review all the output. Perhaps something about the MAKE method could provide only the relevant feedback?
|
|
||||||
|
|
||||||
11. **Kaizen loop** — set up ~2026-06-06 (one week from now).
|
11. **Kaizen loop** — set up ~2026-06-06 (one week from now).
|
||||||
1. Build `/retro`: reads `docs/FRICTION.md` + recurring `/review-repo`
|
1. Build `/retro`: reads `docs/FRICTION.md` + recurring `/review-repo`
|
||||||
|
|
@ -97,20 +88,3 @@
|
||||||
whether selectively allowing libraries (e.g. PyYAML — already present via
|
whether selectively allowing libraries (e.g. PyYAML — already present via
|
||||||
Ansible) is a better fit in general: weigh the parsing-correctness win
|
Ansible) is a better fit in general: weigh the parsing-correctness win
|
||||||
against losing zero-setup portability. Decide a clear rule and record it.
|
against losing zero-setup portability. Decide a clear rule and record it.
|
||||||
|
|
||||||
15. **Security hardening implementation** — build out the ADR-002 hardening standard.
|
|
||||||
1. Implement the CIS Debian Benchmark **Level 1 + Level 2** in the `base` role
|
|
||||||
(local tasks; CIS / `dev-sec` as reference only — no Galaxy roles). Includes
|
|
||||||
AppArmor (enforce mode) and AIDE file-integrity.
|
|
||||||
2. Implement the CIS Docker Benchmark: daemon/engine settings in `docker_host`;
|
|
||||||
per-container settings enforced via `docs/security/service-checklist.md`.
|
|
||||||
3. VM disk layout for CIS L2: separate `/tmp`, `/var`, `/var/log`, `/home`
|
|
||||||
partitions with `nodev,nosuid,noexec` — a Terraform/cloud-init concern
|
|
||||||
(ADR-006). Decide the template layout **before** provisioning, since it is
|
|
||||||
painful to retrofit.
|
|
||||||
4. Network IDS: enable Suricata on OPNsense (IDS first; IPS later?).
|
|
||||||
5. Active security alerting: wire AIDE, `auditd`, `fail2ban`, and Suricata into
|
|
||||||
the Loki/Grafana alerting stack (ties to 3.6).
|
|
||||||
6. Supply-chain hygiene: enforce image digest pinning + official/verified images
|
|
||||||
via the service checklist; revisit active scanning (Trivy/Grype) once a
|
|
||||||
triage stack exists (accepted-risk R1).
|
|
||||||
|
|
|
||||||
|
|
@ -1,61 +1,24 @@
|
||||||
# ADR-002 — Security baseline and strategy
|
# ADR-002 — Security baseline
|
||||||
|
|
||||||
## Context
|
## Context
|
||||||
|
|
||||||
Security here is not a single control but the sum of several combined efforts —
|
Every managed host must reach a defined security baseline before any services
|
||||||
host hardening, network segmentation, secrets handling, supply-chain hygiene, and
|
are deployed. This baseline is applied by the `base` role and is non-negotiable —
|
||||||
disciplined automation. This ADR is the frame that organizes them: it records the
|
it runs first, on every host, every time.
|
||||||
**threat model** we design against, the **principles** every control serves, the
|
|
||||||
host-level **baseline** the `base` role enforces, and the **governance** that keeps
|
|
||||||
security sharp as the homelab grows.
|
|
||||||
|
|
||||||
The goal is a principled, maintainable posture for a homelab with some
|
The goal is a principled, maintainable baseline appropriate for a homelab with
|
||||||
public-facing services — effective against a realistic threat model, not a
|
some public-facing services — not a compliance exercise.
|
||||||
compliance exercise.
|
|
||||||
|
|
||||||
Related decisions: network segmentation (ADR-007), secrets structure (ADR-003),
|
## Baseline components
|
||||||
per-service roles (ADR-004), CI secret-scanning (ADR-010).
|
|
||||||
|
|
||||||
## Threat model
|
### Access & authentication
|
||||||
|
|
||||||
What we deliberately design against — and, just as importantly, what we do not:
|
|
||||||
|
|
||||||
| Threat | In scope? | What it drives |
|
|
||||||
|---|---|---|
|
|
||||||
| **Opportunistic external** — bots scanning, credential stuffing, mass-exploiting known CVEs in exposed services | Yes — primary | SSH key-only + fail2ban, deny-by-default firewall, security auto-patching, minimal attack surface, services behind a reverse proxy with auth |
|
|
||||||
| **Lateral movement / blast radius** — assume one service *is* compromised; limit how far it spreads | Yes | VLAN segmentation (ADR-007), least-privilege containers, no host network mode, per-service isolation, no shared credentials |
|
|
||||||
| **Operator / agent error** — accidental secret leak, misconfiguration, or an AI agent making an unsafe change | Yes | Vault + gitleaks, declarative firewall (no ad-hoc ports), review gates, agent guardrails (below), pre-commit hooks |
|
|
||||||
| **Supply chain** — compromised images, base images, dependencies, collections | Acknowledged, lower priority | Baseline hygiene required: image digest pinning + prefer official/verified images (ADR-011, service checklist), gitleaks. Active vuln scanning deferred — accepted risk |
|
|
||||||
| **Targeted / physical** — a determined adversary specifically after this homelab, or physical device access | Out of scope | Not designed against at this scale; revisit if the threat model changes |
|
|
||||||
|
|
||||||
Supply chain is consciously deprioritized, not forgotten — see
|
|
||||||
`docs/security/accepted-risks.md`.
|
|
||||||
|
|
||||||
## Security principles
|
|
||||||
|
|
||||||
Every control below should trace back to one of these:
|
|
||||||
|
|
||||||
- **Defense in depth** — no single control is load-bearing; layers compensate.
|
|
||||||
- **Least privilege** — accounts, containers, and automation get the minimum they need.
|
|
||||||
- **Deny / secure by default** — closed unless explicitly opened; safe defaults.
|
|
||||||
- **Contain the blast radius** — segment and isolate so one compromise isn't total.
|
|
||||||
- **Automated & reproducible** — the baseline is reached by Ansible, never by hand.
|
|
||||||
- **Explicit & revisitable** — decisions and accepted risks are written down and
|
|
||||||
re-challenged, not left implicit.
|
|
||||||
|
|
||||||
## Baseline controls
|
|
||||||
|
|
||||||
Applied by the `base` role, non-negotiable — it runs first, on every host, every
|
|
||||||
time. Each heading tags the threat(s) it primarily serves.
|
|
||||||
|
|
||||||
### Access & authentication — *opportunistic, agent error*
|
|
||||||
|
|
||||||
- SSH key authentication only — password auth disabled
|
- SSH key authentication only — password auth disabled
|
||||||
- Root login disabled — `PermitRootLogin no`
|
- Root login disabled — `PermitRootLogin no`
|
||||||
- Dedicated `ansible` user with locked-down sudo (NOPASSWD for automation)
|
- Dedicated `ansible` user with locked-down sudo (NOPASSWD for automation)
|
||||||
- No shared user accounts — per-person SSH keys in `group_vars/all/vars.yml`
|
- No shared user accounts — per-person SSH keys in `group_vars/all/vars.yml`
|
||||||
|
|
||||||
### Firewall — *opportunistic, blast radius, agent error*
|
### Firewall
|
||||||
|
|
||||||
- `nftables` (native on Debian 13, replaces iptables)
|
- `nftables` (native on Debian 13, replaces iptables)
|
||||||
- Default policy: deny inbound, allow established/related, allow loopback
|
- Default policy: deny inbound, allow established/related, allow loopback
|
||||||
|
|
@ -67,45 +30,29 @@ time. Each heading tags the threat(s) it primarily serves.
|
||||||
> This is addressed by setting `"iptables": false` in Docker daemon config and managing
|
> This is addressed by setting `"iptables": false` in Docker daemon config and managing
|
||||||
> all rules via nftables explicitly. See `docs/decisions/004-docker-model.md`.
|
> all rules via nftables explicitly. See `docs/decisions/004-docker-model.md`.
|
||||||
|
|
||||||
### Intrusion deterrence — *opportunistic*
|
### Intrusion deterrence
|
||||||
|
|
||||||
- `fail2ban` monitoring SSH (and optionally reverse proxy logs)
|
- `fail2ban` monitoring SSH (and optionally reverse proxy logs)
|
||||||
- Configured to ban after 5 failed attempts, 1-hour ban
|
- Configured to ban after 5 failed attempts, 1-hour ban
|
||||||
|
|
||||||
### Updates — *opportunistic*
|
### Updates
|
||||||
|
|
||||||
- `unattended-upgrades` enabled for **security patches only**
|
- `unattended-upgrades` enabled for **security patches only**
|
||||||
- Full system upgrades triggered deliberately via Ansible (`make deploy PLAYBOOK=upgrade`)
|
- Full system upgrades triggered deliberately via Ansible (`make deploy PLAYBOOK=upgrade`)
|
||||||
- No automatic reboots — reboots are a conscious operational decision
|
- No automatic reboots — reboots are a conscious operational decision
|
||||||
|
|
||||||
### Minimal attack surface — *opportunistic, blast radius*
|
### Minimal attack surface
|
||||||
|
|
||||||
- No unnecessary packages installed
|
- No unnecessary packages installed
|
||||||
- Docker daemon TCP socket disabled — Unix socket only
|
- Docker daemon TCP socket disabled — Unix socket only
|
||||||
- No open ports beyond those explicitly defined in firewall rules
|
- No open ports beyond those explicitly defined in firewall rules
|
||||||
|
|
||||||
### Audit trail — *agent error, blast radius*
|
### Audit trail
|
||||||
|
|
||||||
- `auditd` installed and running with a baseline ruleset
|
- `auditd` installed and running with a baseline ruleset
|
||||||
- Logs shipped to a central location if a log aggregation service is available
|
- Logs shipped to a central location if a log aggregation service is available
|
||||||
|
|
||||||
### Mandatory access control — *blast radius*
|
## Secrets management
|
||||||
|
|
||||||
- **AppArmor** enabled with profiles in enforce mode — Debian-native MAC, default-on,
|
|
||||||
and required by the CIS Debian benchmark. Docker applies its `docker-default`
|
|
||||||
profile to containers; tighter per-service profiles are authored as needed.
|
|
||||||
- **SELinux is not used** — non-native to Debian and redundant with AppArmor
|
|
||||||
(see `docs/security/accepted-risks.md`).
|
|
||||||
|
|
||||||
### File integrity & intrusion detection — *opportunistic, blast radius, agent error*
|
|
||||||
|
|
||||||
- **AIDE** file-integrity monitoring (required by the CIS Debian benchmark) — detects
|
|
||||||
unexpected changes to system files
|
|
||||||
- **Network IDS** — Suricata on OPNsense (planned; see STATUS.md / TODO)
|
|
||||||
- **Active alerting** wires AIDE, `auditd`, `fail2ban`, and Suricata into the
|
|
||||||
monitoring/alerting stack (planned; ties to the Loki/Grafana effort)
|
|
||||||
|
|
||||||
## Secrets management — *agent error, opportunistic*
|
|
||||||
|
|
||||||
- Ansible Vault for all secrets (API keys, passwords, certificates), structured as a
|
- Ansible Vault for all secrets (API keys, passwords, certificates), structured as a
|
||||||
nested `vault.<service>.<key>` map (ADR-003)
|
nested `vault.<service>.<key>` map (ADR-003)
|
||||||
|
|
@ -115,65 +62,15 @@ time. Each heading tags the threat(s) it primarily serves.
|
||||||
`rbw unlock`; nothing decryptable sits at rest in the repo or working tree
|
`rbw unlock`; nothing decryptable sits at rest in the repo or working tree
|
||||||
- See `docs/runbooks/rotate-secrets.md` for `rbw` setup and rotation
|
- See `docs/runbooks/rotate-secrets.md` for `rbw` setup and rotation
|
||||||
|
|
||||||
## Hardening standard
|
## What this baseline does not include
|
||||||
|
|
||||||
The baseline above is implemented to a recognised benchmark rather than ad-hoc:
|
- Full CIS benchmark hardening — adds complexity for marginal gain at this scale
|
||||||
|
- SELinux / AppArmor — not applied by default, revisit if threat model changes
|
||||||
- **Hosts** — the **CIS Debian Benchmark, Levels 1 and 2**, applied by the `base`
|
- Intrusion detection (IDS) — out of scope for now
|
||||||
role. Some L2 items require separate partitions (`/tmp`, `/var`, `/var/log`,
|
|
||||||
`/home`) with restrictive mount options (`nodev,nosuid,noexec`) — that reaches into
|
|
||||||
VM disk layout, a provisioning concern (Terraform / cloud-init, ADR-006), not just
|
|
||||||
the `base` role.
|
|
||||||
- **Container runtime** — the **CIS Docker Benchmark**: daemon/engine settings in the
|
|
||||||
`docker_host` role; per-container run settings (non-root, read-only rootfs, dropped
|
|
||||||
capabilities, no `privileged`, no host namespaces) enforced via
|
|
||||||
`docs/security/service-checklist.md`.
|
|
||||||
- **Application containers** — no CIS benchmark exists for the app long tail
|
|
||||||
(Jellyfin, Nextcloud, Forgejo, …); they are covered by the CIS Docker run settings
|
|
||||||
plus the service checklist plus upstream hardening guidance.
|
|
||||||
|
|
||||||
Hardening controls are **implemented as local roles** (per the no-Galaxy-roles
|
|
||||||
policy, ADR-003), using the CIS benchmarks and community roles (e.g. `dev-sec`) only
|
|
||||||
as reference. Any specific CIS item that proves impractical is exempted into
|
|
||||||
`docs/security/accepted-risks.md` with a rationale — so the register records named
|
|
||||||
exceptions, not a blanket opt-out.
|
|
||||||
|
|
||||||
## Governance
|
|
||||||
|
|
||||||
Security is maintained, not achieved once. This ADR **establishes** four
|
|
||||||
mechanisms; each lives where change is cheap and is linked from here.
|
|
||||||
|
|
||||||
- **Per-service security bar** — every exposed service must clear a defined
|
|
||||||
checklist before deploy (secrets in vault, no default creds, least-privilege /
|
|
||||||
non-root, declared firewall ports, reverse-proxy + auth if exposed). Lives in
|
|
||||||
`docs/security/service-checklist.md`; referenced from `docs/runbooks/new-role.md`.
|
|
||||||
Enforced manually in review today; the planned `/security-review` will automate it.
|
|
||||||
- **Periodic security review** — a recurring review that re-checks posture,
|
|
||||||
surfaces drift, and re-challenges accepted risks. Planned as a `/security-review`
|
|
||||||
skill (sibling to `/review-repo`); see `docs/TODO.md` (Scheduled work). Not built
|
|
||||||
yet — see STATUS.md.
|
|
||||||
- **Accepted-risk register** — the conscious trade-offs we choose to live with, each
|
|
||||||
with rationale and a revisit trigger. Lives in `docs/security/accepted-risks.md`
|
|
||||||
(expected to change; kept out of this ADR so the ADR stays stable).
|
|
||||||
- **Agent / automation guardrails** — what AI agents and automation may do
|
|
||||||
unsupervised vs. what needs a human gate, since operator/agent error is in the
|
|
||||||
threat model. Encoded in `CLAUDE.md` ("What Claude must not do without explicit
|
|
||||||
instruction") and enforced by PreToolUse hooks (generated-file guard, `rbw`
|
|
||||||
pre-flight).
|
|
||||||
|
|
||||||
## Decision
|
## Decision
|
||||||
|
|
||||||
This posture was chosen to be:
|
This baseline was chosen to be:
|
||||||
|
- **Effective** against the realistic threat model (exposed services, shared repo)
|
||||||
- **Effective** against the stated threat model (opportunistic external, lateral
|
- **Maintainable** by a small team without security expertise overhead
|
||||||
movement, operator/agent error)
|
- **Automated** — no manual steps should be needed to reach baseline state
|
||||||
- **Maintainable** by a small team without security-expertise overhead
|
|
||||||
- **Automated** — no manual steps to reach baseline state
|
|
||||||
- **Legible & revisitable** — the threat model, principles, and accepted risks are
|
|
||||||
written down and reviewed over time, not implicit
|
|
||||||
- **Benchmarked** — host and container hardening follow CIS (Debian L1+L2, Docker),
|
|
||||||
not ad-hoc choices
|
|
||||||
|
|
||||||
Out-of-scope items and conscious trade-offs are recorded in
|
|
||||||
`docs/security/accepted-risks.md` rather than here, so this decision record stays
|
|
||||||
stable while the risk posture evolves.
|
|
||||||
|
|
|
||||||
|
|
@ -71,16 +71,7 @@ Fix any lint or test failures before committing.
|
||||||
Add the role to the appropriate playbook in `playbooks/` and add the host group
|
Add the role to the appropriate playbook in `playbooks/` and add the host group
|
||||||
to `inventories/staging/hosts.yml` for integration testing.
|
to `inventories/staging/hosts.yml` for integration testing.
|
||||||
|
|
||||||
### 9. Clear the security checklist (services)
|
### 9. Commit
|
||||||
|
|
||||||
If the role is a **service** — especially one reachable beyond its own host —
|
|
||||||
walk `docs/security/service-checklist.md` and confirm every item passes (secrets
|
|
||||||
in vault, no default creds, least-privilege, declared firewall ports, behind the
|
|
||||||
reverse proxy with auth if exposed). Record any conscious deviation in
|
|
||||||
`docs/security/accepted-risks.md`. This bar is established by ADR-002; enforcement
|
|
||||||
is manual in review today, with the planned `/security-review` to automate it.
|
|
||||||
|
|
||||||
### 10. Commit
|
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
git checkout -b role/<rolename>
|
git checkout -b role/<rolename>
|
||||||
|
|
|
||||||
|
|
@ -1,24 +0,0 @@
|
||||||
# Accepted security risks
|
|
||||||
|
|
||||||
Conscious security trade-offs we are choosing to live with — recorded so "what we
|
|
||||||
are *not* doing" is explicit and revisitable, not forgotten. This register is a
|
|
||||||
**living document**, deliberately kept out of ADR-002 (which records durable
|
|
||||||
decisions) so the ADR stays stable.
|
|
||||||
|
|
||||||
Owned by **ADR-002** (Security baseline and strategy). Re-challenged during the
|
|
||||||
periodic security review (planned `/security-review`; see `docs/TODO.md`).
|
|
||||||
|
|
||||||
**Each entry:** the risk · why we accept it (rationale) · what would make us
|
|
||||||
revisit (trigger).
|
|
||||||
|
|
||||||
| # | Accepted risk | Rationale | Revisit trigger |
|
|
||||||
|---|---|---|---|
|
|
||||||
| R1 | **Active supply-chain scanning deferred** — baseline hygiene *is* required (image digest pinning + prefer official/verified images, ADR-011 / service checklist; gitleaks), but images and dependencies are not actively vulnerability-scanned (Trivy/Grype) or signature-verified | Scanning only pays off with the capacity to triage its output; the realistic threat is opportunistic, not a targeted supply-chain attack | A monitoring/triage stack is live; hosting high-value data/finances for others; a relevant upstream compromise |
|
|
||||||
| R2 | **SELinux not used** — no SELinux mandatory access control | AppArmor — Debian-native and enforced via the CIS baseline — already provides MAC; adding SELinux means two MAC systems, non-native to Debian, for no real gain | A service that ships and requires its own SELinux policy; threat model shifts toward targeted attackers |
|
|
||||||
|
|
||||||
_Last reviewed: 2026-06-04. The prior gaps (full CIS hardening, SELinux/AppArmor,
|
|
||||||
IDS) were re-challenged and **adopted rather than accepted**: CIS Debian L1+L2 + CIS
|
|
||||||
Docker, AppArmor (enforce), AIDE file-integrity, and Suricata network IDS are now
|
|
||||||
part of the security strategy (ADR-002). See STATUS.md / `docs/TODO.md` for build
|
|
||||||
status. As CIS is implemented, any specific item that proves impractical is added
|
|
||||||
here as a named exception._
|
|
||||||
|
|
@ -1,49 +0,0 @@
|
||||||
# Per-service security checklist
|
|
||||||
|
|
||||||
The bar every service (a per-service role — ADR-004) must clear **before deploy**,
|
|
||||||
especially anything reachable beyond its own host. Established by **ADR-002**
|
|
||||||
(Security baseline and strategy); referenced from `docs/runbooks/new-role.md`.
|
|
||||||
Enforced manually in review today; the planned `/security-review` skill (see
|
|
||||||
`docs/TODO.md`) will automate the check.
|
|
||||||
|
|
||||||
Treat each item as must-pass **unless** a deviation is recorded in
|
|
||||||
`docs/security/accepted-risks.md` with a rationale and a revisit trigger.
|
|
||||||
|
|
||||||
## Secrets & credentials
|
|
||||||
|
|
||||||
- [ ] All secrets live in an encrypted `vault.yml` (`vault.<service>.<key>`); none in
|
|
||||||
plaintext files, templates, or Compose env literals
|
|
||||||
- [ ] No default or vendor-shipped credentials remain — admin passwords/tokens are
|
|
||||||
generated and stored in vault
|
|
||||||
- [ ] Nothing secret is baked into an image or committed to git (gitleaks must pass)
|
|
||||||
|
|
||||||
## Least privilege
|
|
||||||
|
|
||||||
- [ ] Container runs as a non-root user where the image supports it
|
|
||||||
- [ ] No `privileged: true` and no host network mode unless explicitly justified
|
|
||||||
- [ ] Only the volumes/paths the service needs are mounted; read-only where possible
|
|
||||||
- [ ] Linux capabilities dropped to what's required (no blanket grants)
|
|
||||||
|
|
||||||
## Network & exposure
|
|
||||||
|
|
||||||
- [ ] Every listening port is declared in `group_vars` firewall definitions — never
|
|
||||||
opened ad-hoc on a host
|
|
||||||
- [ ] The service is not published directly to a LAN/WAN port if it can sit behind the
|
|
||||||
reverse proxy instead
|
|
||||||
- [ ] Anything reachable beyond the `srv` VLAN is behind the reverse proxy **with
|
|
||||||
authentication** (and TLS)
|
|
||||||
- [ ] Inter-service reach follows least privilege — no broad `srv`→`srv` access where a
|
|
||||||
single declared dependency suffices
|
|
||||||
|
|
||||||
## Updates & provenance
|
|
||||||
|
|
||||||
- [ ] Image/source version is pinned (tag or digest), not floating `latest` (ADR-011)
|
|
||||||
- [ ] The update path is known — how this service gets patched
|
|
||||||
|
|
||||||
## Operability (security-adjacent)
|
|
||||||
|
|
||||||
- [ ] Logs go somewhere reviewable (central aggregation when available)
|
|
||||||
- [ ] Backup/restore is covered if the service holds state
|
|
||||||
|
|
||||||
> Deviations are allowed but must be **conscious**: record them in
|
|
||||||
> `docs/security/accepted-risks.md`, don't leave them implicit.
|
|
||||||
Loading…
Add table
Reference in a new issue