Revise ADR-004 to a service-role standard: every service is its own
self-contained role with a required file set including SECURITY.md, uniform
deploy mechanics, and a deferred shared-engine option (with revisit trigger)
recorded in the ADR.
Add the per-service security record:
- docs/security/service-security-template.md — canonical SECURITY.md template
(exposure, checklist status, service-specific hardening, residual risks)
- roles/<service>/SECURITY.md is where each service records how it meets the bar;
/security-review aggregates roles/*/SECURITY.md and cross-checks against config
- service-checklist.md noted as the generic bar the record answers
Wire-up: new-role runbook step writes SECURITY.md from the template; ADR-002
governance bullet points at it; CLAUDE.md role conventions require it and mandate
one-role-per-service; STATUS records the convention as defined-not-yet-applied.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Walked the seeded accepted-risk register (R1-R4) and turned inherited gaps into
deliberate decisions:
- Supply chain (R1): tightened to required baseline hygiene (digest pinning,
official/verified images); active scanning deferred — stays an accepted risk
- CIS (R2): adopted as a positive decision — CIS Debian L1+L2 (base role) + CIS
Docker (docker_host + service checklist); app layer via the checklist
- SELinux/AppArmor (R3): AppArmor becomes a baseline control (CIS-enforced);
register keeps a clean "no SELinux" accept
- IDS (R4): adopt AIDE (baseline via CIS) + Suricata on OPNsense + active alerting
Register shrinks from 4 inherited gaps to 2 deliberate accepts. ADR-002 gains a
Hardening standard section; STATUS + TODO 15 track the (unbuilt) implementation,
including the CIS L2 partition impact on VM provisioning (ADR-006).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add a managerial security frame on top of the host baseline: explicit threat
model (opportunistic external, lateral movement/blast radius, operator/agent
error; supply chain accepted-lower-priority), security principles, and four
governance mechanisms that ADR-002 establishes and links out to:
- docs/security/service-checklist.md — per-service security bar (referenced
from the new-role runbook)
- docs/security/accepted-risks.md — living accepted-risk register (R1-R4)
- planned /security-review skill (TODO 8.5)
- agent guardrails in CLAUDE.md "what Claude must not do"
STATUS.md records the frame as present (manual enforcement) and /security-review
as planned-not-built.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Forgejo origin is trunk-based with no merge-request gate, so the
finishing-a-development-branch "open a PR" option doesn't apply — merge
locally then push. Also carries earlier uncommitted FRICTION.md edits
(emphasis normalization + 2026-05-31 ADR-status entry).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Final-review minor: the /capacity-review skill overwrites a latest.md
pointer alongside the dated report; record that in the ADR.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Adds gather_usage() (stubbed, returns available:false), known_hostnames()
with graceful degradation when terraform/ansible-inventory are absent,
_run_json() helper, and main() that parses reference.md and emits JSON.
Three new TDD tests (12 total, all passing). Script exits 0 with valid
JSON even when no cluster is provisioned.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Implements the parse_table() function and pytest test harness for the
capacity-scan script. Tests cover header matching and graceful empty
return when the required header is absent.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Brainstormed design for docs/hardware/reference.md (physical compute +
network gear + workload placement intent), a stdlib-only capacity-scan.py,
and an on-demand /capacity-review skill that reports to docs/hardware/reviews/.
Mirrors the repo-scan -> /review-repo -> docs/reviews triad.
TODO additions: schedule /capacity-review later and decide its usage-stats
source (Proxmox RRD vs the Prometheus/Loki/Grafana/Alloy stack) before
building any hook (8.4); reevaluate the stdlib-only script policy (#14).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Two project hooks (deny-only, fail open): block Write/Edit of generated
inventories/<env>/hosts.yml, and block git commit when the rbw vault agent is
locked. Both pipe-tested across all paths. Activate with a Claude Code restart
(the watcher only tracks settings.json present at session start).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
docs/FRICTION.md: a running log of friction/gotchas/recurring-fixes/unused tooling,
seeded with this session's real signals — raw material for the periodic kaizen
review. docs/TODO.md: schedule building /retro in ~1 week, and record the Claude-setup
decision. (Also carries your earlier backlog edits.)
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
ADR-010: API tokens as least-privilege managed secrets, declarative-first (no
click-ops), automation boundary, planned trunk-based CI. CLAUDE.md/AGENTS.md:
check 'rbw unlocked' before vault-dependent tasks (incl. commits) rather than
failing partway.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Forgejo's /raw/ API is read-only so it cannot serve as a Terraform HTTP state
backend. Switch both envs to local state on the control node (ADR-006); remove
the dead TF_HTTP_* credential hints.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The mkdir used shell brace expansion {tasks,handlers,...}, which /bin/sh (dash)
does not support, so new-role created one literally-named dir and then errored.
make new-role had never worked on this host. Use explicit mkdir paths.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
R9: pass vlan_tag (default 20 = srv VLAN, ADR-007) from both envs to the
proxmox_vm module so VMs are tagged, not on untagged vmbr0. R11: make new-role
now sed-substitutes ROLE_NAME_PLACEHOLDER so scaffolded molecule converge works
out of the box.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
R6/R7: ADR-003 & ADR-008 CI pipelines rewritten trunk-based (push to main ->
test -> staging -> [manual gate] production); CLAUDE.md no longer forbids pushing
to main. R8: STATUS/roles-README/site.yml now say base & docker_host are not built
(not in git), so a clean clone errors. R15/R16: ADR-001 table flagged as intended
design; dropped the unbuilt 'monitoring agent' from the baseline.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Point ADR-005, the new-host runbook, CONTRIBUTING, and AGENTS at the
rbw/Vaultwarden flow instead of a .vault_pass file. Also record the cron-section
idea in docs/TODO.md.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
First /review-repo run on boma. Hardened repo-scan.py (no TODO.md/prose false
positives). Applied 7 safe fixes (DNS staleness x2, STATUS factual correction,
hosts.yml path generalisation, trunk-based wording x2, scripts/README). Recorded
the run and 17 open findings in docs/reviews/2026-05-30-*.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Match the uppercase convention of the other top-level docs; includes the new
Scheduled-work and sanity-check items, and repoints references in STATUS.md and
the /review-repo command.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
New on-demand repo audit: scripts/repo-scan.py does the cheap deterministic
checks (markers, broken refs, unencrypted vaults) and inventory; the command
fans out judgement reviewers across four dimensions, applies only safe/obvious
fixes, and writes a tracked report to docs/reviews/. Cron + email deferred.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Broaden the intro beyond Ansible (Terraform + Ansible), state the
infrastructure-not-personal-devices scope, and explain the Swahili name.
Also replace the stale .vault_pass quick-start step with 'rbw unlock'.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Master vault password is fetched from Vaultwarden via the rbw agent
(scripts/vault-pass-client.sh, wired as vault_password_file) instead of a
plaintext .vault_pass. Vault secrets use a nested vault.<service>.<key> map.
Encrypted vault.yml files are excluded from lint. Includes the host rename in
Makefile and STATUS.md.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>