2026-05-30 14:10:01 +02:00
|
|
|
|
# CLAUDE.md — Ansible homelab monorepo
|
|
|
|
|
|
|
|
|
|
|
|
This file is read by Claude Code at the start of every session.
|
|
|
|
|
|
Keep it dense and command-focused. Verbose detail lives in `docs/`.
|
|
|
|
|
|
|
|
|
|
|
|
> **Before assuming a role, provider, or pipeline exists, check `STATUS.md`.**
|
|
|
|
|
|
> Much of the design in `docs/decisions/` is intended, not yet built (e.g. the
|
|
|
|
|
|
> `base`/`docker_host` roles are currently empty; Terraform is not `init`ed).
|
|
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
|
|
## Project in one paragraph
|
|
|
|
|
|
|
|
|
|
|
|
Homelab infrastructure automation for a Proxmox cluster running 2–5 Debian 13 VMs.
|
|
|
|
|
|
All hosts share a hardened base configuration. Each host runs a defined set of Docker
|
|
|
|
|
|
services deployed via Compose files rendered from Ansible templates. Ansible runs from
|
|
|
|
|
|
a dedicated control VM. CI runs on Forgejo Actions (self-hosted).
|
|
|
|
|
|
|
|
|
|
|
|
Full design rationale: `docs/decisions/`
|
|
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
|
|
## Key commands
|
|
|
|
|
|
|
|
|
|
|
|
| Action | Command |
|
|
|
|
|
|
|-------------------------------|--------------------------------------------------|
|
|
|
|
|
|
| Lint everything | `make lint` |
|
|
|
|
|
|
| Test a single role | `make test ROLE=<name>` |
|
|
|
|
|
|
| Test all roles | `make test-all` |
|
|
|
|
|
|
| Check mode (dry run) | `make check PLAYBOOK=<name>` |
|
|
|
|
|
|
| Deploy a playbook | `make deploy PLAYBOOK=<name>` |
|
|
|
|
|
|
| Scaffold a new role | `make new-role NAME=<name>` |
|
2026-05-30 18:56:01 +02:00
|
|
|
|
| Review repo for drift/cruft | `/review-repo` (Claude command) |
|
2026-06-01 10:34:38 +02:00
|
|
|
|
| Review hardware capacity | `/capacity-review` (Claude command) |
|
2026-05-30 14:10:01 +02:00
|
|
|
|
| Encrypt a vault file | `make encrypt FILE=<path>` |
|
|
|
|
|
|
| Decrypt a vault file | `make decrypt FILE=<path>` |
|
|
|
|
|
|
| Install Python deps | `make setup` |
|
|
|
|
|
|
| Install Ansible collections | `make collections` |
|
|
|
|
|
|
| Initialise Terraform | `make tf-init [TF_ENV=staging]` |
|
|
|
|
|
|
| Terraform plan | `make tf-plan [TF_ENV=staging]` |
|
|
|
|
|
|
| Terraform apply | `make tf-apply [TF_ENV=staging]` |
|
|
|
|
|
|
| Regenerate Ansible inventory | `make tf-inventory TF_ENV=<staging\|production>` |
|
|
|
|
|
|
|
|
|
|
|
|
**Always `tf-plan` before `tf-apply`. Always `check` before `deploy`. Never skip lint.**
|
|
|
|
|
|
|
|
|
|
|
|
`TF_ENV` defaults to `staging`. Always specify `TF_ENV=production` explicitly for production.
|
|
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
|
|
## Ansible conventions
|
|
|
|
|
|
|
|
|
|
|
|
- **FQCN always**: `ansible.builtin.template`, never `template`
|
|
|
|
|
|
- **Tags**: every task must have at least one tag; playbooks support `--tags` filtering
|
|
|
|
|
|
- **Handlers**: use `listen:` topic strings, not direct name references
|
|
|
|
|
|
- **Variables**: `rolename__varname` double-underscore namespace for role defaults
|
|
|
|
|
|
- **No inline vars in playbooks**: use `group_vars/` or `host_vars/` only
|
|
|
|
|
|
- **Loops**: prefer `loop:` over `with_items:`
|
|
|
|
|
|
- **Conditionals**: prefer `true`/`false` over `yes`/`no`
|
|
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
|
|
## Secrets
|
|
|
|
|
|
|
|
|
|
|
|
- Encrypted files are always named `vault.yml`, sitting alongside `vars.yml`
|
|
|
|
|
|
- Never put plaintext secrets in any file not named `vault.yml`
|
2026-05-30 18:16:35 +02:00
|
|
|
|
- Structure secrets as a nested map `vault.<service>.<key>` (e.g.
|
|
|
|
|
|
`vault.grafana.admin_password`); reference as `{{ vault.grafana.admin_password }}`
|
|
|
|
|
|
- Vault password comes from Vaultwarden via `rbw` (`scripts/vault-pass-client.sh`,
|
|
|
|
|
|
wired as `vault_password_file`). Unlock once per session: `rbw unlock`
|
2026-05-30 21:34:07 +02:00
|
|
|
|
- **Before any vault-dependent task** (`make deploy/check/encrypt/decrypt`, or **any
|
|
|
|
|
|
git commit** — the pre-commit ansible-lint hook decrypts `vault.yml`), run `rbw
|
|
|
|
|
|
unlocked`; if it exits non-zero, ask the user to `rbw unlock` and wait rather than
|
|
|
|
|
|
starting and failing partway. The agent stays unlocked 5h.
|
2026-05-30 14:10:01 +02:00
|
|
|
|
- To edit a vault file: `make decrypt FILE=<path>`, edit, `make encrypt FILE=<path>`
|
|
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
|
|
## Role conventions
|
|
|
|
|
|
|
|
|
|
|
|
- Every role must have `molecule/default/` scenario targeting Debian 13
|
|
|
|
|
|
- Every role must have a populated `README.md`
|
|
|
|
|
|
- Every role must have `meta/main.yml` filled in
|
|
|
|
|
|
- Role names: `snake_case`, descriptive nouns (`base`, `docker_host`, `reverse_proxy`)
|
|
|
|
|
|
- Use `make new-role NAME=<name>` to scaffold — never create role structure by hand
|
|
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
|
|
## Inventory structure
|
|
|
|
|
|
|
|
|
|
|
|
```
|
|
|
|
|
|
inventories/
|
|
|
|
|
|
production/ # live hosts — edit with care
|
|
|
|
|
|
hosts.yml
|
|
|
|
|
|
group_vars/
|
|
|
|
|
|
all/ # applies to every host
|
|
|
|
|
|
vars.yml
|
|
|
|
|
|
vault.yml
|
|
|
|
|
|
docker_hosts/ # hosts running Docker services
|
|
|
|
|
|
proxmox_hosts/ # Proxmox nodes themselves
|
|
|
|
|
|
host_vars/ # per-host overrides
|
|
|
|
|
|
staging/ # safe to run freely
|
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
Host groups: `all`, `control`, `docker_hosts`, `proxmox_hosts`
|
|
|
|
|
|
|
|
|
|
|
|
(`control` holds the one manually-provisioned control node — see ADR-009.)
|
|
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
|
|
## Git conventions
|
|
|
|
|
|
|
|
|
|
|
|
Single-contributor, trunk-based (no merge requests / approval gates):
|
|
|
|
|
|
|
|
|
|
|
|
- `main` is the trunk and must always work — small, safe changes commit straight to it
|
|
|
|
|
|
- Branch for sweeping or AI-driven changes you want to review as one diff or be able
|
|
|
|
|
|
to abandon: `role/<name>`, `fix/<description>`, `feat/<description>`,
|
|
|
|
|
|
`chore/<description>`; merge to `main` when reviewed, then delete the branch
|
|
|
|
|
|
- Run `make lint` (and `make test` for touched roles) before committing
|
|
|
|
|
|
- Commit in logical units; imperative subject ≤72 chars
|
|
|
|
|
|
- AI agents commit their own work in logical units with a `Co-Authored-By` trailer
|
|
|
|
|
|
- Push to the Forgejo `origin` often — it is the off-machine backup
|
|
|
|
|
|
- Never commit secrets; a `vault.yml` must be `$ANSIBLE_VAULT`-encrypted (pre-commit
|
|
|
|
|
|
enforces this, plus gitleaks secret scanning)
|
|
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
|
|
## Dependencies policy
|
|
|
|
|
|
|
|
|
|
|
|
- **No Galaxy roles** — all roles are local; never add a Galaxy role to `requirements.yml`
|
|
|
|
|
|
- **Collections on demand** — only add a collection when a task in a committed role
|
|
|
|
|
|
uses a module from it; add a comment in `requirements.yml` naming the module(s) used
|
|
|
|
|
|
- Full rationale: `docs/decisions/003-toolchain.md` (Collections and roles policy)
|
|
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
|
|
## Terraform conventions
|
|
|
|
|
|
|
|
|
|
|
|
- Terraform owns VM existence only — nothing inside a VM, and no DNS records
|
|
|
|
|
|
- Internal DNS is entirely Ansible (the `dns` role renders the zone from inventory)
|
|
|
|
|
|
- OPNsense is entirely Ansible; do not reach for a Terraform OPNsense provider
|
|
|
|
|
|
- Environments are separate directories (`staging/`, `production/`), not workspaces
|
|
|
|
|
|
- Secrets via `TF_VAR_*` env vars only — never in `.tfvars` files
|
|
|
|
|
|
- `terraform.tfvars.example` is tracked; `terraform.tfvars` is gitignored
|
|
|
|
|
|
- `.terraform.lock.hcl` is tracked (pins provider versions)
|
|
|
|
|
|
- Full rationale: `docs/decisions/006-terraform.md`
|
|
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
|
|
## What Claude must not do without explicit instruction
|
|
|
|
|
|
|
|
|
|
|
|
- Run `make deploy` — always run `make check` first and show output
|
|
|
|
|
|
- Run `make tf-apply` — always run `make tf-plan` first and show output
|
2026-05-30 19:10:58 +02:00
|
|
|
|
- Modify `inventories/<env>/hosts.yml` directly — regenerate via `make tf-inventory`
|
2026-05-30 14:10:01 +02:00
|
|
|
|
- Edit vault-encrypted files directly — decrypt first, re-encrypt after
|
2026-05-30 19:32:37 +02:00
|
|
|
|
- Force-push or rewrite already-pushed history on `main`
|
2026-05-30 14:10:01 +02:00
|
|
|
|
- Add a collection to `requirements.yml` without a specific module need in existing role tasks
|
2026-06-04 14:39:51 +02:00
|
|
|
|
- Open a firewall port anywhere but the `group_vars` firewall definitions — never ad-hoc on a host (ADR-002)
|
|
|
|
|
|
- Disable or weaken a baseline control from ADR-002 (SSH hardening, nftables default-deny, fail2ban, auditd)
|
|
|
|
|
|
- Expose a service to the LAN/WAN without it sitting behind the reverse proxy with authentication (ADR-002)
|
|
|
|
|
|
- Deploy a service that hasn't cleared `docs/security/service-checklist.md` (record any deviation in `docs/security/accepted-risks.md`)
|
2026-05-30 14:10:01 +02:00
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
|
|
## Further reading
|
|
|
|
|
|
|
|
|
|
|
|
| Topic | File |
|
|
|
|
|
|
|------------------------|---------------------------------------|
|
|
|
|
|
|
| Architecture overview | `docs/decisions/001-architecture.md` |
|
2026-06-04 14:39:51 +02:00
|
|
|
|
| Security baseline & strategy | `docs/decisions/002-security.md` |
|
|
|
|
|
|
| Accepted security risks | `docs/security/accepted-risks.md` |
|
|
|
|
|
|
| Per-service security checklist | `docs/security/service-checklist.md` |
|
2026-05-30 14:10:01 +02:00
|
|
|
|
| Toolchain choices | `docs/decisions/003-toolchain.md` |
|
|
|
|
|
|
| Docker & Compose model | `docs/decisions/004-docker-model.md` |
|
|
|
|
|
|
| Bootstrapping hosts | `docs/decisions/005-bootstrapping.md` |
|
|
|
|
|
|
| Terraform | `docs/decisions/006-terraform.md` |
|
|
|
|
|
|
| Network topology | `docs/decisions/007-network.md` |
|
|
|
|
|
|
| Testing methodology | `docs/decisions/008-testing.md` |
|
|
|
|
|
|
| TF ↔ Ansible handoff | `docs/decisions/009-provisioning-handoff.md` |
|
2026-05-30 21:34:07 +02:00
|
|
|
|
| Forgejo & CI | `docs/decisions/010-forgejo-ci.md` |
|
2026-06-01 10:34:38 +02:00
|
|
|
|
| Hardware & capacity | `docs/decisions/012-hardware-capacity.md` |
|
2026-05-30 14:10:01 +02:00
|
|
|
|
| Adding a new role | `docs/runbooks/new-role.md` |
|
|
|
|
|
|
| Adding a new host | `docs/runbooks/new-host.md` |
|
|
|
|
|
|
| Rotating vault secrets | `docs/runbooks/rotate-secrets.md` |
|