No description
Find a file
sjat bc8592616b fix: address final whole-branch review findings
- ADR-023 §4: ADR-015 no-sudo sub-decision now Superseded-by ADR-025 (bidirectional), not just an in-place amendment.
- STATUS: drop the deferred `reset` verb; honest integration_test (molecule not run in this env; applied to ubongo) + verify (forward/DNAT, not wt0); RED->GREEN validated.
- driver: remove unused `import shutil`.
- README: fix the ADR-025 link filename.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-18 21:52:28 +02:00
.claude fix(hooks): scope vault-preflight to staged ansible; catch prose exec re-asks 2026-06-17 17:49:55 +02:00
.docker feat(docker): custom Caddy image with the Gandi DNS-01 plugin 2026-06-15 06:57:38 +02:00
.scaffold Fix Forgejo registry path to owner/image format (review R10a) 2026-05-30 21:34:02 +02:00
docs fix: address final whole-branch review findings 2026-06-18 21:52:28 +02:00
inventories feat(base): codify AI-worker NOPASSWD sudo (ADR-015 amended) 2026-06-18 21:36:31 +02:00
playbooks feat(integration_test): KVM/libvirt substrate role on the control node 2026-06-18 12:09:35 +02:00
roles fix: address final whole-branch review findings 2026-06-18 21:52:28 +02:00
scripts fix: address final whole-branch review findings 2026-06-18 21:52:28 +02:00
terraform revert: back out mesh-hardening 1/3 on askari after it broke the Docker host 2026-06-17 22:16:17 +02:00
tests fix(integration): verify probes :80 without following redirects 2026-06-18 16:57:47 +02:00
.ansible-lint fix(integration): exclude transient .run/ from linters; --- in generated inventory 2026-06-18 16:44:12 +02:00
.gitignore feat(make): test-integration / test-integration-clean targets 2026-06-18 12:45:38 +02:00
.pre-commit-config.yaml chore(tooling): scope ansible-lint to ansible content; venv PATH in make test 2026-06-10 12:51:30 +02:00
.yamllint fix(integration): exclude transient .run/ from linters; --- in generated inventory 2026-06-18 16:44:12 +02:00
AGENTS.md Add ADR-010 (Forgejo integration) and rbw-unlocked pre-flight convention 2026-05-30 21:34:07 +02:00
ansible.cfg feat(make): offsite TF token injection + directory inventory + tf-inventory-offsite 2026-06-14 12:05:41 +02:00
CLAUDE.md docs: wire ADR-025 into testing/control-host/risks/status/capacity 2026-06-18 12:51:22 +02:00
CONTRIBUTING.md Purge residual .vault_pass references (review R1-R5) 2026-05-30 19:17:25 +02:00
Makefile feat(make): test-integration / test-integration-clean targets 2026-06-18 12:45:38 +02:00
README.md docs(review): 2026-06-14 repo audit — M4a doc drift + Traefik→Caddy lag 2026-06-14 18:37:54 +02:00
requirements.txt Harden lint setup and clean inventory placeholders 2026-05-30 14:56:16 +02:00
requirements.yml feat(reverse_proxy): Caddy role (Gandi DNS-01, on-host image build, route catalog) 2026-06-14 17:36:58 +02:00
STATUS.md fix: address final whole-branch review findings 2026-06-18 21:52:28 +02:00

boma

Infrastructure-as-code for a self-hosted homelab: a Proxmox cluster of Debian 13 VMs running Docker services, provisioned with Terraform and configured with Ansible. Stable, secure, reproducible, and fully version-controlled.

Scope — this repo manages infrastructure: the cluster's VMs, their hardened base OS, and the containerised services they run. It does not manage personal machines (laptops, desktops, phones). Terraform owns VM existence; Ansible owns everything inside a VM. See STATUS.md for what's built vs planned and docs/decisions/ for the design rationale.

The nameboma is Swahili for a fortified homestead enclosure (a stockade guarding what's within) — fitting for a hardened, self-contained home setup. It keeps company with the project's other Swahili names: askari (the external sentinel) and nyumbani ("home").

Quick start (control node)

git clone <repo-url> ~/ansible
cd ~/ansible

# Create venv and install dependencies
make setup
make collections

# Unlock the vault password from Vaultwarden via rbw
# (one-time rbw setup: docs/runbooks/rotate-secrets.md)
rbw unlock

# Verify setup
make lint

Common operations

What Command
Lint everything make lint
Dry-run site playbook make check PLAYBOOK=site
Deploy everything make deploy PLAYBOOK=site
Test a role make test ROLE=base
Scaffold a new role make new-role NAME=myservice

See Makefile for the full list of targets.

Project structure

.
├── CLAUDE.md               # Claude Code session context
├── Makefile                # All operations go through here
├── ansible.cfg             # Project-scoped Ansible config
├── requirements.txt        # Python dependencies
├── requirements.yml        # Ansible collections
│
├── docs/
│   ├── decisions/          # Architecture decision records (ADRs)
│   ├── runbooks/           # Step-by-step operational procedures
│   ├── security/           # Per-service security checklist + templates + accepted risks
│   ├── testing/            # VERIFY.md template + service-UI verification reports
│   ├── access/             # ACCESS.md template (ADR-021)
│   ├── backup/             # BACKUP.md template (ADR-022)
│   ├── hardware/           # Physical capacity reference + reviews
│   └── reviews/            # /review-repo reports
│
├── inventories/
│   ├── production/         # Live hosts — edit carefully
│   └── staging/            # Test hosts — safe to run freely
│
├── playbooks/              # Orchestration playbooks
│   ├── site.yml            # Full standard state
│   ├── workstation.yml     # Developer environment (control group)
│   └── bootstrap.yml       # First-run new host setup
│
├── roles/                  # Ansible roles
│   ├── base/               # OS baseline applied to all hosts
│   ├── dev_env/            # Interactive developer environment
│   └── docker_host/        # Docker runtime setup
│
├── terraform/              # VM provisioning only — no DNS (see ADR-006/009)
│   ├── modules/            # Reusable modules (proxmox_vm)
│   └── environments/       # Per-env state: staging/, production/
│
└── scripts/                # Helper scripts (tf_to_inventory.py)

Documentation

  • Current state (built vs planned): STATUS.md — read this before assuming something exists; the ADRs describe intent, not necessarily reality.
  • AI agents: AGENTS.md (points to CLAUDE.md, the authoritative guide)
  • Architecture: docs/decisions/001-architecture.md
  • Security baseline: docs/decisions/002-security.md
  • Toolchain decisions: docs/decisions/003-toolchain.md
  • Docker model: docs/decisions/004-docker-model.md
  • Bootstrapping: docs/decisions/005-bootstrapping.md
  • Terraform: docs/decisions/006-terraform.md
  • Network topology: docs/decisions/007-network.md
  • Testing methodology: docs/decisions/008-testing.md
  • Terraform ↔ Ansible handoff: docs/decisions/009-provisioning-handoff.md
  • Forgejo & CI: docs/decisions/010-forgejo-ci.md
  • Update management: docs/decisions/011-update-management.md
  • Hardware & capacity: docs/decisions/012-hardware-capacity.md
  • Heritage / V4 policy: docs/decisions/013-heritage-v4.md
  • Sourcing technical knowledge: docs/decisions/014-knowledge-sourcing.md
  • Control / AI-worker host (ubongo): docs/decisions/015-control-host.md
  • Mesh VPN (NetBird): docs/decisions/016-mesh-vpn.md
  • Service-UI verification (Level 4): docs/decisions/017-service-ui-verification.md
  • Logging & log integrity: docs/decisions/018-logging.md
  • Tagging & run-targeting: docs/decisions/019-tagging.md
  • Firewall strategy: docs/decisions/020-firewall.md
  • Operational access: docs/decisions/021-operational-access.md
  • Backup & disaster recovery: docs/decisions/022-backup.md
  • ADR structure & lifecycle: docs/decisions/023-adr-structure.md
  • Reverse proxy (Caddy): docs/decisions/024-reverse-proxy.md

(CLAUDE.md carries the full cross-referenced table, including the runbooks and security/testing docs.)

Contributing

See CONTRIBUTING.md for conventions, branching strategy, and how to add roles.