sjat/boma

sjat 64f1e821d8 docs(review): 2026-06-14 repo audit — M4a doc drift + Traefik→Caddy lag

11 safe auto-fixes (docs/comments only): reverse_proxy meta stale DNS-01
description, base/playbooks/scripts/terraform/public_dns README build-state,
CAPABILITIES reverse-proxy Traefik→Caddy, README ADR list → 024, TF cax11→cx23
stamps, public_dns wildcard DNS-01→HTTP-01 comment. 29 open findings reported.
make lint green. No stale-deferred (ADR-011 open questions still open).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

2026-06-14 18:37:54 +02:00

5.4 KiB

Raw Blame History

boma

Infrastructure-as-code for a self-hosted homelab: a Proxmox cluster of Debian 13 VMs running Docker services, provisioned with Terraform and configured with Ansible. Stable, secure, reproducible, and fully version-controlled.

Scope — this repo manages infrastructure: the cluster's VMs, their hardened base OS, and the containerised services they run. It does not manage personal machines (laptops, desktops, phones). Terraform owns VM existence; Ansible owns everything inside a VM. See STATUS.md for what's built vs planned and docs/decisions/ for the design rationale.

The name — boma is Swahili for a fortified homestead enclosure (a stockade guarding what's within) — fitting for a hardened, self-contained home setup. It keeps company with the project's other Swahili names: askari (the external sentinel) and nyumbani ("home").

Quick start (control node)

git clone <repo-url> ~/ansible
cd ~/ansible

# Create venv and install dependencies
make setup
make collections

# Unlock the vault password from Vaultwarden via rbw
# (one-time rbw setup: docs/runbooks/rotate-secrets.md)
rbw unlock

# Verify setup
make lint

Common operations

What	Command
Lint everything	`make lint`
Dry-run site playbook	`make check PLAYBOOK=site`
Deploy everything	`make deploy PLAYBOOK=site`
Test a role	`make test ROLE=base`
Scaffold a new role	`make new-role NAME=myservice`

See Makefile for the full list of targets.

Project structure

.
├── CLAUDE.md               # Claude Code session context
├── Makefile                # All operations go through here
├── ansible.cfg             # Project-scoped Ansible config
├── requirements.txt        # Python dependencies
├── requirements.yml        # Ansible collections
│
├── docs/
│   ├── decisions/          # Architecture decision records (ADRs)
│   ├── runbooks/           # Step-by-step operational procedures
│   ├── security/           # Per-service security checklist + templates + accepted risks
│   ├── testing/            # VERIFY.md template + service-UI verification reports
│   ├── access/             # ACCESS.md template (ADR-021)
│   ├── backup/             # BACKUP.md template (ADR-022)
│   ├── hardware/           # Physical capacity reference + reviews
│   └── reviews/            # /review-repo reports
│
├── inventories/
│   ├── production/         # Live hosts — edit carefully
│   └── staging/            # Test hosts — safe to run freely
│
├── playbooks/              # Orchestration playbooks
│   ├── site.yml            # Full standard state
│   ├── workstation.yml     # Developer environment (control group)
│   └── bootstrap.yml       # First-run new host setup
│
├── roles/                  # Ansible roles
│   ├── base/               # OS baseline applied to all hosts
│   ├── dev_env/            # Interactive developer environment
│   └── docker_host/        # Docker runtime setup
│
├── terraform/              # VM provisioning only — no DNS (see ADR-006/009)
│   ├── modules/            # Reusable modules (proxmox_vm)
│   └── environments/       # Per-env state: staging/, production/
│
└── scripts/                # Helper scripts (tf_to_inventory.py)

Documentation

Current state (built vs planned): STATUS.md — read this before assuming something exists; the ADRs describe intent, not necessarily reality.
AI agents: AGENTS.md (points to CLAUDE.md, the authoritative guide)
Architecture: docs/decisions/001-architecture.md
Security baseline: docs/decisions/002-security.md
Toolchain decisions: docs/decisions/003-toolchain.md
Docker model: docs/decisions/004-docker-model.md
Bootstrapping: docs/decisions/005-bootstrapping.md
Terraform: docs/decisions/006-terraform.md
Network topology: docs/decisions/007-network.md
Testing methodology: docs/decisions/008-testing.md
Terraform ↔ Ansible handoff: docs/decisions/009-provisioning-handoff.md
Forgejo & CI: docs/decisions/010-forgejo-ci.md
Update management: docs/decisions/011-update-management.md
Hardware & capacity: docs/decisions/012-hardware-capacity.md
Heritage / V4 policy: docs/decisions/013-heritage-v4.md
Sourcing technical knowledge: docs/decisions/014-knowledge-sourcing.md
Control / AI-worker host (ubongo): docs/decisions/015-control-host.md
Mesh VPN (NetBird): docs/decisions/016-mesh-vpn.md
Service-UI verification (Level 4): docs/decisions/017-service-ui-verification.md
Logging & log integrity: docs/decisions/018-logging.md
Tagging & run-targeting: docs/decisions/019-tagging.md
Firewall strategy: docs/decisions/020-firewall.md
Operational access: docs/decisions/021-operational-access.md
Backup & disaster recovery: docs/decisions/022-backup.md
ADR structure & lifecycle: docs/decisions/023-adr-structure.md
Reverse proxy (Caddy): docs/decisions/024-reverse-proxy.md

(CLAUDE.md carries the full cross-referenced table, including the runbooks and security/testing docs.)

Contributing

See CONTRIBUTING.md for conventions, branching strategy, and how to add roles.

5.4 KiB Raw Blame History