boma/README.md

125 lines
5.4 KiB
Markdown
Raw Normal View History

# boma
Infrastructure-as-code for a self-hosted homelab: a Proxmox cluster of Debian 13 VMs
running Docker services, provisioned with **Terraform** and configured with
**Ansible**. Stable, secure, reproducible, and fully version-controlled.
**Scope** — this repo manages *infrastructure*: the cluster's VMs, their hardened
base OS, and the containerised services they run. It does **not** manage personal
machines (laptops, desktops, phones). Terraform owns VM existence; Ansible owns
everything inside a VM. See `STATUS.md` for what's built vs planned and
`docs/decisions/` for the design rationale.
**The name** — *boma* is Swahili for a fortified homestead enclosure (a stockade
guarding what's within) — fitting for a hardened, self-contained home setup. It
keeps company with the project's other Swahili names: `askari` (the external
sentinel) and `nyumbani` ("home").
## Quick start (control node)
```bash
git clone <repo-url> ~/ansible
cd ~/ansible
# Create venv and install dependencies
make setup
make collections
# Unlock the vault password from Vaultwarden via rbw
# (one-time rbw setup: docs/runbooks/rotate-secrets.md)
rbw unlock
# Verify setup
make lint
```
## Common operations
| What | Command |
| --------------------- | ------------------------------ |
| Lint everything | `make lint` |
| Dry-run site playbook | `make check PLAYBOOK=site` |
| Deploy everything | `make deploy PLAYBOOK=site` |
| Test a role | `make test ROLE=base` |
| Scaffold a new role | `make new-role NAME=myservice` |
See `Makefile` for the full list of targets.
## Project structure
```
.
├── CLAUDE.md # Claude Code session context
├── Makefile # All operations go through here
├── ansible.cfg # Project-scoped Ansible config
├── requirements.txt # Python dependencies
├── requirements.yml # Ansible collections
├── docs/
│ ├── decisions/ # Architecture decision records (ADRs)
│ ├── runbooks/ # Step-by-step operational procedures
│ ├── security/ # Per-service security checklist + templates + accepted risks
│ ├── testing/ # VERIFY.md template + service-UI verification reports
│ ├── access/ # ACCESS.md template (ADR-021)
│ ├── backup/ # BACKUP.md template (ADR-022)
│ ├── hardware/ # Physical capacity reference + reviews
│ └── reviews/ # /review-repo reports
├── inventories/
│ ├── production/ # Live hosts — edit carefully
│ └── staging/ # Test hosts — safe to run freely
├── playbooks/ # Orchestration playbooks
│ ├── site.yml # Full standard state
│ ├── workstation.yml # Developer environment (control group)
│ └── bootstrap.yml # First-run new host setup
├── roles/ # Ansible roles
│ ├── base/ # OS baseline applied to all hosts
│ ├── dev_env/ # Interactive developer environment
│ └── docker_host/ # Docker runtime setup
├── terraform/ # VM provisioning only — no DNS (see ADR-006/009)
│ ├── modules/ # Reusable modules (proxmox_vm)
│ └── environments/ # Per-env state: staging/, production/
└── scripts/ # Helper scripts (tf_to_inventory.py)
```
## Documentation
- **Current state (built vs planned): `STATUS.md`** — read this before assuming
something exists; the ADRs describe intent, not necessarily reality.
- AI agents: `AGENTS.md` (points to `CLAUDE.md`, the authoritative guide)
- Architecture: `docs/decisions/001-architecture.md`
- Security baseline: `docs/decisions/002-security.md`
- Toolchain decisions: `docs/decisions/003-toolchain.md`
- Docker model: `docs/decisions/004-docker-model.md`
- Bootstrapping: `docs/decisions/005-bootstrapping.md`
- Terraform: `docs/decisions/006-terraform.md`
- Network topology: `docs/decisions/007-network.md`
- Testing methodology: `docs/decisions/008-testing.md`
- Terraform ↔ Ansible handoff: `docs/decisions/009-provisioning-handoff.md`
- Forgejo & CI: `docs/decisions/010-forgejo-ci.md`
- Update management: `docs/decisions/011-update-management.md`
- Hardware & capacity: `docs/decisions/012-hardware-capacity.md`
- Heritage / V4 policy: `docs/decisions/013-heritage-v4.md`
- Sourcing technical knowledge: `docs/decisions/014-knowledge-sourcing.md`
- Control / AI-worker host (`ubongo`): `docs/decisions/015-control-host.md`
- Mesh VPN (NetBird): `docs/decisions/016-mesh-vpn.md`
- Service-UI verification (Level 4): `docs/decisions/017-service-ui-verification.md`
- Logging & log integrity: `docs/decisions/018-logging.md`
- Tagging & run-targeting: `docs/decisions/019-tagging.md`
- Firewall strategy: `docs/decisions/020-firewall.md`
- Operational access: `docs/decisions/021-operational-access.md`
- Backup & disaster recovery: `docs/decisions/022-backup.md`
- ADR structure & lifecycle: `docs/decisions/023-adr-structure.md`
- Reverse proxy (Caddy): `docs/decisions/024-reverse-proxy.md`
(CLAUDE.md carries the full cross-referenced table, including the runbooks and
security/testing docs.)
## Contributing
See `CONTRIBUTING.md` for conventions, branching strategy, and how to add roles.