boma/docs/decisions/006-terraform.md
sjat fe4228fb38 Add architecture decision records and runbooks
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-30 14:10:01 +02:00

111 lines
4.3 KiB
Markdown

# ADR-006 — Terraform for infrastructure provisioning
## Context
Ansible manages host configuration well but has no state model for infrastructure
existence. Adding Terraform handles the "what exists" layer — creating and destroying
VMs on Proxmox — while Ansible continues to own everything that runs inside them,
including all internal DNS records.
This complements rather than replaces Ansible. The two tools do not overlap. The
exact boundary, handoff pipeline, and data contract between them live in **ADR-009
(provisioning handoff)** — this ADR covers Terraform's own internals only.
---
## Responsibility split
The canonical responsibility-split table lives in **ADR-009**. In short: Terraform
owns VM existence only; Ansible owns everything inside a VM, including all internal
DNS records.
**OPNsense is entirely Ansible.** The available Terraform providers for OPNsense
are community-maintained with real risk of provider rot across OPNsense releases.
OPNsense firewall rules also change on a service cadence, not an infrastructure
cadence, making them a poor fit for Terraform state.
---
## Providers
**`bpg/proxmox` (`~> 0.70`)**: Chosen over `telmate/proxmox` for active maintenance,
full Proxmox 8 API support, and better cloud-init integration. This is the only
provider.
Terraform does **not** manage DNS. An earlier design used `hashicorp/dns` (RFC 2136)
to write A records, but that created a bootstrap cycle — the first DNS server cannot
register itself — and split DNS ownership across two tools. Ansible's `dns` role now
owns the entire internal zone, rendered from inventory. See ADR-009.
No Galaxy roles. Terraform manages its own provider dependencies via
`required_providers` and `.terraform.lock.hcl` (tracked in git once `terraform init`
has been run).
---
## State backend
**Choice**: Forgejo HTTP backend (self-hosted at git.baobab.band)
Keeps all state on the same self-hosted stack without additional services.
Authentication uses a Forgejo personal access token via `TF_HTTP_USERNAME` and
`TF_HTTP_PASSWORD` environment variables.
**Note**: The backend URL in `backend.tf` is a placeholder — confirm the exact
endpoint path against your running Forgejo instance's API documentation before
running `terraform init`. If Forgejo's HTTP state is unavailable, remove the
`backend` block from `backend.tf` to fall back to local state on the control node.
---
## Structure
```
terraform/
modules/
proxmox_vm/ # reusable VM module — Proxmox only, no DNS
environments/
staging/ # staging VMs, separate state file
production/ # production VMs, separate state file
```
Separate environment directories (not Terraform workspaces) for the clearest
isolation — no risk of accidentally applying the wrong state.
Each environment directory contains:
- `providers.tf` — provider version pins and configuration
- `backend.tf` — Forgejo state backend (environment-specific path)
- `variables.tf` — input declarations
- `terraform.tfvars.example` — tracked template; copy to `terraform.tfvars` for actual values
- `main.tf``local.vms` map and module calls (no DNS resources)
- `outputs.tf` — VM map consumed by `make tf-inventory`
---
## Secrets handling
The only secret input (the Proxmox API token) is passed via a `TF_VAR_*`
environment variable and declared `sensitive = true` in `variables.tf`. It never
appears in `.tfvars` files. Non-secret configuration lives in tracked
`terraform.tfvars.example`; the real `terraform.tfvars` is gitignored.
---
## Ansible integration
After `terraform apply`, run `make tf-inventory TF_ENV=<env>` to regenerate
`inventories/<env>/hosts.yml` from the `vms` output. The full handoff pipeline,
the `vms` output → inventory data contract, and the generator script
(`scripts/tf_to_inventory.py`) are documented in **ADR-009 (provisioning
handoff)**.
---
## What was ruled out
| Option | Reason |
|---|---|
| `telmate/proxmox` provider | Less actively maintained; weaker cloud-init and Proxmox 8 support |
| OPNsense Terraform provider | Community-maintained; provider rot risk across OPNsense releases |
| Terraform workspaces | Single state file with workspace prefix; accidental cross-env apply possible |
| Separate Terraform repo | Cross-referencing between infra and config adds friction; monorepo keeps the full picture together |