2026-05-30 14:10:01 +02:00
|
|
|
# ADR-006 — Terraform for infrastructure provisioning
|
|
|
|
|
|
|
|
|
|
## Context
|
|
|
|
|
|
|
|
|
|
Ansible manages host configuration well but has no state model for infrastructure
|
|
|
|
|
existence. Adding Terraform handles the "what exists" layer — creating and destroying
|
|
|
|
|
VMs on Proxmox — while Ansible continues to own everything that runs inside them,
|
|
|
|
|
including all internal DNS records.
|
|
|
|
|
|
|
|
|
|
This complements rather than replaces Ansible. The two tools do not overlap. The
|
|
|
|
|
exact boundary, handoff pipeline, and data contract between them live in **ADR-009
|
|
|
|
|
(provisioning handoff)** — this ADR covers Terraform's own internals only.
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
## Responsibility split
|
|
|
|
|
|
|
|
|
|
The canonical responsibility-split table lives in **ADR-009**. In short: Terraform
|
|
|
|
|
owns VM existence only; Ansible owns everything inside a VM, including all internal
|
|
|
|
|
DNS records.
|
|
|
|
|
|
|
|
|
|
**OPNsense is entirely Ansible.** The available Terraform providers for OPNsense
|
|
|
|
|
are community-maintained with real risk of provider rot across OPNsense releases.
|
|
|
|
|
OPNsense firewall rules also change on a service cadence, not an infrastructure
|
|
|
|
|
cadence, making them a poor fit for Terraform state.
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
## Providers
|
|
|
|
|
|
|
|
|
|
**`bpg/proxmox` (`~> 0.70`)**: Chosen over `telmate/proxmox` for active maintenance,
|
|
|
|
|
full Proxmox 8 API support, and better cloud-init integration. This is the only
|
|
|
|
|
provider.
|
|
|
|
|
|
|
|
|
|
Terraform does **not** manage DNS. An earlier design used `hashicorp/dns` (RFC 2136)
|
|
|
|
|
to write A records, but that created a bootstrap cycle — the first DNS server cannot
|
|
|
|
|
register itself — and split DNS ownership across two tools. Ansible's `dns` role now
|
|
|
|
|
owns the entire internal zone, rendered from inventory. See ADR-009.
|
|
|
|
|
|
2026-05-30 19:19:47 +02:00
|
|
|
Terraform manages its own provider dependencies via `required_providers` and
|
|
|
|
|
`.terraform.lock.hcl` (tracked in git once `terraform init` has been run).
|
2026-05-30 14:10:01 +02:00
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
## State backend
|
|
|
|
|
|
2026-05-30 21:34:05 +02:00
|
|
|
**Choice**: Local state on the control node.
|
|
|
|
|
|
|
|
|
|
Forgejo (Gitea-based) has no usable Terraform HTTP state backend — its API `/raw/`
|
|
|
|
|
endpoint is read-only, so state cannot be written there. State therefore lives
|
|
|
|
|
locally as `terraform.tfstate` (gitignored) on the control node, which is persistent
|
|
|
|
|
and backed up with the rest of the node.
|
|
|
|
|
|
|
|
|
|
At this scale (solo operator, a handful of VMs) local state is sufficient: no
|
|
|
|
|
concurrent applies, so no remote locking is needed. If a remote backend with locking
|
|
|
|
|
becomes worthwhile later, add a `backend` block to `backend.tf` pointing at a real
|
|
|
|
|
backend such as MinIO/S3 — Forgejo is not an option. See ADR-010 for the Forgejo
|
|
|
|
|
integration boundary.
|
2026-05-30 14:10:01 +02:00
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
## Structure
|
|
|
|
|
|
|
|
|
|
```
|
|
|
|
|
terraform/
|
|
|
|
|
modules/
|
|
|
|
|
proxmox_vm/ # reusable VM module — Proxmox only, no DNS
|
|
|
|
|
environments/
|
|
|
|
|
staging/ # staging VMs, separate state file
|
|
|
|
|
production/ # production VMs, separate state file
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
Separate environment directories (not Terraform workspaces) for the clearest
|
|
|
|
|
isolation — no risk of accidentally applying the wrong state.
|
|
|
|
|
|
|
|
|
|
Each environment directory contains:
|
|
|
|
|
- `providers.tf` — provider version pins and configuration
|
|
|
|
|
- `backend.tf` — Forgejo state backend (environment-specific path)
|
|
|
|
|
- `variables.tf` — input declarations
|
|
|
|
|
- `terraform.tfvars.example` — tracked template; copy to `terraform.tfvars` for actual values
|
|
|
|
|
- `main.tf` — `local.vms` map and module calls (no DNS resources)
|
|
|
|
|
- `outputs.tf` — VM map consumed by `make tf-inventory`
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
## Secrets handling
|
|
|
|
|
|
|
|
|
|
The only secret input (the Proxmox API token) is passed via a `TF_VAR_*`
|
|
|
|
|
environment variable and declared `sensitive = true` in `variables.tf`. It never
|
|
|
|
|
appears in `.tfvars` files. Non-secret configuration lives in tracked
|
|
|
|
|
`terraform.tfvars.example`; the real `terraform.tfvars` is gitignored.
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
## Ansible integration
|
|
|
|
|
|
|
|
|
|
After `terraform apply`, run `make tf-inventory TF_ENV=<env>` to regenerate
|
|
|
|
|
`inventories/<env>/hosts.yml` from the `vms` output. The full handoff pipeline,
|
|
|
|
|
the `vms` output → inventory data contract, and the generator script
|
|
|
|
|
(`scripts/tf_to_inventory.py`) are documented in **ADR-009 (provisioning
|
|
|
|
|
handoff)**.
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
## What was ruled out
|
|
|
|
|
|
|
|
|
|
| Option | Reason |
|
|
|
|
|
|---|---|
|
|
|
|
|
| `telmate/proxmox` provider | Less actively maintained; weaker cloud-init and Proxmox 8 support |
|
|
|
|
|
| OPNsense Terraform provider | Community-maintained; provider rot risk across OPNsense releases |
|
|
|
|
|
| Terraform workspaces | Single state file with workspace prefix; accidental cross-env apply possible |
|
|
|
|
|
| Separate Terraform repo | Cross-referencing between infra and config adds friction; monorepo keeps the full picture together |
|