6.3 KiB
ADR-006 — Terraform for infrastructure provisioning
Status
Accepted (2026-05-30)
Context
Ansible manages host configuration well but has no state model for infrastructure existence. Adding Terraform handles the "what exists" layer — creating and destroying VMs on Proxmox and Hetzner — while Ansible continues to own everything that runs inside them, including all internal DNS records.
This complements rather than replaces Ansible. The two tools do not overlap. The exact boundary, handoff pipeline, and data contract between them live in ADR-009 (provisioning handoff) — this ADR covers Terraform's own internals only.
Decision
Responsibility split
The canonical responsibility-split table lives in ADR-009. In short: Terraform owns VM existence only; Ansible owns everything inside a VM, including all internal DNS records.
OPNsense is entirely Ansible. The available Terraform providers for OPNsense are community-maintained with real risk of provider rot across OPNsense releases. OPNsense firewall rules also change on a service cadence, not an infrastructure cadence, making them a poor fit for Terraform state.
Providers
bpg/proxmox (~> 0.70): Chosen over telmate/proxmox for active maintenance,
full Proxmox 8 API support, and better cloud-init integration. This is the provider
for Proxmox VMs.
hetznercloud/hcloud (~> 1.65): owns off-site VM existence (askari). ADR-006's
scope is now Proxmox + Hetzner — "Terraform owns VM existence" generalizes across
providers. The offsite environment + hetzner_vm module live alongside the Proxmox env
proxmox_vmmodule; each environment has its own local state.
Terraform does not manage DNS. An earlier design used hashicorp/dns (RFC 2136)
to write A records, but that created a bootstrap cycle — the first DNS server cannot
register itself — and split DNS ownership across two tools. Ansible's dns role now
owns the entire internal zone, rendered from inventory. See ADR-009.
Terraform manages its own provider dependencies via required_providers and
.terraform.lock.hcl (tracked in git once terraform init has been run).
State backend
Choice: Local state on the control node.
Forgejo (Gitea-based) has no usable Terraform HTTP state backend — its API /raw/
endpoint is read-only, so state cannot be written there. State therefore lives
locally as terraform.tfstate (gitignored) on the control node, which is persistent
and backed up with the rest of the node.
At this scale (solo operator, a handful of VMs) local state is sufficient: no
concurrent applies, so no remote locking is needed. If a remote backend with locking
becomes worthwhile later, add a backend block to backend.tf pointing at a real
backend such as MinIO/S3 — Forgejo is not an option. See ADR-010 for the Forgejo
integration boundary.
Structure
terraform/
modules/
proxmox_vm/ # reusable VM module — Proxmox only, no DNS
hetzner_vm/ # reusable VM module — Hetzner Cloud, no DNS
environments/
staging/ # staging Proxmox VMs, separate state file
production/ # production Proxmox VMs, separate state file
offsite/ # off-site Hetzner VMs (askari), separate state file
Separate environment directories (not Terraform workspaces) for the clearest isolation — no risk of accidentally applying the wrong state.
Each environment directory contains:
providers.tf— provider version pins and configurationbackend.tf— backend configuration (local state on the control node; no remote backend — see "State backend" above)variables.tf— input declarationsterraform.tfvars.example— tracked template; copy toterraform.tfvarsfor actual valuesmain.tf—local.vmsmap and module calls (no DNS resources)outputs.tf— VM map consumed bymake tf-inventory
Secrets handling
The only secret input (the Proxmox API token) is passed via a TF_VAR_*
environment variable and declared sensitive = true in variables.tf. It never
appears in .tfvars files. Non-secret configuration lives in tracked
terraform.tfvars.example; the real terraform.tfvars is gitignored.
Ansible integration
After terraform apply, run make tf-inventory TF_ENV=<env> to regenerate
inventories/<env>/hosts.yml from the vms output. The full handoff pipeline,
the vms output → inventory data contract, and the generator script
(scripts/tf_to_inventory.py) are documented in ADR-009 (provisioning
handoff).
What was ruled out
| Option | Reason |
|---|---|
telmate/proxmox provider |
Less actively maintained; weaker cloud-init and Proxmox 8 support |
| OPNsense Terraform provider | Community-maintained; provider rot risk across OPNsense releases |
| Terraform workspaces | Single state file with workspace prefix; accidental cross-env apply possible |
| Separate Terraform repo | Cross-referencing between infra and config adds friction; monorepo keeps the full picture together |
Consequences
Drawn from the "What was ruled out" section and the decisions stated above:
bpg/proxmoxis the provider for Proxmox VMs;telmate/proxmoxwas ruled out for weaker maintenance and Proxmox 8 / cloud-init support (Providers; What was ruled out).hetznercloud/hcloudis the provider for off-site VM existence (askari); ADR-006's scope now covers Proxmox + Hetzner (Providers).- OPNsense stays entirely in Ansible — no Terraform OPNsense provider — to avoid community-provider rot across OPNsense releases (Responsibility split; What was ruled out).
- Terraform writes no DNS records; Ansible's
dnsrole owns the entire internal zone, avoiding the bootstrap cycle and split DNS ownership the earlierhashicorp/dnsdesign created (Providers). - State is local on the control node because Forgejo offers no usable HTTP state backend; this is sufficient at solo-operator scale (no concurrent applies, no remote locking), with a real backend such as MinIO/S3 to be added later if warranted (State backend).
- Separate environment directories are used instead of Terraform workspaces to remove the risk of applying the wrong state (Structure; What was ruled out).
- Terraform and Ansible internals are kept in one monorepo rather than a separate Terraform repo to avoid cross-referencing friction (What was ruled out).