boma/docs/decisions/001-architecture.md
sjat 93f2a847c7 Reconcile CI to trunk-based; mark base/docker_host not-built (R6-R8,R15-R16)
R6/R7: ADR-003 & ADR-008 CI pipelines rewritten trunk-based (push to main ->
test -> staging -> [manual gate] production); CLAUDE.md no longer forbids pushing
to main. R8: STATUS/roles-README/site.yml now say base & docker_host are not built
(not in git), so a clean clone errors. R15/R16: ADR-001 table flagged as intended
design; dropped the unbuilt 'monitoring agent' from the baseline.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-30 19:32:37 +02:00

2.8 KiB
Raw Blame History

ADR-001 — Architecture overview

Context

This document describes the overall architecture of the homelab infrastructure and the boundaries of what this Ansible monorepo manages.

Infrastructure

  • Hypervisor: Proxmox cluster (2+ nodes)
  • Guest OS: Debian 13 (all managed hosts)
  • Scale: 25 VMs, small fleet — treated as individuals, not cattle
  • Control node: A dedicated Debian 13 VM on the cluster. Ansible runs from here. The control node is the one host that cannot fully bootstrap itself from scratch and requires manual initial setup (see docs/runbooks/new-host.md).

What this repo manages

Layer Managed by Notes
VM existence Terraform (terraform/) Clones the cloud-init template; control node is the one manual exception (see ADR-009)
Internal DNS records Ansible dns role Internal zone rendered from inventory (see ADR-007/009)
OS baseline Ansible base role Users, SSH, firewall, updates, audit
Docker runtime Ansible docker_host role Engine, daemon config, log driver
Service deployment Ansible per-service roles Compose rendered from templates
Secrets Ansible Vault Encrypted vault.yml files in repo

The Terraform↔Ansible boundary and handoff are defined in ADR-009. This table describes the intended design — see STATUS.md for what is actually built.

Host groups

all
├── control           # the control node itself — baseline config only, runs no services
├── docker_hosts      # VMs running Docker services (most hosts)
└── proxmox_hosts     # Proxmox nodes themselves (limited management scope)

The control group holds the single manually-provisioned control node; it is managed for baseline config (SSH, firewall, updates) but never runs the docker_host role. Proxmox nodes are managed only for basic baseline tasks (SSH). Proxmox configuration itself (storage, clustering, networking) is out of scope.

Service interaction model

Services run as Docker containers on one or more docker_hosts. Where services need to interact, they do so via:

  • Docker networks (same host)
  • Internal DNS / hostname resolution (cross-host)
  • Explicitly defined published ports (external access)

All Compose files are rendered by Ansible from Jinja2 templates. No hand-edited Compose files exist on hosts — they are always regenerated on deploy.

Decision

This architecture prioritises:

  • Simplicity: few moving parts, no orchestration layer (no Kubernetes, no Swarm)
  • Reproducibility: any host can be rebuilt from scratch via Ansible
  • Legibility: a human reading the repo can understand what runs where