boma/docs/decisions/007-network.md

187 lines
6.5 KiB
Markdown
Raw Permalink Normal View History

# ADR-007 — Network topology and addressing
## Context
The boma homelab is a Proxmox cluster on a dedicated private network behind an
OPNsense firewall. This document records the agreed physical topology, VLAN
design, IP addressing conventions, naming scheme, and DNS zone structure.
Everything here feeds directly into Terraform variables, Ansible inventory,
and OPNsense configuration.
---
## Physical topology
```
ISP
└── OPNsense (dedicated hardware)
├── WAN — ISP uplink
└── LAN — 802.1q trunk to managed switch
┌──────────────┼──────────────────────────┐
│ │ │ │
pve0 pve1 pve2 AP1 / AP2
(eno1 trunk) (eno1 trunk) (eno1 trunk) (trunk)
(eno2 corosync)(eno2 corosync)(eno2 corosync)
└──────────────┴──────────────┘
172.16.0.0/24 (corosync ring — not on managed switch)
```
**Dual NICs per Proxmox node:**
- `eno1` — VLAN-aware trunk. Carries all VLANs via a single VLAN-aware bridge
(`vmbr0`). VMs get their VLAN tag assigned in Proxmox.
- `eno2` — Dedicated corosync ring (`vmbr1`). Direct link or tiny unmanaged
switch between the three nodes only. Never touches the main switch fabric.
**Access points** broadcast multiple SSIDs, each tagged to its corresponding VLAN
(trusted WiFi → VLAN 30, IoT → VLAN 40, guest → VLAN 50).
---
## VLAN design
| VLAN | Name | Subnet | Purpose |
|---|---|---|---|
| 10 | `mgmt` | `10.10.0.0/24` | Proxmox hosts, OPNsense, managed switch. No internet except update repos. |
| 20 | `srv` | `10.20.0.0/24` | All Debian VMs and Docker services. 100% static. Terraform provisions here. |
| 30 | `lan` | `10.30.0.0/24` | Trusted home devices. DHCP. Access to selected `srv` services via OPNsense. |
| 40 | `iot` | `10.40.0.0/24` | Smart home, cameras, printers. DHCP. Internet egress only + HA exception. |
| 50 | `guest` | `10.50.0.0/24` | Guest WiFi. DHCP. Internet only, fully isolated. |
| 99 | `vpn` | `10.99.0.0/24` | WireGuard peers. `askari` (Hetzner) + road-warrior clients. |
---
## IP addressing
### VLAN 10 — mgmt (10.10.0.0/24) — no DHCP
| Address | Host |
|---|---|
| `10.10.0.1` | OPNsense LAN (mgmt) |
| `10.10.0.2` | Managed switch |
| `10.10.0.200` | `pve0` |
| `10.10.0.201` | `pve1` |
| `10.10.0.202` | `pve2` |
### VLAN 20 — srv (10.20.0.0/24) — no DHCP, all static
| Range | Purpose |
|---|---|
| `10.20.0.1` | OPNsense gateway |
| `10.20.0.10``.19` | Core infrastructure VMs (DNS, proxy) |
| `10.20.0.20``.49` | Additional static infrastructure |
| `10.20.0.50``.249` | Terraform-provisioned VMs |
Assigned infrastructure addresses:
| Address | Host | Role |
|---|---|---|
| `10.20.0.10` | `dns1` | Primary DNS server |
| `10.20.0.11` | `dns2` | Secondary DNS server |
| `10.20.0.12` | `proxy` | Reverse proxy |
| `10.20.0.13` | `homeassistant` | Home Assistant (IoT controller) |
### VLAN 30 — lan (10.30.0.0/24)
| Range | Purpose |
|---|---|
| `10.30.0.1` | OPNsense gateway |
| `10.30.0.100``.249` | DHCP pool |
### VLAN 40 — iot (10.40.0.0/24)
| Range | Purpose |
|---|---|
| `10.40.0.1` | OPNsense gateway |
| `10.40.0.100``.249` | DHCP pool |
### VLAN 50 — guest (10.50.0.0/24)
| Range | Purpose |
|---|---|
| `10.50.0.1` | OPNsense gateway |
| `10.50.0.100``.249` | DHCP pool |
### VLAN 99 — vpn (10.99.0.0/24) — WireGuard
| Address | Host |
|---|---|
| `10.99.0.1` | OPNsense (WireGuard endpoint) |
| `10.99.0.2` | `askari` (Hetzner VPS) |
| `10.99.0.10`+ | Road-warrior clients |
### Corosync ring (172.16.0.0/24) — not on managed switch
| Address | Host |
|---|---|
| `172.16.0.200` | `pve0` |
| `172.16.0.201` | `pve1` |
| `172.16.0.202` | `pve2` |
---
## OPNsense firewall rules (intent)
| Source | Destination | Policy |
|---|---|---|
| `mgmt` | anywhere | allow (administrator access) |
| `srv` | `srv` | allow (inter-service communication) |
| `srv` | internet | allow (updates, image pulls) |
| `lan` | `srv` (allow-list) | allow specific published ports only |
| `lan` | internet | allow |
| `iot` | internet | allow egress only |
| `iot` | `srv` (HA IP only) | allow on integration ports |
| `guest` | internet | allow, isolated from all internal |
| `vpn` | `srv` (metrics ports) | allow (monitoring) |
| `vpn` | `mgmt` | allow (administration from askari) |
**Home Assistant ↔ IoT**: HA VM at `10.20.0.13` can reach IoT VLAN on required
ports. OPNsense Avahi (mDNS reflector) bridges `srv``iot` for device discovery.
IoT devices cannot initiate connections to `srv`.
---
## Naming scheme
| Layer | Convention | Examples |
|---|---|---|
| Homelab name | `boma` | — |
| Proxmox nodes | `pve<n>` | `pve0`, `pve1`, `pve2` |
| Infrastructure VMs | `<role><n>` | `dns1`, `dns2`, `proxy` |
| Hetzner VPS | `askari` | Swahili for guard/sentinel |
| Internal FQDN | `<host>.boma.baobab.band` | `dns1.boma.baobab.band` |
| Public service FQDN | `<service>.baobab.band` | `forgejo.nyumbani.baobab.band` |
---
## DNS zones and split-horizon
**Internal zone**: `boma.baobab.band` — served by `dns1` and `dns2`.
The zone is rendered by the Ansible `dns` role: host A records come from the
inventory (which derives from Terraform's `local.vms` via `make tf-inventory`),
and service/alias/split-horizon records are explicit zone data in `group_vars`.
Terraform itself writes no DNS records — see ADR-009.
**Public zone**: `baobab.band` — served by external DNS (Cloudflare or equivalent).
Public-facing services resolve to the public IP or Cloudflare proxy.
**Split-horizon**: `dns1`/`dns2` serve internal answers for any hostname that has
both a public and private face. Example: `forgejo.nyumbani.baobab.band` resolves to
`10.20.0.12` (proxy) internally and to the public IP externally.
OPNsense DNS resolver forwards `boma.baobab.band` queries to `dns1`/`dns2`.
All other queries go upstream (e.g., `1.1.1.1`, `9.9.9.9`).
---
## External monitoring — askari
`askari` (Hetzner VPS) connects via WireGuard to OPNsense (`10.99.0.1`).
Its peer address is `10.99.0.2`. OPNsense routes `10.99.0.0/24` into the VPN
tunnel and allows `askari` narrow access to `srv` metrics endpoints and `mgmt`
for administration.
`askari` is provisioned and managed independently of the Proxmox cluster — it
must be reachable even when the homelab is down (its entire purpose).
FQDN: `askari.baobab.band`.