docs(spec): host nftables firewall design (ADR-020 build #1)
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
2ad50e4d5b
commit
d7fbaca554
1 changed files with 219 additions and 0 deletions
|
|
@ -0,0 +1,219 @@
|
||||||
|
# Design — Host nftables firewall (the `firewall` concern of `base`)
|
||||||
|
|
||||||
|
- **Date:** 2026-06-06
|
||||||
|
- **Status:** Approved design — pending implementation plan
|
||||||
|
- **Implements:** ADR-020 deferred build #1 (host nftables in `base`)
|
||||||
|
- **Scope:** The **`firewall`-tagged concern of the `base` role only**. Other `base`
|
||||||
|
concerns (SSH hardening, fail2ban, auditd, packages, users) are separate future efforts.
|
||||||
|
Docker netfilter is deferred to the `docker_host` role.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Problem
|
||||||
|
|
||||||
|
ADR-020 settled the firewall *strategy*: a per-host nftables layer doing default-deny
|
||||||
|
inbound + east-west allowlisting + permissive egress, rendered from a shared
|
||||||
|
`group_vars` service catalog. Nothing is built yet — `roles/base/` is empty. This spec
|
||||||
|
designs the concrete host firewall: the catalog schema, how rules are resolved and
|
||||||
|
rendered, how they are applied without locking out the host, and how it is tested.
|
||||||
|
|
||||||
|
Two hard constraints shape the design:
|
||||||
|
|
||||||
|
1. **Molecule runs in a privileged Docker container sharing the dev host (`ubongo`)
|
||||||
|
kernel netfilter** — applying real nftables rules there could mutate the live host.
|
||||||
|
So Level-1 testing renders and syntax-checks but does **not** apply.
|
||||||
|
2. **Lockout risk** — a bad ruleset can brick SSH/Ansible. On-cluster hosts have the
|
||||||
|
Proxmox console as break-glass; offsite `askari` (Hetzner) does not, cheaply.
|
||||||
|
|
||||||
|
## Scope decisions (settled in brainstorming)
|
||||||
|
|
||||||
|
- **Host firewall only**, coherent on any host (even one with no services). Docker
|
||||||
|
`iptables:false` + container forward/NAT/masquerade are **deferred to `docker_host`**,
|
||||||
|
which contributes rules via an extension hook (below).
|
||||||
|
- **Placement lives in the catalog** (`host:` | `group:` | `hosts:`), giving one source
|
||||||
|
of truth that also resolves symbolic sources. Proxmox HA/migration moves a *VM*
|
||||||
|
between physical nodes but the VM keeps its static `srv` IP and inventory identity, so
|
||||||
|
node-level failover is invisible to the firewall. A planned service relocation is a
|
||||||
|
one-line catalog edit + `--tags firewall` re-deploy (which re-renders opened ports
|
||||||
|
*and* every source resolution consistently). Within-group HA is handled by placing a
|
||||||
|
service on a `group`/`hosts` list — the allowlist then already covers every member.
|
||||||
|
- **Level-1 testing = render + `nft -c` syntax check, no apply.** Enforcement is
|
||||||
|
verified at Level 2 on staging VMs.
|
||||||
|
- **Auto-rollback safety net** on apply (critical for offsite `askari`).
|
||||||
|
|
||||||
|
## Role layout
|
||||||
|
|
||||||
|
Scaffold with `make new-role base`, then implement the firewall concern:
|
||||||
|
|
||||||
|
```
|
||||||
|
roles/base/
|
||||||
|
tasks/main.yml # include_tasks firewall.yml (tags: [firewall]); grows later
|
||||||
|
tasks/firewall.yml # install nftables, render, validate, safe-apply
|
||||||
|
filter_plugins/firewall_rules.py # pure catalog→resolved-rules resolver (pytest-unit-tested)
|
||||||
|
templates/nftables.conf.j2
|
||||||
|
defaults/main.yml # base__firewall_* behaviour knobs
|
||||||
|
handlers/main.yml
|
||||||
|
molecule/default/ # fixture catalog + inventory; converge + verify
|
||||||
|
README.md, meta/main.yml
|
||||||
|
```
|
||||||
|
|
||||||
|
`base` is infrastructure, not a *service* role, so the service-role `SECURITY.md` /
|
||||||
|
`VERIFY.md` conventions (ADR-004) do not apply. The firewall role import in a playbook
|
||||||
|
carries the `base` role-name tag (enforced by `check-tags.py`, ADR-019); the firewall
|
||||||
|
tasks within carry the `firewall` concern tag.
|
||||||
|
|
||||||
|
## Data model — shared catalog + zones
|
||||||
|
|
||||||
|
Two new **global inventory facts** (read by `base` now and OPNsense later, so plain
|
||||||
|
names, not role-namespaced) in `inventories/<env>/group_vars/all/firewall.yml`:
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
# Zone → subnet (from ADR-007)
|
||||||
|
firewall_zones:
|
||||||
|
lan: 10.30.0.0/24
|
||||||
|
srv: 10.20.0.0/24
|
||||||
|
mgmt: 10.10.0.0/24
|
||||||
|
iot: 10.40.0.0/24
|
||||||
|
guest: 10.50.0.0/24
|
||||||
|
|
||||||
|
# Service catalog: name → placement + ingress
|
||||||
|
firewall_catalog:
|
||||||
|
reverse_proxy:
|
||||||
|
host: docker01 # placement: host | group | hosts:[...]
|
||||||
|
ingress:
|
||||||
|
- { from: lan, port: 443, proto: tcp }
|
||||||
|
photoprism:
|
||||||
|
host: docker01
|
||||||
|
ingress:
|
||||||
|
- { from: reverse_proxy, port: 2342, proto: tcp }
|
||||||
|
```
|
||||||
|
|
||||||
|
- **Placement** is exactly one of `host: <name>`, `group: <group>`, or `hosts: [<name>, …]`.
|
||||||
|
- **`from`** resolves three ways, checked in this order: (1) a key in `firewall_zones`
|
||||||
|
→ that subnet; (2) a key in `firewall_catalog` → that service's placement → host
|
||||||
|
IP(s) as `/32`; (3) an inventory group or host name → its IP(s) as `/32`. An
|
||||||
|
unresolvable `from` is a hard error (fail fast, never silently open/skip).
|
||||||
|
|
||||||
|
Role **behaviour knobs** stay role-namespaced in `roles/base/defaults/main.yml`:
|
||||||
|
|
||||||
|
| Default | Value | Purpose |
|
||||||
|
|---|---|---|
|
||||||
|
| `base__firewall_mgmt_interface` | `wt0` | interface SSH is accepted on (NetBird overlay, ADR-016) |
|
||||||
|
| `base__firewall_ssh_port` | `22` | SSH port allowed on the mgmt interface |
|
||||||
|
| `base__firewall_rollback_timeout` | `45` | seconds before auto-revert fires |
|
||||||
|
| `base__firewall_dropin_dir` | `/etc/nftables.d` | extension dir included by the ruleset |
|
||||||
|
|
||||||
|
## Resolution & rendering
|
||||||
|
|
||||||
|
The resolver is a **pure Python filter plugin**, `roles/base/filter_plugins/firewall_rules.py`,
|
||||||
|
exposing `resolve_firewall_rules(catalog, zones, inventory_hostname, hostvars)`. It:
|
||||||
|
|
||||||
|
1. selects catalog entries placed on `inventory_hostname` (matching `host`, membership
|
||||||
|
in `group`, or presence in `hosts`);
|
||||||
|
2. for each entry's `ingress` rules, resolves `from` to a list of source CIDRs (zone /
|
||||||
|
service-placement / group-or-host, per the order above);
|
||||||
|
3. returns a **deterministic, de-duplicated, sorted** list of
|
||||||
|
`{proto, port, sources: [cidr, …]}`.
|
||||||
|
|
||||||
|
Chosen over inline Jinja (unreadable, untestable) and a `set_fact` loop (awkward to
|
||||||
|
unit-test) — a filter plugin matches the house style of `check-tags.py` /
|
||||||
|
`capacity-scan.py` and is pytest-unit-testable in isolation. Host→IP resolution reads
|
||||||
|
`hostvars[<host>].ansible_host` (the static `srv` IP the Terraform-generated inventory
|
||||||
|
provides).
|
||||||
|
|
||||||
|
`tasks/firewall.yml` builds `base__firewall_resolved` from the filter; the template
|
||||||
|
renders that flat list:
|
||||||
|
|
||||||
|
```jinja
|
||||||
|
#!/usr/sbin/nft -f
|
||||||
|
flush ruleset
|
||||||
|
table inet filter {
|
||||||
|
chain input {
|
||||||
|
type filter hook input priority 0; policy drop;
|
||||||
|
iif "lo" accept
|
||||||
|
ct state established,related accept
|
||||||
|
ct state invalid drop
|
||||||
|
iif "{{ base__firewall_mgmt_interface }}" tcp dport {{ base__firewall_ssh_port }} accept
|
||||||
|
ip protocol icmp accept
|
||||||
|
ip6 nexthdr ipv6-icmp accept
|
||||||
|
{% for r in base__firewall_resolved %}
|
||||||
|
ip saddr { {{ r.sources | join(', ') }} } {{ r.proto }} dport {{ r.port }} accept
|
||||||
|
{% endfor %}
|
||||||
|
}
|
||||||
|
chain forward { type filter hook forward priority 0; policy drop; }
|
||||||
|
chain output { type filter hook output priority 0; policy accept; }
|
||||||
|
}
|
||||||
|
include "{{ base__firewall_dropin_dir }}/*.nft"
|
||||||
|
```
|
||||||
|
|
||||||
|
A host with no catalog entries still gets a valid default-deny + management-plane
|
||||||
|
ruleset. The `include` is the `docker_host` extension hook (forward/NAT drop-ins).
|
||||||
|
Sorted resolved rules → stable diffs and deterministic tests.
|
||||||
|
|
||||||
|
## Safe apply (lockout protection)
|
||||||
|
|
||||||
|
`tasks/firewall.yml` renders `/etc/nftables.conf`; when it changes, a **linear**
|
||||||
|
safe-apply sequence runs (deliberately in tasks, not a handler, so the confirm/cancel
|
||||||
|
step is controllable — a small, justified deviation from the handler idiom, noted in the
|
||||||
|
role README):
|
||||||
|
|
||||||
|
1. **Validate** — `nft -c -f /etc/nftables.conf`; fail the play if invalid, before
|
||||||
|
touching the live ruleset.
|
||||||
|
2. **Snapshot** — `nft list ruleset > /etc/nftables.rollback` (empty/flush on first run).
|
||||||
|
3. **Arm revert** — `systemd-run --on-active={{ base__firewall_rollback_timeout }}
|
||||||
|
--unit=nft-rollback nft -f /etc/nftables.rollback` (transient timer, no `at`
|
||||||
|
dependency).
|
||||||
|
4. **Apply** — `nft -f /etc/nftables.conf`.
|
||||||
|
5. **Confirm + disarm** — the next Ansible task running proves the connection survived →
|
||||||
|
`systemctl stop nft-rollback`. If the apply bricked connectivity, the play cannot
|
||||||
|
continue, the timer fires, and the host self-heals (the offsite-`askari` safeguard).
|
||||||
|
6. **Persist** — enable `nftables.service` so `/etc/nftables.conf` loads on boot.
|
||||||
|
|
||||||
|
`established/related` (rendered in the ruleset) means the in-flight Ansible session
|
||||||
|
survives the swap; atomic `nft -f` avoids partial states.
|
||||||
|
|
||||||
|
**NetBird dependency:** locking SSH to `wt0`-only assumes NetBird (ADR-016) is built.
|
||||||
|
Until then, `base__firewall_mgmt_interface` (and, if needed, an additional management
|
||||||
|
source) is set to a reachable path so the role is deployable independently. This is a
|
||||||
|
config knob, not a code dependency.
|
||||||
|
|
||||||
|
## Testing (ADR-008)
|
||||||
|
|
||||||
|
- **Level 1 / pytest** — unit-test `firewall_rules.py` against fixture catalogs: zone
|
||||||
|
resolution, service→host-IP resolution, `group`/`hosts` multi-host placement, a host
|
||||||
|
with no services, source de-dup/sort, and an unresolvable `from` raising. Mirrors
|
||||||
|
`tests/test_check_tags.py` (import the module, assert on return values).
|
||||||
|
- **Level 1 / Molecule** — fixture `firewall_catalog` + fixture inventory (host_vars/
|
||||||
|
group_vars) in the scenario; `converge` renders `/etc/nftables.conf`; `verify` asserts
|
||||||
|
(a) expected accept lines are present for the fixture and (b) `nft -c -f
|
||||||
|
/etc/nftables.conf` validates syntax. **No apply** (kernel safety).
|
||||||
|
- **Level 2 / staging** — real apply on staging VMs verifies enforcement *and* the
|
||||||
|
safe-apply + auto-rollback path (steps 2–5), which Level 1 cannot safely cover.
|
||||||
|
|
||||||
|
The Molecule base image is not guaranteed to ship `nft`. The role installs the
|
||||||
|
`nftables` package as its first firewall task, so by the time `verify` runs the `nft -c`
|
||||||
|
syntax check, `nft` is present (installed during `converge`).
|
||||||
|
|
||||||
|
## Open dependencies / notes
|
||||||
|
|
||||||
|
- **NetBird/ADR-016 unbuilt** — see the mgmt-interface knob above; full `wt0`-only
|
||||||
|
lockdown lands when NetBird does.
|
||||||
|
- The safe-apply orchestration (steps 2–5) has **no Level-1 coverage** by design; it is
|
||||||
|
integration-tested at Level 2. Called out so the gap is explicit.
|
||||||
|
|
||||||
|
## Scope summary
|
||||||
|
|
||||||
|
**Built here:** `firewall_catalog`/`firewall_zones` schema; `firewall_rules.py` resolver
|
||||||
|
+ pytest; `nftables.conf.j2` (default-deny input, mgmt plane, permissive egress, drop-in
|
||||||
|
`include` hook); safe-apply-with-rollback tasks; Molecule render/syntax scenario;
|
||||||
|
`base` role scaffolding (README, meta, defaults, handlers).
|
||||||
|
|
||||||
|
**Deferred:** Docker `iptables:false` + container forward/NAT (→ `docker_host` spec, via
|
||||||
|
the drop-in hook); OPNsense rendering from the same catalog (→ OPNsense-as-code spec);
|
||||||
|
drift-detection check (ADR-020); all other `base` concerns.
|
||||||
|
|
||||||
|
## Related
|
||||||
|
|
||||||
|
ADR-020 (firewall strategy), ADR-002 (security baseline), ADR-004 (Docker model —
|
||||||
|
`iptables:false`, one service = one role), ADR-007 (VLANs/subnets), ADR-008 (testing
|
||||||
|
levels), ADR-016 (NetBird mesh — SSH on `wt0`), ADR-019 (`firewall` tag).
|
||||||
Loading…
Add table
Reference in a new issue