Incident 2026-06-17: applying base's nftables default-deny (forward policy drop) to askari — a Docker host — broke container forwarding/NAT on reboot, and the wt0-only sshd ListenAddress left no break-glass (ip_nonlocal_bind did NOT beat the boot race). Recovery: disable nftables + restart docker (restore the wiped NAT masquerade) + force-recreate the coordinator (it FATAL-looped unable to download its GeoLite2 DB with no egress) -> mesh re-formed. Back out the enablement so a future deploy can't re-break askari: - offsite_hosts: base__ssh_listen_mesh_only=false, base__firewall_apply=false - remove host_vars/askari.yml (manage over the WAN again, not wt0) - tf/offsite: re-open WAN :22 to ubongo only (break-glass; already applied) askari now: sshd on all interfaces (Ansible-managed), nftables disabled, WAN :22 open -> stable + reboot-survivable. The base feature code (sshd ListenAddress option, firewall public zone) stays; it's just not enabled on Docker hosts. Mesh-hardening 1/3 to be re-spec'd before any retry. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|---|---|---|
| .. | ||
| production | ||
| staging | ||
| README.md | ||
inventories/
Ansible inventories, one directory per environment (staging/, production/).
Defines which hosts exist and their group membership; group_vars/ and host_vars/
hold per-group and per-host configuration.
hosts.ymlis generated from Terraform outputs bymake tf-inventory— do not hand-edit. The control node is the one manual exception.offsite.yml(inproduction/) is a second generated inventory file, written bymake tf-inventory-offsitefrom the offsite Terraform env; it holds theoffsite_hostsgroup (askari). Ansible merges it withhosts.yml, so both can declare the same group names harmlessly (the offsite generator emits all four groups, most empty).- Host groups:
all,control,docker_hosts,proxmox_hosts,offsite_hosts. - Terraform→inventory data flow and the data contract: ADR-009.
- Addressing conventions (subnets, ranges): ADR-007.
- Layout and host groups: see CLAUDE.md ("Inventory structure").