Live cutover complete: base INPUT-only default-deny + wt0-primary SSH + permanent WAN break-glass on askari, netbird_coordinator geo-disabled. A real reboot recovered unattended — firewall persisted, Docker forwarding + public services up, coordinator geo-disabled (no FATAL), mesh + both SSH paths back. ROADMAP sub-project 3 (askari redesign) marked DONE; next = relay-SPOF reduction.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
After an operator reboot of ubongo, verified live that the INPUT-only default-deny ruleset re-applied on boot (input chain policy drop + the full wt0/ssh-from-control/admin-addr allow-list), the wt0 mesh came back (Management+Signal Connected), and both SSH paths recovered clean. Closes the 'real-host reboot validation pending' item for mesh-hardening 2/3.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
base firewall applied + live-verified on ubongo (INPUT-only default-deny;
base__firewall_input_only). Records the Docker-nat-flush caveat (needs a restart
docker on a Docker host), the claude self-SSH grant, and reboot-validation-pending.
ROADMAP: sub-project 2 done; remaining = NetBird ACL + askari redesign.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
mamba + work laptop enrolled in the mesh → ubongo reachable from anywhere; the
mobile-access goal is met and Phase 1 (remote access) is complete. Adds
docs/runbooks/netbird-client.md (reusable client-enrollment runbook) + STATUS/
ROADMAP flips + CLAUDE.md reading-table entry.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
ADR-024 Status/Consequences, STATUS.md, ROADMAP M4a, and the FRICTION ledger now
record that the DNS-01 path is built and proven, with the root cause of the M4a
failure (version skew: pre-Bearer libdns/gandi sent the deprecated Apikey header;
plus building on a Hetzner IP). Traefik was reconsidered and rejected again — lego's
Gandi provider has the same PAT-vs-Apikey question, so it would not have helped.
Dated review reports and spec/plan snapshots are left as historical records.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- ADR-007: document ubongo on the legacy V4 net at 10.20.10.151 (transitional,
outside the planned srv /24 until the LAN is re-cut) (O10); single authoritative
boma.baobab.band -> boma.wingu.me transition note already added earlier
- terraform tfvars.example + variables.tf (both envs): pve01 -> pve0 and
<host>.boma.baobab.band per ADR-007 naming (O11)
- ADR-012/013/015/016/017/018: convert "See also:" prose to `## Related` sections
placed after Consequences, matching ADR-014/019-023 (O13)
- docs/README + inventories/README: list the missing subdirs / offsite_hosts +
offsite.yml merge behaviour (O14, O29 note)
- ADR-009: drop the retired `nyumbani` example; use vaultwarden.wingu.me split-horizon (O19)
- ROADMAP M2: askari shipped as cx23/x86 (CAX11/ARM out of stock) (O20)
- ADR-020: 80/443/3478 opened in M4a (past tense); coordinator role is M4b (O21)
- netbird -> netbird_coordinator across ROADMAP M4b, the M4b plan, ADR-024 (O23)
- ADR-024: align the M1 DNS-01 wildcard scope wording with ROADMAP (O24)
- capacity-scan.py: read the inventory directory so offsite.yml (askari) is seen (O28)
- tf_to_inventory.py: generated header now warns it overwrites the manual control node (O9)
- tests/tags.yml: proxy concern comment Traefik -> Caddy (missed in the O3 sweep)
O9's existing stub hosts.yml header stays as-is (generator-owned, hook-protected);
the fix lives in the generator for the next regeneration. make lint + pytest (57) green.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Adds ADR-024 pinning Caddy (xcaddy + caddy-dns/gandi) as boma's reverse
proxy, superseding the soft Traefik assumption in the roadmap and ADR-017
prose. Updates CLAUDE.md Further reading table and ROADMAP.md Phase-2 step 5.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
askari is provisioned as IaC: Terraform owns its existence too, generalizing
ADR-006 from "Proxmox VM existence" to Proxmox + Hetzner (new hetznercloud/hcloud
provider, hetzner_vm module, offsite stack with local state). CAX11 (ARM) in
Helsinki on Debian 13, behind a TF-managed Hetzner Cloud Firewall (SSH-from-ubongo
now; NetBird ports in M4). Token via TF_VAR_hcloud_token from vault.hetzner.token.
Handoff stays ADR-009-shaped (tf_to_inventory.py extended to emit askari into
offsite_hosts). State in the ADR-022 backup scope; DR via terraform import.
Amends ADR-006/009/020/007/016. Point ROADMAP.md M2 at the spec.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
boma's domain is wingu.me (registered at Gandi; 'wingu' = Swahili for cloud).
Replace the parametric <boma-domain> placeholder with wingu.me throughout. The
zone was NOT empty — Gandi auto-seeded 13 default records (parking A, www redirect,
a full Gandi mailbox set), so M1 includes a one-time purge to a clean baseline plus
an anti-spoof null-mail set (null MX, SPF -all, DMARC reject) since wingu.me sends
no mail. Domain-pick open item closed.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Settles the M1 design: full registrar transfer Cloudflare -> Gandi; three-tier
naming scheme (host.boma / service.bare / service.askari), nyumbani dropped,
mesh/LAN-only default; public-DNS-as-code via a control-node `public_dns` role
driven by group_vars data, using community.general.gandi_livedns with a PAT
(api_key is deprecated/rejected by Gandi — verified per ADR-014). Stale records +
unused MX cleaned by omission. Cert scope is DNS+PAT only (issuance deferred to
M4/Phase 2). Human/agent division of labour + token-scoping recorded.
Resolves TODO 4 and review finding O12 once the ADR-007 amendment lands. Point
ROADMAP.md M1 at the spec.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
High-level build order for the project (Approach A): one Off-site/Remote-access
track first (Gandi DNS-as-code -> askari -> NetBird control plane -> enroll
ubongo + road-warrior laptops -> harden), a procurement gate sized by
/capacity-review, then the Cluster track. Sequences the docs/TODO.md backlog into
milestones and records why the order is what it is.
Decisions captured this session: Gandi over Cloudflare is values-driven and
independent of NetBird (sequenced first so records are born at Gandi); public DNS
managed as code (Ansible, consistent with internal DNS + Terraform-owns-no-DNS);
NetBird-on-ubongo before base default-deny (chicken-and-egg); cluster procurement
gated on patterns proven on two cheap hosts.
Wire ROADMAP.md into CLAUDE.md's Further-reading index and point TODO.md at it.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>