From e3461375f5fe7e2f3368ecc122e25c61b50bd02e Mon Sep 17 00:00:00 2001 From: sjat Date: Sun, 14 Jun 2026 18:20:04 +0200 Subject: [PATCH] =?UTF-8?q?docs(plan):=20M4b=20=E2=80=94=20NetBird=20coord?= =?UTF-8?q?inator=20service=20role?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Capture NetBird's configure.sh reference for a pinned version → translate into boma role templates (compose + management.json + dex/openid + turnserver), external-proxy mode behind the M4a Caddy (netbird.askari.wingu.me). First service role: full ADR-004 standard files; secrets generated/CHANGEME-stubbed (setup key for M5). Gated live deploy + verify. Co-Authored-By: Claude Opus 4.8 (1M context) --- .../plans/2026-06-14-m4b-netbird.md | 91 +++++++++++++++++++ 1 file changed, 91 insertions(+) create mode 100644 docs/superpowers/plans/2026-06-14-m4b-netbird.md diff --git a/docs/superpowers/plans/2026-06-14-m4b-netbird.md b/docs/superpowers/plans/2026-06-14-m4b-netbird.md new file mode 100644 index 0000000..eef8e7a --- /dev/null +++ b/docs/superpowers/plans/2026-06-14-m4b-netbird.md @@ -0,0 +1,91 @@ +# M4b — NetBird coordinator (service role) Implementation Plan + +> **For agentic workers:** REQUIRED SUB-SKILL: superpowers:subagent-driven-development (recommended) or superpowers:executing-plans. Steps use `- [ ]` checkboxes. + +**Goal:** Deploy the self-hosted NetBird control plane on askari as boma's first real service role (`netbird`), fronted by the M4a Caddy, reachable at `https://netbird.askari.wingu.me` with the embedded Dex login. + +**Architecture:** NetBird's own `configure.sh` generates the canonical compose + config for a pinned version; boma **captures that reference once and translates it into role templates** (ADR-004/013 — don't run their imperative script in production, render from templates). Runs in **external-reverse-proxy mode** (no bundled Traefik); Caddy adds a `netbird.askari.wingu.me` route. Secrets (datastore encryption key, TURN password, Dex secrets) are generated into vault; the setup key is stubbed `CHANGEME` for M5. + +**Tech Stack:** NetBird (combined `netbird-server` container if stable for the pinned version, else the multi-container set), embedded Dex IdP, Coturn, Docker Compose, Caddy (M4a), Ansible. + +**Spec:** `docs/superpowers/specs/2026-06-14-netbird-coordinator-m4-design.md` · **Prereq:** M4a (Docker + Caddy) ✓ on askari. + +**Execution context:** Task 1 runs `configure.sh` in a scratch dir (capture only). Tasks 2–6 author. **Task 7 deploys live to askari** (gated). NetBird self-hosting is finicky — expect live debugging. + +--- + +### Task 1: Capture NetBird's reference setup (pin the version) + +- [ ] **Step 1:** Pick + pin the NetBird version (ADR-014 — check the latest stable release). Record it. +- [ ] **Step 2:** In a scratch dir (on ubongo, throwaway), fetch NetBird's `getting-started`/`configure.sh` for that version and run it with answers for: domain `netbird.askari.wingu.me`, **external reverse proxy** (disable bundled Traefik/Caddy), **embedded Dex** (no external SSO), Let's Encrypt off (Caddy terminates TLS). +- [ ] **Step 3:** Capture the generated files verbatim into the plan/notes: `docker-compose.yml`, `management.json` (or `config.yaml`), `turnserver.conf`, `openid-configuration.json`, dashboard env. Also capture NetBird's **Caddy external-proxy template** (their docs ship one) — it shows the exact upstreams + HTTP/2/gRPC routing the dashboard/management/signal/relay need. +- [ ] **Step 4:** No commit (reference capture; informs Tasks 2–4). + +--- + +### Task 2: `netbird` service role — templates + +**Files:** `roles/netbird/` (scaffold via `make new-role NAME=netbird`): `defaults/main.yml`, `tasks/main.yml`, `templates/{docker-compose.yml,management.json,turnserver.conf,openid-configuration.json,dashboard.env}.j2`, `handlers/main.yml`, `README.md`. + +- [ ] **Step 1:** Translate the captured compose into `templates/docker-compose.yml.j2` — containers, the shared `boma` Docker network (so Caddy reaches them by name), **no host port mappings except what Caddy/Coturn need** (Coturn 3478/udp; everything else internal, Caddy fronts it). Pin image tags (ADR-011). +- [ ] **Step 2:** Translate `management.json`/`config.yaml` into a template — fill `Datadir`, `DataStoreEncryptionKey` (`{{ vault.netbird.datastore_key }}`), `HttpConfig` (public URL `https://netbird.askari.wingu.me`), `TURNConfig` (coturn host + `{{ vault.netbird.turn_password }}`), `Signal`, `Relay`, `Store` (sqlite), and the embedded-Dex IdP block (DeviceAuthorizationFlow/PKCE, `openid-configuration.json` URL). +- [ ] **Step 3:** `turnserver.conf.j2` (realm = `netbird.askari.wingu.me`, the TURN secret), `openid-configuration.json.j2`, `dashboard.env.j2` (`NETBIRD_MGMT_API_ENDPOINT=https://netbird.askari.wingu.me`, the `AUTH_*` Dex values). +- [ ] **Step 4:** `defaults/main.yml` (`netbird__*` knobs: version, base_dir `/opt/services/netbird`, domain) + `tasks/main.yml` (ADR-004 deploy mechanics: ensure dir, render all files, `community.docker.docker_compose_v2` up; `netbird__manage` toggle for Molecule). +- [ ] **Step 5:** `make lint`; commit `feat(netbird): coordinator service role (compose + config templates)`. + +--- + +### Task 3: Secrets (CHANGEME convention + generated) + +- [ ] **Step 1:** Add to vault (`make edit-vault`): `vault.netbird.datastore_key`, `vault.netbird.turn_password`, any Dex client secret — **generate** strong values (or stub `CHANGEME` + a comment if operator-supplied). Add `vault.netbird.setup_key: CHANGEME` with a comment "created in the NetBird dashboard after first boot — M5 enrolment". +- [ ] **Step 2:** `make check-vault` confirms structure + lists the `setup_key` placeholder. +- [ ] **Step 3:** Commit the vault. + +--- + +### Task 4: Wire Caddy + DNS + +- [ ] **Step 1:** Append to `reverse_proxy__routes` (`group_vars/all/reverse_proxy.yml`): `{host: netbird.askari.wingu.me, upstream: ""}` — per the captured Caddy template (NetBird needs HTTP/2 + gRPC; add the required Caddy directives, e.g. separate handles for the management gRPC path if the template shows them). +- [ ] **Step 2:** `netbird.askari.wingu.me` already resolves via the `*.askari.wingu.me` wildcard (M4a) — no new DNS record. +- [ ] **Step 3:** Commit. + +--- + +### Task 5: Service-role standard files (ADR-004, authored) + +- [ ] **Step 1:** Author `roles/netbird/SECURITY.md` (copy `docs/security/service-security-template.md`; record the public surface = Caddy 443 + Coturn 3478, embedded-Dex auth, accepted-risk R3). +- [ ] **Step 2:** `VERIFY.md` (copy the template; the `/verify-service` UI spec — run later when the playwright harness exists). +- [ ] **Step 3:** `ACCESS.md` (ADR-021; the dashboard/admin access + `access__*` intent). +- [ ] **Step 4:** `BACKUP.md` (ADR-022; the **datastore is stateful** → `backup__*` data; record that off-site backup is **pending `fisi`** — an accepted risk for now). +- [ ] **Step 5:** `make lint`; commit `docs(netbird): service-role standard files (SECURITY/VERIFY/ACCESS/BACKUP)`. + +--- + +### Task 6: Add netbird to the offsite playbook + +- [ ] **Step 1:** In `playbooks/offsite.yml`, add `netbird` after `reverse_proxy` (role-name tag). `make lint`. Commit. + +--- + +### Task 7: Deploy to askari + verify (gated, live — expect debugging) + +> NetBird self-hosting is finicky; budget for iterating on the management config + Caddy routing. + +- [ ] **Step 1:** `make check PLAYBOOK=offsite LIMIT=askari TAGS=netbird` — review. +- [ ] **Step 2:** `make deploy PLAYBOOK=offsite LIMIT=askari TAGS=netbird` → `make deploy ... TAGS=reverse_proxy` (Caddy reloads with the netbird route). +- [ ] **Step 3:** Verify: `docker compose ps` all healthy; `curl -sI https://netbird.askari.wingu.me` → 200 with the M4a cert; the **dashboard loads** in a browser; the management API responds. Iterate on config/routing until green. +- [ ] **Step 4:** No repo commit (host state). + +--- + +### Task 8: Docs + +- [ ] **Step 1:** STATUS — `netbird` coordinator built + applied (dashboard live); the first service role. ROADMAP M4b done; **M5 (enrol) next**. `make lint`; commit. + +--- + +## Self-Review (completed) + +- **Spec coverage:** external-proxy NetBird + embedded Dex (Decisions 3) → Tasks 1,2,4; first service role + standard files (Decision 7) → Tasks 2,5; firewall 3478 (Decision 5) → done in M4a; setup key M5 + CHANGEME (Decision 8) → Task 3; Caddy front (M4a) → Task 4. Enrolment → M5, correct. +- **Placeholder scan:** the concrete config field *values* are intentionally captured from `configure.sh` (Task 1) rather than invented — version-sensitive, and inventing them would be wrong. The plan pins the method, not guesses. +- **Risk:** NetBird's external-proxy + gRPC routing is the hard part — Task 1 captures NetBird's own Caddy template to get it right, and Task 7 budgets for live iteration.