docs(plan): M4b — NetBird coordinator service role
Capture NetBird's configure.sh reference for a pinned version → translate into boma role templates (compose + management.json + dex/openid + turnserver), external-proxy mode behind the M4a Caddy (netbird.askari.wingu.me). First service role: full ADR-004 standard files; secrets generated/CHANGEME-stubbed (setup key for M5). Gated live deploy + verify. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
1862b7a828
commit
e3461375f5
1 changed files with 91 additions and 0 deletions
91
docs/superpowers/plans/2026-06-14-m4b-netbird.md
Normal file
91
docs/superpowers/plans/2026-06-14-m4b-netbird.md
Normal file
|
|
@ -0,0 +1,91 @@
|
||||||
|
# M4b — NetBird coordinator (service role) Implementation Plan
|
||||||
|
|
||||||
|
> **For agentic workers:** REQUIRED SUB-SKILL: superpowers:subagent-driven-development (recommended) or superpowers:executing-plans. Steps use `- [ ]` checkboxes.
|
||||||
|
|
||||||
|
**Goal:** Deploy the self-hosted NetBird control plane on askari as boma's first real service role (`netbird`), fronted by the M4a Caddy, reachable at `https://netbird.askari.wingu.me` with the embedded Dex login.
|
||||||
|
|
||||||
|
**Architecture:** NetBird's own `configure.sh` generates the canonical compose + config for a pinned version; boma **captures that reference once and translates it into role templates** (ADR-004/013 — don't run their imperative script in production, render from templates). Runs in **external-reverse-proxy mode** (no bundled Traefik); Caddy adds a `netbird.askari.wingu.me` route. Secrets (datastore encryption key, TURN password, Dex secrets) are generated into vault; the setup key is stubbed `CHANGEME` for M5.
|
||||||
|
|
||||||
|
**Tech Stack:** NetBird (combined `netbird-server` container if stable for the pinned version, else the multi-container set), embedded Dex IdP, Coturn, Docker Compose, Caddy (M4a), Ansible.
|
||||||
|
|
||||||
|
**Spec:** `docs/superpowers/specs/2026-06-14-netbird-coordinator-m4-design.md` · **Prereq:** M4a (Docker + Caddy) ✓ on askari.
|
||||||
|
|
||||||
|
**Execution context:** Task 1 runs `configure.sh` in a scratch dir (capture only). Tasks 2–6 author. **Task 7 deploys live to askari** (gated). NetBird self-hosting is finicky — expect live debugging.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Task 1: Capture NetBird's reference setup (pin the version)
|
||||||
|
|
||||||
|
- [ ] **Step 1:** Pick + pin the NetBird version (ADR-014 — check the latest stable release). Record it.
|
||||||
|
- [ ] **Step 2:** In a scratch dir (on ubongo, throwaway), fetch NetBird's `getting-started`/`configure.sh` for that version and run it with answers for: domain `netbird.askari.wingu.me`, **external reverse proxy** (disable bundled Traefik/Caddy), **embedded Dex** (no external SSO), Let's Encrypt off (Caddy terminates TLS).
|
||||||
|
- [ ] **Step 3:** Capture the generated files verbatim into the plan/notes: `docker-compose.yml`, `management.json` (or `config.yaml`), `turnserver.conf`, `openid-configuration.json`, dashboard env. Also capture NetBird's **Caddy external-proxy template** (their docs ship one) — it shows the exact upstreams + HTTP/2/gRPC routing the dashboard/management/signal/relay need.
|
||||||
|
- [ ] **Step 4:** No commit (reference capture; informs Tasks 2–4).
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Task 2: `netbird` service role — templates
|
||||||
|
|
||||||
|
**Files:** `roles/netbird/` (scaffold via `make new-role NAME=netbird`): `defaults/main.yml`, `tasks/main.yml`, `templates/{docker-compose.yml,management.json,turnserver.conf,openid-configuration.json,dashboard.env}.j2`, `handlers/main.yml`, `README.md`.
|
||||||
|
|
||||||
|
- [ ] **Step 1:** Translate the captured compose into `templates/docker-compose.yml.j2` — containers, the shared `boma` Docker network (so Caddy reaches them by name), **no host port mappings except what Caddy/Coturn need** (Coturn 3478/udp; everything else internal, Caddy fronts it). Pin image tags (ADR-011).
|
||||||
|
- [ ] **Step 2:** Translate `management.json`/`config.yaml` into a template — fill `Datadir`, `DataStoreEncryptionKey` (`{{ vault.netbird.datastore_key }}`), `HttpConfig` (public URL `https://netbird.askari.wingu.me`), `TURNConfig` (coturn host + `{{ vault.netbird.turn_password }}`), `Signal`, `Relay`, `Store` (sqlite), and the embedded-Dex IdP block (DeviceAuthorizationFlow/PKCE, `openid-configuration.json` URL).
|
||||||
|
- [ ] **Step 3:** `turnserver.conf.j2` (realm = `netbird.askari.wingu.me`, the TURN secret), `openid-configuration.json.j2`, `dashboard.env.j2` (`NETBIRD_MGMT_API_ENDPOINT=https://netbird.askari.wingu.me`, the `AUTH_*` Dex values).
|
||||||
|
- [ ] **Step 4:** `defaults/main.yml` (`netbird__*` knobs: version, base_dir `/opt/services/netbird`, domain) + `tasks/main.yml` (ADR-004 deploy mechanics: ensure dir, render all files, `community.docker.docker_compose_v2` up; `netbird__manage` toggle for Molecule).
|
||||||
|
- [ ] **Step 5:** `make lint`; commit `feat(netbird): coordinator service role (compose + config templates)`.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Task 3: Secrets (CHANGEME convention + generated)
|
||||||
|
|
||||||
|
- [ ] **Step 1:** Add to vault (`make edit-vault`): `vault.netbird.datastore_key`, `vault.netbird.turn_password`, any Dex client secret — **generate** strong values (or stub `CHANGEME` + a comment if operator-supplied). Add `vault.netbird.setup_key: CHANGEME` with a comment "created in the NetBird dashboard after first boot — M5 enrolment".
|
||||||
|
- [ ] **Step 2:** `make check-vault` confirms structure + lists the `setup_key` placeholder.
|
||||||
|
- [ ] **Step 3:** Commit the vault.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Task 4: Wire Caddy + DNS
|
||||||
|
|
||||||
|
- [ ] **Step 1:** Append to `reverse_proxy__routes` (`group_vars/all/reverse_proxy.yml`): `{host: netbird.askari.wingu.me, upstream: "<netbird container:port>"}` — per the captured Caddy template (NetBird needs HTTP/2 + gRPC; add the required Caddy directives, e.g. separate handles for the management gRPC path if the template shows them).
|
||||||
|
- [ ] **Step 2:** `netbird.askari.wingu.me` already resolves via the `*.askari.wingu.me` wildcard (M4a) — no new DNS record.
|
||||||
|
- [ ] **Step 3:** Commit.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Task 5: Service-role standard files (ADR-004, authored)
|
||||||
|
|
||||||
|
- [ ] **Step 1:** Author `roles/netbird/SECURITY.md` (copy `docs/security/service-security-template.md`; record the public surface = Caddy 443 + Coturn 3478, embedded-Dex auth, accepted-risk R3).
|
||||||
|
- [ ] **Step 2:** `VERIFY.md` (copy the template; the `/verify-service` UI spec — run later when the playwright harness exists).
|
||||||
|
- [ ] **Step 3:** `ACCESS.md` (ADR-021; the dashboard/admin access + `access__*` intent).
|
||||||
|
- [ ] **Step 4:** `BACKUP.md` (ADR-022; the **datastore is stateful** → `backup__*` data; record that off-site backup is **pending `fisi`** — an accepted risk for now).
|
||||||
|
- [ ] **Step 5:** `make lint`; commit `docs(netbird): service-role standard files (SECURITY/VERIFY/ACCESS/BACKUP)`.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Task 6: Add netbird to the offsite playbook
|
||||||
|
|
||||||
|
- [ ] **Step 1:** In `playbooks/offsite.yml`, add `netbird` after `reverse_proxy` (role-name tag). `make lint`. Commit.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Task 7: Deploy to askari + verify (gated, live — expect debugging)
|
||||||
|
|
||||||
|
> NetBird self-hosting is finicky; budget for iterating on the management config + Caddy routing.
|
||||||
|
|
||||||
|
- [ ] **Step 1:** `make check PLAYBOOK=offsite LIMIT=askari TAGS=netbird` — review.
|
||||||
|
- [ ] **Step 2:** `make deploy PLAYBOOK=offsite LIMIT=askari TAGS=netbird` → `make deploy ... TAGS=reverse_proxy` (Caddy reloads with the netbird route).
|
||||||
|
- [ ] **Step 3:** Verify: `docker compose ps` all healthy; `curl -sI https://netbird.askari.wingu.me` → 200 with the M4a cert; the **dashboard loads** in a browser; the management API responds. Iterate on config/routing until green.
|
||||||
|
- [ ] **Step 4:** No repo commit (host state).
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Task 8: Docs
|
||||||
|
|
||||||
|
- [ ] **Step 1:** STATUS — `netbird` coordinator built + applied (dashboard live); the first service role. ROADMAP M4b done; **M5 (enrol) next**. `make lint`; commit.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Self-Review (completed)
|
||||||
|
|
||||||
|
- **Spec coverage:** external-proxy NetBird + embedded Dex (Decisions 3) → Tasks 1,2,4; first service role + standard files (Decision 7) → Tasks 2,5; firewall 3478 (Decision 5) → done in M4a; setup key M5 + CHANGEME (Decision 8) → Task 3; Caddy front (M4a) → Task 4. Enrolment → M5, correct.
|
||||||
|
- **Placeholder scan:** the concrete config field *values* are intentionally captured from `configure.sh` (Task 1) rather than invented — version-sensitive, and inventing them would be wrong. The plan pins the method, not guesses.
|
||||||
|
- **Risk:** NetBird's external-proxy + gRPC routing is the hard part — Task 1 captures NetBird's own Caddy template to get it right, and Task 7 budgets for live iteration.
|
||||||
Loading…
Add table
Reference in a new issue