boma/roles/netbird_coordinator/VERIFY.md
sjat 070d6f293b docs(netbird): service-role standard files (SECURITY/VERIFY/ACCESS/BACKUP)
Author the four ADR-mandated service-role docs for netbird_coordinator and
add the cross-role access__*/backup__* data (ADR-021/022). First stateful
service: backup__state=true; off-site capture pending the fisi pull node.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-15 18:01:29 +02:00

3.4 KiB
Raw Permalink Blame History

Verify — netbird_coordinator (NetBird control plane)

Authored now, executed later. This is the acceptance spec for /verify-service netbird_coordinator. It cannot run yet: it needs the Playwright UI harness (ADR-017) and a live deploy of this role behind the M4a Caddy on askari. Until both exist, treat this as the spec to drive once they do — verification is deferred, not skipped.

NetBird's coordinator does have a real web UI (the dashboard), so this is a genuine Level-4 UI spec, not just an HTTP/TLS check.

Critical user journeys

The acceptance criteria — what "working" means. Numbered; action → expected result.

  1. Dashboard loads over a valid LE cert — request https://netbird.askari.wingu.me → the dashboard SPA renders; the browser shows a valid Let's Encrypt certificate (trusted chain, SAN matches the host, not expired).
  2. First-boot /setup creates the first admin — on a fresh deploy (zero users), https://netbird.askari.wingu.me/setup is reachable and creating the first admin account succeeds; re-visiting /setup afterwards no longer offers admin creation (the window self-closes once a user exists).
  3. Login via the embedded Dex IdP succeeds — logging in with the just-created admin (OIDC redirect through /oauth2, public PKCE client, no client secret) lands on the dashboard's authenticated home / peers view.
  4. The management API responds behind auth — an authenticated dashboard session can list peers / setup keys (the dashboard calls the management REST API at /api); an unauthenticated request to /api/... is rejected (401/403), confirming the API is not open.
  5. STUN answers on 3478/udp — out of band (not browser): a STUN binding request to askari:3478/udp returns a binding response (confirms the host-published UDP port is live).

What good looks like

Key states/screens to confirm (and screenshot):

  • The browser padlock shows a valid Let's Encrypt cert for netbird.askari.wingu.me.
  • The /setup page renders the admin-creation form on a fresh deploy, and the dashboard reports an authenticated session after first login.
  • The dashboard's peers/setup-keys view loads its data from the management API (no error toast, no infinite spinner) — proving the /api + gRPC routing through Caddy works.
  • An anonymous /api request returns 401/403, not data.

Not browser-verifiable

Route these to the manual-test handoff:

  • STUN on 3478/udp (journey 5) — UDP, not HTTP; verify with a STUN client, not a browser.
  • gRPC over h2c (management + signal exchange) and the relay WebSocket — exercised end-to-end only by a real peer enrolling (M5), not by a headless dashboard session.
  • Peer enrolment via setup keys — depends on the M5 client work; out of scope here.
  • Datastore encryption / restore — proven by the BACKUP.md restore drill, not the UI.

Test data

This service runs only on production askari — there is no staging Authentik group and no SSO in front of it (it ships its own embedded IdP). The journeys provision their own:

  • A fresh deploy with zero users so journey 2 (/setup) is reachable; journey 2 itself creates the single admin account used by journeys 34. No pre-seeded peers.
  • Public DNS A-record for netbird.askari.wingu.me pointing at askari (so Caddy's HTTP-01 cert can issue) — already provisioned with the M4a Caddy.