boma/roles/netbird_coordinator/VERIFY.md

64 lines
3.4 KiB
Markdown
Raw Permalink Normal View History

# Verify — netbird_coordinator (NetBird control plane)
> **Authored now, executed later.** This is the acceptance spec for `/verify-service
> netbird_coordinator`. It cannot run yet: it needs the Playwright UI harness (ADR-017)
> **and** a live deploy of this role behind the M4a Caddy on askari. Until both exist,
> treat this as the spec to drive once they do — verification is deferred, not skipped.
NetBird's coordinator does have a real web UI (the dashboard), so this is a genuine
Level-4 UI spec, not just an HTTP/TLS check.
## Critical user journeys
The acceptance criteria — what "working" means. Numbered; action → expected result.
1. **Dashboard loads over a valid LE cert** — request
`https://netbird.askari.wingu.me` → the dashboard SPA renders; the browser shows a
valid Let's Encrypt certificate (trusted chain, SAN matches the host, not expired).
2. **First-boot `/setup` creates the first admin** — on a fresh deploy (zero users),
`https://netbird.askari.wingu.me/setup` is reachable and creating the first admin
account succeeds; re-visiting `/setup` afterwards no longer offers admin creation
(the window self-closes once a user exists).
3. **Login via the embedded Dex IdP succeeds** — logging in with the just-created admin
(OIDC redirect through `/oauth2`, public PKCE client, no client secret) lands on the
dashboard's authenticated home / peers view.
4. **The management API responds behind auth** — an authenticated dashboard session can
list peers / setup keys (the dashboard calls the management REST API at `/api`); an
**unauthenticated** request to `/api/...` is rejected (401/403), confirming the API
is not open.
5. **STUN answers on 3478/udp** — out of band (not browser): a STUN binding request to
`askari:3478/udp` returns a binding response (confirms the host-published UDP port is
live).
## What good looks like
Key states/screens to confirm (and screenshot):
- The browser padlock shows a valid Let's Encrypt cert for `netbird.askari.wingu.me`.
- The `/setup` page renders the admin-creation form on a fresh deploy, and the dashboard
reports an authenticated session after first login.
- The dashboard's peers/setup-keys view loads its data from the management API (no error
toast, no infinite spinner) — proving the `/api` + gRPC routing through Caddy works.
- An anonymous `/api` request returns 401/403, not data.
## Not browser-verifiable
Route these to the manual-test handoff:
- **STUN on 3478/udp** (journey 5) — UDP, not HTTP; verify with a STUN client, not a
browser.
- **gRPC over h2c** (management + signal exchange) and the **relay WebSocket** — exercised
end-to-end only by a real peer enrolling (M5), not by a headless dashboard session.
- **Peer enrolment via setup keys** — depends on the M5 client work; out of scope here.
- **Datastore encryption / restore** — proven by the `BACKUP.md` restore drill, not the UI.
## Test data
This service runs **only on production askari** — there is no staging Authentik group and
no SSO in front of it (it ships its own embedded IdP). The journeys provision their own:
- A **fresh deploy with zero users** so journey 2 (`/setup`) is reachable; journey 2
itself creates the single admin account used by journeys 34. No pre-seeded peers.
- Public DNS A-record for `netbird.askari.wingu.me` pointing at askari (so Caddy's
HTTP-01 cert can issue) — already provisioned with the M4a Caddy.