# Backup — netbird_coordinator (NetBird control plane) Rendered from the role's `backup__*` data (`roles/netbird_coordinator/defaults/main.yml`) — the source of truth that also drives `/check-backup`. Regenerate from the data; edit the data, not the tables. Host: `askari` (off-site Hetzner; ADR-007/016). This is boma's **first stateful service** (`backup__state: true`). It holds the entire mesh control-plane state in an encrypted SQLite datastore. ## State captured Rendered from `backup__*`: | What | Source | How captured | |---|---|---| | datastore volume | `/var/lib/netbird` (Docker named volume `netbird_data`) | file-level, pulled read-only — the SQLite DB (peers, setup keys, ACLs, embedded-IdP users) | - **Encryption key is part of the backup contract.** The datastore is **encrypted** with `vault.netbird.datastore_key` (`server.store.encryptionKey`, base64 32 bytes). A restore needs **both** the captured volume **and** that key. The key already lives in the Ansible Vault (off-host, in the repo); it is **not** re-captured by the data backup and must not be — the vault is its own backup. Lose the key and the snapshot is unreadable. - **Quiesce:** `false` — SQLite is captured file-level from the named volume. ADR-022 Decision 7 prefers a logical dump; NetBird exposes no dump command and uses an embedded store, so this is the file-level escape hatch (Decision 7 B). If a live file-level copy proves inconsistent in practice, flip `backup__quiesce: true` (stop → snapshot → restart) — the stack tolerates a brief restart. - **RPO:** ~24 h (nightly; ADR-022 Decision 2) — **once the pipeline exists** (see below). ## Restore procedure 1. Re-provision the host (Terraform) and redeploy this role (Ansible) — Model A. This renders `config.yaml` with `vault.netbird.datastore_key` from the vault (the *same* key the snapshot was encrypted under — do not rotate it across a restore). 2. Stop the stack, `restic restore` the latest snapshot for `netbird_coordinator` into the `netbird_data` volume / `/var/lib/netbird`, then start the stack. 3. No logical dump to replay (file-level store). 4. Confirm with this role's `VERIFY.md` checks (ADR-008/017) — dashboard loads, login via the embedded IdP works, the management API lists the restored peers/keys. ## Restore notes - **The encryption key must match the snapshot.** The datastore is unreadable without the exact `vault.netbird.datastore_key` it was written under. Restore the vault first (or confirm the key is unchanged) before restoring the data; never rotate the datastore key as part of a restore. - **Off-site backup is NOT yet captured — accepted risk.** The restic / `fisi` pull node (ADR-022 Plan 2) is **not built yet**, so right now this state is **not** backed up off-host. Until `fisi` lands, a loss of askari loses the mesh control-plane state; the only recovery is to re-bootstrap a fresh coordinator (`/setup`) and re-enrol peers (M5). Accepted for now; this record exists so the gap is explicit and `/check-backup` flags it. Revisit when the `fisi` pull node + restic repo are live. - **Compose project name is `netbird`** (the base-dir basename), not `netbird_coordinator` — relevant when stopping the stack to quiesce a restore.