boma/docs/superpowers/plans/2026-06-05-ubongo-control-host.md
sjat 0e9f179bfc Add implementation plan for ubongo control host
Task-by-task docs plan: author ADR-015 and reconcile ADR-001/005/008/009/012,
the new-host and rotate-secrets runbooks, accepted-risks, STATUS, and CLAUDE.md.
Documentation-only; the physical box stays "designed, not built".

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-05 09:29:10 +02:00

30 KiB
Raw Blame History

Ubongo Control / AI-Worker Host — Implementation Plan

For agentic workers: REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (- [ ]) syntax for tracking.

Goal: Record the decision to replace the cluster-resident control VM with a dedicated always-on physical host (ubongo) outside the Proxmox cluster, by authoring ADR-015 and reconciling every doc that currently assumes the control node is a cluster VM.

Architecture: This is a documentation-only change. No code, no roles, no inventory data. ubongo is recorded as designed, not built (per STATUS.md discipline) — the physical box, its OS install, and its inventory wiring are a future manual build, not part of this plan. The work is: one new ADR (the home of record) plus targeted amendments to the ADRs/runbooks/registers that contradict it, each cross-linking ADR-015.

Tech Stack: Markdown only. Verification is the repo's pre-commit hooks (trailing-whitespace, end-of-file, gitleaks, ansible-lint, vault-encryption guard) plus manual internal-consistency checks. There is no markdown linter in the toolchain, so "tests" are hook-pass + cross-reference-resolves greps.


Pre-flight (read once before starting)

  • rbw must be unlocked before every commit. The pre-commit ansible-lint hook decrypts vault.yml. Run rbw unlocked (exit 0 = good); if not, stop and ask the user to rbw unlock. Do not start a task you cannot commit.
  • Commit style: one commit per task, imperative subject ≤72 chars, with the trailer:
    Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
    
  • Order matters: Task 1 (ADR-015) must land first — every later task links to it.
  • Spec reference: docs/superpowers/specs/2026-06-05-ubongo-control-host-design.md.

File map

File Action Responsibility after change
docs/decisions/015-control-host.md Create Home of record for the ubongo decision
docs/decisions/001-architecture.md Modify Control node = physical box outside cluster
docs/decisions/005-bootstrapping.md Modify Control-node bootstrap = bare-metal Debian install
docs/decisions/009-provisioning-handoff.md Modify Control-node exception is genuinely physical
docs/decisions/008-testing.md Modify All test levels run on ubongo; stub future UI level
docs/decisions/012-hardware-capacity.md Modify ubongo is in-scope physical compute
docs/hardware/reference.md Modify ubongo row in node-capacity + physical-compute section
docs/runbooks/new-host.md Modify Part E: control node is bare-metal, not qm clone
docs/runbooks/rotate-secrets.md Modify Offline break-glass vault-password requirement
docs/security/accepted-risks.md Modify Reserve mesh-VPN coordinator risk (pending VPN choice)
STATUS.md Modify Row: ubongo — designed, not built
CLAUDE.md Modify ADR-015 in Further reading; control-group note

Task 1: Author ADR-015 (the home of record)

Files:

  • Create: docs/decisions/015-control-host.md

  • Step 1: Create the ADR file

Create docs/decisions/015-control-host.md with exactly this content:

# ADR-015 — Control / development / AI-worker host (`ubongo`)

## Context

Earlier ADRs framed the control node — the host that runs Terraform and Ansible —
as a **single Debian 13 VM on the Proxmox cluster**, manually provisioned as the one
documented exception to "Terraform owns VM existence" (ADR-009). That framing treats
the control node purely as a control-plane runner.

It fails four needs, all confirmed as drivers:

1. **Cold-start bootstrap** — the VM that runs Terraform/Ansible cannot exist until
   something else creates it; the bootstrap is circular and awkward.
2. **Always-on availability** — the operator wants to SSH in from a work PC or
   anywhere to drive Claude Code. A cluster VM is gone whenever the cluster is down
   or being rebuilt.
3. **Recovery / disaster** — the tool used to rebuild the cluster must not live
   inside the thing it rebuilds.
4. **Dev ergonomics** — a persistent home for Claude Code + the repo, not entangled
   with production VM lifecycle.

A laptop-only answer fails always-on and recovery. A VM-only answer fails cold-start
and recovery. A small dedicated always-on physical machine outside the cluster
satisfies all four.

## Decision

Introduce **`ubongo`** (Swahili: *brain*, consistent with the fleet's theme): a
single dedicated x86-64 mini-PC, always-on, living **outside** the Proxmox cluster.
It becomes *the* control node and collapses four roles into one box:

- Terraform + Ansible runner (control plane)
- Claude Code / AI-worker host the operator SSHes into
- Local test runner (Molecule/Docker, lint, and later a browser stack)
- Persistent dev home for the repo

There is **no longer a control VM on the cluster.** The `control` inventory group
points at this physical box. This *strengthens* the ADR-009 control-node exception:
it is genuinely outside Terraform's world, not a VM pretending to be the exception.
Every other host stays a Terraform-managed VM exactly as designed.

`ubongo` runs **plain Debian 13** (the `base` role applies). It is not a hypervisor
and runs no `docker_host` services.

### Hardware target

| Spec | Target | Why |
|---|---|---|
| CPU | 4 cores, x86-64 (Intel N100-class or better) | Molecule containers + Chromium prefer x86 |
| RAM | 16 GB | Docker + headless Chromium + toolchain headroom |
| Disk | 250 GB SSD/NVMe | Docker images, molecule layers, repos, browser cache |
| Network | Wired GbE | Always-on reliability over Wi-Fi |
| Power | Low draw (≤15 W idle) | Runs 24/7 |

Indicative: a refurb Dell/Lenovo/HP micro (USFF) or an N100 mini-PC (~€150250).
Claude Code itself is light (the model runs in Anthropic's cloud); the sizing driver
is **all testing being local** — Molecule (Docker), lint, and a future
headless-Chromium/Playwright stack.

### Provisioning (bootstrap path)

Manual, on bare metal:

1. Install Debian 13 on the box (one-time, by hand).
2. `git clone` the repo; `make setup`; `make collections`; set up `rbw` + unlock.
3. Join the mesh VPN (choice deferred — see below).
4. From then on `ubongo` manages every other host normally; Ansible manages *it* for
   baseline config via the `control` group (`base` role only).

### Access & security

- Remote access is via the **mesh VPN** (choice deferred). SSH to `ubongo` over the
  mesh; nothing is published to the public internet — this stays inside ADR-002.
- `ubongo` runs the `base` role: SSH hardening, nftables default-deny, fail2ban,
  auditd, unattended-upgrades. Inbound SSH is allowed **only on the mesh interface**,
  denied on the physical NIC.

### Recovery model

`ubongo` is the rebuild tool, so three things must survive a full cluster loss:

1. **`mamba` (laptop) is a break-glass clone** — repo + toolchain + mesh + `rbw`,
   able to drive the fleet if `ubongo` dies.
2. **Terraform state** lives on `ubongo`, backed up encrypted off-box (synced to
   `mamba`). For a 25 VM fleet it is also reconstructable via `terraform import`.
3. **Vault password**`ubongo` gets it from Vaultwarden via `rbw`. `rbw` keeps a
   local encrypted copy of the vault and decrypts it offline with the operator's
   Vaultwarden master password, so `ubongo` can decrypt the Ansible vault with the
   whole cluster down — provided `rbw` has synced once and the operator keeps the
   Vaultwarden master password offline (memorised + paper in a safe). Mirror onto
   `mamba`.

There is always exactly one irreducible offline root secret; here it is the
Vaultwarden master password. Mirroring Vaultwarden onto `ubongo` is rejected: it
would make the control node run a service (against its remit) and still need that
master password.

> verified: rbw offline-cache decryption · TO VERIFY before relying on the recovery
> model · rbw docs · (ADR-014, security-relevant — confirm during build)

## Consequences

- The control node is physical compute outside the cluster, so it appears in
  `docs/hardware/reference.md` even though it is not a cluster node (ADR-012).
- All testing (Molecule, lint, staging/external) runs on `ubongo` (ADR-008).
- A future **service-UI acceptance** testing level (Claude driving a headless browser
  against a deployed service) is anticipated; `ubongo` is sized for it. The harness
  is a separate spec.

## Deferred (separate specs / discussions)

1. **Mesh VPN choice** — Tailscale vs NetBird, hosted vs self-hosted. Recovery
   dimension: a hosted coordinator keeps the mesh up when the cluster is down; a
   self-hosted coordinator must live off-cluster (on `ubongo`), never on the fleet,
   or it recreates the chicken-and-egg.
2. **Browser-E2E verification harness** — Playwright/headless-Chromium, test-user
   generation, screenshot-back-to-Claude, and the new ADR-008 level.
3. **`rbw` offline-cache verification** — confirm offline decryption before relying
   on it (ADR-014).

## What was ruled out

| Option | Reason |
|---|---|
| Keep control node as a cluster VM | Fails cold-start, recovery, always-on. |
| Laptop-only (`mamba` for everything) | Fails always-on. Retained as break-glass backup. |
| Split roles (control VM + thin jump box) | Two toolchains, split control plane, heavy testing back on a cluster VM. |
| Mirror Vaultwarden onto `ubongo` | Control node would run a service; still needs the master password. |
| Self-hosted mesh coordinator on the cluster | Recreates the chicken-and-egg. |
| Raspberry Pi | Chokes running Docker + Chromium + toolchain together. |

See also: ADR-001 (architecture), ADR-005 (bootstrapping), ADR-008 (testing),
ADR-009 (provisioning handoff), ADR-012 (hardware/capacity), ADR-002 (security).
  • Step 2: Confirm rbw is unlocked, then verify hooks pass

Run: rbw unlocked && pre-commit run --files docs/decisions/015-control-host.md Expected: rbw exits 0; hooks report Passed/Skipped (ansible-lint skips non-YAML; trailing-whitespace + end-of-file Passed).

  • Step 3: Commit
git add docs/decisions/015-control-host.md
git commit -m "Add ADR-015 (control/AI-worker host ubongo)"

Task 2: Amend ADR-001 (architecture)

Files:

  • Modify: docs/decisions/001-architecture.md

  • Step 1: Update the control-node bullet

Find (lines ~1315):

- **Control node**: A dedicated Debian 13 VM on the cluster. Ansible runs from here.
  The control node is the one host that cannot fully bootstrap itself from scratch
  and requires manual initial setup (see `docs/runbooks/new-host.md`).

Replace with:

- **Control node**: `ubongo` — a dedicated always-on **physical** x86-64 machine
  **outside** the cluster. Ansible runs from here. It cannot be created by the
  Terraform it hosts, so it is provisioned manually (see ADR-015 and
  `docs/runbooks/new-host.md`).
  • Step 2: Update the VM-existence table row

Find:

| VM existence       | Terraform (`terraform/`) | Clones the cloud-init template; control node is the one manual exception (see ADR-009) |

Replace with:

| VM existence       | Terraform (`terraform/`) | Clones the cloud-init template; `ubongo` (control node) is a physical box outside the cluster, the one manual exception (see ADR-009/ADR-015) |
  • Step 3: Update the control host-group comment

Find:

├── control           # the control node itself — baseline config only, runs no services

Replace with:

├── control           # ubongo — physical control node outside the cluster; baseline config only, runs no services
  • Step 4: Verify and commit

Run: rbw unlocked && pre-commit run --files docs/decisions/001-architecture.md Expected: hooks Passed/Skipped.

git add docs/decisions/001-architecture.md
git commit -m "ADR-001: control node is physical ubongo outside cluster"

Task 3: Amend ADR-005 (bootstrapping)

Files:

  • Modify: docs/decisions/005-bootstrapping.md

  • Step 1: Replace the "Control node bootstrapping" section body

Find (the numbered list under ## Control node bootstrapping, lines ~5269):

The control node is a special case — it runs Terraform and Ansible, so it cannot
be created by the Terraform it hosts (chicken-and-egg). It is the one documented
exception to Terraform-owned VM existence (see ADR-009). The control node requires:

1. Manual VM provisioning — clone this cloud-init template by hand (Proxmox UI or
   `qm clone`), since Terraform is not yet available to do it
2. Manual setup of the Ansible environment:

Replace with:

The control node is a special case — it runs Terraform and Ansible, so it cannot
be created by the Terraform it hosts (chicken-and-egg). It is `ubongo`, a dedicated
**physical** machine outside the cluster, and the one documented exception to
Terraform-owned VM existence (see ADR-009 and ADR-015). The control node requires:

1. Manual OS provisioning — install Debian 13 on the physical box by hand (it is not
   a Proxmox guest, so there is no template to clone)
2. Manual setup of the Ansible environment:
  • Step 2: Update the trailing reference to the control node listing

Find:

The control node itself is listed in `inventories/production/hosts.yml` under
a `control` group and can be managed for baseline config (SSH, firewall, updates)
but not for the `docker_host` role (it does not run services).

Replace with:

`ubongo` is listed in `inventories/production/hosts.yml` under the `control` group
and can be managed for baseline config (SSH, firewall, updates) but not for the
`docker_host` role (it does not run services). Hardware target and recovery model
are in ADR-015.
  • Step 3: Verify and commit

Run: rbw unlocked && pre-commit run --files docs/decisions/005-bootstrapping.md Expected: hooks Passed/Skipped.

git add docs/decisions/005-bootstrapping.md
git commit -m "ADR-005: control node bootstrap is bare-metal Debian on ubongo"

Task 4: Amend ADR-009 (provisioning handoff)

Files:

  • Modify: docs/decisions/009-provisioning-handoff.md

  • Step 1: Strengthen the control-node exception section

Find (under ## The control-node exception, lines ~129138):

The control node — the host that runs Terraform and Ansible — is the one VM
Terraform does **not** create. It cannot provision the infrastructure that would
provision itself (chicken-and-egg). It is therefore the single documented exception
to "Terraform owns VM existence":

- Provisioned and bootstrapped manually, per the control-node section of ADR-005.
- Listed in `inventories/<env>/hosts.yml` under the `control` group, and managed by
  Ansible for baseline config only (no `docker_host` role).

Replace with:

The control node — the host that runs Terraform and Ansible — is `ubongo`, a
dedicated **physical** machine outside the cluster. It is not a VM at all, so
Terraform genuinely never touches it: it cannot provision the infrastructure that
would provision itself (chicken-and-egg). It is therefore the single documented
exception to "Terraform owns VM existence":

- Provisioned and bootstrapped manually on bare metal, per the control-node section
  of ADR-005; rationale, hardware, and recovery model in ADR-015.
- Listed in `inventories/<env>/hosts.yml` under the `control` group, and managed by
  Ansible for baseline config only (no `docker_host` role).
  • Step 2: Verify and commit

Run: rbw unlocked && pre-commit run --files docs/decisions/009-provisioning-handoff.md Expected: hooks Passed/Skipped.

git add docs/decisions/009-provisioning-handoff.md
git commit -m "ADR-009: control-node exception is a physical box, not a VM"

Task 5: Amend ADR-008 (testing)

Files:

  • Modify: docs/decisions/008-testing.md

  • Step 1: Make Level 1 say it runs on ubongo

Find:

Runs in Docker on the control node or in CI. Fast (~5 min per role).

Replace with:

Runs in Docker on the control node (`ubongo`) or in CI. Fast (~5 min per role).
  • Step 2: Add a future service-UI acceptance level stub

Find (the end of ### Level 3 — External smoke test from askari, lines ~5155):

### Level 3 — External smoke test from askari

Once `askari` is operational: scripted checks from outside the network confirming
that public-facing services respond correctly. Catches firewall and reverse proxy
configuration issues invisible to Ansible check mode.

Replace with:

### Level 3 — External smoke test from askari

Once `askari` is operational: scripted checks from outside the network confirming
that public-facing services respond correctly. Catches firewall and reverse proxy
configuration issues invisible to Ansible check mode.

### Level 4 — Service-UI acceptance (planned, not built)

Claude drives a headless browser from `ubongo` against a *deployed* service: loads
the rendered UI, creates test users, exercises features, and hands the operator a
manual test script for the rest. Catches application-level regressions that no lower
level sees. The harness (Playwright/headless-Chromium, screenshot-back-to-Claude) is
a **separate spec**; `ubongo` is sized for it (ADR-015). Status: designed, not built
(STATUS.md).
  • Step 3: Verify and commit

Run: rbw unlocked && pre-commit run --files docs/decisions/008-testing.md Expected: hooks Passed/Skipped.

git add docs/decisions/008-testing.md
git commit -m "ADR-008: tests run on ubongo; stub Level 4 service-UI acceptance"

Task 6: Amend ADR-012 and the hardware reference

Files:

  • Modify: docs/decisions/012-hardware-capacity.md

  • Modify: docs/hardware/reference.md

  • Step 1: Note ubongo as in-scope physical compute in ADR-012

In docs/decisions/012-hardware-capacity.md, find the first bullet under ## Decision:

- `docs/hardware/reference.md` is the single, hand-maintained source of truth for
  physical compute + network gear and workload placement intent. Two
  machine-readable tables (node capacity, workload placement) carry the numbers.

Replace with:

- `docs/hardware/reference.md` is the single, hand-maintained source of truth for
  physical compute + network gear and workload placement intent. Two
  machine-readable tables (node capacity, workload placement) carry the numbers.
  This includes `ubongo`, the physical control node (ADR-015), even though it sits
  outside the Proxmox cluster.
  • Step 2: Add ubongo to the physical-compute section of the reference

In docs/hardware/reference.md, find:

_(repeat for pve1, pve2, askari)_

Replace with:

### ubongo (control node — outside the cluster)
- **Model / form factor:** _TBD (x86-64 mini-PC / USFF, e.g. N100 or refurb micro)_
- **CPU:** _TBD (target 4 cores, x86-64)_
- **RAM:** _TBD (target 16 GB)_
- **Storage:** _TBD (target 250 GB SSD/NVMe)_
- **NICs:** _wired GbE_
- **Notes:** _always-on; control plane + AI-worker + local test runner (ADR-015); not a Proxmox guest_

_(repeat for pve1, pve2, askari)_
  • Step 3: Add ubongo to the machine-readable node-capacity table

In docs/hardware/reference.md, find the node-capacity table:

| node | cores | ram_gb | disk_gb |
|------|-------|--------|---------|
| pve0 | 20    | 64     | 4000    |
| pve1 | 20    | 64     | 4000    |

Replace with:

| node | cores | ram_gb | disk_gb |
|------|-------|--------|---------|
| pve0 | 20    | 64     | 4000    |
| pve1 | 20    | 64     | 4000    |
| ubongo | 4   | 16     | 250     |

Note: the header row (node | cores | ram_gb | disk_gb) is a parser contract for scripts/capacity-scan.py — only a data row is added, the header is untouched.

  • Step 4: Verify the capacity scan still parses, hooks pass, then commit

Run: python3 scripts/capacity-scan.py 2>&1 | head -c 400 Expected: it runs without a parse error and the output reflects the new ubongo row (no traceback). If the script needs an argument or env, consult its --help; a clean exit with JSON is success.

Run: rbw unlocked && pre-commit run --files docs/decisions/012-hardware-capacity.md docs/hardware/reference.md Expected: hooks Passed/Skipped.

git add docs/decisions/012-hardware-capacity.md docs/hardware/reference.md
git commit -m "ADR-012/hardware: add ubongo as physical control node"

Task 7: Update the new-host runbook (Part E)

Files:

  • Modify: docs/runbooks/new-host.md

  • Step 1: Replace Part E with the bare-metal control-node procedure

Find the whole ## Part E — Control node (manual exception) section (lines ~113133), from the heading through the paragraph ending "every other host comes from make tf-inventory." Replace it with:

## Part E — Control node (`ubongo`, manual exception)

The control node runs Terraform and Ansible, so it cannot be created by the
Terraform it hosts (chicken-and-egg). It is `ubongo`, a dedicated **physical**
machine outside the cluster — not a Proxmox guest. It is the **one** host
provisioned manually. Rationale, hardware target, and recovery model: ADR-015.

1. Install Debian 13 on the physical box by hand (no template to clone).
2. Create the `ansible` user and install its SSH public key.
3. Set up the Ansible environment on it:
   ```bash
   git clone <repo> ~/ansible
   cd ~/ansible
   make setup        # venv + Python deps
   make collections  # Ansible collections
   rbw login && rbw unlock   # vault password from Vaultwarden (see rotate-secrets.md)
  1. Join the mesh VPN (choice deferred — see ADR-015) so it is reachable over SSH from elsewhere.
  2. Add ubongo to inventories/<env>/hosts.yml under the control group.

Because ubongo is not in local.vms, this is the only case where editing hosts.yml by hand is expected. Known limitation: make tf-inventory regenerates hosts.yml from Terraform outputs and will overwrite a hand-added control entry — re-add ubongo after running it (preserving the control entry in the generator is tracked separately, not yet built).


- [ ] **Step 2: Update the Prerequisites note that assumes a template**

Find:
```markdown
- Proxmox VM template exists (Debian 13 cloud-init image — see below if not)

Replace with:

- Proxmox VM template exists (Debian 13 cloud-init image — see below if not).
  Not needed for the control node `ubongo`, which is bare-metal (Part E).
  • Step 3: Verify and commit

Run: rbw unlocked && pre-commit run --files docs/runbooks/new-host.md Expected: hooks Passed/Skipped.

git add docs/runbooks/new-host.md
git commit -m "new-host runbook: control node ubongo is bare-metal"

Task 8: Update the rotate-secrets runbook (offline break-glass)

Files:

  • Modify: docs/runbooks/rotate-secrets.md

  • Step 1: Add a break-glass section after the rbw setup section

Find the end of the ## One-time — \rbw` setup on a new machine` section:

Once unlocked, `make encrypt/decrypt/check/deploy` and the pre-commit ansible-lint
hook all obtain the password automatically. If the agent is locked you'll see a
clear "run: rbw unlock" error rather than a hang.

Replace with:

Once unlocked, `make encrypt/decrypt/check/deploy` and the pre-commit ansible-lint
hook all obtain the password automatically. If the agent is locked you'll see a
clear "run: rbw unlock" error rather than a hang.

---

## Break-glass — vault access during a full cluster outage

The control node `ubongo` (ADR-015) is the tool used to rebuild the cluster, so it
must be able to decrypt the vault even when Vaultwarden (if hosted on the cluster)
is down. `rbw` keeps a **local encrypted copy** of the Vaultwarden vault and decrypts
it **offline** with your Vaultwarden master password — no live server needed for
entries it has already synced. The recovery design therefore requires:

- `rbw` on `ubongo` (and on `mamba`, the break-glass laptop) has **synced at least
  once** while Vaultwarden was reachable (`rbw sync`).
- Your **Vaultwarden master password** is kept **offline** — in a password manager on
  `mamba` and on paper in a safe — independent of any cluster-hosted Vaultwarden.

There is always exactly one irreducible offline root secret; here it is the
Vaultwarden master password. Keep it recoverable without the cluster.

> **To verify (ADR-014, security-relevant):** confirm `rbw` actually decrypts its
> local cache fully offline on your pinned `rbw` version before relying on this.
  • Step 2: Verify and commit

Run: rbw unlocked && pre-commit run --files docs/runbooks/rotate-secrets.md Expected: hooks Passed/Skipped.

git add docs/runbooks/rotate-secrets.md
git commit -m "rotate-secrets: document offline vault break-glass for ubongo"

Task 9: Reserve the mesh-VPN accepted-risk entry

Files:

  • Modify: docs/security/accepted-risks.md

  • Step 1: Add R3 to the risk table

Find the table row for R2:

| R2 | **SELinux not used** — no SELinux mandatory access control | AppArmor — Debian-native and enforced via the CIS baseline — already provides MAC; adding SELinux means two MAC systems, non-native to Debian, for no real gain | A service that ships and requires its own SELinux policy; threat model shifts toward targeted attackers |

Add immediately after it:

| R3 | **Mesh-VPN coordinator dependency (pending VPN choice)** — remote SSH to the control node `ubongo` (ADR-015) rides a mesh VPN whose coordination plane may be a third party (e.g. hosted Tailscale/NetBird) | A hosted coordinator keeps the mesh up when the cluster is down, which *helps* recovery; nothing is exposed to the public internet (ADR-002 preserved). Provisional — finalised when the VPN is chosen (separate discussion) | The VPN choice is settled (replace this entry with the concrete decision); a self-hosted coordinator is adopted; the provider's trust/security posture changes |
  • Step 2: Update the "Last reviewed" footer date

Find:

_Last reviewed: 2026-06-04. The prior gaps

Replace 2026-06-04 with 2026-06-05 (only the date changes; leave the rest of the sentence intact):

_Last reviewed: 2026-06-05. The prior gaps
  • Step 3: Verify and commit

Run: rbw unlocked && pre-commit run --files docs/security/accepted-risks.md Expected: hooks Passed/Skipped.

git add docs/security/accepted-risks.md
git commit -m "accepted-risks: reserve R3 mesh-VPN coordinator (pending choice)"

Task 10: Add the ubongo row to STATUS.md

Files:

  • Modify: STATUS.md

  • Step 1: Add a row to the "Designed but not built" table

Find the last row of the ## Designed but not built table:

| Network IDS + security alerting | ADR-002 / TODO 15 | Suricata on OPNsense + AIDE/`auditd`/`fail2ban` alerting into the monitoring stack; not built |

Add immediately after it:

| `ubongo` — physical control / AI-worker host | ADR-015 | Replaces the cluster control VM with a dedicated always-on x86 box outside the cluster. Decision recorded; box not yet acquired/installed, not in inventory. |
  • Step 2: Verify and commit

Run: rbw unlocked && pre-commit run --files STATUS.md Expected: hooks Passed/Skipped.

git add STATUS.md
git commit -m "STATUS: record ubongo control host as designed, not built"

Task 11: Update CLAUDE.md (index + control-group note)

Files:

  • Modify: CLAUDE.md

  • Step 1: Add ADR-015 to the Further reading table

Find:

| Bootstrapping hosts    | `docs/decisions/005-bootstrapping.md` |

Replace with:

| Bootstrapping hosts    | `docs/decisions/005-bootstrapping.md` |
| Control / AI-worker host (`ubongo`) | `docs/decisions/015-control-host.md` |
  • Step 2: Update the control-group parenthetical in the Inventory structure section

Find:

(`control` holds the one manually-provisioned control node — see ADR-009.)

Replace with:

(`control` holds `ubongo`, the one manually-provisioned **physical** control node
outside the cluster — see ADR-009 and ADR-015.)
  • Step 3: Verify and commit

Run: rbw unlocked && pre-commit run --files CLAUDE.md Expected: hooks Passed/Skipped.

git add CLAUDE.md
git commit -m "CLAUDE.md: link ADR-015; note ubongo as physical control node"

Task 12: Final consistency sweep

Files: none modified (verification only)

  • Step 1: Confirm no doc still calls the control node a VM

Run:

grep -rniE "control node.*(VM|virtual)|dedicated Debian 13 VM" docs/ CLAUDE.md STATUS.md

Expected: no hit that asserts the control node is a VM. (Hits inside ADR-015's "What was ruled out" table that describe the rejected option are fine.) If any other doc still frames the control node as a VM, fix it the same way as the relevant task above and amend that task's commit.

  • Step 2: Confirm every ADR-015 cross-link resolves

Run:

grep -rl "ADR-015\|015-control-host" docs/ CLAUDE.md STATUS.md
test -f docs/decisions/015-control-host.md && echo "ADR-015 present"

Expected: the file exists and the referencing docs (001, 005, 008, 009, 012, runbooks, accepted-risks, STATUS, CLAUDE.md) appear.

  • Step 3: Full hook run

Run: rbw unlocked && pre-commit run --all-files Expected: all hooks Passed/Skipped. Fix anything that fails (most likely trailing whitespace or end-of-file) and amend the owning commit.

  • Step 4: Push (only if the user asks)

Per CLAUDE.md, push to origin is the off-machine backup. If the user wants it pushed:

git push origin main

Self-review notes (author)

  • Spec coverage: every spec section maps to a task — host decision/hardware/bootstrap/access/recovery → Task 1 (ADR-015); the doc-changes table → Tasks 211; testing implication → Task 5; deferrals are recorded in ADR-015 and not implemented here (correct — they are separate specs). ✓
  • Not in scope (intentional): acquiring/installing the box, mesh-VPN selection, the browser harness, adding ubongo to live inventory, and modifying tf_to_inventory.py to preserve the control entry (logged as a known limitation in Task 7). ✓
  • No placeholders: every edit shows exact find/replace text; the only _TBD_ strings are deliberate hardware-reference skeleton fields matching that file's existing style. ✓