Document the 2026-06-18 incident class: a road-warrior laptop losing DNS on a network transition strands NetBird (can't resolve the coordinator FQDN), taking ubongo unreachable until DNS recovers. Adds triage (local DNS vs coordinator), device mitigations (reliable resolvers + hosts-file pin), the non-mesh LAN break-glass to ubongo, and why ubongo is relay-only (deferred mesh-hardening, not a bug) — including the break-glass rule that hardening must preserve.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- New docs/runbooks/integration-testing.md: when to use (firewall/
sshd/boot/Docker changes); make test-integration commands; lower-
level driver sub-commands; cert tier guidance; diagnostics dir;
VM inspection (virsh console / SSH); safety invariants; resource
constraints; adding a new profile; self-validating acceptance test.
- docs/runbooks/new-host.md: pre-flight warning before deploying
lockout-risky changes (firewall/sshd/boot) while break-glass is open
- docs/runbooks/new-role.md: step 13 pre-flight for lockout-risky roles
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
scripts/registry-login.sh reads vault.forgejo.registry_token and pipes it to
docker login --password-stdin (never echoed, never on argv); 'make registry-login'
wires it with the venv binaries. Adds the operator-minted CHANGEME vault stub
(fill via make edit-vault) and a per-machine prereq note in the claude-code-setup
runbook, so 'make caddy-image-push'/'molecule-image-push' become agent-completable
non-interactively. Consumes the 2026-06-15 signal in docs/FRICTION.md.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
mamba + work laptop enrolled in the mesh → ubongo reachable from anywhere; the
mobile-access goal is met and Phase 1 (remote access) is complete. Adds
docs/runbooks/netbird-client.md (reusable client-enrollment runbook) + STATUS/
ROADMAP flips + CLAUDE.md reading-table entry.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- STATUS: docker_host is built+applied, not scaffold-only (O1)
- ADR-004: backup points to ADR-022, not "out of scope"; service-role file
table gains ACCESS.md + BACKUP.md rows (O2, O5)
- Finish Traefik->Caddy: ADR-008/011/017/019, CAPABILITIES, TODO (O3); scope
ADR-024's custom-image/NetBird claims to the deferred DNS-01/M4b paths (O22)
- ADR-016/017/018 now lead with ## Status per ADR-023 (O4)
- ADR-002: caveat `PLAYBOOK=upgrade` as planned/unbuilt (O6)
- CAPABILITIES: carve out ubongo's dev_env from the nvim/tmux exclusion (O7)
- ADR-007: one authoritative boma.baobab.band -> boma.wingu.me transition note (O18)
- new-host Part E: note ubongo is managed as sjat, ansible-user bootstrap pending (O15)
O9 (hosts.yml header) left open: the file is generator-owned (hook-protected);
fixing it needs a tf_to_inventory.py change or a tf-inventory run, not a hand-edit.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Move ubongo to 'Built (partial)' in STATUS; fill real M70q hardware specs
(i3-10100T, 16 GB, 256 GB SanDisk X600 SATA, no disk encryption). Record in
ADR-015 the dedicated claude AI-worker identity, LAN-SSH-only operational
reality, and the no-encryption decision; close the rbw offline-cache
recovery-verification item (ADR-015 + rotate-secrets). Add accepted-risk R5
(control-node disk unencrypted at rest) with its compensating controls.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Review O1-O3: ADR-017's per-service VERIFY.md requirement now appears in the
ADR-004 service-role file table, as a new-role runbook step, and the README
docs index/tree are refreshed (ADRs 010-017, security/testing/hardware dirs).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Auto-fixes from /review-repo:
- ADR-005 + new-host.md: drop "Terraform writes the host's DNS A record"
(contradicts ADR-009 — dns role owns the zone; recurs from the 2026-05-30 run)
- ADR-005: control node is physical ubongo, not cloned from the template (ADR-015)
- CLAUDE.md: add the VERIFY.md template to Further reading
- TODO.md: typo fixes (we we / seperate)
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Reviewed the Claude Code config against boma's capabilities and committed a
reproducible, leaner toolchain:
- .claude/settings.json now declares extraKnownMarketplaces + enabledPlugins so a
fresh clone prompts to install the active set: superpowers, context7, terraform
(we use TF, ADR-006), claude-md-management (doc/ADR-heavy). Drops code-simplifier.
- Adds a conservative, read-only/verify permissions allowlist (git status/diff/log,
make lint/test/check, pytest, rbw unlocked, ls/cat/rg/find) — mutations and
outward/destructive commands stay gated, consistent with ADR-002.
- docs/runbooks/claude-code-setup.md: per-machine bootstrap, the deferred
enable-when plugins (security-guidance/semgrep, playwright, hookify, skill-creator),
rbw/venv prerequisites, and a note to keep the dangerous-mode prompt on.
Closes TODO 10.7. Plugin install remains a per-machine /plugin action (no native
auto-install).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Revise ADR-004 to a service-role standard: every service is its own
self-contained role with a required file set including SECURITY.md, uniform
deploy mechanics, and a deferred shared-engine option (with revisit trigger)
recorded in the ADR.
Add the per-service security record:
- docs/security/service-security-template.md — canonical SECURITY.md template
(exposure, checklist status, service-specific hardening, residual risks)
- roles/<service>/SECURITY.md is where each service records how it meets the bar;
/security-review aggregates roles/*/SECURITY.md and cross-checks against config
- service-checklist.md noted as the generic bar the record answers
Wire-up: new-role runbook step writes SECURITY.md from the template; ADR-002
governance bullet points at it; CLAUDE.md role conventions require it and mandate
one-role-per-service; STATUS records the convention as defined-not-yet-applied.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add a managerial security frame on top of the host baseline: explicit threat
model (opportunistic external, lateral movement/blast radius, operator/agent
error; supply chain accepted-lower-priority), security principles, and four
governance mechanisms that ADR-002 establishes and links out to:
- docs/security/service-checklist.md — per-service security bar (referenced
from the new-role runbook)
- docs/security/accepted-risks.md — living accepted-risk register (R1-R4)
- planned /security-review skill (TODO 8.5)
- agent guardrails in CLAUDE.md "what Claude must not do"
STATUS.md records the frame as present (manual enforcement) and /security-review
as planned-not-built.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Point ADR-005, the new-host runbook, CONTRIBUTING, and AGENTS at the
rbw/Vaultwarden flow instead of a .vault_pass file. Also record the cron-section
idea in docs/TODO.md.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
First /review-repo run on boma. Hardened repo-scan.py (no TODO.md/prose false
positives). Applied 7 safe fixes (DNS staleness x2, STATUS factual correction,
hosts.yml path generalisation, trunk-based wording x2, scripts/README). Recorded
the run and 17 open findings in docs/reviews/2026-05-30-*.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Master vault password is fetched from Vaultwarden via the rbw agent
(scripts/vault-pass-client.sh, wired as vault_password_file) instead of a
plaintext .vault_pass. Vault secrets use a nested vault.<service>.<key> map.
Encrypted vault.yml files are excluded from lint. Includes the host rename in
Makefile and STATUS.md.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>