All logs -> on-cluster Loki for troubleshooting/trends; a security-relevant
subset also ships write-only off-site to askari (append-only, tamper-resistant
against full-cluster compromise); skip WORM (accepted-risk R4). Alloy agent in
base; loki/grafana service roles; disk-wear handled as a design parameter.
Basis for ADR-018.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Task-by-task: author ADR-017, expand ADR-008 Level 4, create the VERIFY.md
template + /verify-service skill, and reconcile the checklist/CLAUDE.md/
gitignore/STATUS/TODO. Buildable-now artifacts; live run stays deferred.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Resolves ADR-015 deferred item #2 + TODO 2.2/2.3: a Claude-driven exploratory
browser harness (/verify-service) that exercises staging service UIs through
real SSO, backed by a per-service VERIFY.md, with test users in staging
Authentik and a manual-test handoff. Basis for ADR-017.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Resolves ADR-015 deferred item #1: the mesh VPN is NetBird, self-hosted on
askari, replacing ADR-007's VLAN-99 OPNsense WireGuard. Agent-per-host
enrollment via base, embedded local-user IdP, coordinator off-site for
outage survival. Basis for ADR-016.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Task-by-task docs plan: author ADR-015 and reconcile ADR-001/005/008/009/012,
the new-host and rotate-secrets runbooks, accepted-risks, STATUS, and CLAUDE.md.
Documentation-only; the physical box stays "designed, not built".
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Records the decision to replace the cluster-resident control VM with a
dedicated always-on physical mini-PC (ubongo) outside the Proxmox
cluster, collapsing control plane, AI-worker host, dev home, and local
test runner into one box. Basis for ADR-015.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Brainstormed design for docs/hardware/reference.md (physical compute +
network gear + workload placement intent), a stdlib-only capacity-scan.py,
and an on-demand /capacity-review skill that reports to docs/hardware/reviews/.
Mirrors the repo-scan -> /review-repo -> docs/reviews triad.
TODO additions: schedule /capacity-review later and decide its usage-stats
source (Proxmox RRD vs the Prometheus/Loki/Grafana/Alloy stack) before
building any hook (8.4); reevaluate the stdlib-only script policy (#14).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>