Adds role_tag_problems() to check-tags.py: every role imported in a
play's roles: block must carry its own role name as a tag (extra tags
allowed; templated role names skipped). Wires the check into main() so
make lint catches violations. 6 new unit tests (29 total, all passing).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Task-by-task docs plan: author ADR-018 and reconcile ADR-002, accepted-risks
(R4), CAPABILITIES, ADR-012, STATUS, TODO, CLAUDE.md. Roles/pipeline deferred
on the base + service-role machinery.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
All logs -> on-cluster Loki for troubleshooting/trends; a security-relevant
subset also ships write-only off-site to askari (append-only, tamper-resistant
against full-cluster compromise); skip WORM (accepted-risk R4). Alloy agent in
base; loki/grafana service roles; disk-wear handled as a design parameter.
Basis for ADR-018.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- ADR-011: ruled-out row was "digest-pinning stateful" (contradicted Decision 2);
now "digest-only (no readable tag)" — tag@digest is adopted (O7)
- ADR-003/010: act_runner names ubongo as the runner host, runner VM as a future
option (O8)
- ADR-008: WireGuard Molecule-exclusion row reframed to NetBird wt0 data plane (O9)
- ADR-011: scheduled_jobs xref points to TODO 8.3, not ADR-010 (O10)
- CAPABILITIES: add /verify-service Level 4 capability row (O11)
- TODO 3.10: rewrite the garbled base-container question (O12)
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Review O5/O6: ADR-006 mislabeled backend.tf as "Forgejo state backend" (its own
State-backend section chooses local state — Forgejo's API is read-only); ADR-014
called plugin reproducibility open though TODO 10.7 is done.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Review O4: ADR-016 said askari gets "its own inventory group" but never named it.
Settled as offsite_hosts (off-site, distinct from on-site-but-off-cluster ubongo).
Added to VALID_GROUPS (tf_to_inventory.py), ADR-009 valid groups, ADR-001/ADR-016
host-group enumerations, and CLAUDE.md. Generated hosts.yml picks up the section on
the next make tf-inventory (a manual-exception group like control).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Review O1-O3: ADR-017's per-service VERIFY.md requirement now appears in the
ADR-004 service-role file table, as a new-role runbook step, and the README
docs index/tree are refreshed (ADRs 010-017, security/testing/hardware dirs).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Stale-deferred check exercised: 6 open-deferred-items all confirmed genuinely
open, 0 stale-deferred. Top open: thread ADR-017 VERIFY.md convention through
ADR-004/new-role/README; name the askari inventory group (ADR-016).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Auto-fixes from /review-repo:
- ADR-005 + new-host.md: drop "Terraform writes the host's DNS A record"
(contradicts ADR-009 — dns role owns the zone; recurs from the 2026-05-30 run)
- ADR-005: control node is physical ubongo, not cloned from the template (ADR-015)
- CLAUDE.md: add the VERIFY.md template to Further reading
- TODO.md: typo fixes (we we / seperate)
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
ubongo, the NetBird mesh, and Level 4 verification are design-resolved
(ADR-015/016/017 + specs + plans); STATUS now says so while keeping build
status honest. Also resolves ADR-015 deferred #2 (browser harness), which
was left open when ADR-017 landed.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Task-by-task: author ADR-017, expand ADR-008 Level 4, create the VERIFY.md
template + /verify-service skill, and reconcile the checklist/CLAUDE.md/
gitignore/STATUS/TODO. Buildable-now artifacts; live run stays deferred.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Resolves ADR-015 deferred item #2 + TODO 2.2/2.3: a Claude-driven exploratory
browser harness (/verify-service) that exercises staging service UIs through
real SSO, backed by a per-service VERIFY.md, with test users in staging
Authentik and a manual-test handoff. Basis for ADR-017.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Resolves ADR-015 deferred item #1: the mesh VPN is NetBird, self-hosted on
askari, replacing ADR-007's VLAN-99 OPNsense WireGuard. Agent-per-host
enrollment via base, embedded local-user IdP, coordinator off-site for
outage survival. Basis for ADR-016.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Task-by-task docs plan: author ADR-015 and reconcile ADR-001/005/008/009/012,
the new-host and rotate-secrets runbooks, accepted-risks, STATUS, and CLAUDE.md.
Documentation-only; the physical box stays "designed, not built".
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>