Commit graph

302 commits

Author SHA1 Message Date
1021c6d25d STATUS: record logging pipeline + security alerting (ADR-018) 2026-06-06 07:06:06 +02:00
c6aa45037d ADR-012: track log-storage allocation + SSD wearout (ADR-018) 2026-06-06 07:05:15 +02:00
687d623a52 CAPABILITIES: Loki decided + Alloy agent + security alerting (ADR-018) 2026-06-06 07:04:26 +02:00
6f68f8b8c5 accepted-risks: add R4 (no cryptographic WORM for logs) 2026-06-06 07:03:27 +02:00
30c6a93c28 ADR-002: make central-logging + alerting controls concrete (ADR-018) 2026-06-06 07:02:32 +02:00
2894319f01 Add ADR-018 (logging and log integrity)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-06 07:01:36 +02:00
96f8f20c05 Add implementation plan for logging + log integrity (ADR-018)
Task-by-task docs plan: author ADR-018 and reconcile ADR-002, accepted-risks
(R4), CAPABILITIES, ADR-012, STATUS, TODO, CLAUDE.md. Roles/pipeline deferred
on the base + service-role machinery.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-06 06:59:58 +02:00
8eb5ccf97d Add design spec for logging + log integrity (ship all to Loki)
All logs -> on-cluster Loki for troubleshooting/trends; a security-relevant
subset also ships write-only off-site to askari (append-only, tamper-resistant
against full-cluster compromise); skip WORM (accepted-risk R4). Alloy agent in
base; loki/grafana service roles; disk-wear handled as a design parameter.
Basis for ADR-018.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-05 22:03:31 +02:00
568729e7bd repo-scan: cut broken-path-ref + marker false positives
- broken-path-ref: skip template/generated-report paths — a placeholder
  (<service>) immediately following the match, a YYYY-MM-DD date token, or a
  path under a generated-report reviews/ dir (14 -> 0 on the current tree).
- marker: skip numbered-backlog references (TODO 8.2, TODO-3.1, TODO (2.2,
  TODO item 16) which point at the backlog, not code markers (35 -> 2; the
  remaining two are literal "TODO:" strings in a plan doc). Real code markers
  (TODO:, FIXME, etc.) still caught — verified with a synthetic fixture.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-05 20:37:40 +02:00
db76be2a63 review-repo: clear O7-O12 clarity items
- ADR-011: ruled-out row was "digest-pinning stateful" (contradicted Decision 2);
  now "digest-only (no readable tag)" — tag@digest is adopted (O7)
- ADR-003/010: act_runner names ubongo as the runner host, runner VM as a future
  option (O8)
- ADR-008: WireGuard Molecule-exclusion row reframed to NetBird wt0 data plane (O9)
- ADR-011: scheduled_jobs xref points to TODO 8.3, not ADR-010 (O10)
- CAPABILITIES: add /verify-service Level 4 capability row (O11)
- TODO 3.10: rewrite the garbled base-container question (O12)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-05 19:28:07 +02:00
8e4bf3dd88 ADR-006/014: clear two stale labels
Review O5/O6: ADR-006 mislabeled backend.tf as "Forgejo state backend" (its own
State-backend section chooses local state — Forgejo's API is read-only); ADR-014
called plugin reproducibility open though TODO 10.7 is done.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-05 18:55:17 +02:00
d8afa94c4b Name and propagate the offsite_hosts inventory group (askari)
Review O4: ADR-016 said askari gets "its own inventory group" but never named it.
Settled as offsite_hosts (off-site, distinct from on-site-but-off-cluster ubongo).
Added to VALID_GROUPS (tf_to_inventory.py), ADR-009 valid groups, ADR-001/ADR-016
host-group enumerations, and CLAUDE.md. Generated hosts.yml picks up the section on
the next make tf-inventory (a manual-exception group like control).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-05 18:54:54 +02:00
f0d189ca09 Thread the VERIFY.md convention through ADR-004/new-role/README
Review O1-O3: ADR-017's per-service VERIFY.md requirement now appears in the
ADR-004 service-role file table, as a new-role runbook step, and the README
docs index/tree are refreshed (ADRs 010-017, security/testing/hardware dirs).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-05 18:52:42 +02:00
3dd03d4198 review-repo: 2026-06-05 report (4 auto-fixed, 12 open)
Stale-deferred check exercised: 6 open-deferred-items all confirmed genuinely
open, 0 stale-deferred. Top open: thread ADR-017 VERIFY.md convention through
ADR-004/new-role/README; name the askari inventory group (ADR-016).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-05 18:24:39 +02:00
666ad42634 review-repo: fix DNS-write contradictions + stale control-node/template refs
Auto-fixes from /review-repo:
- ADR-005 + new-host.md: drop "Terraform writes the host's DNS A record"
  (contradicts ADR-009 — dns role owns the zone; recurs from the 2026-05-30 run)
- ADR-005: control node is physical ubongo, not cloned from the template (ADR-015)
- CLAUDE.md: add the VERIFY.md template to Further reading
- TODO.md: typo fixes (we we / seperate)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-05 18:23:16 +02:00
f566fd17eb review-repo: add stale-deferred check for ADR Deferred entries
repo-scan.py now enumerates open ADR "Deferred/Open" items and flags any that
another file describes as resolved but which isn't marked resolved in place
(the recurring miss in docs/FRICTION.md). review-repo.md's Phase 2 reviewer
confirms each open item against later ADRs/STATUS.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-05 18:13:49 +02:00
66d11cc352 FRICTION: stale-deferred-item pattern recurred a 3rd time — build the check
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-05 18:06:26 +02:00
d5c62c99ad STATUS/ADR-015: mark the three deferred design threads resolved
ubongo, the NetBird mesh, and Level 4 verification are design-resolved
(ADR-015/016/017 + specs + plans); STATUS now says so while keeping build
status honest. Also resolves ADR-015 deferred #2 (browser harness), which
was left open when ADR-017 landed.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-05 18:01:14 +02:00
91d851fe4d TODO: mark headless-browsing + test-user standard decided (ADR-017) 2026-06-05 13:20:40 +02:00
01e4f96983 STATUS: record Level 4 service-UI verification (ADR-017) 2026-06-05 13:19:53 +02:00
eb415db96e Git-ignore verify screenshots; add testing/reviews dir
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-05 13:19:04 +02:00
920e47b50d CLAUDE.md: VERIFY.md role convention; link ADR-017
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-05 13:18:07 +02:00
22c0747c0b service-checklist: add Level 4 UI verification to the gate 2026-06-05 13:17:16 +02:00
25f04002df Add /verify-service skill for Level 4 UI verification (ADR-017) 2026-06-05 13:16:25 +02:00
05abb3b6a5 Add VERIFY.md template for service-UI acceptance (ADR-017) 2026-06-05 13:15:13 +02:00
2df1f98153 ADR-008: expand Level 4 into the verify-service harness (ADR-017) 2026-06-05 13:14:12 +02:00
cc3337502f Add ADR-017 (service-UI acceptance verification, Level 4) 2026-06-05 13:13:09 +02:00
be6a064f44 Add implementation plan for service-UI verification (Level 4)
Task-by-task: author ADR-017, expand ADR-008 Level 4, create the VERIFY.md
template + /verify-service skill, and reconcile the checklist/CLAUDE.md/
gitignore/STATUS/TODO. Buildable-now artifacts; live run stays deferred.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-05 13:11:43 +02:00
2bd11b5aa9 Add design spec for service-UI verification (ADR-008 Level 4)
Resolves ADR-015 deferred item #2 + TODO 2.2/2.3: a Claude-driven exploratory
browser harness (/verify-service) that exercises staging service UIs through
real SSO, backed by a per-service VERIFY.md, with test users in staging
Authentik and a manual-test handoff. Basis for ADR-017.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-05 13:05:11 +02:00
5322cce5c6 FRICTION: resolving a deferred decision needs a doc-wide grep sweep
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-05 12:20:20 +02:00
cd62c5e098 new-host runbook: mesh VPN resolved to NetBird (ADR-016) 2026-06-05 11:52:22 +02:00
ed9fdcc10a CLAUDE.md: link ADR-016 (mesh VPN) 2026-06-05 11:51:36 +02:00
787aa3b8e1 STATUS: record NetBird mesh (coordinator + base enrollment) 2026-06-05 11:50:53 +02:00
841f666de9 CAPABILITIES: VPN decided — NetBird self-hosted (ADR-016) 2026-06-05 11:50:04 +02:00
08165ffb68 accepted-risks: R3 now the concrete NetBird coordinator risk 2026-06-05 11:48:58 +02:00
2ae5cf4535 ADR-015: resolve mesh-VPN deferral — NetBird on askari (ADR-016) 2026-06-05 11:48:04 +02:00
5a32dd46d3 ADR-007: retire VLAN-99 WireGuard for the NetBird mesh (ADR-016) 2026-06-05 11:47:03 +02:00
ff796c64ca Add ADR-016 (mesh VPN — NetBird self-hosted on askari) 2026-06-05 11:45:45 +02:00
4b85b14f1f Add implementation plan for NetBird mesh VPN
Task-by-task docs plan: author ADR-016 and reconcile ADR-007 (retire VLAN-99
WireGuard), ADR-015 (resolve deferred #1), accepted-risks R3, CAPABILITIES,
STATUS, CLAUDE.md. Documentation-only; role/deployment waits on the base role.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-05 11:44:05 +02:00
99ace3eb48 Add design spec for mesh VPN (NetBird self-hosted on askari)
Resolves ADR-015 deferred item #1: the mesh VPN is NetBird, self-hosted on
askari, replacing ADR-007's VLAN-99 OPNsense WireGuard. Agent-per-host
enrollment via base, embedded local-user IdP, coordinator off-site for
outage survival. Basis for ADR-016.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-05 10:58:35 +02:00
a53941dffe CLAUDE.md: fix capabilities doc link after rename to CAPABILITIES.md 2026-06-05 09:50:28 +02:00
7a48a60f14 CLAUDE.md: fix project summary — control node is physical ubongo 2026-06-05 09:49:23 +02:00
a30c1af3f0 CLAUDE.md: link ADR-015; note ubongo as physical control node 2026-06-05 09:48:09 +02:00
9653a34241 STATUS: record ubongo control host as designed, not built 2026-06-05 09:47:24 +02:00
55a3666d16 accepted-risks: reserve R3 mesh-VPN coordinator (pending choice) 2026-06-05 09:46:40 +02:00
a2db8058e7 rotate-secrets: document offline vault break-glass for ubongo 2026-06-05 09:45:27 +02:00
b89ca8835a new-host runbook: control node ubongo is bare-metal
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-05 09:44:31 +02:00
3fb780c286 ADR-012/hardware: add ubongo as physical control node
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-05 09:43:09 +02:00
66064be7b2 ADR-008: tests run on ubongo; stub Level 4 service-UI acceptance 2026-06-05 09:42:01 +02:00
07bc1c83f0 ADR-009: control-node exception is a physical box, not a VM 2026-06-05 09:41:03 +02:00