Commit graph

56 commits

Author SHA1 Message Date
0286c78f36 docs(plan): mesh-hardening SPOF — accept + DNS-resilience implementation plan
Two tasks: a base mesh coordinator-FQDN /etc/hosts pin (Molecule TDD) + the accept-and-document docs (R8, ADR-016 availability amendment, STATUS/ROADMAP). Coordinator backup deferred to ADR-022.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-20 10:49:26 +02:00
3ba22d199a docs(spec): mesh-hardening SPOF — accept single-coordinator SPOF + DNS-resilience pin
Sub-project 3 of the mesh-hardening follow-on. Accepts the single off-site coordinator as a documented availability SPOF (R8 + ADR-016 amendment) given the narrow blast radius (LAN/intra-cluster/local traffic unaffected; only remote relayed mesh access breaks). Hardens the one real gap: a base mesh coordinator-FQDN /etc/hosts pin so managed hosts survive a local-DNS hiccup. Coordinator off-site backup explicitly deferred to an ADR-022 kickoff (no throwaway infra).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-20 10:42:19 +02:00
6be758bece docs(plan): mesh-hardening redesign — askari implementation plan
Four tasks: netbird_coordinator geolocation disable (TDD via Molecule) -> inventory enablement (INPUT-only firewall + WAN break-glass + manage over wt0) -> an askari_inputonly integration profile (the reboot-safety GREEN gate) -> the operator-gated supervised live cutover + STATUS/ROADMAP update. Tasks 1-3 are autonomously implementable; Task 4 is operator-gated (live off-site host, lockout risk).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-19 16:32:27 +02:00
a178729587 docs(spec): mesh-hardening redesign — askari wt0-primary + WAN break-glass
Redesign of the backed-out 2026-06-17 askari SSH->wt0 attempt. Mirrors the proven ubongo 2/3 pattern (INPUT-only default-deny, SSH scoped by iifname wt0, no sshd ListenAddress change -> no boot-race) and adds the coordinator-host exception the incident demanded: a permanent non-mesh break-glass (WAN :22 from ubongo's static WAN IP + the Hetzner console), WAN :22 deliberately left open. Folds in the netbird_coordinator geo-DB robustness fix (FRICTION #4) so a transient egress blip can't FATAL the control plane. Harness-GREEN gate before a supervised live cutover.

Operator decision (2026-06-19): do this redesign first, then a separate sub-project to reduce askari's SPOF role.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-19 16:25:26 +02:00
66a9a0af08 docs: ubongo admin-addrs add 10.20.10.17 + flag raw-lease follow-up
Allow a second operator workstation (10.20.10.17) onto ubongo's LAN SSH
alongside mamba (10.20.10.50). Both are raw DHCP leases; recorded a FRICTION
open signal to replace them with MAC-pinned OPNsense reservations when
OPNsense-as-code lands (ADR-020 / TODO 3.5).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-19 09:26:04 +02:00
e14e347047 docs(plan): mesh-hardening 2/3 — ubongo implementation plan
Five tasks: base knobs (input-only forward policy + admin-addr SSH allow,
TDD via Molecule) → enable on the control group → a 'be ubongo' integration
profile (profile-aware verify) → the real-VM harness GREEN gate → the
operator-supervised live cutover (signal-6 order, physical-console break-glass).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-19 09:26:04 +02:00
24a1d909c9 docs(spec): mesh-hardening 2/3 — ubongo INPUT-only default-deny
Sub-project 2 of the mesh-hardening follow-on (the post-incident roadmap
ordering puts ubongo first). Harden the control node's inbound surface via
base's nftables firewall as INPUT-only default-deny: the forward chain stays
permissive (new base__firewall_input_only knob) so Docker egress + the
libvirt-NAT integration harness keep working, and there is no sshd ListenAddress
change — sidestepping the ip_nonlocal_bind boot-race that sank askari. SSH
allowed from wt0, ssh-from-control (Ansible self), and mamba on the LAN (new
base__firewall_admin_addrs). Harness-validated before an operator-supervised
cutover; the physical console is the permanent break-glass.

Design maps to the four relevant 2026-06-17 incident lessons (FRICTION signals
1/2/3/6).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-19 09:12:58 +02:00
65533be4d9 docs(plan): implementation plan for local VM integration testing (2.4)
20-task TDD plan: integration_test substrate role, stdlib virsh driver, askari profile, tiered certs, RED->GREEN acceptance, docker_host container-forward fix, ADR-025 + docs. Follows the 2026-06-18 design spec.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-18 11:56:04 +02:00
02e1eb7449 docs(spec): design local VM integration testing on ubongo (2.4)
Throwaway KVM VMs on ubongo (libvirt, Approach A) that mirror a real host (real Docker, real reboot, real role apply) to catch the reboot/firewall/boot-order class Molecule cannot - the 2026-06-17 mesh-hardening incident. First profile: be askari; tiered certs (internal + le-staging built, le-prod-wildcard on-demand). Concrete build of ADR-008 Level 2/3; to be recorded as ADR-025.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-18 11:35:51 +02:00
dfa363cecd docs(plan): mesh-hardening 1/3 — askari SSH onto wt0 implementation plan
5 tasks: base sshd ListenAddress+ip_nonlocal_bind (Molecule), firewall public
zone + askari catalog, inventory wt0 override, TF retire WAN :22, then the live
operator-supervised staged cutover.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-17 20:25:59 +02:00
292c204752 docs(spec): mesh-hardening 1/3 — move askari SSH onto wt0
Decomposes the M5 mesh-hardening follow-on into 3 independent sub-specs; this
is sub-project 1. Three-layer SSH-on-wt0 (sshd ListenAddress=mesh + nftables
iifname wt0 + retire the Hetzner WAN :22), ip_nonlocal_bind to beat the
post-boot wt0 bind race (fail-closed), live wt0 fact for the listen addr,
staged cutover with the firewall auto-rollback as the safety gate.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-17 20:15:12 +02:00
55776fb03c docs(plan): M5 mesh-enrollment implementation plan
8 tasks: build the base 'mesh' concern + tag + vault stub + per-host opt-in
(autonomous), operator handoff for /setup + setup key, gated live enrol of
ubongo + askari, operator laptop enrol, docs. Reachability-only; lockdown deferred.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-17 15:49:28 +02:00
4142bb15f8 docs(spec): M5 mesh-enrollment design (reachability-only)
base 'mesh' concern enrols NetBird agents on ubongo + askari via a reusable scoped
setup key (vault); laptops enrolled by the operator. Reachability via the default
peer policy; the base nftables default-deny on ubongo + ACL tightening are deferred
to a follow-on. Resolves ROADMAP M5 design; next: writing-plans.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-17 15:44:13 +02:00
d14639e80a docs(plan): /kaizen command — implementation plan (TODO 11)
7 tasks: friction-scan.py (TDD, --json/--nudge) + tests; kaizen.md command;
/review-repo nudge hookup + STATUS/TODO; dogfood run. Mirrors /review-repo.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-14 21:09:29 +02:00
1a0e30e278 docs(spec): /kaizen — kaizen-loop command (TODO 11)
Curate-only consume pass over FRICTION.md Open signals: interactive guided
session, add/change/park/remove verdicts (park-with-resurrection-trigger to
protect out-of-phase tooling on a solo project), single source = FRICTION.md,
ledger is the durable record. Mirrors /review-repo (command md + stdlib scanner).
Stage 1 on-demand + stage-2 nudge; headless/cron deferred (TODO 11.3).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-14 21:05:09 +02:00
9e0c264658 docs: reconcile lower-severity review findings (O9-O24)
- ADR-007: document ubongo on the legacy V4 net at 10.20.10.151 (transitional,
  outside the planned srv /24 until the LAN is re-cut) (O10); single authoritative
  boma.baobab.band -> boma.wingu.me transition note already added earlier
- terraform tfvars.example + variables.tf (both envs): pve01 -> pve0 and
  <host>.boma.baobab.band per ADR-007 naming (O11)
- ADR-012/013/015/016/017/018: convert "See also:" prose to `## Related` sections
  placed after Consequences, matching ADR-014/019-023 (O13)
- docs/README + inventories/README: list the missing subdirs / offsite_hosts +
  offsite.yml merge behaviour (O14, O29 note)
- ADR-009: drop the retired `nyumbani` example; use vaultwarden.wingu.me split-horizon (O19)
- ROADMAP M2: askari shipped as cx23/x86 (CAX11/ARM out of stock) (O20)
- ADR-020: 80/443/3478 opened in M4a (past tense); coordinator role is M4b (O21)
- netbird -> netbird_coordinator across ROADMAP M4b, the M4b plan, ADR-024 (O23)
- ADR-024: align the M1 DNS-01 wildcard scope wording with ROADMAP (O24)
- capacity-scan.py: read the inventory directory so offsite.yml (askari) is seen (O28)
- tf_to_inventory.py: generated header now warns it overwrites the manual control node (O9)
- tests/tags.yml: proxy concern comment Traefik -> Caddy (missed in the O3 sweep)

O9's existing stub hosts.yml header stays as-is (generator-owned, hook-protected);
the fix lives in the generator for the next regeneration. make lint + pytest (57) green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-14 19:31:40 +02:00
e3461375f5 docs(plan): M4b — NetBird coordinator service role
Capture NetBird's configure.sh reference for a pinned version → translate into
boma role templates (compose + management.json + dex/openid + turnserver),
external-proxy mode behind the M4a Caddy (netbird.askari.wingu.me). First service
role: full ADR-004 standard files; secrets generated/CHANGEME-stubbed (setup key
for M5). Gated live deploy + verify.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-14 18:20:04 +02:00
dd8c6825ba docs(plan): M4a — Docker + Caddy reverse proxy platform
First of M4's two build phases: docker_host (Docker engine), custom xcaddy Caddy
image (caddy-dns/gandi), reverse_proxy role (Caddyfile from a route catalog,
DNS-01 wildcard cert for *.askari.wingu.me via vault.gandi.pat), ADR-024 (Caddy is
boma's reverse proxy), firewall 80/443 + DNS, proven by serving a test route over
TLS. M4b (NetBird) follows, reading NetBird's current self-host compose then.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-14 17:20:53 +02:00
65cf20a993 docs(spec): M4 — NetBird coordinator on askari + Caddy reverse proxy
Caddy becomes boma's standard reverse proxy (amends the soft Traefik assumption;
new ADR) with Gandi DNS-01 certs (custom xcaddy image, reuses vault.gandi.pat) —
the only cert path for mesh/LAN-only services. NetBird self-hosted in
external-proxy mode (embedded Dex), compose rendered from boma templates
(ADR-004/013). Three roles: docker_host (first real content), reverse_proxy (new,
Caddy), netbird (first service role w/ full ADR-004 standard files). Firewall +
DNS amendments; backup execution deferred (fisi). caddy-dns/gandi + NetBird
self-host facts verified.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-14 17:19:21 +02:00
cff368ece2 docs(spec,plan): M3 — base ssh hardening + fail2ban
ADR-002 baseline (key-only, no root, fail2ban 5/1h) as two base task files under
the existing 'hardening' concern tag; applied to askari by tag (NOT the host
firewall — that's mesh-gated to avoid lockout; Hetzner Cloud Firewall is the
perimeter until M5). NetBird agent deferred to M4. Adds a LIMIT=/TAGS= passthrough
to make check/deploy.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-14 16:38:38 +02:00
29921428c4 docs(plan): M2 — askari provisioning (Terraform + Hetzner Cloud)
9-task plan: verify hcloud facts; hetzner_vm module (server+firewall+ssh+cloud-init);
offsite env (CAX11/hel1/debian-13, local state); Makefile token-injection + directory
inventory + tf-inventory-offsite; offsite-handoff pytest; init/validate/plan; GATED
apply (billed VPS) + bootstrap; ADR-006/009/020/007/016 amendments. Resolves the
inventory-handoff open item via a directory inventory.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-14 11:53:08 +02:00
b131ee317e docs(plan): M1 — public_dns implementation plan
Bite-sized TDD plan: add community.general; scaffold public_dns; wingu.me record
data + pytest; role tasks (gandi_livedns present/absent loops, apply toggle);
Molecule (apply=false, no live API); dns.yml play; gated live run on ubongo
(purge Gandi defaults + anti-spoof baseline + dig verify); ADR-007 amendment +
TODO 4 resolution + STATUS/CAPABILITIES.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-14 10:23:26 +02:00
602550fdaa docs(spec): M2 — provision askari via Terraform + Hetzner Cloud
askari is provisioned as IaC: Terraform owns its existence too, generalizing
ADR-006 from "Proxmox VM existence" to Proxmox + Hetzner (new hetznercloud/hcloud
provider, hetzner_vm module, offsite stack with local state). CAX11 (ARM) in
Helsinki on Debian 13, behind a TF-managed Hetzner Cloud Firewall (SSH-from-ubongo
now; NetBird ports in M4). Token via TF_VAR_hcloud_token from vault.hetzner.token.
Handoff stays ADR-009-shaped (tf_to_inventory.py extended to emit askari into
offsite_hosts). State in the ADR-022 backup scope; DR via terraform import.

Amends ADR-006/009/020/007/016. Point ROADMAP.md M2 at the spec.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-14 10:12:10 +02:00
32d480efcf docs(spec): note project (boma) vs domain (wingu.me) in the naming scheme
Decided to keep the project named boma with wingu.me as its domain (boma was not
available as a domain). Record why the infra tier reads <host>.boma.wingu.me so it
isn't re-litigated; folds into the ADR-007 amendment.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-14 09:47:13 +02:00
f7fac5f5e3 docs(spec): M1 — finalize for wingu.me (greenfield), record Gandi-defaults purge
boma's domain is wingu.me (registered at Gandi; 'wingu' = Swahili for cloud).
Replace the parametric <boma-domain> placeholder with wingu.me throughout. The
zone was NOT empty — Gandi auto-seeded 13 default records (parking A, www redirect,
a full Gandi mailbox set), so M1 includes a one-time purge to a clean baseline plus
an anti-spoof null-mail set (null MX, SPF -all, DMARC reject) since wingu.me sends
no mail. Domain-pick open item closed.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-14 09:14:10 +02:00
7a47dd9dec docs(spec): M1 — public DNS migration to Gandi (DNS-as-code) design
Settles the M1 design: full registrar transfer Cloudflare -> Gandi; three-tier
naming scheme (host.boma / service.bare / service.askari), nyumbani dropped,
mesh/LAN-only default; public-DNS-as-code via a control-node `public_dns` role
driven by group_vars data, using community.general.gandi_livedns with a PAT
(api_key is deprecated/rejected by Gandi — verified per ADR-014). Stale records +
unused MX cleaned by omission. Cert scope is DNS+PAT only (issuance deferred to
M4/Phase 2). Human/agent division of labour + token-scoping recorded.

Resolves TODO 4 and review finding O12 once the ADR-007 amendment lands. Point
ROADMAP.md M1 at the spec.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-11 23:17:19 +02:00
f3f382ae69 Add dev_env role: zsh/tmux/nvim for workstation-class hosts
A new role (separate from base) that gives workstation-class hosts (ubongo
now, mamba later) a clean interactive environment: zsh + oh-my-zsh +
oh-my-posh, tmux + TPM plugins, and neovim. Dotfiles are real files deployed
via GNU stow (not templated); pinned nvim v0.12.2 + oh-my-posh 29.0.1.

Configs re-derived (ADR-013) from AnsibleBaobabV4 + the operator's fisi setup
on boma's terms: no Nerd Font (headless host), no system LSP suite (nvim uses
mason), versions pinned (V4 tracks latest). Applied via playbooks/workstation.yml
to the control group for users sjat + claude. Lint + Molecule (idempotent) green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-11 13:50:11 +02:00
b9daf2a0ad plan: record ubongo build outcome (done/deferred/follow-ups)
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-11 10:33:18 +02:00
7b190e4313 Add ubongo physical-build plan (2026-06-11 session)
Captures the interactive build decisions (no-encryption + accepted risk,
simple partition, dedicated claude identity, LAN-only access, pinned
versions) and the A-F + H task breakdown. Sequel to the 2026-06-05
docs-only ADR-015 plan.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-11 10:01:41 +02:00
6d7d27b03b docs(adr): add Proposed lifecycle state; mark ADR-011 Proposed
Revisits the lifecycle decision on the evidence of ADR-011 (a real draft
with open questions). Adds a fourth state, Proposed (YYYY-MM-DD), to ADR-023,
the template, the adr-structure check (+test), spec and plan. Sets ADR-011's
Status to Proposed and removes its now-redundant inline 'Proposed' line.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-10 14:48:55 +02:00
3c920ae630 docs(adr): sync plan Task 2 with flat-comment template fix
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-10 14:31:23 +02:00
89179dd7c9 docs(adr): revise spec+plan — full retroactive restructure of 001-018
Replaces the Status-only backfill with a faithful presentational
restructure bringing the whole back-catalogue to 4-section conformance
(no grandfathering). Adds the faithfulness rule and per-file worklist.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-10 14:28:20 +02:00
ce3319cbed docs(adr): implementation plan + FRICTION signal for ADR structure
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-10 13:55:16 +02:00
dfbe37916f docs(adr): design spec for ADR structure & lifecycle (ADR-023)
Codifies the structure ADRs 019-022 converged on, pins an
Accepted/Superseded/Deprecated lifecycle with a no-silent-rewrite rule,
adds an adr-template.md scaffold, and plans a Status-header backfill of
ADRs 001-018. Basis for ADR-023.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-10 13:45:21 +02:00
ed6d5463aa docs(backup): final-review fixes — stateless BACKUP.md, dump-step wording, spec sync
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-10 11:32:06 +02:00
f5c97d1f36 docs(backup): record ADR-022; wire into CLAUDE.md, STATUS, TODO
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-10 11:19:01 +02:00
2041bd3b70 docs(backup): add foundation-layer implementation plan (ADR-022)
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-10 11:05:17 +02:00
eaffd8d900 docs(backup): add backup & DR strategy design (→ ADR-022)
Data-only restic backups, rebuild-from-code recovery (Model A); central
off-cluster pull node (fisi) with 8TB mirror; 3-2-1 via pCloud (rclone)
+ rotated USB air-gap. Per-service backup__* contract + BACKUP.md as a
hard convention. Two-tier restore testing (ubongo container restore-verify
+ semi-annual staging DR rehearsal). One restic password escrowed to
Vaultwarden + paper (restic + vault passwords) for a non-circular
break-glass. Dead-man's-switch alerting via Uptime Kuma.

Resolves TODO 3.8; grounds ADR-011's backup-first assumption.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-10 11:00:01 +02:00
cdbd66410a docs(access): implementation plan for ADR-021 operational access
Splits the work into Tranche A (land now: ADR-021, ADR-016/020
reconciliation, ssh-from-control firewall source, ACCESS.md template,
/check-access command, governance + index wiring) and Tranche B
(build-pending on infra: per-service access__* + rendered ACCESS.md,
/check-access running).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-09 17:16:49 +02:00
fd4bbbc977 docs(access): design operational-access doctrine (ADR-021)
Brainstorming spec for ADR-021: operational access as a deployment
deliverable. Two layers (host baseline + per-service), a three-tier
access ladder (mesh SSH -> LAN SSH from ubongo -> console break-glass),
declarative access__* data rendering ACCESS.md and driving a
/check-access verifier. Resolves TODO 3.2 (API access) and 7.2 (host
access); amends ADR-016 (SSH also from ubongo) and ADR-020
(ssh-from-control source).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-09 17:10:54 +02:00
03329d7d25 docs(plan): host nftables firewall implementation plan
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-06 18:47:48 +02:00
d7fbaca554 docs(spec): host nftables firewall design (ADR-020 build #1)
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-06 18:40:50 +02:00
f700f4a475 docs(plan): firewall strategy ADR-020 landing plan
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-06 15:42:17 +02:00
2a65391c0e docs(spec): firewall strategy design (TODO 3.5 → ADR-020)
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-06 15:36:24 +02:00
04bfc26422 docs(plan): tagging standard implementation plan (ADR-019)
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-06 09:21:15 +02:00
4ed9e9a8bf docs(spec): tagging standard design (TODO 3.7/3.11 → ADR-019)
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-06 09:15:44 +02:00
96f8f20c05 Add implementation plan for logging + log integrity (ADR-018)
Task-by-task docs plan: author ADR-018 and reconcile ADR-002, accepted-risks
(R4), CAPABILITIES, ADR-012, STATUS, TODO, CLAUDE.md. Roles/pipeline deferred
on the base + service-role machinery.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-06 06:59:58 +02:00
8eb5ccf97d Add design spec for logging + log integrity (ship all to Loki)
All logs -> on-cluster Loki for troubleshooting/trends; a security-relevant
subset also ships write-only off-site to askari (append-only, tamper-resistant
against full-cluster compromise); skip WORM (accepted-risk R4). Alloy agent in
base; loki/grafana service roles; disk-wear handled as a design parameter.
Basis for ADR-018.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-05 22:03:31 +02:00
be6a064f44 Add implementation plan for service-UI verification (Level 4)
Task-by-task: author ADR-017, expand ADR-008 Level 4, create the VERIFY.md
template + /verify-service skill, and reconcile the checklist/CLAUDE.md/
gitignore/STATUS/TODO. Buildable-now artifacts; live run stays deferred.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-05 13:11:43 +02:00
2bd11b5aa9 Add design spec for service-UI verification (ADR-008 Level 4)
Resolves ADR-015 deferred item #2 + TODO 2.2/2.3: a Claude-driven exploratory
browser harness (/verify-service) that exercises staging service UIs through
real SSO, backed by a per-service VERIFY.md, with test users in staging
Authentik and a manual-test handoff. Basis for ADR-017.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-05 13:05:11 +02:00