Documents three blockers found while developing the askari_inputonly
integration-test profile:
1. inet filter default-deny silently blocks libvirt dnsmasq DHCP: nftables
multi-table independence means ip filter LIBVIRT_INP accept does NOT
prevent inet filter drop. Diagnosed via strace; fixed with a drop-in.
2. libvirt leaseshelper PID-file: virPidFileReleasePath unlinks the file after
every call; nobody cannot recreate in /run/. Fix: suid root C wrapper.
3. cloud-init rejects underscores in local-hostname → skips network-config
→ no DHCP. Fix: sanitize with replace("_", "-") in meta-data hostname.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Applying base's nftables (even INPUT-only/forward-accept) to a Docker host
flushes Docker's ip nat -> container egress breaks until 'systemctl restart
docker'. Found on the ubongo mesh-hardening 2/3 live cutover; the Docker-less
test VM couldn't surface it. Self-heals on reboot (dockerd re-adds nat;
forward=accept doesn't block). Runbook/docker_host follow-ups noted.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Final-review finding: the default Molecule scenario only renders the forward
drop (input_only off) branch; the accept branch is covered by the integration
harness only. Tracked for a kaizen decision (2nd scenario vs accept the split).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Two signals from running the ubongo harness gate: (1) the operator wants a
standard pre-authorising isolated VM integration tests on ubongo so the agent
doesn't ask each time; (2) a stale agent session (shell predating the
integration_test libvirt-group grant) carries stale process groups, so the
harness's qemu-img/file writes are denied -> run via 'sg libvirt -c ...';
self-heal idea noted.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Allow a second operator workstation (10.20.10.17) onto ubongo's LAN SSH
alongside mamba (10.20.10.50). Both are raw DHCP leases; recorded a FRICTION
open signal to replace them with MAC-pinned OPNsense reservations when
OPNsense-as-code lands (ADR-020 / TODO 3.5).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
firewall-breaks-Docker-hosts, ip_nonlocal_bind didn't beat the boot race,
coordinator-host circular bootstrap, NetBird geo-DB FATAL dependency, no
off-site coordinator backup, and reboot-tested-after-removing-break-glass.
For the next /kaizen + the mesh-hardening re-spec.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Migrate the single-file-bind-mount/stale-config gotcha (reload-in-place needs a
directory mount; restart-based roles don't) to docs/testing/gotchas.md, and move
all 7 open signals out of FRICTION.md's Open-signals section into the new
2026-06-17 decisions-ledger block: all consumed, 1 PARK (the ubongo
self-management gap, tracked in STATUS), 0 REMOVE. Relax test_load_signals to
accept an empty Open-signals section (the goal state after a kaizen pass).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
I re-surfaced two already-settled decisions as questions (push to origin; subagent
vs inline) at the M5 handoff. The existing execution-mode guard only matches the
writing-plans menu's literal text, so free-form prose re-asks slip through. Default:
push as backup and go subagent-driven without asking.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Building images is fully automatable; pushing to the Forgejo registry needs an
interactive docker login, and registry creds aren't in vault — so an agent can't
complete a push. Captured for the next kaizen review.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
ADR-024 Status/Consequences, STATUS.md, ROADMAP M4a, and the FRICTION ledger now
record that the DNS-01 path is built and proven, with the root cause of the M4a
failure (version skew: pre-Bearer libdns/gandi sent the deprecated Apikey header;
plus building on a Hetzner IP). Traefik was reconsidered and rejected again — lego's
Gandi provider has the same PAT-vs-Apikey question, so it would not have helped.
Dated review reports and spec/plan snapshots are left as historical records.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Three new Open signals: ansible-lint no-role-prefix vs ADR-021/022 access__/
backup__ conventions (first service role); Molecule tag-propagation now testable
via tagged converge + full-then-partial; ADRs over-claiming cross-doc reconciliation
(repo-scan check candidate, cousin of stale-deferred).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
M1 public_dns applied to wingu.me (purge + SPF/DMARC, idempotent). Friction:
item.values dict-method collision, Gandi null-MX rejection, and the apply=false-
Molecule/data-only-pytest gap that let both bugs reach a live apply.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
5th occurrence (06-14): asked the subagent-driven/inline menu at the M1 plan
handoff. The 06-10 ledger claims a Stop hook blocks this; it didn't fire. Flag to
verify the hook is present + its matcher catches the writing-plans menu wording.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Complete the 2026-06-09 entry (third recurrence of presenting the
execution-mode menu despite the standing subagent-driven preference) and
restore two continuation-line indents a markdown formatter had stripped.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Forgejo origin is trunk-based with no merge-request gate, so the
finishing-a-development-branch "open a PR" option doesn't apply — merge
locally then push. Also carries earlier uncommitted FRICTION.md edits
(emphasis normalization + 2026-05-31 ADR-status entry).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
docs/FRICTION.md: a running log of friction/gotchas/recurring-fixes/unused tooling,
seeded with this session's real signals — raw material for the periodic kaizen
review. docs/TODO.md: schedule building /retro in ~1 week, and record the Claude-setup
decision. (Also carries your earlier backlog edits.)
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>