diff --git a/docs/FRICTION.md b/docs/FRICTION.md index 2ce85fa..f555f04 100644 --- a/docs/FRICTION.md +++ b/docs/FRICTION.md @@ -197,6 +197,23 @@ harness on ubongo and shaking it down against real KVM (spec/plan in docs/superp integration-only coverage. Final-review finding; not a cutover blocker (the accept branch is a literal, and a var-name break would fail the drop branch too → caught). +- `[gotcha]` **Applying base's firewall to a Docker host flushes Docker's nat → container + egress dies until `restart docker`** (2026-06-19, mesh-hardening 2/3 live cutover): base's + `nftables.conf.j2` starts with `flush ruleset`, which wipes ALL tables incl. Docker's + `ip nat`/`ip filter` (+ libvirt's). On ubongo I chose INPUT-only so `forward` stays `accept` + — yet the apply STILL broke CONTAINER egress: `docker pull` worked (dockerd uses HOST egress) + but a container `ping` FAILED — the masquerade (SNAT) was gone, so replies couldn't return. + `forward accept` permits forwarding but can't replace the missing nat. The spec's "input-only + keeps Docker egress working" was therefore **incomplete**, and the local-VM harness couldn't + catch it (the test VM runs no Docker). Fix on the live host: `systemctl restart docker` + re-adds its `ip nat`/`ip filter` (egress restored; coexists fine with base's `inet filter`). + On REBOOT it self-heals (dockerd re-adds nat on boot; `forward accept` doesn't block — unlike + the 2026-06-17 `forward drop` incident). → (1) any cutover/runbook applying base firewall to a + Docker host MUST `restart docker` + check container egress after the apply; (2) the pending + `docker_host` nftables integration should own re-adding/persisting Docker's rules so base's + `flush` is safe; (3) the firewall final-review checklist should include "does the host run + Docker/libvirt? the flush wipes their nat." + --- ## Kaizen reviews — decisions ledger