Two bugs caught by the live make check/deploy on askari:
- include_tasks with a tag selects the include but NOT its tasks, so --tags hardening
ran nothing. Use apply: {tags:} to propagate (also fixed the firewall include).
- fail2ban service start + restart handler fail in a first-run --check (package not
installed yet); guard both with when: not ansible_check_mode so check is clean.
Applied to askari: SSH hardened, fail2ban active, ping still works (no lockout).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add explicit base__ssh_authorised_keys: [] default to prevent
undefined-var errors in Molecule. Extend verify.yml with sshd
drop-in validation, PasswordAuthentication check, and fail2ban
jail assertion. Pre-create /run/sshd in ssh.yml so sshd -t
works in containers before the service has ever started.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Bare 'nft list ruleset' has no leading flush, so the timer's 'nft -f rollback'
was a no-op on first apply (empty file) and errored ('table exists') on later
applies — the auto-rollback silently did nothing, defeating the askari lockout
safeguard. Prepend 'flush ruleset' so the revert is atomic + self-contained.
Verified the snapshot->lockout->revert round-trip in an isolated netns.
Also fix stale STATUS prose (base is partially built, not absent).
established/related keeps the in-flight session alive across the swap, so the
prior 'next task runs' confirm always passed even if new connections were
bricked — the rollback was theater. reset_connection + wait_for_connection now
force a fresh handshake through the new ruleset; failure aborts the play and the
armed timer reverts. (meta: reset_connection ignores 'when' — benign extra
reconnect on no-op runs; verified idempotent in molecule.)