ADR-002 baseline (key-only, no root, fail2ban 5/1h) as two base task files under the existing 'hardening' concern tag; applied to askari by tag (NOT the host firewall — that's mesh-gated to avoid lockout; Hetzner Cloud Firewall is the perimeter until M5). NetBird agent deferred to M4. Adds a LIMIT=/TAGS= passthrough to make check/deploy. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
10 KiB
base SSH hardening + fail2ban (M3) Implementation Plan
For agentic workers: REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (
- [ ]) syntax for tracking.
Goal: Add SSH-hardening + fail2ban concerns to the base role (ADR-002 baseline) and apply them to askari — without locking anything out.
Architecture: Two new base task files (ssh.yml, fail2ban.yml), both under the existing hardening concern tag, included after firewall.yml. Applied to askari by tag (hardening) so the host firewall (default-deny) is NOT applied pre-mesh — the Hetzner Cloud Firewall remains askari's perimeter until M5. A LIMIT=/TAGS= passthrough on make check/deploy enables the targeted apply.
Tech Stack: Ansible (ansible.builtin, ansible.posix.authorized_key — already vendored), sshd drop-in config, fail2ban.
Spec: docs/superpowers/specs/2026-06-14-base-ssh-fail2ban-m3-design.md
Execution context: Tasks 1–3 author + Molecule (Docker available). Task 4 applies to live askari (gated; reachable from ubongo). No new billed resources.
Task 1: make check/deploy LIMIT + TAGS passthrough
Files: Modify Makefile (the check and deploy recipes).
- Step 1: In the
check:recipe, change the command line to:
$(PLAYBOOK_BIN) $(INVENTORY) $(VAULT_ARGS) $(if $(LIMIT),--limit $(LIMIT)) $(if $(TAGS),--tags $(TAGS)) --check --diff playbooks/$(PLAYBOOK).yml
- Step 2: In the
deploy:recipe, change the command line to:
$(PLAYBOOK_BIN) $(INVENTORY) $(VAULT_ARGS) $(if $(LIMIT),--limit $(LIMIT)) $(if $(TAGS),--tags $(TAGS)) playbooks/$(PLAYBOOK).yml
- Step 3: Add help lines noting
[LIMIT=<host>] [TAGS=<tags>]are optional on check/deploy. - Step 4: Sanity-check it parses:
make check PLAYBOOK=dns LIMIT=control TAGS=public_dns 2>&1 | tail -2(should run check-mode scoped to control). Expected: no make/syntax error. - Step 5: Commit:
git add Makefile
git commit -m "feat(make): optional LIMIT= and TAGS= passthrough on check/deploy"
(append Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>)
Task 2: base hardening concern — ssh + fail2ban
Files: Create roles/base/tasks/ssh.yml, roles/base/tasks/fail2ban.yml, roles/base/templates/sshd_hardening.conf.j2, roles/base/templates/fail2ban_sshd.local.j2; modify roles/base/tasks/main.yml, roles/base/defaults/main.yml, roles/base/handlers/main.yml, inventories/production/group_vars/all/vars.yml.
- Step 1: Append to
roles/base/defaults/main.yml:
# SSH hardening + fail2ban (ADR-002) — `hardening` concern.
base__ssh_password_authentication: "no"
base__ssh_permit_root_login: "no"
base__fail2ban_maxretry: 5
base__fail2ban_bantime: 1h
base__fail2ban_findtime: 10m
# base__ssh_authorised_keys lives in group_vars/all/vars.yml (per-person control keys).
- Step 2: Create
roles/base/templates/sshd_hardening.conf.j2:
# Managed by Ansible (base role, ADR-002). Do not edit on the host.
PasswordAuthentication {{ base__ssh_password_authentication }}
PermitRootLogin {{ base__ssh_permit_root_login }}
PubkeyAuthentication yes
KbdInteractiveAuthentication no
- Step 3: Create
roles/base/templates/fail2ban_sshd.local.j2:
# Managed by Ansible (base role, ADR-002).
[sshd]
enabled = true
maxretry = {{ base__fail2ban_maxretry }}
bantime = {{ base__fail2ban_bantime }}
findtime = {{ base__fail2ban_findtime }}
- Step 4: Create
roles/base/tasks/ssh.yml:
---
- name: Ensure openssh-server is installed
ansible.builtin.apt:
name: openssh-server
state: present
update_cache: true
- name: Render hardened sshd drop-in
ansible.builtin.template:
src: sshd_hardening.conf.j2
dest: /etc/ssh/sshd_config.d/10-boma.conf
owner: root
group: root
mode: "0644"
notify: reload sshd
- name: Validate the full sshd config (drop-in included)
ansible.builtin.command: sshd -t
changed_when: false
- name: Authorise control SSH keys for the ansible user
ansible.posix.authorized_key:
user: "{{ ansible_user | default('ansible') }}"
key: "{{ base__ssh_authorised_keys | join('\n') }}"
exclusive: true
when: base__ssh_authorised_keys | length > 0
- Step 5: Create
roles/base/tasks/fail2ban.yml:
---
- name: Install fail2ban
ansible.builtin.apt:
name: fail2ban
state: present
update_cache: true
- name: Configure the sshd jail
ansible.builtin.template:
src: fail2ban_sshd.local.j2
dest: /etc/fail2ban/jail.d/sshd.local
owner: root
group: root
mode: "0644"
notify: restart fail2ban
- name: Enable and start fail2ban
ansible.builtin.service:
name: fail2ban
enabled: true
state: started
- Step 6: Replace
roles/base/handlers/main.yml:
---
- name: Reload sshd
listen: reload sshd
ansible.builtin.service:
name: ssh
state: reloaded
- name: Restart fail2ban
listen: restart fail2ban
ansible.builtin.service:
name: fail2ban
state: restarted
- Step 7: In
roles/base/tasks/main.yml, add after the firewall include:
- name: SSH hardening
ansible.builtin.include_tasks: ssh.yml
tags: [hardening]
- name: fail2ban intrusion deterrence
ansible.builtin.include_tasks: fail2ban.yml
tags: [hardening]
- Step 8: In
inventories/production/group_vars/all/vars.yml, setbase__ssh_authorised_keys(replace the empty[]):
base__ssh_authorised_keys:
- "ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIKSx1TFLJ9H8vCe5ZJSu7MYmAiH0/OC8evloQjGR0Bqw claude@ubongo"
- Step 9:
make lint— expect0 failure(s)+check-tags: OK(thehardeningtag is already intests/tags.yml). - Step 10: Commit:
git add roles/base inventories/production/group_vars/all/vars.yml
git commit -m "feat(base): ssh hardening + fail2ban (hardening concern, ADR-002)"
(Co-Authored-By trailer)
Task 3: Molecule coverage
Files: Modify roles/base/molecule/default/converge.yml, roles/base/molecule/default/verify.yml.
-
Step 1: In
converge.yml, the role already runs withbase__firewall_apply: false. Leavebase__ssh_authorised_keysunset (defaults to[]→ theauthorized_keytask is skipped, no test user needed). No converge change needed unless vars are missing — confirm the play still hasroles: [base]. -
Step 2: Append assertions to
verify.yml(after the existing firewall checks):
- name: sshd drop-in present and config valid
ansible.builtin.command: sshd -t
changed_when: false
tags: [verify]
- name: PasswordAuthentication is disabled
ansible.builtin.command: grep -q '^PasswordAuthentication no' /etc/ssh/sshd_config.d/10-boma.conf
changed_when: false
tags: [verify]
- name: fail2ban sshd jail configured
ansible.builtin.command: grep -q '^\[sshd\]' /etc/fail2ban/jail.d/sshd.local
changed_when: false
tags: [verify]
- Step 3: Run
make test ROLE=base. Expected: converge installs openssh-server + fail2ban, renders the drop-ins, validates sshd, starts fail2ban; verify passes; idempotence clean. If the Molecule image lacks systemd-for-fail2ban or apt fails offline, capture the error (the image is systemd-enabled permolecule.yml). - Step 4: Commit:
git add roles/base/molecule
git commit -m "test(base): Molecule coverage for ssh hardening + fail2ban"
(Co-Authored-By trailer)
Task 4: Apply to askari (gated — live host)
Runs against live askari (reachable from ubongo).
rbwunlocked. Applies ONLY thehardeningconcern (--tags hardening) so the host firewall is not touched.
- Step 1: Dry-run.
make check PLAYBOOK=site LIMIT=askari TAGS=hardening— review: openssh-server present, sshd drop-in (PasswordAuthentication no,PermitRootLogin no), authorized_key foransible, fail2ban installed + sshd jail. Confirm NO firewall tasks appear. - Step 2: Apply.
make deploy PLAYBOOK=site LIMIT=askari TAGS=hardening— expect changed for the drop-in, fail2ban install/config;failed=0. - Step 3: Verify SSH still works (lock-out guard).
.venv/bin/ansible offsite_hosts -m ping→pong. And.venv/bin/ansible offsite_hosts -b -m command -a 'sshd -t'→ rc=0. - Step 4: Verify fail2ban.
.venv/bin/ansible offsite_hosts -b -m command -a 'fail2ban-client status sshd'→ shows the sshd jail active. - Step 5: Idempotence. Re-run Step 2 →
changed=0. - Step 6: No repo commit (configures the host, not the repo).
Task 5: Docs
Files: Modify STATUS.md, docs/ROADMAP.md.
- Step 1: In
STATUS.md, update theroles/base/row (under "Scaffolded but empty"/partial) to note thehardeningconcern (ssh + fail2ban) is now built, and applied to askari; firewall concern still pending application (mesh-gated). If askari's row exists in "Real and working today," append "SSH hardened + fail2ban (M3)". - Step 2: In
docs/ROADMAP.md, mark M3 as done (ssh + fail2ban built + applied to askari; NetBird agent deferred to M4; host firewall + ubongo hardening at M5). - Step 3:
make lint; commit:
git add STATUS.md docs/ROADMAP.md
git commit -m "docs(base): M3 — ssh hardening + fail2ban applied to askari; STATUS + roadmap"
(Co-Authored-By trailer)
Self-Review (completed)
- Spec coverage: ssh + fail2ban concerns under
hardening(Decision 1) → Task 2; apply-by-tag, no firewall (Decision 2) → Task 4 (TAGS=hardening);base__ssh_authorised_keyspopulated (Decision 3) → Task 2 Step 8; LIMIT/TAGS passthrough (Decision 4) → Task 1; ADR-002 controls (key-only, no root, fail2ban 5/1h) → Tasks 2; Molecule + live verify (testing) → Tasks 3, 4. Deferrals (agent/M4, host-fw+ubongo/M5, auditd/Phase 2) honoured. - Placeholder scan: none — all task/template/handler content is concrete.
- Name consistency:
base__ssh_*/base__fail2ban_*/base__ssh_authorised_keysused identically across defaults, templates, tasks, and group_vars; handler listen-topics (reload sshd,restart fail2ban) match thenotify:strings. - Lock-out guard: sshd hardening only disables password+root (we use key+sudo); the
ansibleuser's key is preserved (base__ssh_authorised_keyshas it);sshd -tvalidates before reload; firewall untouched (--tags hardening). Task 4 verifies SSH post-apply.