boma/STATUS.md

5.1 KiB

Project status — what's real vs planned

This repo is partly aspirational: the ADRs in docs/decisions/ describe the intended design, and some of it is not built yet. This file is the ground truth. Before relying on a role, provider, or pipeline existing, check here. If something is listed as "designed, not built", do not assume it works.

Last reviewed: 2026-05-30.

Real and working today

Thing State
playbooks/bootstrap.yml Works — self-contained (installs Python, creates the ansible user + sudoers)
scripts/tf_to_inventory.py Works — stdlib only; terraform output -jsonhosts.yml
.docker/molecule-debian13/Dockerfile Present — custom Molecule test image (ADR-008)
docs/decisions/*, docs/runbooks/* Current and mutually reconciled
Makefile, lint config (.ansible-lint, .yamllint), .gitignore Present and used
git Initialized, trunk-based on main, pushed to origin (forgejo.nyumbani.baobab.band:7577).
Pre-commit hooks Configured: lint, gitleaks, vault-encryption guard. Activate with pre-commit install after make setup.
Vault password client scripts/vault-pass-client.sh fetches the master password from Vaultwarden via rbw (wired as vault_password_file). Requires rbw installed + rbw unlock.
/review-repo Repo audit: scripts/repo-scan.py (Phase 0) + .claude/commands/review-repo.md, reports to docs/reviews/. On-demand only; cron + email deferred (docs/TODO.md).
Terraform HCL (terraform/) Written (proxmox VM module + envs) — but never run; see below
docs/hardware/reference.md + scripts/capacity-scan.py Present — reference doc (skeleton until real hardware) + stdlib scan; emits capacity JSON
/capacity-review Works — on-demand capacity evaluation → docs/hardware/reviews/. Intent-based (no live usage yet)
ADR-002 security strategy + docs/security/{accepted-risks,service-checklist}.md Present — threat model, principles, governance frame; checklist + risk register are docs, enforced manually in review
Service-role standard + per-service SECURITY.md convention Defined (ADR-004 + docs/security/service-security-template.md); not yet applied — no service roles exist

Scaffolded but empty — NOT implemented

Thing State
roles/base/ Not in git — only an empty dir on disk (untracked). site.yml references it, so a clean clone errors on make deploy PLAYBOOK=site until it is built.
roles/docker_host/ Not in git. Same.
inventories/*/hosts.yml Structured stubs with empty host maps (hosts: {}); regenerated by make tf-inventory once Terraform has hosts
inventories/production/group_vars/{docker_hosts,proxmox_hosts}/ Empty dirs

So make deploy PLAYBOOK=site currently fails on a clean clone — the base and docker_host roles it calls do not exist yet.

Designed but not built

Thing Designed in Notes
dns role (renders the internal zone) ADR-007 / ADR-009 Does not exist. Internal DNS ownership is assigned to it by design.
Terraform actually provisioning ADR-006 / ADR-009 Never terraform inited: no .terraform.lock.hcl, no state, no real local.vms entries
CI (Forgejo Actions) ADR-003 / ADR-008 Pipeline described; not implemented
Level 2 / 3 testing (staging, askari smoke) ADR-008 Depends on real VMs / askari, which don't exist yet
Per-service roles ADR-004 Model defined; no service roles built
Forgejo Actions CI ADR-003 / ADR-008 Remote is live (pushed); Actions/act_runner pipeline not yet built
Live usage stats for /capacity-review ADR-012 / TODO 8.4 gather_usage() stubbed; source undecided (Proxmox RRD vs PLG stack); needs the cluster
/security-review skill ADR-002 / TODO 8.5 Periodic posture re-check + accepted-risk re-challenge; planned, not built
CIS hardening (Debian L1+L2 + Docker) ADR-002 / TODO 15 Implemented by the (unbuilt) base/docker_host roles; brings AppArmor + AIDE as baseline. L2 partitions affect VM provisioning (ADR-006)
Network IDS + security alerting ADR-002 / TODO 15 Suricata on OPNsense + AIDE/auditd/fail2ban alerting into the monitoring stack; not built
ubongo — physical control / AI-worker host ADR-015 Replaces the cluster control VM with a dedicated always-on x86 box outside the cluster. Decision recorded; box not yet acquired/installed, not in inventory.
NetBird mesh — coordinator on askari ADR-016 Self-hosted NetBird control plane (management/signal/relay) on askari; replaces ADR-007 WireGuard. Decision recorded; not deployed (askari + service-role machinery not built).
NetBird agent enrollment in base ADR-016 Every Linux host joins the mesh via the base role (setup keys in vault); SSH allowed only on wt0. Designed; base role not built.

Keeping this honest

Update this file whenever you build, stub, or remove something. It is the first place an AI tool or new contributor should look to learn what they can actually rely on. When a row moves from "designed" to "working", move it up — don't leave stale optimism here.