diff --git a/CLAUDE.md b/CLAUDE.md index f45ae4d..9b3cd93 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -187,6 +187,7 @@ Single-contributor, trunk-based (no merge requests / approval gates): | Topic | File | |------------------------|---------------------------------------| | Architecture overview | `docs/decisions/001-architecture.md` | +| Capabilities overview (what boma does) | `docs/capabilities.md` | | Security baseline & strategy | `docs/decisions/002-security.md` | | Accepted security risks | `docs/security/accepted-risks.md` | | Per-service security checklist | `docs/security/service-checklist.md` | diff --git a/docs/capabilities.md b/docs/capabilities.md new file mode 100644 index 0000000..c128b11 --- /dev/null +++ b/docs/capabilities.md @@ -0,0 +1,143 @@ +# boma capabilities overview + +A high-level, **living** map of what boma is intended to do — organised by capability +domain, decided on boma's own terms. This is the frame of reference for "what does +what" alignment, for judging which tools/plugins are relevant, and (later) for +planning what runs on which node. It is **intent**, not status — see `STATUS.md` for +what actually exists. + +Decided fresh per **ADR-013** (V4 is not a source of scope); a V4 completeness check +is recorded at the bottom. Individual service *picks* (e.g. Jellyfin vs Plex) and +node placement are **surfaced here, not resolved here** — they are downstream +decisions this frame enables. + +## Legend + +- **Tier** — **P** platform (others depend on it) · **A** user-facing app · **S** supporting +- **Commitment** — **core** (foundational, committed) · **planned** (intended) · + **candidate** (wanted, option not settled) · **maybe-later** (nice to have) +- ⚠️ = interacts with an existing ADR; reconcile before building. + +--- + +## 1. Edge & networking — [P] + +| Capability | Candidate service(s) | Tier | Commitment | What it does | Notes / open | +|---|---|---|---|---|---| +| Reverse proxy / TLS | Traefik | P | core | Edge routing + ACME certs for everything exposed | Spin-up order names it (TODO 12) | +| Internal DNS | `dns` role → dns1/dns2 | P | core | Authoritative internal zone (ADR-007) | Ansible-rendered zone | +| VPN / remote access | Netbird · *or* OPNsense WireGuard | P | candidate | Secure remote access to `srv`/`mgmt` | ⚠️ ADR-007 commits WireGuard-via-OPNsense; Netbird (mesh) is a real alternative to weigh | +| Service portal / dashboard | Homepage | A | candidate | One landing page listing all services — a "what does what" front door | Gap surfaced by V4; fits boma's legibility goal | + +_(DHCP, firewall, mDNS reflection live on OPNsense — Ansible-managed, not containers.)_ + +## 2. Identity & access — [P] + +| Capability | Candidate service(s) | Tier | Commitment | What it does | Notes / open | +|---|---|---|---|---|---| +| SSO / identity provider | Authentik | P | planned | Central auth / forward-auth for exposed services | Named in spin-up order (TODO 12) | +| Secrets / password vault | Vaultwarden | P | core | Personal vault; also holds the Ansible vault master password | Already used by `rbw` (ADR-002) | + +## 3. Observability — [P] + +| Capability | Candidate service(s) | Tier | Commitment | What it does | Notes / open | +|---|---|---|---|---|---| +| Metrics | Prometheus | P | planned | Time-series metrics + alert rules | TODO 3.6 | +| Logs | Loki | P | planned | Log aggregation | TODO 3.6 | +| Dashboards | Grafana | P | planned | Visualisation + alerting | TODO 3.6 | +| Uptime checks | Uptime Kuma | P | planned | Endpoint up/down checks | TODO 3.6 | +| External watchdog | askari (Hetzner VPS) | P | core | Off-site monitoring that survives a homelab outage | ADR-007 | +| Notify / alerting | ntfy · Matrix · email (multi-channel) | S | planned | Deliver alerts to the user across channels | TODO 9; Matrix homeserver in §8 | +| Metric exporters | node_exporter, cAdvisor, … | S | planned | Feed Prometheus | per host/service | + +## 4. Source & CI — [P/A] + +| Capability | Candidate service(s) | Tier | Commitment | What it does | Notes / open | +|---|---|---|---|---|---| +| Git hosting | Forgejo | P/A | core | Repo hosting + off-machine backup | **Live** (ADR-010) | +| CI runner | Forgejo Actions (`act_runner`) | P | planned | Pipelines (lint/test/deploy) | ADR-010 / ADR-008 | + +## 5. Home automation & IoT — [A] + +| Capability | Candidate service(s) | Tier | Commitment | What it does | Notes / open | +|---|---|---|---|---|---| +| Automation hub | Home Assistant | A | planned | IoT controller; bridges `srv`↔`iot` | ADR-007 (`10.20.0.13`) | +| Device messaging | MQTT broker (Mosquitto) | S | candidate | Messaging backbone for HA/devices | Only if devices need it | +| Radio / firmware bridges | Zigbee2MQTT · ESPHome | S | maybe-later | Zigbee/ESP device integration | Hardware-dependent | + +## 6. Media — [A] + +| Capability | Candidate service(s) | Tier | Commitment | What it does | Notes / open | +|---|---|---|---|---|---| +| Media server | Jellyfin · *vs* Plex | A | candidate | Stream media to the household | Jellyfin = FOSS default | +| Library automation | \*arr stack (Sonarr/Radarr/…) | A | candidate | Acquire/organise media | ADR-011 example | +| Download client | qBittorrent (+ VPN) | S | candidate | Fetching | Egress isolation needed | +| Indexer helper | FlareSolverr | S | candidate | CAPTCHA/Cloudflare for indexers | ADR-011 example | +| Books / audiobooks / podcasts | Audiobookshelf · Calibre-Web · (Readarr) | A | candidate | Ebook + audiobook + podcast library | Gap surfaced by V4; video-only was too narrow | +| Download VPN egress | Gluetun | S | candidate | Routes the download client through a commercial VPN | Pairs with qBittorrent | + +## 7. Personal cloud & files — [A] + +| Capability | Candidate service(s) | Tier | Commitment | What it does | Notes / open | +|---|---|---|---|---|---| +| Files / sync | Nextcloud | A | planned | Personal cloud, file sync & share; CalDAV/CardDAV (contacts & calendar) | ADR-011 | +| Photos | PhotoPrism · *vs* Immich | A | candidate | Photo library + phone backup | ADR-011 lists PhotoPrism; Immich is the modern alt | +| Office documents | Collabora Online | A | candidate | In-browser document editing for Nextcloud | Gap surfaced by V4 | +| LAN file shares | Samba · NFS | S | candidate | Raw SMB/NFS shares (distinct from Nextcloud sync) | Gap surfaced by V4; only if a direct-share need exists | + +## 8. Communications — [A] + +| Capability | Candidate service(s) | Tier | Commitment | What it does | Notes / open | +|---|---|---|---|---|---| +| Real-time chat | Matrix homeserver (Synapse · *vs* Conduit) | A | planned | Self-hosted messaging; also an alert route | Stateful + internet-facing → careful exposure, own `SECURITY.md` | +| Bridges | mautrix-* | S | maybe-later | Bridge other networks into Matrix | After the homeserver is stable | +| Self-hosted email | Poste.io · Mailcow | A | maybe-later | Run boma's own mail server | ⚠️ Deliverability + security are heavy; V4 ran one — re-justify hard before committing | + +## 9. Data & backup — [P/S] + +| Capability | Candidate service(s) | Tier | Commitment | What it does | Notes / open | +|---|---|---|---|---|---| +| Databases | Postgres/MariaDB — central *vs* per-app | P | candidate | Backing store for stateful apps | Open: central server vs per-service (TODO 3.9) | +| Backup engine | Proxmox Backup Server · restic | P | planned | VM backups (PBS) + file/DB dumps (restic) | TODO 3.8 | +| Off-site target | pCloud | S | planned | Off-site copy of backups (3-2-1) | | +| Air-gap target | USB hard drives | S | maybe-later | Periodic cold/air-gapped copy | Manual rotation | + +## 10. Operations & support — [S] + +| Capability | Candidate service(s) | Tier | Commitment | What it does | Notes / open | +|---|---|---|---|---|---| +| Update watcher | DIUN | S | planned | New-image alerts driving the update process | ADR-011 | +| Scheduled jobs | `scheduled_jobs` role + `claude -p` jobs | S | planned | Declarative cron: `/review-repo`, security/capacity reviews, sanity checks | TODO 8 | +| Sanity / smoke | whoami + health checks | S | planned | Verification endpoints + "is it actually working" checks | ADR-011 / TODO 8.2 | + +--- + +## V4 completeness check + +Run against AnsibleBaobabV4's role set per ADR-013/014 — used **only** as a coverage +check, not a source of scope. Each finding was re-justified on boma's terms before it +changed anything here. + +**Strong alignment (confirms the fresh frame).** Most of boma's picks correspond to +services V4 actually ran: Traefik, DNS, Vaultwarden, Forgejo (+runner), Prometheus, +Grafana, Grafana-Alloy, Loki, Uptime Kuma, exporters, ntfy, DIUN, Nextcloud (+db), +PhotoPrism (+db), Jellyfin, the \*arr stack, qBittorrent, FlareSolverr, Home Assistant, +a Matrix homeserver (V4: Conduwuit) + Element web client, WireGuard, backups (incl. +cloud). That overlap is reassuring — these were arrived at independently and match. + +**Gaps it surfaced (added above as `candidate`/`maybe-later`, re-justified):** books/ +audiobooks/podcasts library; Gluetun download-VPN egress; Collabora office docs; +Samba/NFS LAN shares; CalDAV/CardDAV via Nextcloud; a service portal (Homepage); +self-hosted email (parked maybe-later — heavy). Also noted: a Matrix deployment needs +a **client** (Element web), not just the homeserver. + +**Long-tail surfaced but parked (maybe-later, not added as rows):** local self-hosted +AI/LLM, a game server (Minecraft), generic static-site hosting. Plausible someday; +none are committed. + +**Confirmed exclusions (V4 had them; boma deliberately does not).** V4 mixed in a lot +of **workstation/desktop** config — XFCE/GNOME desktops, kiosk mode, nvim/kitty/tmux, +LibreOffice, antivirus, remote desktop. boma is **server-only**, so these are correctly +absent. Likewise the removed Knowledge domain (Discourse, Snipe-IT, MRBS booking) and +V4-specific project websites — out of boma's scope by design. The narrower surface is +intentional, not an oversight.