boma/docs/decisions/019-tagging.md
sjat 175777e36a docs: reconcile 2026-06-14 review findings (O1-O7,O18,O22)
- STATUS: docker_host is built+applied, not scaffold-only (O1)
- ADR-004: backup points to ADR-022, not "out of scope"; service-role file
  table gains ACCESS.md + BACKUP.md rows (O2, O5)
- Finish Traefik->Caddy: ADR-008/011/017/019, CAPABILITIES, TODO (O3); scope
  ADR-024's custom-image/NetBird claims to the deferred DNS-01/M4b paths (O22)
- ADR-016/017/018 now lead with ## Status per ADR-023 (O4)
- ADR-002: caveat `PLAYBOOK=upgrade` as planned/unbuilt (O6)
- CAPABILITIES: carve out ubongo's dev_env from the nvim/tmux exclusion (O7)
- ADR-007: one authoritative boma.baobab.band -> boma.wingu.me transition note (O18)
- new-host Part E: note ubongo is managed as sjat, ansible-user bootstrap pending (O15)

O9 (hosts.yml header) left open: the file is generator-owned (hook-protected);
fixing it needs a tf_to_inventory.py change or a tf-inventory run, not a hand-edit.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-14 19:06:33 +02:00

5.2 KiB
Raw Permalink Blame History

ADR-019 — Tagging standard for targeted, predictable runs

Status

Accepted (2026-06-06). Resolves TODO 3.7 ("Define a tagging standard that lets us target runs without over-tagging") and TODO 3.11 ("Deliberate tagging strategy").

Context

boma wants to run playbooks targeted — a single service, a single layer, or a single cross-cutting concern — transparently and predictably: a reader should know from a --tags invocation exactly what it will and won't touch. CLAUDE.md already requires tag-filterable tasks, but no vocabulary or convention existed, and the TODO explicitly warns against the opposite failure mode: over-tagging.

Decision

Two-tier tagging

Tier 1 — role/service tag (mechanical). The tag equals the role name, applied once at the role-import level:

roles:
  - role: photoprism
    tags: [photoprism]

Ansible propagates it to every task in the role. Because one service = one role (ADR-004), this single rule covers both the layer/role and single-service targeting axes with zero per-task burden. Role-less lifecycle playbooks (e.g. bootstrap.yml) carry a single playbook-identity tag instead.

Tier 2 — concern tag (curated). A small closed list of cross-cutting concern tags, applied per-task/block only where a task genuinely belongs to that concern.

The closed concern list

A concern earns a tag only if it (a) appears in 2+ roles, (b) is worth running as a slice on its own, and (c) doesn't overlap confusingly with another.

Tag Covers
packages apt package install/management
users accounts, groups, sudo
firewall nftables rulesets & port definitions (ADR-002)
hardening security baseline — sshd config, fail2ban, auditd, sysctl
logging Alloy / log-shipping config (ADR-018)
monitoring metric exporters / health checks
config render templated config/compose files to disk — no restart
deploy bring services up / restart (compose up -d)
proxy reverse-proxy + TLS registration (Caddy routes, Authentik)

The config/deploy split lets you re-render and diff configuration (--tags config) without bouncing services, then restart deliberately (--tags deploy). backup and secrets are intentionally omitted until the roles needing them exist.

always / never

  • always — reserved for cheap preflight assertions (vault unlocked, OS is Debian 13, required vars present), so even --tags config runs its safety guards.
  • never — reserved for destructive/expensive opt-in tasks, each paired with a descriptive tag (e.g. tags: [never, force_pull]); they run only when named.

Predictability principle: tags are union-only

--tags a,b runs tasks tagged a OR b — Ansible has no native AND. boma therefore targets one axis at a time: either a role/service or a concern, never an intersection like "photoprism's firewall only." If that's ever needed, just run --tags photoprism (idempotent and fast). Designing for intersection is the over-tagging trap; we decline it on purpose.

Terraform / Proxmox VM tags (metadata only)

Every Terraform-managed VM carries exactly three Proxmox tags:

Tag Value Purpose
env staging | production which environment
role/group docker_hosts, proxmox_hosts, … matches the inventory group
managed-by terraform distinguishes IaC VMs from hand-made ones

These are pure metadata for transparency (glanceable in the Proxmox UI). They do not drive run-targeting and do not feed inventory — scripts/tf_to_inventory.py keeps building groups from the group output field, the single source of truth.

Enforcement

tests/tags.yml is the single source of truth for the allowed concern/special/ opt-in/playbook tags. scripts/check-tags.py (run by make lint, covered by tests/test_check_tags.py) scans roles/ and playbooks/ and fails on any tag outside {role directory names} {tests/tags.yml entries}. Molecule scenario files (roles/*/molecule/**) are excluded from the scan — they are test orchestration, not the production run-targeting surface this standard governs. It also checks that every role imported in a play's roles: block carries its own role name as a tag (additional tags are allowed).

Extending the vocabulary

To add a concern tag: (1) add it to tests/tags.yml; (2) add a row to the concern table above with a one-line justification showing it passes the litmus test (cross-cutting, 2+ roles, distinct). That is the whole gate — lightweight, but it leaves a paper trail.

Consequences

  • Targeted runs are predictable: only two kinds of tags exist, one of them mechanical.
  • Over-tagging is structurally resisted (closed list + lint enforcement).
  • Intersection targeting is unavailable by design.
  • Authors must keep role tags = role names. make lint enforces both the vocabulary (every tag is a known role name or approved tag) and that each role import in a roles: block carries its own role-name tag (extra tags allowed).

ADR-002 (security baseline / firewall), ADR-004 (one service = one role), ADR-009 (TF↔Ansible handoff / inventory), ADR-018 (logging).