fix(tags): exclude molecule scenarios from tag scan; clarify ADR enforcement

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
docs(tags): ADR-019 + CLAUDE.md/TODO/CAPABILITIES (tagging standard)
2026-06-06 09:50:14 +02:00 · 2026-06-06 09:42:22 +02:00 · 2026-06-06 09:39:19 +02:00 · 2026-06-06 09:37:43 +02:00 · 2026-06-06 09:33:12 +02:00 · 2026-06-06 09:28:03 +02:00
13 changed files with 1295 additions and 6 deletions
--- a/CLAUDE.md
+++ b/CLAUDE.md
@ -51,7 +51,11 @@ Full design rationale: `docs/decisions/`
 ## Ansible conventions

 - **FQCN always**: `ansible.builtin.template`, never `template`
- **Tags**: every task must have at least one tag; playbooks support `--tags` filtering
+- **Tags** (ADR-019): import each role with its role-name tag once at the play level
+  (Ansible inherits it to every task). Tag a task/block with a concern tag from the
+  approved list (`tests/tags.yml`) only where it genuinely belongs to that concern —
+  don't invent tags or tag for tagging's sake. Target one axis at a time (role/service
+  *or* concern; tags are union/OR, never intersected). `make lint` enforces the vocabulary.
 - **Handlers**: use `listen:` topic strings, not direct name references
 - **Variables**: `rolename__varname` double-underscore namespace for role defaults
 - **No inline vars in playbooks**: use `group_vars/` or `host_vars/` only
@ -144,6 +148,9 @@ Single-contributor, trunk-based (no merge requests / approval gates):
 ## Terraform conventions

 - Terraform owns VM existence only — nothing inside a VM, and no DNS records
+- Every TF-managed VM carries three Proxmox tags — `<env>`, its inventory `group`, and
+  `managed-by=terraform` — as **metadata only** (ADR-019). They do not feed inventory
+  or run-targeting; `tf_to_inventory.py` still groups by the `group` output field.
 - Internal DNS is entirely Ansible (the `dns` role renders the zone from inventory)
 - OPNsense is entirely Ansible; do not reach for a Terraform OPNsense provider
 - Environments are separate directories (`staging/`, `production/`), not workspaces
@ -215,6 +222,7 @@ Single-contributor, trunk-based (no merge requests / approval gates):
 | Update management      | `docs/decisions/011-update-management.md` |
 | Hardware & capacity    | `docs/decisions/012-hardware-capacity.md` |
 | Logging & log integrity | `docs/decisions/018-logging.md` |
+| Tagging & run-targeting | `docs/decisions/019-tagging.md` |
 | Adding a new role      | `docs/runbooks/new-role.md`           |
 | Adding a new host      | `docs/runbooks/new-host.md`           |
 | Rotating vault secrets | `docs/runbooks/rotate-secrets.md`     |
--- a/1
+++ b/1
@ -67,6 +67,7 @@ collections:
 lint:
 	$(VENV)/bin/yamllint .
 	$(LINT)
+	$(PYTHON) scripts/check-tags.py

 # ── Testing ───────────────────────────────────────────────────────────────────

--- a/docs/CAPABILITIES.md
+++ b/docs/CAPABILITIES.md
@ -112,6 +112,10 @@ _(DHCP, firewall, mDNS reflection live on OPNsense — Ansible-managed, not cont
 | Sanity / smoke | whoami + health checks | S | planned | Verification endpoints + "is it actually working" checks | ADR-011 / TODO 8.2 |
 | Service-UI verification | `/verify-service` skill | S | planned | Claude-driven exploratory Level 4 acceptance check of a deployed service's UI | Decided (ADR-017); running deferred on ubongo + playwright + Authentik |

+- **Targeted runs** (ADR-019): playbooks are sliced with `--tags` along two axes —
+  role/service (tag = role name) or a closed list of cross-cutting concerns
+  (`firewall`, `logging`, `config`, `deploy`, …); the vocabulary is lint-enforced.
+
 ---

 ## V4 completeness check
--- a/docs/TODO.md
+++ b/docs/TODO.md
@ -28,11 +28,13 @@
      (all logs) + off-site security subset on `askari` + Grafana on-cluster (not the
      whole stack on `askari`). Still to design/build: Prometheus + metric exporters,
      Uptime Kuma, and exactly which alerts live where.
-   7. Define a tagging standard that lets us target runs without over-tagging.
+   7. ~~Define a tagging standard that lets us target runs without over-tagging.~~
+      DECIDED (ADR-019): two-tier — role-name tags (auto, at play level) + a closed
+      9-tag concern list (`tests/tags.yml`); union-only targeting; enforced by `make lint`.
   8. Ensure the right things are backed up (incl. database dumps if we land on PBS).
   9. Decide: a central database server, or individual database services per app?
   10. Should we keep the custom base-container (Molecule test image) method for role testing, or revisit it as boma's testing approach matures (ADR-008)?
-   11. Deliberate tagging strategy.
+   11. ~~Deliberate tagging strategy.~~ DECIDED (ADR-019) — folded into 3.7.

 4. **Split-horizon FQDN** — adopt split-horizon FQDN with or without nyumbani?

--- a/docs/decisions/019-tagging.md
+++ b/docs/decisions/019-tagging.md
@ -0,0 +1,112 @@
+# ADR-019 — Tagging standard for targeted, predictable runs
+
+## Status
+
+Accepted (2026-06-06). Resolves TODO 3.7 ("Define a tagging standard that lets us
+target runs without over-tagging") and TODO 3.11 ("Deliberate tagging strategy").
+
+## Context
+
+boma wants to run playbooks **targeted** — a single service, a single layer, or a
+single cross-cutting concern — **transparently and predictably**: a reader should
+know from a `--tags` invocation exactly what it will and won't touch. CLAUDE.md
+already requires tag-filterable tasks, but no vocabulary or convention existed, and
+the TODO explicitly warns against the opposite failure mode: **over-tagging**.
+
+## Decision
+
+### Two-tier tagging
+
+**Tier 1 — role/service tag (mechanical).** The tag equals the role name, applied
+once at the role-import level:
+
+```yaml
+roles:
+  - role: photoprism
+    tags: [photoprism]
+```
+
+Ansible propagates it to every task in the role. Because one service = one role
+(ADR-004), this single rule covers both the *layer/role* and *single-service*
+targeting axes with zero per-task burden. Role-less lifecycle playbooks
+(e.g. `bootstrap.yml`) carry a single playbook-identity tag instead.
+
+**Tier 2 — concern tag (curated).** A small **closed list** of cross-cutting concern
+tags, applied per-task/block **only where a task genuinely belongs to that concern**.
+
+### The closed concern list
+
+A concern earns a tag only if it (a) appears in 2+ roles, (b) is worth running as a
+slice on its own, and (c) doesn't overlap confusingly with another.
+
+| Tag | Covers |
+|-----|--------|
+| `packages`   | apt package install/management |
+| `users`      | accounts, groups, sudo |
+| `firewall`   | nftables rulesets & port definitions (ADR-002) |
+| `hardening`  | security baseline — sshd config, fail2ban, auditd, sysctl |
+| `logging`    | Alloy / log-shipping config (ADR-018) |
+| `monitoring` | metric exporters / health checks |
+| `config`     | render templated config/compose files to disk — **no restart** |
+| `deploy`     | bring services up / restart (`compose up -d`) |
+| `proxy`      | reverse-proxy + TLS registration (Traefik routes, Authentik) |
+
+The `config`/`deploy` split lets you re-render and diff configuration (`--tags
+config`) without bouncing services, then restart deliberately (`--tags deploy`).
+`backup` and `secrets` are intentionally omitted until the roles needing them exist.
+
+### `always` / `never`
+
+- **`always`** — reserved for cheap preflight assertions (vault unlocked, OS is
+  Debian 13, required vars present), so even `--tags config` runs its safety guards.
+- **`never`** — reserved for destructive/expensive opt-in tasks, each paired with a
+  descriptive tag (e.g. `tags: [never, force_pull]`); they run only when named.
+
+### Predictability principle: tags are union-only
+
+`--tags a,b` runs tasks tagged a **OR** b — Ansible has no native AND. boma therefore
+targets **one axis at a time**: either a role/service *or* a concern, never an
+intersection like "photoprism's firewall only." If that's ever needed, just run
+`--tags photoprism` (idempotent and fast). Designing for intersection is the
+over-tagging trap; we decline it on purpose.
+
+### Terraform / Proxmox VM tags (metadata only)
+
+Every Terraform-managed VM carries exactly three Proxmox tags:
+
+| Tag | Value | Purpose |
+|-----|-------|---------|
+| env        | `staging` \| `production`          | which environment |
+| role/group | `docker_hosts`, `proxmox_hosts`, … | matches the inventory group |
+| managed-by | `terraform`                        | distinguishes IaC VMs from hand-made ones |
+
+These are **pure metadata for transparency** (glanceable in the Proxmox UI). They do
+**not** drive run-targeting and do **not** feed inventory — `scripts/tf_to_inventory.py`
+keeps building groups from the `group` output field, the single source of truth.
+
+## Enforcement
+
+`tests/tags.yml` is the single source of truth for the allowed concern/special/
+opt-in/playbook tags. `scripts/check-tags.py` (run by `make lint`, covered by
+`tests/test_check_tags.py`) scans `roles/` and `playbooks/` and fails on any tag
+outside `{role directory names} ∪ {tests/tags.yml entries}`.
+Molecule scenario files (`roles/*/molecule/**`) are excluded from the scan — they are test orchestration, not the production run-targeting surface this standard governs.
+
+## Extending the vocabulary
+
+To add a concern tag: (1) add it to `tests/tags.yml`; (2) add a row to the concern
+table above with a one-line justification showing it passes the litmus test
+(cross-cutting, 2+ roles, distinct). That is the whole gate — lightweight, but it
+leaves a paper trail.
+
+## Consequences
+
+- Targeted runs are predictable: only two kinds of tags exist, one of them mechanical.
+- Over-tagging is structurally resisted (closed list + lint enforcement).
+- Intersection targeting is unavailable by design.
+- Authors must keep role tags = role names. The linter enforces the *vocabulary* (every tag must be a known role name or an approved tag); the role-tag-equals-role-name rule itself is a convention the linter does not separately check.
+
+## Related
+
+ADR-002 (security baseline / firewall), ADR-004 (one service = one role),
+ADR-009 (TF↔Ansible handoff / inventory), ADR-018 (logging).
--- a/docs/superpowers/plans/2026-06-06-tagging-strategy.md
+++ b/docs/superpowers/plans/2026-06-06-tagging-strategy.md
@ -0,0 +1,728 @@
+# Ansible Tagging Standard Implementation Plan
+
+> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
+
+**Goal:** Establish a two-tier Ansible tagging standard (role-name tags + a closed concern list) with machine-enforced vocabulary, plus a Proxmox VM metadata-tag convention, so playbook runs are targeted, transparent, and predictable.
+
+**Architecture:** A single source-of-truth YAML (`tests/tags.yml`) lists the allowed concern/special/opt-in/playbook tags. A Python checker (`scripts/check-tags.py`) scans `roles/` and `playbooks/`, computes the allowed set as `{role dir names} ∪ {tags.yml entries}`, and fails `make lint` on any unknown tag. Terraform gets a documented three-tag VM convention (metadata only). The standard is recorded as ADR-019 and folded into CLAUDE.md.
+
+**Tech Stack:** Python 3 (stdlib + PyYAML, already present via ansible-core), pytest (already in `requirements.txt`), Make, Terraform (HCL edit only — not `init`ed), Markdown docs.
+
+---
+
+## File structure
+
+| File | Responsibility | Action |
+|------|----------------|--------|
+| `tests/tags.yml` | Single source of truth: allowed concern/special/opt-in/playbook tags | Create |
+| `scripts/check-tags.py` | Scan `roles/`+`playbooks/`, fail on tags outside the allowed set | Create |
+| `tests/test_check_tags.py` | Unit tests for the checker (mirrors `tests/test_capacity_scan.py`) | Create |
+| `Makefile` | Wire `check-tags.py` into the `lint` target | Modify |
+| `playbooks/site.yml` | Fix `docker_host` role tag (`docker` → `docker_host`) | Modify |
+| `docs/decisions/019-tagging.md` | The ADR (the standard itself) | Create |
+| `CLAUDE.md` | Reword tag rule; add Proxmox tag convention; add ADR-019 to Further reading | Modify |
+| `terraform/environments/staging/main.tf` | Add `managed-by=terraform` tag | Modify |
+| `terraform/environments/production/main.tf` | Add `managed-by=terraform` tag | Modify |
+| `docs/TODO.md` | Mark 3.7 and 3.11 DECIDED | Modify |
+| `docs/CAPABILITIES.md` | Note targeted runs as a capability | Modify |
+
+Notes for the implementer:
+- The repo venv is `.venv`. Run Python as `.venv/bin/python` (Makefile vars: `PYTHON := .venv/bin/python`). If `.venv` is missing, run `make setup` first.
+- PyYAML is available in the venv (ansible-core depends on it) — `import yaml` works.
+- Terraform is **not** `init`ed in this repo, so `terraform validate`/`plan` will fail offline. Only use `terraform fmt` (offline-safe) for the HCL tasks.
+- Before any `git commit`, the pre-commit hook decrypts `vault.yml`, so the vault agent must be unlocked: run `rbw unlocked` (exit 0 = good). If locked, ask the user to `rbw unlock` and wait. None of these tasks touch vault files, but the hook still runs.
+
+---
+
+### Task 1: Tag vocabulary file (`tests/tags.yml`)
+
+**Files:**
+- Create: `tests/tags.yml`
+
+- [ ] **Step 1: Create the vocabulary file**
+
+Create `tests/tags.yml` with exactly this content:
+
+```yaml
+---
+# Allowed Ansible tag vocabulary — single source of truth for scripts/check-tags.py.
+# Authoritative reference & rationale: docs/decisions/019-tagging.md.
+#
+# The full allowed set the linter enforces is:
+#   {role directory names under roles/} ∪ everything listed below.
+#
+# To add a CONCERN tag: add it here AND add a row to the ADR-019 table with a
+# one-line justification (cross-cutting, used in 2+ roles, distinct).
+
+# Cross-cutting concern tags, applied per-task/block where a task belongs to the
+# concern. Targeted one at a time (tags are union/OR, never intersected).
+concerns:
+  - packages     # apt package install/management
+  - users        # accounts, groups, sudo
+  - firewall     # nftables rulesets & port definitions (ADR-002)
+  - hardening    # security baseline — sshd config, fail2ban, auditd, sysctl
+  - logging      # Alloy / log-shipping config (ADR-018)
+  - monitoring   # metric exporters / health checks
+  - config       # render templated config/compose files to disk — no restart
+  - deploy       # bring services up / restart (compose up -d)
+  - proxy        # reverse-proxy + TLS registration (Traefik routes, Authentik)
+
+# Ansible built-in special tags. Narrow use only:
+#   always — cheap preflight assertions (run regardless of --tags)
+#   never  — destructive/expensive tasks, paired with an opt-in tag below
+special:
+  - always
+  - never
+
+# `never`-paired opt-in tags: destructive/expensive tasks that only run when
+# named explicitly (e.g. `tags: [never, force_pull]`). Empty until a role adds one.
+opt_ins: []
+
+# Playbook-level identity tags for role-less lifecycle plays (e.g. bootstrap.yml).
+playbooks:
+  - bootstrap
+```
+
+- [ ] **Step 2: Verify it parses and has the expected shape**
+
+Run:
+```bash
+.venv/bin/python -c "import yaml; d=yaml.safe_load(open('tests/tags.yml')); assert len(d['concerns'])==9, d['concerns']; assert d['special']==['always','never']; assert d['opt_ins']==[]; assert d['playbooks']==['bootstrap']; print('tags.yml OK')"
+```
+Expected: prints `tags.yml OK` and exits 0.
+
+- [ ] **Step 3: Commit**
+
+```bash
+git add tests/tags.yml
+git commit -m "feat(tags): add allowed-tag vocabulary (tests/tags.yml)"
+```
+
+---
+
+### Task 2: Checker core — tag collection & allowed-set helpers
+
+**Files:**
+- Create: `scripts/check-tags.py`
+- Test: `tests/test_check_tags.py`
+
+- [ ] **Step 1: Write the failing tests**
+
+Create `tests/test_check_tags.py`:
+
+```python
+import importlib.util
+import pathlib
+
+_PATH = pathlib.Path(__file__).resolve().parent.parent / "scripts" / "check-tags.py"
+_spec = importlib.util.spec_from_file_location("check_tags", _PATH)
+ct = importlib.util.module_from_spec(_spec)
+_spec.loader.exec_module(ct)
+
+
+def test_collect_tags_list_form():
+    node = {"name": "t", "tags": ["firewall", "users"]}
+    assert ct.collect_tags(node) == {"firewall", "users"}
+
+
+def test_collect_tags_string_form():
+    node = {"name": "t", "tags": "always"}
+    assert ct.collect_tags(node) == {"always"}
+
+
+def test_collect_tags_nested_blocks_and_roles():
+    doc = [
+        {"hosts": "all", "roles": [{"role": "base", "tags": ["base"]}]},
+        {"block": [{"name": "x", "tags": ["config"]}], "tags": ["deploy"]},
+    ]
+    assert ct.collect_tags(doc) == {"base", "config", "deploy"}
+
+
+def test_collect_tags_ignores_templated_values():
+    node = {"tags": ["{{ dynamic }}", "logging"]}
+    assert ct.collect_tags(node) == {"logging"}
+
+
+def test_load_vocab_unions_all_categories():
+    vocab = ct.load_vocab()
+    assert "firewall" in vocab      # concern
+    assert "always" in vocab        # special
+    assert "bootstrap" in vocab     # playbook identity
+    assert len([c for c in vocab]) >= 12
+
+
+def test_role_names_reads_role_dirs():
+    names = ct.role_names()
+    assert "base" in names
+    assert "docker_host" in names
+```
+
+- [ ] **Step 2: Run tests to verify they fail**
+
+Run: `.venv/bin/python -m pytest tests/test_check_tags.py -v`
+Expected: FAIL — `ModuleNotFoundError` / file not found for `scripts/check-tags.py` (the module can't be imported yet).
+
+- [ ] **Step 3: Write the minimal implementation**
+
+Create `scripts/check-tags.py`:
+
+```python
+#!/usr/bin/env python3
+"""
+Validate that every Ansible tag used under roles/ and playbooks/ belongs to the
+approved vocabulary. Single source of truth: tests/tags.yml. Rationale: ADR-019.
+
+Allowed set = {role directory names under roles/} ∪ {concerns, special, opt_ins,
+playbooks from tests/tags.yml}. Templated tags (containing "{{") are skipped —
+they can't be statically validated.
+
+Usage:  python3 scripts/check-tags.py
+Exit 0 = all tags allowed; exit 1 = unknown tag(s) found.
+"""
+import pathlib
+import sys
+
+import yaml
+
+REPO = pathlib.Path(__file__).resolve().parent.parent
+VOCAB_FILE = REPO / "tests" / "tags.yml"
+SCAN_DIRS = ("roles", "playbooks")
+
+
+class _IgnoreUnknownTags(yaml.SafeLoader):
+    """SafeLoader that tolerates custom YAML tags (e.g. !vault) instead of crashing."""
+
+
+def _ignore(loader, tag_suffix, node):
+    return None
+
+
+_IgnoreUnknownTags.add_multi_constructor("", _ignore)
+_IgnoreUnknownTags.add_multi_constructor("!", _ignore)
+
+
+def _static_str(value):
+    return isinstance(value, str) and "{{" not in value
+
+
+def load_vocab(path=VOCAB_FILE):
+    data = yaml.safe_load(path.read_text()) or {}
+    vocab = set()
+    for key in ("concerns", "special", "opt_ins", "playbooks"):
+        vocab.update(data.get(key) or [])
+    return vocab
+
+
+def role_names(repo=REPO):
+    roles_dir = repo / "roles"
+    if not roles_dir.is_dir():
+        return set()
+    return {p.name for p in roles_dir.iterdir() if p.is_dir()}
+
+
+def collect_tags(node):
+    """Recursively collect every static tag string under any 'tags:' key."""
+    tags = set()
+    if isinstance(node, dict):
+        for key, value in node.items():
+            if key == "tags":
+                if _static_str(value):
+                    tags.add(value)
+                elif isinstance(value, list):
+                    tags.update(t for t in value if _static_str(t))
+            tags |= collect_tags(value)
+    elif isinstance(node, list):
+        for item in node:
+            tags |= collect_tags(item)
+    return tags
+
+
+if __name__ == "__main__":  # pragma: no cover
+    sys.exit(0)
+```
+
+- [ ] **Step 4: Run tests to verify they pass**
+
+Run: `.venv/bin/python -m pytest tests/test_check_tags.py -v`
+Expected: PASS (all 6 tests).
+
+- [ ] **Step 5: Commit**
+
+```bash
+git add scripts/check-tags.py tests/test_check_tags.py
+git commit -m "feat(tags): checker helpers — tag collection & allowed-set"
+```
+
+---
+
+### Task 3: Checker validation — scan files and fail on unknown tags
+
+**Files:**
+- Modify: `scripts/check-tags.py`
+- Test: `tests/test_check_tags.py`
+
+- [ ] **Step 1: Write the failing tests**
+
+Append to `tests/test_check_tags.py`:
+
+```python
+def test_scan_text_collects_from_yaml_string():
+    text = """
+- hosts: all
+  roles:
+    - role: base
+      tags: [base]
+  tasks:
+    - name: open port
+      tags: [firewall]
+"""
+    assert ct.scan_text(text) == {"base", "firewall"}
+
+
+def test_scan_text_tolerates_custom_yaml_tags():
+    text = "- name: t\n  secret: !vault xxx\n  tags: [users]\n"
+    assert ct.scan_text(text) == {"users"}
+
+
+def test_find_violations_flags_unknown_tag():
+    allowed = {"base", "firewall"}
+    used = {"base", "frewall"}  # typo
+    assert ct.find_violations(used, allowed) == ["frewall"]
+
+
+def test_find_violations_empty_when_all_allowed():
+    assert ct.find_violations({"base", "firewall"}, {"base", "firewall"}) == []
+```
+
+- [ ] **Step 2: Run tests to verify they fail**
+
+Run: `.venv/bin/python -m pytest tests/test_check_tags.py -v`
+Expected: FAIL — `AttributeError: module 'check_tags' has no attribute 'scan_text'` (and `find_violations`).
+
+- [ ] **Step 3: Add the scanning + validation functions**
+
+In `scripts/check-tags.py`, replace the final block:
+
+```python
+if __name__ == "__main__":  # pragma: no cover
+    sys.exit(0)
+```
+
+with:
+
+```python
+def scan_text(text):
+    """Collect static tags from a (possibly multi-document) YAML string."""
+    found = set()
+    for doc in yaml.load_all(text, Loader=_IgnoreUnknownTags):
+        found |= collect_tags(doc)
+    return found
+
+
+def iter_yaml_files(repo=REPO, scan_dirs=SCAN_DIRS):
+    for name in scan_dirs:
+        base = repo / name
+        if not base.is_dir():
+            continue
+        for ext in ("*.yml", "*.yaml"):
+            yield from sorted(base.rglob(ext))
+
+
+def find_violations(used, allowed):
+    return sorted(used - allowed)
+
+
+def main():
+    allowed = load_vocab() | role_names()
+    violations = []
+    for path in iter_yaml_files():
+        try:
+            used = scan_text(path.read_text())
+        except yaml.YAMLError as exc:
+            print(f"warning: could not parse {path}: {exc}", file=sys.stderr)
+            continue
+        for tag in find_violations(used, allowed):
+            violations.append((path.relative_to(REPO), tag))
+
+    if violations:
+        print(
+            "error: Ansible tag(s) not in tests/tags.yml or role names "
+            "(see docs/decisions/019-tagging.md):",
+            file=sys.stderr,
+        )
+        for relpath, tag in violations:
+            print(f"  {relpath}: '{tag}'", file=sys.stderr)
+        print(f"\nallowed: {', '.join(sorted(allowed))}", file=sys.stderr)
+        sys.exit(1)
+
+    print(f"check-tags: OK ({len(allowed)} tags allowed across {len(SCAN_DIRS)} dirs)")
+
+
+if __name__ == "__main__":
+    main()
+```
+
+- [ ] **Step 4: Run tests to verify they pass**
+
+Run: `.venv/bin/python -m pytest tests/test_check_tags.py -v`
+Expected: PASS (all 10 tests).
+
+- [ ] **Step 5: Commit**
+
+```bash
+git add scripts/check-tags.py tests/test_check_tags.py
+git commit -m "feat(tags): scan roles/+playbooks/ and fail on unknown tags"
+```
+
+---
+
+### Task 4: Reconcile existing tags & wire into `make lint`
+
+**Files:**
+- Modify: `playbooks/site.yml:18-19`
+- Modify: `Makefile` (the `lint:` target)
+
+- [ ] **Step 1: Run the checker against the current repo (expect one violation)**
+
+Run: `.venv/bin/python scripts/check-tags.py`
+Expected: FAIL (exit 1) reporting `playbooks/site.yml: 'docker'` — because the `docker_host` role is tagged `[docker]`, which is neither a role name nor a vocabulary tag. This confirms the checker works end-to-end.
+
+- [ ] **Step 2: Fix the role tag to equal the role name**
+
+In `playbooks/site.yml`, change:
+
+```yaml
+    - role: docker_host
+      tags: [docker]
+```
+
+to:
+
+```yaml
+    - role: docker_host
+      tags: [docker_host]
+```
+
+- [ ] **Step 3: Re-run the checker (expect clean)**
+
+Run: `.venv/bin/python scripts/check-tags.py`
+Expected: PASS — prints `check-tags: OK (... tags allowed across 2 dirs)` and exits 0.
+(Allowed set now includes role names `base`, `docker_host`; used tags are `base`, `docker_host`, `bootstrap` — all allowed.)
+
+- [ ] **Step 4: Wire the checker into `make lint`**
+
+In `Makefile`, change the `lint:` target from:
+
+```makefile
+lint:
+	$(VENV)/bin/yamllint .
+	$(LINT)
+```
+
+to:
+
+```makefile
+lint:
+	$(VENV)/bin/yamllint .
+	$(LINT)
+	$(PYTHON) scripts/check-tags.py
+```
+
+- [ ] **Step 5: Run the full lint suite and the test suite**
+
+Run: `make lint && .venv/bin/python -m pytest tests/test_check_tags.py -v`
+Expected: yamllint passes, ansible-lint passes, `check-tags: OK`, and all pytest tests PASS.
+
+- [ ] **Step 6: Commit**
+
+```bash
+git add playbooks/site.yml Makefile
+git commit -m "feat(tags): enforce tag vocabulary in make lint; fix docker_host tag"
+```
+
+---
+
+### Task 5: Terraform Proxmox VM tag convention
+
+**Files:**
+- Modify: `terraform/environments/staging/main.tf` (the `tags =` line in `module "vms"`)
+- Modify: `terraform/environments/production/main.tf` (the `tags =` line in `module "vms"`)
+
+- [ ] **Step 1: Add `managed-by=terraform` to the staging VM tags**
+
+In `terraform/environments/staging/main.tf`, change:
+
+```hcl
+  tags              = ["staging", each.value.group]
+```
+
+to:
+
+```hcl
+  tags              = ["staging", each.value.group, "managed-by=terraform"]
+```
+
+- [ ] **Step 2: Add `managed-by=terraform` to the production VM tags**
+
+In `terraform/environments/production/main.tf`, change:
+
+```hcl
+  tags              = ["production", each.value.group]
+```
+
+to:
+
+```hcl
+  tags              = ["production", each.value.group, "managed-by=terraform"]
+```
+
+- [ ] **Step 3: Format-check the HCL (offline-safe)**
+
+Run: `terraform -chdir=terraform/environments/staging fmt && terraform -chdir=terraform/environments/production fmt`
+Expected: either no output (already formatted) or the filename printed (reformatted). Exit 0.
+(Do NOT run `terraform validate`/`plan` — Terraform is not `init`ed in this repo and they will fail offline.)
+
+- [ ] **Step 4: Confirm the edits**
+
+Run: `grep -n "managed-by=terraform" terraform/environments/staging/main.tf terraform/environments/production/main.tf`
+Expected: one match in each file.
+
+- [ ] **Step 5: Commit**
+
+```bash
+git add terraform/environments/staging/main.tf terraform/environments/production/main.tf
+git commit -m "feat(tags): Proxmox VM metadata convention (managed-by=terraform)"
+```
+
+---
+
+### Task 6: Documentation — ADR-019, CLAUDE.md, TODO, CAPABILITIES
+
+**Files:**
+- Create: `docs/decisions/019-tagging.md`
+- Modify: `CLAUDE.md` (Ansible conventions; Terraform conventions; Further reading)
+- Modify: `docs/TODO.md` (items 3.7 and 3.11)
+- Modify: `docs/CAPABILITIES.md`
+
+- [ ] **Step 1: Write the ADR**
+
+Create `docs/decisions/019-tagging.md`:
+
+````markdown
+# ADR-019 — Tagging standard for targeted, predictable runs
+
+## Status
+
+Accepted (2026-06-06). Resolves TODO 3.7 ("Define a tagging standard that lets us
+target runs without over-tagging") and TODO 3.11 ("Deliberate tagging strategy").
+
+## Context
+
+boma wants to run playbooks **targeted** — a single service, a single layer, or a
+single cross-cutting concern — **transparently and predictably**: a reader should
+know from a `--tags` invocation exactly what it will and won't touch. CLAUDE.md
+already requires tag-filterable tasks, but no vocabulary or convention existed, and
+the TODO explicitly warns against the opposite failure mode: **over-tagging**.
+
+## Decision
+
+### Two-tier tagging
+
+**Tier 1 — role/service tag (mechanical).** The tag equals the role name, applied
+once at the role-import level:
+
+```yaml
+roles:
+  - role: photoprism
+    tags: [photoprism]
+```
+
+Ansible propagates it to every task in the role. Because one service = one role
+(ADR-004), this single rule covers both the *layer/role* and *single-service*
+targeting axes with zero per-task burden. Role-less lifecycle playbooks
+(e.g. `bootstrap.yml`) carry a single playbook-identity tag instead.
+
+**Tier 2 — concern tag (curated).** A small **closed list** of cross-cutting concern
+tags, applied per-task/block **only where a task genuinely belongs to that concern**.
+
+### The closed concern list
+
+A concern earns a tag only if it (a) appears in 2+ roles, (b) is worth running as a
+slice on its own, and (c) doesn't overlap confusingly with another.
+
+| Tag | Covers |
+|-----|--------|
+| `packages`   | apt package install/management |
+| `users`      | accounts, groups, sudo |
+| `firewall`   | nftables rulesets & port definitions (ADR-002) |
+| `hardening`  | security baseline — sshd config, fail2ban, auditd, sysctl |
+| `logging`    | Alloy / log-shipping config (ADR-018) |
+| `monitoring` | metric exporters / health checks |
+| `config`     | render templated config/compose files to disk — **no restart** |
+| `deploy`     | bring services up / restart (`compose up -d`) |
+| `proxy`      | reverse-proxy + TLS registration (Traefik routes, Authentik) |
+
+The `config`/`deploy` split lets you re-render and diff configuration (`--tags
+config`) without bouncing services, then restart deliberately (`--tags deploy`).
+`backup` and `secrets` are intentionally omitted until the roles needing them exist.
+
+### `always` / `never`
+
+- **`always`** — reserved for cheap preflight assertions (vault unlocked, OS is
+  Debian 13, required vars present), so even `--tags config` runs its safety guards.
+- **`never`** — reserved for destructive/expensive opt-in tasks, each paired with a
+  descriptive tag (e.g. `tags: [never, force_pull]`); they run only when named.
+
+### Predictability principle: tags are union-only
+
+`--tags a,b` runs tasks tagged a **OR** b — Ansible has no native AND. boma therefore
+targets **one axis at a time**: either a role/service *or* a concern, never an
+intersection like "photoprism's firewall only." If that's ever needed, just run
+`--tags photoprism` (idempotent and fast). Designing for intersection is the
+over-tagging trap; we decline it on purpose.
+
+### Terraform / Proxmox VM tags (metadata only)
+
+Every Terraform-managed VM carries exactly three Proxmox tags:
+
+| Tag | Value | Purpose |
+|-----|-------|---------|
+| env        | `staging` \| `production`          | which environment |
+| role/group | `docker_hosts`, `proxmox_hosts`, … | matches the inventory group |
+| managed-by | `terraform`                        | distinguishes IaC VMs from hand-made ones |
+
+These are **pure metadata for transparency** (glanceable in the Proxmox UI). They do
+**not** drive run-targeting and do **not** feed inventory — `scripts/tf_to_inventory.py`
+keeps building groups from the `group` output field, the single source of truth.
+
+## Enforcement
+
+`tests/tags.yml` is the single source of truth for the allowed concern/special/
+opt-in/playbook tags. `scripts/check-tags.py` (run by `make lint`, covered by
+`tests/test_check_tags.py`) scans `roles/` and `playbooks/` and fails on any tag
+outside `{role directory names} ∪ {tests/tags.yml entries}`.
+
+## Extending the vocabulary
+
+To add a concern tag: (1) add it to `tests/tags.yml`; (2) add a row to the concern
+table above with a one-line justification showing it passes the litmus test
+(cross-cutting, 2+ roles, distinct). That is the whole gate — lightweight, but it
+leaves a paper trail.
+
+## Consequences
+
+- Targeted runs are predictable: only two kinds of tags exist, one of them mechanical.
+- Over-tagging is structurally resisted (closed list + lint enforcement).
+- Intersection targeting is unavailable by design.
+- Authors must keep role tags = role names; the linter enforces it.
+
+## Related
+
+ADR-002 (security baseline / firewall), ADR-004 (one service = one role),
+ADR-009 (TF↔Ansible handoff / inventory), ADR-018 (logging).
+````
+
+- [ ] **Step 2: Reword the tag rule in CLAUDE.md**
+
+In `CLAUDE.md`, under **Ansible conventions**, change:
+
+```markdown
+- **Tags**: every task must have at least one tag; playbooks support `--tags` filtering
+```
+
+to:
+
+```markdown
+- **Tags** (ADR-019): import each role with its role-name tag once at the play level
+  (Ansible inherits it to every task). Tag a task/block with a concern tag from the
+  approved list (`tests/tags.yml`) only where it genuinely belongs to that concern —
+  don't invent tags or tag for tagging's sake. Target one axis at a time (role/service
+  *or* concern; tags are union/OR, never intersected). `make lint` enforces the vocabulary.
+```
+
+- [ ] **Step 3: Add the Proxmox tag convention to CLAUDE.md**
+
+In `CLAUDE.md`, under **Terraform conventions**, add this bullet after the existing
+"Terraform owns VM existence only" bullet:
+
+```markdown
+- Every TF-managed VM carries three Proxmox tags — `<env>`, its inventory `group`, and
+  `managed-by=terraform` — as **metadata only** (ADR-019). They do not feed inventory
+  or run-targeting; `tf_to_inventory.py` still groups by the `group` output field.
+```
+
+- [ ] **Step 4: Add ADR-019 to the Further reading table**
+
+In `CLAUDE.md`, in the **Further reading** table, add this row immediately after the
+`Logging & log integrity` row:
+
+```markdown
+| Tagging & run-targeting | `docs/decisions/019-tagging.md` |
+```
+
+- [ ] **Step 5: Mark the TODO items decided**
+
+In `docs/TODO.md`, change line for item 3.7:
+
+```markdown
+   7. Define a tagging standard that lets us target runs without over-tagging.
+```
+
+to:
+
+```markdown
+   7. ~~Define a tagging standard that lets us target runs without over-tagging.~~
+      DECIDED (ADR-019): two-tier — role-name tags (auto, at play level) + a closed
+      9-tag concern list (`tests/tags.yml`); union-only targeting; enforced by `make lint`.
+```
+
+and change item 3.11:
+
+```markdown
+   11. Deliberate tagging strategy.
+```
+
+to:
+
+```markdown
+   11. ~~Deliberate tagging strategy.~~ DECIDED (ADR-019) — folded into 3.7.
+```
+
+- [ ] **Step 6: Note the capability in CAPABILITIES.md**
+
+Run: `grep -n "^## \|^### " docs/CAPABILITIES.md` to locate the section covering
+operations / CI / how playbooks are run. Add this bullet under the most appropriate
+existing section (operations or testing/CI):
+
+```markdown
+- **Targeted runs** (ADR-019): playbooks are sliced with `--tags` along two axes —
+  role/service (tag = role name) or a closed list of cross-cutting concerns
+  (`firewall`, `logging`, `config`, `deploy`, …); the vocabulary is lint-enforced.
+```
+
+- [ ] **Step 7: Verify docs are consistent and lint still passes**
+
+Run:
+```bash
+grep -n "019-tagging" CLAUDE.md && grep -c "managed-by=terraform" CLAUDE.md && make lint
+```
+Expected: the ADR-019 row is found in CLAUDE.md, `managed-by=terraform` appears at
+least once, and `make lint` passes (including `check-tags: OK`).
+
+- [ ] **Step 8: Commit**
+
+```bash
+git add docs/decisions/019-tagging.md CLAUDE.md docs/TODO.md docs/CAPABILITIES.md
+git commit -m "docs(tags): ADR-019 + CLAUDE.md/TODO/CAPABILITIES (tagging standard)"
+```
+
+---
+
+## Final verification
+
+- [ ] Run the full suite once more: `make lint && .venv/bin/python -m pytest tests/ -v`
+  Expected: yamllint + ansible-lint pass, `check-tags: OK`, all tests PASS.
+- [ ] Confirm a deliberate violation is caught: temporarily add `tags: [bogus]` to a
+  task in `playbooks/site.yml`, run `.venv/bin/python scripts/check-tags.py`, confirm it
+  exits 1 reporting `'bogus'`, then revert the edit.
+- [ ] `git log --oneline -7` shows the six task commits.
--- a/docs/superpowers/specs/2026-06-06-tagging-strategy-design.md
+++ b/docs/superpowers/specs/2026-06-06-tagging-strategy-design.md
@ -0,0 +1,188 @@
+# Design — Ansible tagging standard (targeted, predictable runs)
+
+- **Date:** 2026-06-06
+- **Status:** Approved design — pending implementation plan
+- **Resolves:** TODO 3.7 ("Define a tagging standard that lets us target runs without
+  over-tagging") and TODO 3.11 ("Deliberate tagging strategy") — the same thread
+- **Becomes:** ADR-019 (this design is the basis for that ADR)
+
+---
+
+## Problem
+
+boma wants to run playbooks **targeted** — a single service, a single layer, or a
+single cross-cutting concern — and to do so **transparently and predictably**: you
+should be able to look at a `--tags` invocation and know exactly what it will and won't
+touch. CLAUDE.md already mandates that every task be tag-filterable, but no *vocabulary*
+or *naming convention* exists. Without one, tags proliferate ad-hoc per role and the
+"predictable" property is lost — and the TODO explicitly warns against the opposite
+failure mode, **over-tagging**.
+
+The repo is effectively greenfield for this: `base` and `docker_host` are empty, and the
+only tags in existence are `[base]`/`[docker]` in `site.yml` and `[bootstrap]` in
+`bootstrap.yml`. So we can bake the standard into role-authoring conventions *before*
+there are a dozen service roles to retrofit.
+
+## Targeting axes (what we want to slice by)
+
+1. **Layer / role** — `--tags base`, `--tags docker`
+2. **Single service** — `--tags photoprism`, `--tags traefik`
+3. **Concern / function** — `--tags firewall`, `--tags logging`, …
+
+Lifecycle phases (bootstrap/config/deploy) are **not** a tag axis — `bootstrap.yml` vs
+`site.yml` already separate those as whole playbooks.
+
+Key simplification: because of ADR-004 (*one service = one role*, role name = service
+name), axes 1 and 2 are the **same mechanism** — a tag equal to the role name. Only the
+concern axis needs a curated vocabulary.
+
+## Approach (chosen): two-tier tagging
+
+**Tier 1 — role/service tag (mechanical).** The tag *equals the role name*, applied
+**once** at the role-import level in the playbook:
+
+```yaml
+roles:
+  - role: photoprism
+    tags: [photoprism]
+```
+
+Ansible propagates the tag to every task in the role. This covers both the layer/role
+and single-service axes with one rule and **zero per-task burden**.
+
+**Tier 2 — concern tag (curated).** A small **closed, documented list** of cross-cutting
+concern tags, applied per-task/block **only where a task genuinely belongs to that
+concern**. `--tags firewall` then hits firewall tasks in `base` and in every service
+role.
+
+Rejected alternatives: *concern-only/flat* (loses natural `--tags <service>` ergonomics);
+*rich multi-dimensional* (role+service+concern+lifecycle+ad-hoc per task) — that is
+precisely the over-tagging the TODO warns against.
+
+## The closed concern list
+
+Litmus test for earning a spot: a concern must (a) appear in **2+ roles**, (b) be
+something you'd realistically want to run as a slice on its own, and (c) not overlap
+confusingly with another.
+
+**Baseline concerns** (mostly in `base`, some echoed in service roles):
+
+| Tag | Covers |
+|-----|--------|
+| `packages`   | apt package install/management |
+| `users`      | accounts, groups, sudo |
+| `firewall`   | nftables rulesets & port definitions (ADR-002) |
+| `hardening`  | security baseline — sshd config, fail2ban, auditd, sysctl |
+| `logging`    | Alloy / log-shipping config (ADR-018) |
+| `monitoring` | metric exporters / health checks |
+
+**Service concerns** (in every service role, ADR-004):
+
+| Tag | Covers |
+|-----|--------|
+| `config` | render templated config/compose files to disk — **no restart** |
+| `deploy` | bring services up / restart (`compose up -d`) |
+| `proxy`  | reverse-proxy + TLS registration (Traefik routes, Authentik) |
+
+Nine tags total. The `config`/`deploy` split is deliberate and high-value: `--tags
+config` re-renders and lets you diff configuration without bouncing services; `--tags
+deploy` does the restart.
+
+`backup` and `secrets` are **intentionally omitted** until the roles that need them
+exist — they enter via the extend process, not speculative reservation.
+
+## `always` / `never` policy
+
+boma uses Ansible's two built-in special tags, narrowly:
+
+- **`always`** — reserved strictly for **cheap preflight assertions** (vault unlocked,
+  OS is Debian 13, required vars present). Ensures even `--tags config` runs its safety
+  guards.
+- **`never`** — reserved for **destructive/expensive opt-in tasks**, each paired with a
+  descriptive tag (e.g. `never, force_pull` or `never, restore`). They never run unless
+  explicitly named, keeping dangerous actions out of normal runs. The descriptive
+  partner tag is a documented `never`-paired opt-in (allowed by the linter).
+
+## Predictability principle: tags are union-only
+
+`--tags a,b` runs tasks tagged a **OR** b — Ansible has no native AND. Rather than fight
+this, we make it an explicit principle: **boma targets one axis at a time** — *either* a
+role/service (`--tags photoprism`) *or* a concern (`--tags firewall`), never an
+intersection like "photoprism's firewall only." If that is ever genuinely needed, the
+answer is "just run `--tags photoprism`" (idempotent and fast). Designing for
+intersection is the over-tagging trap; we decline it on purpose.
+
+## Reconciling the existing CLAUDE.md rule
+
+CLAUDE.md currently says *"every task must have at least one tag."* Under the two-tier
+model the role tag is applied **once at the play/import level** and **inherited** by
+every task, so tasks are always reachable without hand-tagging each one. The rule is
+**reworded** to:
+
+> Import each role with its role-name tag (once, at the play level). Within a role, tag a
+> task/block with a concern tag from the approved list **only where it genuinely belongs
+> to that concern** — don't invent tags or tag for tagging's sake.
+
+This directly resolves the "without over-tagging" tension.
+
+## Terraform / Proxmox VM tags (metadata only)
+
+Formalize the convention that already half-exists in `staging/main.tf`
+(`tags = ["staging", each.value.group]`). Every TF-managed VM gets exactly three tags:
+
+| Tag | Value | Purpose |
+|-----|-------|---------|
+| env        | `staging` \| `production`            | which environment |
+| role/group | `docker_hosts`, `proxmox_hosts`, …   | matches the inventory group |
+| managed-by | `terraform`                          | distinguishes IaC VMs from hand-made ones |
+
+Set as `tags = ["${env}", each.value.group, "managed-by=terraform"]` in the env
+`main.tf` (env is constant per directory).
+
+**Explicit non-goals** (stated so nobody wires them up later): these tags are **pure
+metadata for transparency** — glanceable in the Proxmox UI. They do **not** drive
+run-targeting and do **not** feed inventory. `scripts/tf_to_inventory.py` keeps building
+groups from the `group` output field, which stays the single source of truth.
+
+## Enforcement
+
+A small **lint check wired into `make lint`**: a script collects every `tags:` value
+across `roles/` and `playbooks/` and fails if any tag is not in the allowed set:
+
+```
+{role names} ∪ {9 concern tags} ∪ {always, never} ∪ {documented never-paired opt-ins}
+```
+
+The allowed concern list (and the `never`-paired opt-ins) live in **one
+machine-readable file, `tests/tags.yml`**, which both the linter reads and the ADR
+documents — so doc and enforcement cannot drift. This is more honest than ansible-lint's
+limited built-in tags rule. A unit test (mirroring `tests/test_capacity_scan.py`) covers
+the checker.
+
+## The "propose to extend" process
+
+To add a concern tag: (1) add it to `tests/tags.yml`; (2) add a row to the ADR-019 table
+with a one-line justification showing it passes the litmus test (cross-cutting, 2+
+roles, distinct). That is the whole gate — lightweight, but it leaves a paper trail.
+
+## Deliverables
+
+- **New `docs/decisions/019-tagging.md`** — the standard: rationale, two-tier model,
+  concern table, union-only principle, `always`/`never` policy, Proxmox tag convention,
+  extend process.
+- **`tests/tags.yml`** — machine-readable allowed concern list + `never`-paired opt-ins.
+- **Lint checker script** (e.g. `scripts/check-tags.py`) + **`make lint`** wiring +
+  **`tests/test_check_tags.py`**.
+- **CLAUDE.md** — reword the tag bullet under *Ansible conventions*; add the Proxmox tag
+  convention under *Terraform conventions*; add ADR-019 to *Further reading*.
+- **`terraform/environments/{staging,production}/main.tf`** — apply the three-tag
+  convention.
+- **`docs/TODO.md`** — mark 3.7 and 3.11 DECIDED (ADR-019).
+- **`docs/CAPABILITIES.md`** — note targeted runs as a capability, if it fits.
+
+## Out of scope
+
+- Intersection targeting (role ∩ concern) — declined on purpose (see principle).
+- Lifecycle-phase tags — handled by separate playbooks.
+- Proxmox tags feeding inventory or run-targeting — metadata only.
+- `backup`/`secrets` concern tags — added later via the extend process.
--- a/playbooks/site.yml
+++ b/playbooks/site.yml
@ -16,4 +16,4 @@
  become: true
  roles:
    - role: docker_host
-      tags: [docker]
+      tags: [docker_host]
--- a/scripts/check-tags.py
+++ b/scripts/check-tags.py
@ -0,0 +1,124 @@
+#!/usr/bin/env python3
+"""
+Validate that every Ansible tag used under roles/ and playbooks/ belongs to the
+approved vocabulary. Single source of truth: tests/tags.yml. Rationale: ADR-019.
+
+Allowed set = {role directory names under roles/} ∪ {concerns, special, opt_ins,
+playbooks from tests/tags.yml}. Templated tags (containing "{{") are skipped —
+they can't be statically validated.
+
+Usage:  python3 scripts/check-tags.py
+Exit 0 = all tags allowed; exit 1 = unknown tag(s) found.
+"""
+import pathlib
+import sys
+
+import yaml
+
+REPO = pathlib.Path(__file__).resolve().parent.parent
+VOCAB_FILE = REPO / "tests" / "tags.yml"
+SCAN_DIRS = ("roles", "playbooks")
+
+
+class _IgnoreUnknownTags(yaml.SafeLoader):
+    """SafeLoader that tolerates custom YAML tags (e.g. !vault) instead of crashing."""
+
+
+def _ignore(loader, tag_suffix, node):
+    return None
+
+
+_IgnoreUnknownTags.add_multi_constructor("", _ignore)
+
+
+def _static_str(value):
+    return isinstance(value, str) and "{{" not in value
+
+
+def load_vocab(path=VOCAB_FILE):
+    data = yaml.safe_load(path.read_text()) or {}
+    vocab = set()
+    for key in ("concerns", "special", "opt_ins", "playbooks"):
+        vocab.update(data.get(key) or [])
+    return vocab
+
+
+def role_names(repo=REPO):
+    roles_dir = repo / "roles"
+    if not roles_dir.is_dir():
+        return set()
+    return {p.name for p in roles_dir.iterdir() if p.is_dir()}
+
+
+def collect_tags(node):
+    """Recursively collect every static tag string under any 'tags:' key."""
+    # Matches any dict key literally named `tags`; Ansible-tag semantics assumed.
+    tags = set()
+    if isinstance(node, dict):
+        for key, value in node.items():
+            if key == "tags":
+                if _static_str(value):
+                    tags.add(value)
+                elif isinstance(value, list):
+                    tags.update(t for t in value if _static_str(t))
+            tags |= collect_tags(value)
+    elif isinstance(node, list):
+        for item in node:
+            tags |= collect_tags(item)
+    return tags
+
+
+def scan_text(text):
+    """Collect static tags from a (possibly multi-document) YAML string."""
+    found = set()
+    for doc in yaml.load_all(text, Loader=_IgnoreUnknownTags):
+        found |= collect_tags(doc)
+    return found
+
+
+def iter_yaml_files(repo=REPO, scan_dirs=SCAN_DIRS):
+    for name in scan_dirs:
+        base = repo / name
+        if not base.is_dir():
+            continue
+        for ext in ("*.yml", "*.yaml"):
+            for path in sorted(base.rglob(ext)):
+                # Molecule scenarios are test orchestration, not the production
+                # run-targeting surface this standard governs (ADR-019). Skip them.
+                if "molecule" in path.relative_to(base).parts:
+                    continue
+                yield path
+
+
+def find_violations(used, allowed):
+    return sorted(used - allowed)
+
+
+def main():
+    allowed = load_vocab() | role_names()
+    violations = []
+    for path in iter_yaml_files():
+        try:
+            used = scan_text(path.read_text())
+        except yaml.YAMLError as exc:
+            print(f"warning: could not parse {path}: {exc}", file=sys.stderr)
+            continue
+        for tag in find_violations(used, allowed):
+            violations.append((path.relative_to(REPO), tag))
+
+    if violations:
+        print(
+            "error: Ansible tag(s) not in tests/tags.yml or role names "
+            "(see docs/decisions/019-tagging.md):",
+            file=sys.stderr,
+        )
+        for relpath, tag in violations:
+            print(f"  {relpath}: '{tag}'", file=sys.stderr)
+        print(f"\nallowed: {', '.join(sorted(allowed))}", file=sys.stderr)
+        sys.exit(1)
+
+    print(f"check-tags: OK ({len(allowed)} tags allowed across {len(SCAN_DIRS)} dirs)")
+
+
+if __name__ == "__main__":
+    main()
--- a/terraform/environments/production/main.tf
+++ b/terraform/environments/production/main.tf
@ -35,7 +35,7 @@ module "vms" {
  ssh_public_keys   = var.ssh_public_keys
  cores             = each.value.cores
  memory_mb         = each.value.memory_mb
-  tags              = ["production", each.value.group]
+  tags              = ["production", each.value.group, "managed-by=terraform"]
 }

 # Internal DNS records are NOT managed here. Terraform owns VM existence only;
--- a/terraform/environments/staging/main.tf
+++ b/terraform/environments/staging/main.tf
@ -29,7 +29,7 @@ module "vms" {
  ssh_public_keys   = var.ssh_public_keys
  cores             = each.value.cores
  memory_mb         = each.value.memory_mb
-  tags              = ["staging", each.value.group]
+  tags              = ["staging", each.value.group, "managed-by=terraform"]
 }

 # Internal DNS records are NOT managed here. Terraform owns VM existence only;
--- a/tests/tags.yml
+++ b/tests/tags.yml
@ -0,0 +1,37 @@
+---
+# Allowed Ansible tag vocabulary — single source of truth for scripts/check-tags.py.
+# Authoritative reference & rationale: docs/decisions/019-tagging.md.
+#
+# The full allowed set the linter enforces is:
+#   {role directory names under roles/} ∪ everything listed below.
+#
+# To add a CONCERN tag: add it here AND add a row to the ADR-019 table with a
+# one-line justification (cross-cutting, used in 2+ roles, distinct).
+
+# Cross-cutting concern tags, applied per-task/block where a task belongs to the
+# concern. Targeted one at a time (tags are union/OR, never intersected).
+concerns:
+  - packages     # apt package install/management
+  - users        # accounts, groups, sudo
+  - firewall     # nftables rulesets & port definitions (ADR-002)
+  - hardening    # security baseline — sshd config, fail2ban, auditd, sysctl
+  - logging      # Alloy / log-shipping config (ADR-018)
+  - monitoring   # metric exporters / health checks
+  - config       # render templated config/compose files to disk — no restart
+  - deploy       # bring services up / restart (compose up -d)
+  - proxy        # reverse-proxy + TLS registration (Traefik routes, Authentik)
+
+# Ansible built-in special tags. Narrow use only:
+#   always — cheap preflight assertions (run regardless of --tags)
+#   never  — destructive/expensive tasks, paired with an opt-in tag below
+special:
+  - always
+  - never
+
+# `never`-paired opt-in tags: destructive/expensive tasks that only run when
+# named explicitly (e.g. `tags: [never, force_pull]`). Empty until a role adds one.
+opt_ins: []
+
+# Playbook-level identity tags for role-less lifecycle plays (e.g. bootstrap.yml).
+playbooks:
+  - bootstrap
--- a/tests/test_check_tags.py
+++ b/tests/test_check_tags.py
@ -0,0 +1,85 @@
+import importlib.util
+import pathlib
+
+_PATH = pathlib.Path(__file__).resolve().parent.parent / "scripts" / "check-tags.py"
+_spec = importlib.util.spec_from_file_location("check_tags", _PATH)
+ct = importlib.util.module_from_spec(_spec)
+_spec.loader.exec_module(ct)
+
+
+def test_collect_tags_list_form():
+    node = {"name": "t", "tags": ["firewall", "users"]}
+    assert ct.collect_tags(node) == {"firewall", "users"}
+
+
+def test_collect_tags_string_form():
+    node = {"name": "t", "tags": "always"}
+    assert ct.collect_tags(node) == {"always"}
+
+
+def test_collect_tags_nested_blocks_and_roles():
+    doc = [
+        {"hosts": "all", "roles": [{"role": "base", "tags": ["base"]}]},
+        {"block": [{"name": "x", "tags": ["config"]}], "tags": ["deploy"]},
+    ]
+    assert ct.collect_tags(doc) == {"base", "config", "deploy"}
+
+
+def test_collect_tags_ignores_templated_values():
+    node = {"tags": ["{{ dynamic }}", "logging"]}
+    assert ct.collect_tags(node) == {"logging"}
+
+
+def test_load_vocab_unions_all_categories():
+    vocab = ct.load_vocab()
+    assert "firewall" in vocab      # concern
+    assert "always" in vocab        # special
+    assert "bootstrap" in vocab     # playbook identity
+    assert len(vocab) >= 10
+
+
+def test_role_names_reads_role_dirs():
+    names = ct.role_names()
+    assert "base" in names
+    assert "docker_host" in names
+
+
+def test_scan_text_collects_from_yaml_string():
+    text = """
+- hosts: all
+  roles:
+    - role: base
+      tags: [base]
+  tasks:
+    - name: open port
+      tags: [firewall]
+"""
+    assert ct.scan_text(text) == {"base", "firewall"}
+
+
+def test_scan_text_tolerates_custom_yaml_tags():
+    text = "- name: t\n  secret: !vault xxx\n  tags: [users]\n"
+    assert ct.scan_text(text) == {"users"}
+
+
+def test_find_violations_flags_unknown_tag():
+    allowed = {"base", "firewall"}
+    used = {"base", "frewall"}  # typo
+    assert ct.find_violations(used, allowed) == ["frewall"]
+
+
+def test_find_violations_empty_when_all_allowed():
+    assert ct.find_violations({"base", "firewall"}, {"base", "firewall"}) == []
+
+
+def test_iter_yaml_files_skips_molecule(tmp_path):
+    role = tmp_path / "roles" / "demo"
+    (role / "tasks").mkdir(parents=True)
+    (role / "tasks" / "main.yml").write_text("---\n")
+    mol = role / "molecule" / "default"
+    mol.mkdir(parents=True)
+    (mol / "verify.yml").write_text("---\n")
+    found = list(ct.iter_yaml_files(repo=tmp_path, scan_dirs=("roles",)))
+    names = [p.name for p in found]
+    assert "main.yml" in names
+    assert "verify.yml" not in names
Author	SHA1	Message	Date
sjat	2e5a1e1e23	fix(tags): exclude molecule scenarios from tag scan; clarify ADR enforcement Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-06 09:50:14 +02:00
sjat	24b5e9361e	docs(tags): ADR-019 + CLAUDE.md/TODO/CAPABILITIES (tagging standard) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-06 09:42:22 +02:00
sjat	9584cc2c76	feat(tags): Proxmox VM metadata convention (managed-by=terraform) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-06 09:39:19 +02:00
sjat	0b59107b33	feat(tags): enforce tag vocabulary in make lint; fix docker_host tag Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-06 09:37:43 +02:00
sjat	a3ea2aceb2	feat(tags): scan roles/+playbooks/ and fail on unknown tags Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-06 09:33:12 +02:00
sjat	b45118dac3	feat(tags): checker helpers — tag collection & allowed-set Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-06 09:28:03 +02:00
sjat	24397fa280	feat(tags): add allowed-tag vocabulary (tests/tags.yml)	2026-06-06 09:26:20 +02:00
sjat	04bfc26422	docs(plan): tagging standard implementation plan (ADR-019) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-06 09:21:15 +02:00
sjat	4ed9e9a8bf	docs(spec): tagging standard design (TODO 3.7/3.11 → ADR-019) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-06 09:15:44 +02:00