Compare commits

...

9 commits

Author SHA1 Message Date
2e5a1e1e23 fix(tags): exclude molecule scenarios from tag scan; clarify ADR enforcement
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-06 09:50:14 +02:00
24b5e9361e docs(tags): ADR-019 + CLAUDE.md/TODO/CAPABILITIES (tagging standard)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-06 09:42:22 +02:00
9584cc2c76 feat(tags): Proxmox VM metadata convention (managed-by=terraform)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-06 09:39:19 +02:00
0b59107b33 feat(tags): enforce tag vocabulary in make lint; fix docker_host tag
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-06 09:37:43 +02:00
a3ea2aceb2 feat(tags): scan roles/+playbooks/ and fail on unknown tags
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-06 09:33:12 +02:00
b45118dac3 feat(tags): checker helpers — tag collection & allowed-set
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-06 09:28:03 +02:00
24397fa280 feat(tags): add allowed-tag vocabulary (tests/tags.yml) 2026-06-06 09:26:20 +02:00
04bfc26422 docs(plan): tagging standard implementation plan (ADR-019)
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-06 09:21:15 +02:00
4ed9e9a8bf docs(spec): tagging standard design (TODO 3.7/3.11 → ADR-019)
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-06 09:15:44 +02:00
13 changed files with 1295 additions and 6 deletions

View file

@ -51,7 +51,11 @@ Full design rationale: `docs/decisions/`
## Ansible conventions ## Ansible conventions
- **FQCN always**: `ansible.builtin.template`, never `template` - **FQCN always**: `ansible.builtin.template`, never `template`
- **Tags**: every task must have at least one tag; playbooks support `--tags` filtering - **Tags** (ADR-019): import each role with its role-name tag once at the play level
(Ansible inherits it to every task). Tag a task/block with a concern tag from the
approved list (`tests/tags.yml`) only where it genuinely belongs to that concern —
don't invent tags or tag for tagging's sake. Target one axis at a time (role/service
*or* concern; tags are union/OR, never intersected). `make lint` enforces the vocabulary.
- **Handlers**: use `listen:` topic strings, not direct name references - **Handlers**: use `listen:` topic strings, not direct name references
- **Variables**: `rolename__varname` double-underscore namespace for role defaults - **Variables**: `rolename__varname` double-underscore namespace for role defaults
- **No inline vars in playbooks**: use `group_vars/` or `host_vars/` only - **No inline vars in playbooks**: use `group_vars/` or `host_vars/` only
@ -144,6 +148,9 @@ Single-contributor, trunk-based (no merge requests / approval gates):
## Terraform conventions ## Terraform conventions
- Terraform owns VM existence only — nothing inside a VM, and no DNS records - Terraform owns VM existence only — nothing inside a VM, and no DNS records
- Every TF-managed VM carries three Proxmox tags — `<env>`, its inventory `group`, and
`managed-by=terraform` — as **metadata only** (ADR-019). They do not feed inventory
or run-targeting; `tf_to_inventory.py` still groups by the `group` output field.
- Internal DNS is entirely Ansible (the `dns` role renders the zone from inventory) - Internal DNS is entirely Ansible (the `dns` role renders the zone from inventory)
- OPNsense is entirely Ansible; do not reach for a Terraform OPNsense provider - OPNsense is entirely Ansible; do not reach for a Terraform OPNsense provider
- Environments are separate directories (`staging/`, `production/`), not workspaces - Environments are separate directories (`staging/`, `production/`), not workspaces
@ -215,6 +222,7 @@ Single-contributor, trunk-based (no merge requests / approval gates):
| Update management | `docs/decisions/011-update-management.md` | | Update management | `docs/decisions/011-update-management.md` |
| Hardware & capacity | `docs/decisions/012-hardware-capacity.md` | | Hardware & capacity | `docs/decisions/012-hardware-capacity.md` |
| Logging & log integrity | `docs/decisions/018-logging.md` | | Logging & log integrity | `docs/decisions/018-logging.md` |
| Tagging & run-targeting | `docs/decisions/019-tagging.md` |
| Adding a new role | `docs/runbooks/new-role.md` | | Adding a new role | `docs/runbooks/new-role.md` |
| Adding a new host | `docs/runbooks/new-host.md` | | Adding a new host | `docs/runbooks/new-host.md` |
| Rotating vault secrets | `docs/runbooks/rotate-secrets.md` | | Rotating vault secrets | `docs/runbooks/rotate-secrets.md` |

View file

@ -67,6 +67,7 @@ collections:
lint: lint:
$(VENV)/bin/yamllint . $(VENV)/bin/yamllint .
$(LINT) $(LINT)
$(PYTHON) scripts/check-tags.py
# ── Testing ─────────────────────────────────────────────────────────────────── # ── Testing ───────────────────────────────────────────────────────────────────

View file

@ -112,6 +112,10 @@ _(DHCP, firewall, mDNS reflection live on OPNsense — Ansible-managed, not cont
| Sanity / smoke | whoami + health checks | S | planned | Verification endpoints + "is it actually working" checks | ADR-011 / TODO 8.2 | | Sanity / smoke | whoami + health checks | S | planned | Verification endpoints + "is it actually working" checks | ADR-011 / TODO 8.2 |
| Service-UI verification | `/verify-service` skill | S | planned | Claude-driven exploratory Level 4 acceptance check of a deployed service's UI | Decided (ADR-017); running deferred on ubongo + playwright + Authentik | | Service-UI verification | `/verify-service` skill | S | planned | Claude-driven exploratory Level 4 acceptance check of a deployed service's UI | Decided (ADR-017); running deferred on ubongo + playwright + Authentik |
- **Targeted runs** (ADR-019): playbooks are sliced with `--tags` along two axes —
role/service (tag = role name) or a closed list of cross-cutting concerns
(`firewall`, `logging`, `config`, `deploy`, …); the vocabulary is lint-enforced.
--- ---
## V4 completeness check ## V4 completeness check

View file

@ -28,11 +28,13 @@
(all logs) + off-site security subset on `askari` + Grafana on-cluster (not the (all logs) + off-site security subset on `askari` + Grafana on-cluster (not the
whole stack on `askari`). Still to design/build: Prometheus + metric exporters, whole stack on `askari`). Still to design/build: Prometheus + metric exporters,
Uptime Kuma, and exactly which alerts live where. Uptime Kuma, and exactly which alerts live where.
7. Define a tagging standard that lets us target runs without over-tagging. 7. ~~Define a tagging standard that lets us target runs without over-tagging.~~
DECIDED (ADR-019): two-tier — role-name tags (auto, at play level) + a closed
9-tag concern list (`tests/tags.yml`); union-only targeting; enforced by `make lint`.
8. Ensure the right things are backed up (incl. database dumps if we land on PBS). 8. Ensure the right things are backed up (incl. database dumps if we land on PBS).
9. Decide: a central database server, or individual database services per app? 9. Decide: a central database server, or individual database services per app?
10. Should we keep the custom base-container (Molecule test image) method for role testing, or revisit it as boma's testing approach matures (ADR-008)? 10. Should we keep the custom base-container (Molecule test image) method for role testing, or revisit it as boma's testing approach matures (ADR-008)?
11. Deliberate tagging strategy. 11. ~~Deliberate tagging strategy.~~ DECIDED (ADR-019) — folded into 3.7.
4. **Split-horizon FQDN** — adopt split-horizon FQDN with or without nyumbani? 4. **Split-horizon FQDN** — adopt split-horizon FQDN with or without nyumbani?

View file

@ -0,0 +1,112 @@
# ADR-019 — Tagging standard for targeted, predictable runs
## Status
Accepted (2026-06-06). Resolves TODO 3.7 ("Define a tagging standard that lets us
target runs without over-tagging") and TODO 3.11 ("Deliberate tagging strategy").
## Context
boma wants to run playbooks **targeted** — a single service, a single layer, or a
single cross-cutting concern — **transparently and predictably**: a reader should
know from a `--tags` invocation exactly what it will and won't touch. CLAUDE.md
already requires tag-filterable tasks, but no vocabulary or convention existed, and
the TODO explicitly warns against the opposite failure mode: **over-tagging**.
## Decision
### Two-tier tagging
**Tier 1 — role/service tag (mechanical).** The tag equals the role name, applied
once at the role-import level:
```yaml
roles:
- role: photoprism
tags: [photoprism]
```
Ansible propagates it to every task in the role. Because one service = one role
(ADR-004), this single rule covers both the *layer/role* and *single-service*
targeting axes with zero per-task burden. Role-less lifecycle playbooks
(e.g. `bootstrap.yml`) carry a single playbook-identity tag instead.
**Tier 2 — concern tag (curated).** A small **closed list** of cross-cutting concern
tags, applied per-task/block **only where a task genuinely belongs to that concern**.
### The closed concern list
A concern earns a tag only if it (a) appears in 2+ roles, (b) is worth running as a
slice on its own, and (c) doesn't overlap confusingly with another.
| Tag | Covers |
|-----|--------|
| `packages` | apt package install/management |
| `users` | accounts, groups, sudo |
| `firewall` | nftables rulesets & port definitions (ADR-002) |
| `hardening` | security baseline — sshd config, fail2ban, auditd, sysctl |
| `logging` | Alloy / log-shipping config (ADR-018) |
| `monitoring` | metric exporters / health checks |
| `config` | render templated config/compose files to disk — **no restart** |
| `deploy` | bring services up / restart (`compose up -d`) |
| `proxy` | reverse-proxy + TLS registration (Traefik routes, Authentik) |
The `config`/`deploy` split lets you re-render and diff configuration (`--tags
config`) without bouncing services, then restart deliberately (`--tags deploy`).
`backup` and `secrets` are intentionally omitted until the roles needing them exist.
### `always` / `never`
- **`always`** — reserved for cheap preflight assertions (vault unlocked, OS is
Debian 13, required vars present), so even `--tags config` runs its safety guards.
- **`never`** — reserved for destructive/expensive opt-in tasks, each paired with a
descriptive tag (e.g. `tags: [never, force_pull]`); they run only when named.
### Predictability principle: tags are union-only
`--tags a,b` runs tasks tagged a **OR** b — Ansible has no native AND. boma therefore
targets **one axis at a time**: either a role/service *or* a concern, never an
intersection like "photoprism's firewall only." If that's ever needed, just run
`--tags photoprism` (idempotent and fast). Designing for intersection is the
over-tagging trap; we decline it on purpose.
### Terraform / Proxmox VM tags (metadata only)
Every Terraform-managed VM carries exactly three Proxmox tags:
| Tag | Value | Purpose |
|-----|-------|---------|
| env | `staging` \| `production` | which environment |
| role/group | `docker_hosts`, `proxmox_hosts`, … | matches the inventory group |
| managed-by | `terraform` | distinguishes IaC VMs from hand-made ones |
These are **pure metadata for transparency** (glanceable in the Proxmox UI). They do
**not** drive run-targeting and do **not** feed inventory — `scripts/tf_to_inventory.py`
keeps building groups from the `group` output field, the single source of truth.
## Enforcement
`tests/tags.yml` is the single source of truth for the allowed concern/special/
opt-in/playbook tags. `scripts/check-tags.py` (run by `make lint`, covered by
`tests/test_check_tags.py`) scans `roles/` and `playbooks/` and fails on any tag
outside `{role directory names} {tests/tags.yml entries}`.
Molecule scenario files (`roles/*/molecule/**`) are excluded from the scan — they are test orchestration, not the production run-targeting surface this standard governs.
## Extending the vocabulary
To add a concern tag: (1) add it to `tests/tags.yml`; (2) add a row to the concern
table above with a one-line justification showing it passes the litmus test
(cross-cutting, 2+ roles, distinct). That is the whole gate — lightweight, but it
leaves a paper trail.
## Consequences
- Targeted runs are predictable: only two kinds of tags exist, one of them mechanical.
- Over-tagging is structurally resisted (closed list + lint enforcement).
- Intersection targeting is unavailable by design.
- Authors must keep role tags = role names. The linter enforces the *vocabulary* (every tag must be a known role name or an approved tag); the role-tag-equals-role-name rule itself is a convention the linter does not separately check.
## Related
ADR-002 (security baseline / firewall), ADR-004 (one service = one role),
ADR-009 (TF↔Ansible handoff / inventory), ADR-018 (logging).

View file

@ -0,0 +1,728 @@
# Ansible Tagging Standard Implementation Plan
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
**Goal:** Establish a two-tier Ansible tagging standard (role-name tags + a closed concern list) with machine-enforced vocabulary, plus a Proxmox VM metadata-tag convention, so playbook runs are targeted, transparent, and predictable.
**Architecture:** A single source-of-truth YAML (`tests/tags.yml`) lists the allowed concern/special/opt-in/playbook tags. A Python checker (`scripts/check-tags.py`) scans `roles/` and `playbooks/`, computes the allowed set as `{role dir names} {tags.yml entries}`, and fails `make lint` on any unknown tag. Terraform gets a documented three-tag VM convention (metadata only). The standard is recorded as ADR-019 and folded into CLAUDE.md.
**Tech Stack:** Python 3 (stdlib + PyYAML, already present via ansible-core), pytest (already in `requirements.txt`), Make, Terraform (HCL edit only — not `init`ed), Markdown docs.
---
## File structure
| File | Responsibility | Action |
|------|----------------|--------|
| `tests/tags.yml` | Single source of truth: allowed concern/special/opt-in/playbook tags | Create |
| `scripts/check-tags.py` | Scan `roles/`+`playbooks/`, fail on tags outside the allowed set | Create |
| `tests/test_check_tags.py` | Unit tests for the checker (mirrors `tests/test_capacity_scan.py`) | Create |
| `Makefile` | Wire `check-tags.py` into the `lint` target | Modify |
| `playbooks/site.yml` | Fix `docker_host` role tag (`docker``docker_host`) | Modify |
| `docs/decisions/019-tagging.md` | The ADR (the standard itself) | Create |
| `CLAUDE.md` | Reword tag rule; add Proxmox tag convention; add ADR-019 to Further reading | Modify |
| `terraform/environments/staging/main.tf` | Add `managed-by=terraform` tag | Modify |
| `terraform/environments/production/main.tf` | Add `managed-by=terraform` tag | Modify |
| `docs/TODO.md` | Mark 3.7 and 3.11 DECIDED | Modify |
| `docs/CAPABILITIES.md` | Note targeted runs as a capability | Modify |
Notes for the implementer:
- The repo venv is `.venv`. Run Python as `.venv/bin/python` (Makefile vars: `PYTHON := .venv/bin/python`). If `.venv` is missing, run `make setup` first.
- PyYAML is available in the venv (ansible-core depends on it) — `import yaml` works.
- Terraform is **not** `init`ed in this repo, so `terraform validate`/`plan` will fail offline. Only use `terraform fmt` (offline-safe) for the HCL tasks.
- Before any `git commit`, the pre-commit hook decrypts `vault.yml`, so the vault agent must be unlocked: run `rbw unlocked` (exit 0 = good). If locked, ask the user to `rbw unlock` and wait. None of these tasks touch vault files, but the hook still runs.
---
### Task 1: Tag vocabulary file (`tests/tags.yml`)
**Files:**
- Create: `tests/tags.yml`
- [ ] **Step 1: Create the vocabulary file**
Create `tests/tags.yml` with exactly this content:
```yaml
---
# Allowed Ansible tag vocabulary — single source of truth for scripts/check-tags.py.
# Authoritative reference & rationale: docs/decisions/019-tagging.md.
#
# The full allowed set the linter enforces is:
# {role directory names under roles/} everything listed below.
#
# To add a CONCERN tag: add it here AND add a row to the ADR-019 table with a
# one-line justification (cross-cutting, used in 2+ roles, distinct).
# Cross-cutting concern tags, applied per-task/block where a task belongs to the
# concern. Targeted one at a time (tags are union/OR, never intersected).
concerns:
- packages # apt package install/management
- users # accounts, groups, sudo
- firewall # nftables rulesets & port definitions (ADR-002)
- hardening # security baseline — sshd config, fail2ban, auditd, sysctl
- logging # Alloy / log-shipping config (ADR-018)
- monitoring # metric exporters / health checks
- config # render templated config/compose files to disk — no restart
- deploy # bring services up / restart (compose up -d)
- proxy # reverse-proxy + TLS registration (Traefik routes, Authentik)
# Ansible built-in special tags. Narrow use only:
# always — cheap preflight assertions (run regardless of --tags)
# never — destructive/expensive tasks, paired with an opt-in tag below
special:
- always
- never
# `never`-paired opt-in tags: destructive/expensive tasks that only run when
# named explicitly (e.g. `tags: [never, force_pull]`). Empty until a role adds one.
opt_ins: []
# Playbook-level identity tags for role-less lifecycle plays (e.g. bootstrap.yml).
playbooks:
- bootstrap
```
- [ ] **Step 2: Verify it parses and has the expected shape**
Run:
```bash
.venv/bin/python -c "import yaml; d=yaml.safe_load(open('tests/tags.yml')); assert len(d['concerns'])==9, d['concerns']; assert d['special']==['always','never']; assert d['opt_ins']==[]; assert d['playbooks']==['bootstrap']; print('tags.yml OK')"
```
Expected: prints `tags.yml OK` and exits 0.
- [ ] **Step 3: Commit**
```bash
git add tests/tags.yml
git commit -m "feat(tags): add allowed-tag vocabulary (tests/tags.yml)"
```
---
### Task 2: Checker core — tag collection & allowed-set helpers
**Files:**
- Create: `scripts/check-tags.py`
- Test: `tests/test_check_tags.py`
- [ ] **Step 1: Write the failing tests**
Create `tests/test_check_tags.py`:
```python
import importlib.util
import pathlib
_PATH = pathlib.Path(__file__).resolve().parent.parent / "scripts" / "check-tags.py"
_spec = importlib.util.spec_from_file_location("check_tags", _PATH)
ct = importlib.util.module_from_spec(_spec)
_spec.loader.exec_module(ct)
def test_collect_tags_list_form():
node = {"name": "t", "tags": ["firewall", "users"]}
assert ct.collect_tags(node) == {"firewall", "users"}
def test_collect_tags_string_form():
node = {"name": "t", "tags": "always"}
assert ct.collect_tags(node) == {"always"}
def test_collect_tags_nested_blocks_and_roles():
doc = [
{"hosts": "all", "roles": [{"role": "base", "tags": ["base"]}]},
{"block": [{"name": "x", "tags": ["config"]}], "tags": ["deploy"]},
]
assert ct.collect_tags(doc) == {"base", "config", "deploy"}
def test_collect_tags_ignores_templated_values():
node = {"tags": ["{{ dynamic }}", "logging"]}
assert ct.collect_tags(node) == {"logging"}
def test_load_vocab_unions_all_categories():
vocab = ct.load_vocab()
assert "firewall" in vocab # concern
assert "always" in vocab # special
assert "bootstrap" in vocab # playbook identity
assert len([c for c in vocab]) >= 12
def test_role_names_reads_role_dirs():
names = ct.role_names()
assert "base" in names
assert "docker_host" in names
```
- [ ] **Step 2: Run tests to verify they fail**
Run: `.venv/bin/python -m pytest tests/test_check_tags.py -v`
Expected: FAIL — `ModuleNotFoundError` / file not found for `scripts/check-tags.py` (the module can't be imported yet).
- [ ] **Step 3: Write the minimal implementation**
Create `scripts/check-tags.py`:
```python
#!/usr/bin/env python3
"""
Validate that every Ansible tag used under roles/ and playbooks/ belongs to the
approved vocabulary. Single source of truth: tests/tags.yml. Rationale: ADR-019.
Allowed set = {role directory names under roles/} {concerns, special, opt_ins,
playbooks from tests/tags.yml}. Templated tags (containing "{{") are skipped —
they can't be statically validated.
Usage: python3 scripts/check-tags.py
Exit 0 = all tags allowed; exit 1 = unknown tag(s) found.
"""
import pathlib
import sys
import yaml
REPO = pathlib.Path(__file__).resolve().parent.parent
VOCAB_FILE = REPO / "tests" / "tags.yml"
SCAN_DIRS = ("roles", "playbooks")
class _IgnoreUnknownTags(yaml.SafeLoader):
"""SafeLoader that tolerates custom YAML tags (e.g. !vault) instead of crashing."""
def _ignore(loader, tag_suffix, node):
return None
_IgnoreUnknownTags.add_multi_constructor("", _ignore)
_IgnoreUnknownTags.add_multi_constructor("!", _ignore)
def _static_str(value):
return isinstance(value, str) and "{{" not in value
def load_vocab(path=VOCAB_FILE):
data = yaml.safe_load(path.read_text()) or {}
vocab = set()
for key in ("concerns", "special", "opt_ins", "playbooks"):
vocab.update(data.get(key) or [])
return vocab
def role_names(repo=REPO):
roles_dir = repo / "roles"
if not roles_dir.is_dir():
return set()
return {p.name for p in roles_dir.iterdir() if p.is_dir()}
def collect_tags(node):
"""Recursively collect every static tag string under any 'tags:' key."""
tags = set()
if isinstance(node, dict):
for key, value in node.items():
if key == "tags":
if _static_str(value):
tags.add(value)
elif isinstance(value, list):
tags.update(t for t in value if _static_str(t))
tags |= collect_tags(value)
elif isinstance(node, list):
for item in node:
tags |= collect_tags(item)
return tags
if __name__ == "__main__": # pragma: no cover
sys.exit(0)
```
- [ ] **Step 4: Run tests to verify they pass**
Run: `.venv/bin/python -m pytest tests/test_check_tags.py -v`
Expected: PASS (all 6 tests).
- [ ] **Step 5: Commit**
```bash
git add scripts/check-tags.py tests/test_check_tags.py
git commit -m "feat(tags): checker helpers — tag collection & allowed-set"
```
---
### Task 3: Checker validation — scan files and fail on unknown tags
**Files:**
- Modify: `scripts/check-tags.py`
- Test: `tests/test_check_tags.py`
- [ ] **Step 1: Write the failing tests**
Append to `tests/test_check_tags.py`:
```python
def test_scan_text_collects_from_yaml_string():
text = """
- hosts: all
roles:
- role: base
tags: [base]
tasks:
- name: open port
tags: [firewall]
"""
assert ct.scan_text(text) == {"base", "firewall"}
def test_scan_text_tolerates_custom_yaml_tags():
text = "- name: t\n secret: !vault xxx\n tags: [users]\n"
assert ct.scan_text(text) == {"users"}
def test_find_violations_flags_unknown_tag():
allowed = {"base", "firewall"}
used = {"base", "frewall"} # typo
assert ct.find_violations(used, allowed) == ["frewall"]
def test_find_violations_empty_when_all_allowed():
assert ct.find_violations({"base", "firewall"}, {"base", "firewall"}) == []
```
- [ ] **Step 2: Run tests to verify they fail**
Run: `.venv/bin/python -m pytest tests/test_check_tags.py -v`
Expected: FAIL — `AttributeError: module 'check_tags' has no attribute 'scan_text'` (and `find_violations`).
- [ ] **Step 3: Add the scanning + validation functions**
In `scripts/check-tags.py`, replace the final block:
```python
if __name__ == "__main__": # pragma: no cover
sys.exit(0)
```
with:
```python
def scan_text(text):
"""Collect static tags from a (possibly multi-document) YAML string."""
found = set()
for doc in yaml.load_all(text, Loader=_IgnoreUnknownTags):
found |= collect_tags(doc)
return found
def iter_yaml_files(repo=REPO, scan_dirs=SCAN_DIRS):
for name in scan_dirs:
base = repo / name
if not base.is_dir():
continue
for ext in ("*.yml", "*.yaml"):
yield from sorted(base.rglob(ext))
def find_violations(used, allowed):
return sorted(used - allowed)
def main():
allowed = load_vocab() | role_names()
violations = []
for path in iter_yaml_files():
try:
used = scan_text(path.read_text())
except yaml.YAMLError as exc:
print(f"warning: could not parse {path}: {exc}", file=sys.stderr)
continue
for tag in find_violations(used, allowed):
violations.append((path.relative_to(REPO), tag))
if violations:
print(
"error: Ansible tag(s) not in tests/tags.yml or role names "
"(see docs/decisions/019-tagging.md):",
file=sys.stderr,
)
for relpath, tag in violations:
print(f" {relpath}: '{tag}'", file=sys.stderr)
print(f"\nallowed: {', '.join(sorted(allowed))}", file=sys.stderr)
sys.exit(1)
print(f"check-tags: OK ({len(allowed)} tags allowed across {len(SCAN_DIRS)} dirs)")
if __name__ == "__main__":
main()
```
- [ ] **Step 4: Run tests to verify they pass**
Run: `.venv/bin/python -m pytest tests/test_check_tags.py -v`
Expected: PASS (all 10 tests).
- [ ] **Step 5: Commit**
```bash
git add scripts/check-tags.py tests/test_check_tags.py
git commit -m "feat(tags): scan roles/+playbooks/ and fail on unknown tags"
```
---
### Task 4: Reconcile existing tags & wire into `make lint`
**Files:**
- Modify: `playbooks/site.yml:18-19`
- Modify: `Makefile` (the `lint:` target)
- [ ] **Step 1: Run the checker against the current repo (expect one violation)**
Run: `.venv/bin/python scripts/check-tags.py`
Expected: FAIL (exit 1) reporting `playbooks/site.yml: 'docker'` — because the `docker_host` role is tagged `[docker]`, which is neither a role name nor a vocabulary tag. This confirms the checker works end-to-end.
- [ ] **Step 2: Fix the role tag to equal the role name**
In `playbooks/site.yml`, change:
```yaml
- role: docker_host
tags: [docker]
```
to:
```yaml
- role: docker_host
tags: [docker_host]
```
- [ ] **Step 3: Re-run the checker (expect clean)**
Run: `.venv/bin/python scripts/check-tags.py`
Expected: PASS — prints `check-tags: OK (... tags allowed across 2 dirs)` and exits 0.
(Allowed set now includes role names `base`, `docker_host`; used tags are `base`, `docker_host`, `bootstrap` — all allowed.)
- [ ] **Step 4: Wire the checker into `make lint`**
In `Makefile`, change the `lint:` target from:
```makefile
lint:
$(VENV)/bin/yamllint .
$(LINT)
```
to:
```makefile
lint:
$(VENV)/bin/yamllint .
$(LINT)
$(PYTHON) scripts/check-tags.py
```
- [ ] **Step 5: Run the full lint suite and the test suite**
Run: `make lint && .venv/bin/python -m pytest tests/test_check_tags.py -v`
Expected: yamllint passes, ansible-lint passes, `check-tags: OK`, and all pytest tests PASS.
- [ ] **Step 6: Commit**
```bash
git add playbooks/site.yml Makefile
git commit -m "feat(tags): enforce tag vocabulary in make lint; fix docker_host tag"
```
---
### Task 5: Terraform Proxmox VM tag convention
**Files:**
- Modify: `terraform/environments/staging/main.tf` (the `tags =` line in `module "vms"`)
- Modify: `terraform/environments/production/main.tf` (the `tags =` line in `module "vms"`)
- [ ] **Step 1: Add `managed-by=terraform` to the staging VM tags**
In `terraform/environments/staging/main.tf`, change:
```hcl
tags = ["staging", each.value.group]
```
to:
```hcl
tags = ["staging", each.value.group, "managed-by=terraform"]
```
- [ ] **Step 2: Add `managed-by=terraform` to the production VM tags**
In `terraform/environments/production/main.tf`, change:
```hcl
tags = ["production", each.value.group]
```
to:
```hcl
tags = ["production", each.value.group, "managed-by=terraform"]
```
- [ ] **Step 3: Format-check the HCL (offline-safe)**
Run: `terraform -chdir=terraform/environments/staging fmt && terraform -chdir=terraform/environments/production fmt`
Expected: either no output (already formatted) or the filename printed (reformatted). Exit 0.
(Do NOT run `terraform validate`/`plan` — Terraform is not `init`ed in this repo and they will fail offline.)
- [ ] **Step 4: Confirm the edits**
Run: `grep -n "managed-by=terraform" terraform/environments/staging/main.tf terraform/environments/production/main.tf`
Expected: one match in each file.
- [ ] **Step 5: Commit**
```bash
git add terraform/environments/staging/main.tf terraform/environments/production/main.tf
git commit -m "feat(tags): Proxmox VM metadata convention (managed-by=terraform)"
```
---
### Task 6: Documentation — ADR-019, CLAUDE.md, TODO, CAPABILITIES
**Files:**
- Create: `docs/decisions/019-tagging.md`
- Modify: `CLAUDE.md` (Ansible conventions; Terraform conventions; Further reading)
- Modify: `docs/TODO.md` (items 3.7 and 3.11)
- Modify: `docs/CAPABILITIES.md`
- [ ] **Step 1: Write the ADR**
Create `docs/decisions/019-tagging.md`:
````markdown
# ADR-019 — Tagging standard for targeted, predictable runs
## Status
Accepted (2026-06-06). Resolves TODO 3.7 ("Define a tagging standard that lets us
target runs without over-tagging") and TODO 3.11 ("Deliberate tagging strategy").
## Context
boma wants to run playbooks **targeted** — a single service, a single layer, or a
single cross-cutting concern — **transparently and predictably**: a reader should
know from a `--tags` invocation exactly what it will and won't touch. CLAUDE.md
already requires tag-filterable tasks, but no vocabulary or convention existed, and
the TODO explicitly warns against the opposite failure mode: **over-tagging**.
## Decision
### Two-tier tagging
**Tier 1 — role/service tag (mechanical).** The tag equals the role name, applied
once at the role-import level:
```yaml
roles:
- role: photoprism
tags: [photoprism]
```
Ansible propagates it to every task in the role. Because one service = one role
(ADR-004), this single rule covers both the *layer/role* and *single-service*
targeting axes with zero per-task burden. Role-less lifecycle playbooks
(e.g. `bootstrap.yml`) carry a single playbook-identity tag instead.
**Tier 2 — concern tag (curated).** A small **closed list** of cross-cutting concern
tags, applied per-task/block **only where a task genuinely belongs to that concern**.
### The closed concern list
A concern earns a tag only if it (a) appears in 2+ roles, (b) is worth running as a
slice on its own, and (c) doesn't overlap confusingly with another.
| Tag | Covers |
|-----|--------|
| `packages` | apt package install/management |
| `users` | accounts, groups, sudo |
| `firewall` | nftables rulesets & port definitions (ADR-002) |
| `hardening` | security baseline — sshd config, fail2ban, auditd, sysctl |
| `logging` | Alloy / log-shipping config (ADR-018) |
| `monitoring` | metric exporters / health checks |
| `config` | render templated config/compose files to disk — **no restart** |
| `deploy` | bring services up / restart (`compose up -d`) |
| `proxy` | reverse-proxy + TLS registration (Traefik routes, Authentik) |
The `config`/`deploy` split lets you re-render and diff configuration (`--tags
config`) without bouncing services, then restart deliberately (`--tags deploy`).
`backup` and `secrets` are intentionally omitted until the roles needing them exist.
### `always` / `never`
- **`always`** — reserved for cheap preflight assertions (vault unlocked, OS is
Debian 13, required vars present), so even `--tags config` runs its safety guards.
- **`never`** — reserved for destructive/expensive opt-in tasks, each paired with a
descriptive tag (e.g. `tags: [never, force_pull]`); they run only when named.
### Predictability principle: tags are union-only
`--tags a,b` runs tasks tagged a **OR** b — Ansible has no native AND. boma therefore
targets **one axis at a time**: either a role/service *or* a concern, never an
intersection like "photoprism's firewall only." If that's ever needed, just run
`--tags photoprism` (idempotent and fast). Designing for intersection is the
over-tagging trap; we decline it on purpose.
### Terraform / Proxmox VM tags (metadata only)
Every Terraform-managed VM carries exactly three Proxmox tags:
| Tag | Value | Purpose |
|-----|-------|---------|
| env | `staging` \| `production` | which environment |
| role/group | `docker_hosts`, `proxmox_hosts`, … | matches the inventory group |
| managed-by | `terraform` | distinguishes IaC VMs from hand-made ones |
These are **pure metadata for transparency** (glanceable in the Proxmox UI). They do
**not** drive run-targeting and do **not** feed inventory — `scripts/tf_to_inventory.py`
keeps building groups from the `group` output field, the single source of truth.
## Enforcement
`tests/tags.yml` is the single source of truth for the allowed concern/special/
opt-in/playbook tags. `scripts/check-tags.py` (run by `make lint`, covered by
`tests/test_check_tags.py`) scans `roles/` and `playbooks/` and fails on any tag
outside `{role directory names} {tests/tags.yml entries}`.
## Extending the vocabulary
To add a concern tag: (1) add it to `tests/tags.yml`; (2) add a row to the concern
table above with a one-line justification showing it passes the litmus test
(cross-cutting, 2+ roles, distinct). That is the whole gate — lightweight, but it
leaves a paper trail.
## Consequences
- Targeted runs are predictable: only two kinds of tags exist, one of them mechanical.
- Over-tagging is structurally resisted (closed list + lint enforcement).
- Intersection targeting is unavailable by design.
- Authors must keep role tags = role names; the linter enforces it.
## Related
ADR-002 (security baseline / firewall), ADR-004 (one service = one role),
ADR-009 (TF↔Ansible handoff / inventory), ADR-018 (logging).
````
- [ ] **Step 2: Reword the tag rule in CLAUDE.md**
In `CLAUDE.md`, under **Ansible conventions**, change:
```markdown
- **Tags**: every task must have at least one tag; playbooks support `--tags` filtering
```
to:
```markdown
- **Tags** (ADR-019): import each role with its role-name tag once at the play level
(Ansible inherits it to every task). Tag a task/block with a concern tag from the
approved list (`tests/tags.yml`) only where it genuinely belongs to that concern —
don't invent tags or tag for tagging's sake. Target one axis at a time (role/service
*or* concern; tags are union/OR, never intersected). `make lint` enforces the vocabulary.
```
- [ ] **Step 3: Add the Proxmox tag convention to CLAUDE.md**
In `CLAUDE.md`, under **Terraform conventions**, add this bullet after the existing
"Terraform owns VM existence only" bullet:
```markdown
- Every TF-managed VM carries three Proxmox tags — `<env>`, its inventory `group`, and
`managed-by=terraform` — as **metadata only** (ADR-019). They do not feed inventory
or run-targeting; `tf_to_inventory.py` still groups by the `group` output field.
```
- [ ] **Step 4: Add ADR-019 to the Further reading table**
In `CLAUDE.md`, in the **Further reading** table, add this row immediately after the
`Logging & log integrity` row:
```markdown
| Tagging & run-targeting | `docs/decisions/019-tagging.md` |
```
- [ ] **Step 5: Mark the TODO items decided**
In `docs/TODO.md`, change line for item 3.7:
```markdown
7. Define a tagging standard that lets us target runs without over-tagging.
```
to:
```markdown
7. ~~Define a tagging standard that lets us target runs without over-tagging.~~
DECIDED (ADR-019): two-tier — role-name tags (auto, at play level) + a closed
9-tag concern list (`tests/tags.yml`); union-only targeting; enforced by `make lint`.
```
and change item 3.11:
```markdown
11. Deliberate tagging strategy.
```
to:
```markdown
11. ~~Deliberate tagging strategy.~~ DECIDED (ADR-019) — folded into 3.7.
```
- [ ] **Step 6: Note the capability in CAPABILITIES.md**
Run: `grep -n "^## \|^### " docs/CAPABILITIES.md` to locate the section covering
operations / CI / how playbooks are run. Add this bullet under the most appropriate
existing section (operations or testing/CI):
```markdown
- **Targeted runs** (ADR-019): playbooks are sliced with `--tags` along two axes —
role/service (tag = role name) or a closed list of cross-cutting concerns
(`firewall`, `logging`, `config`, `deploy`, …); the vocabulary is lint-enforced.
```
- [ ] **Step 7: Verify docs are consistent and lint still passes**
Run:
```bash
grep -n "019-tagging" CLAUDE.md && grep -c "managed-by=terraform" CLAUDE.md && make lint
```
Expected: the ADR-019 row is found in CLAUDE.md, `managed-by=terraform` appears at
least once, and `make lint` passes (including `check-tags: OK`).
- [ ] **Step 8: Commit**
```bash
git add docs/decisions/019-tagging.md CLAUDE.md docs/TODO.md docs/CAPABILITIES.md
git commit -m "docs(tags): ADR-019 + CLAUDE.md/TODO/CAPABILITIES (tagging standard)"
```
---
## Final verification
- [ ] Run the full suite once more: `make lint && .venv/bin/python -m pytest tests/ -v`
Expected: yamllint + ansible-lint pass, `check-tags: OK`, all tests PASS.
- [ ] Confirm a deliberate violation is caught: temporarily add `tags: [bogus]` to a
task in `playbooks/site.yml`, run `.venv/bin/python scripts/check-tags.py`, confirm it
exits 1 reporting `'bogus'`, then revert the edit.
- [ ] `git log --oneline -7` shows the six task commits.

View file

@ -0,0 +1,188 @@
# Design — Ansible tagging standard (targeted, predictable runs)
- **Date:** 2026-06-06
- **Status:** Approved design — pending implementation plan
- **Resolves:** TODO 3.7 ("Define a tagging standard that lets us target runs without
over-tagging") and TODO 3.11 ("Deliberate tagging strategy") — the same thread
- **Becomes:** ADR-019 (this design is the basis for that ADR)
---
## Problem
boma wants to run playbooks **targeted** — a single service, a single layer, or a
single cross-cutting concern — and to do so **transparently and predictably**: you
should be able to look at a `--tags` invocation and know exactly what it will and won't
touch. CLAUDE.md already mandates that every task be tag-filterable, but no *vocabulary*
or *naming convention* exists. Without one, tags proliferate ad-hoc per role and the
"predictable" property is lost — and the TODO explicitly warns against the opposite
failure mode, **over-tagging**.
The repo is effectively greenfield for this: `base` and `docker_host` are empty, and the
only tags in existence are `[base]`/`[docker]` in `site.yml` and `[bootstrap]` in
`bootstrap.yml`. So we can bake the standard into role-authoring conventions *before*
there are a dozen service roles to retrofit.
## Targeting axes (what we want to slice by)
1. **Layer / role**`--tags base`, `--tags docker`
2. **Single service**`--tags photoprism`, `--tags traefik`
3. **Concern / function**`--tags firewall`, `--tags logging`, …
Lifecycle phases (bootstrap/config/deploy) are **not** a tag axis — `bootstrap.yml` vs
`site.yml` already separate those as whole playbooks.
Key simplification: because of ADR-004 (*one service = one role*, role name = service
name), axes 1 and 2 are the **same mechanism** — a tag equal to the role name. Only the
concern axis needs a curated vocabulary.
## Approach (chosen): two-tier tagging
**Tier 1 — role/service tag (mechanical).** The tag *equals the role name*, applied
**once** at the role-import level in the playbook:
```yaml
roles:
- role: photoprism
tags: [photoprism]
```
Ansible propagates the tag to every task in the role. This covers both the layer/role
and single-service axes with one rule and **zero per-task burden**.
**Tier 2 — concern tag (curated).** A small **closed, documented list** of cross-cutting
concern tags, applied per-task/block **only where a task genuinely belongs to that
concern**. `--tags firewall` then hits firewall tasks in `base` and in every service
role.
Rejected alternatives: *concern-only/flat* (loses natural `--tags <service>` ergonomics);
*rich multi-dimensional* (role+service+concern+lifecycle+ad-hoc per task) — that is
precisely the over-tagging the TODO warns against.
## The closed concern list
Litmus test for earning a spot: a concern must (a) appear in **2+ roles**, (b) be
something you'd realistically want to run as a slice on its own, and (c) not overlap
confusingly with another.
**Baseline concerns** (mostly in `base`, some echoed in service roles):
| Tag | Covers |
|-----|--------|
| `packages` | apt package install/management |
| `users` | accounts, groups, sudo |
| `firewall` | nftables rulesets & port definitions (ADR-002) |
| `hardening` | security baseline — sshd config, fail2ban, auditd, sysctl |
| `logging` | Alloy / log-shipping config (ADR-018) |
| `monitoring` | metric exporters / health checks |
**Service concerns** (in every service role, ADR-004):
| Tag | Covers |
|-----|--------|
| `config` | render templated config/compose files to disk — **no restart** |
| `deploy` | bring services up / restart (`compose up -d`) |
| `proxy` | reverse-proxy + TLS registration (Traefik routes, Authentik) |
Nine tags total. The `config`/`deploy` split is deliberate and high-value: `--tags
config` re-renders and lets you diff configuration without bouncing services; `--tags
deploy` does the restart.
`backup` and `secrets` are **intentionally omitted** until the roles that need them
exist — they enter via the extend process, not speculative reservation.
## `always` / `never` policy
boma uses Ansible's two built-in special tags, narrowly:
- **`always`** — reserved strictly for **cheap preflight assertions** (vault unlocked,
OS is Debian 13, required vars present). Ensures even `--tags config` runs its safety
guards.
- **`never`** — reserved for **destructive/expensive opt-in tasks**, each paired with a
descriptive tag (e.g. `never, force_pull` or `never, restore`). They never run unless
explicitly named, keeping dangerous actions out of normal runs. The descriptive
partner tag is a documented `never`-paired opt-in (allowed by the linter).
## Predictability principle: tags are union-only
`--tags a,b` runs tasks tagged a **OR** b — Ansible has no native AND. Rather than fight
this, we make it an explicit principle: **boma targets one axis at a time***either* a
role/service (`--tags photoprism`) *or* a concern (`--tags firewall`), never an
intersection like "photoprism's firewall only." If that is ever genuinely needed, the
answer is "just run `--tags photoprism`" (idempotent and fast). Designing for
intersection is the over-tagging trap; we decline it on purpose.
## Reconciling the existing CLAUDE.md rule
CLAUDE.md currently says *"every task must have at least one tag."* Under the two-tier
model the role tag is applied **once at the play/import level** and **inherited** by
every task, so tasks are always reachable without hand-tagging each one. The rule is
**reworded** to:
> Import each role with its role-name tag (once, at the play level). Within a role, tag a
> task/block with a concern tag from the approved list **only where it genuinely belongs
> to that concern** — don't invent tags or tag for tagging's sake.
This directly resolves the "without over-tagging" tension.
## Terraform / Proxmox VM tags (metadata only)
Formalize the convention that already half-exists in `staging/main.tf`
(`tags = ["staging", each.value.group]`). Every TF-managed VM gets exactly three tags:
| Tag | Value | Purpose |
|-----|-------|---------|
| env | `staging` \| `production` | which environment |
| role/group | `docker_hosts`, `proxmox_hosts`, … | matches the inventory group |
| managed-by | `terraform` | distinguishes IaC VMs from hand-made ones |
Set as `tags = ["${env}", each.value.group, "managed-by=terraform"]` in the env
`main.tf` (env is constant per directory).
**Explicit non-goals** (stated so nobody wires them up later): these tags are **pure
metadata for transparency** — glanceable in the Proxmox UI. They do **not** drive
run-targeting and do **not** feed inventory. `scripts/tf_to_inventory.py` keeps building
groups from the `group` output field, which stays the single source of truth.
## Enforcement
A small **lint check wired into `make lint`**: a script collects every `tags:` value
across `roles/` and `playbooks/` and fails if any tag is not in the allowed set:
```
{role names} {9 concern tags} {always, never} {documented never-paired opt-ins}
```
The allowed concern list (and the `never`-paired opt-ins) live in **one
machine-readable file, `tests/tags.yml`**, which both the linter reads and the ADR
documents — so doc and enforcement cannot drift. This is more honest than ansible-lint's
limited built-in tags rule. A unit test (mirroring `tests/test_capacity_scan.py`) covers
the checker.
## The "propose to extend" process
To add a concern tag: (1) add it to `tests/tags.yml`; (2) add a row to the ADR-019 table
with a one-line justification showing it passes the litmus test (cross-cutting, 2+
roles, distinct). That is the whole gate — lightweight, but it leaves a paper trail.
## Deliverables
- **New `docs/decisions/019-tagging.md`** — the standard: rationale, two-tier model,
concern table, union-only principle, `always`/`never` policy, Proxmox tag convention,
extend process.
- **`tests/tags.yml`** — machine-readable allowed concern list + `never`-paired opt-ins.
- **Lint checker script** (e.g. `scripts/check-tags.py`) + **`make lint`** wiring +
**`tests/test_check_tags.py`**.
- **CLAUDE.md** — reword the tag bullet under *Ansible conventions*; add the Proxmox tag
convention under *Terraform conventions*; add ADR-019 to *Further reading*.
- **`terraform/environments/{staging,production}/main.tf`** — apply the three-tag
convention.
- **`docs/TODO.md`** — mark 3.7 and 3.11 DECIDED (ADR-019).
- **`docs/CAPABILITIES.md`** — note targeted runs as a capability, if it fits.
## Out of scope
- Intersection targeting (role ∩ concern) — declined on purpose (see principle).
- Lifecycle-phase tags — handled by separate playbooks.
- Proxmox tags feeding inventory or run-targeting — metadata only.
- `backup`/`secrets` concern tags — added later via the extend process.

View file

@ -16,4 +16,4 @@
become: true become: true
roles: roles:
- role: docker_host - role: docker_host
tags: [docker] tags: [docker_host]

124
scripts/check-tags.py Normal file
View file

@ -0,0 +1,124 @@
#!/usr/bin/env python3
"""
Validate that every Ansible tag used under roles/ and playbooks/ belongs to the
approved vocabulary. Single source of truth: tests/tags.yml. Rationale: ADR-019.
Allowed set = {role directory names under roles/} {concerns, special, opt_ins,
playbooks from tests/tags.yml}. Templated tags (containing "{{") are skipped
they can't be statically validated.
Usage: python3 scripts/check-tags.py
Exit 0 = all tags allowed; exit 1 = unknown tag(s) found.
"""
import pathlib
import sys
import yaml
REPO = pathlib.Path(__file__).resolve().parent.parent
VOCAB_FILE = REPO / "tests" / "tags.yml"
SCAN_DIRS = ("roles", "playbooks")
class _IgnoreUnknownTags(yaml.SafeLoader):
"""SafeLoader that tolerates custom YAML tags (e.g. !vault) instead of crashing."""
def _ignore(loader, tag_suffix, node):
return None
_IgnoreUnknownTags.add_multi_constructor("", _ignore)
def _static_str(value):
return isinstance(value, str) and "{{" not in value
def load_vocab(path=VOCAB_FILE):
data = yaml.safe_load(path.read_text()) or {}
vocab = set()
for key in ("concerns", "special", "opt_ins", "playbooks"):
vocab.update(data.get(key) or [])
return vocab
def role_names(repo=REPO):
roles_dir = repo / "roles"
if not roles_dir.is_dir():
return set()
return {p.name for p in roles_dir.iterdir() if p.is_dir()}
def collect_tags(node):
"""Recursively collect every static tag string under any 'tags:' key."""
# Matches any dict key literally named `tags`; Ansible-tag semantics assumed.
tags = set()
if isinstance(node, dict):
for key, value in node.items():
if key == "tags":
if _static_str(value):
tags.add(value)
elif isinstance(value, list):
tags.update(t for t in value if _static_str(t))
tags |= collect_tags(value)
elif isinstance(node, list):
for item in node:
tags |= collect_tags(item)
return tags
def scan_text(text):
"""Collect static tags from a (possibly multi-document) YAML string."""
found = set()
for doc in yaml.load_all(text, Loader=_IgnoreUnknownTags):
found |= collect_tags(doc)
return found
def iter_yaml_files(repo=REPO, scan_dirs=SCAN_DIRS):
for name in scan_dirs:
base = repo / name
if not base.is_dir():
continue
for ext in ("*.yml", "*.yaml"):
for path in sorted(base.rglob(ext)):
# Molecule scenarios are test orchestration, not the production
# run-targeting surface this standard governs (ADR-019). Skip them.
if "molecule" in path.relative_to(base).parts:
continue
yield path
def find_violations(used, allowed):
return sorted(used - allowed)
def main():
allowed = load_vocab() | role_names()
violations = []
for path in iter_yaml_files():
try:
used = scan_text(path.read_text())
except yaml.YAMLError as exc:
print(f"warning: could not parse {path}: {exc}", file=sys.stderr)
continue
for tag in find_violations(used, allowed):
violations.append((path.relative_to(REPO), tag))
if violations:
print(
"error: Ansible tag(s) not in tests/tags.yml or role names "
"(see docs/decisions/019-tagging.md):",
file=sys.stderr,
)
for relpath, tag in violations:
print(f" {relpath}: '{tag}'", file=sys.stderr)
print(f"\nallowed: {', '.join(sorted(allowed))}", file=sys.stderr)
sys.exit(1)
print(f"check-tags: OK ({len(allowed)} tags allowed across {len(SCAN_DIRS)} dirs)")
if __name__ == "__main__":
main()

View file

@ -35,7 +35,7 @@ module "vms" {
ssh_public_keys = var.ssh_public_keys ssh_public_keys = var.ssh_public_keys
cores = each.value.cores cores = each.value.cores
memory_mb = each.value.memory_mb memory_mb = each.value.memory_mb
tags = ["production", each.value.group] tags = ["production", each.value.group, "managed-by=terraform"]
} }
# Internal DNS records are NOT managed here. Terraform owns VM existence only; # Internal DNS records are NOT managed here. Terraform owns VM existence only;

View file

@ -29,7 +29,7 @@ module "vms" {
ssh_public_keys = var.ssh_public_keys ssh_public_keys = var.ssh_public_keys
cores = each.value.cores cores = each.value.cores
memory_mb = each.value.memory_mb memory_mb = each.value.memory_mb
tags = ["staging", each.value.group] tags = ["staging", each.value.group, "managed-by=terraform"]
} }
# Internal DNS records are NOT managed here. Terraform owns VM existence only; # Internal DNS records are NOT managed here. Terraform owns VM existence only;

37
tests/tags.yml Normal file
View file

@ -0,0 +1,37 @@
---
# Allowed Ansible tag vocabulary — single source of truth for scripts/check-tags.py.
# Authoritative reference & rationale: docs/decisions/019-tagging.md.
#
# The full allowed set the linter enforces is:
# {role directory names under roles/} everything listed below.
#
# To add a CONCERN tag: add it here AND add a row to the ADR-019 table with a
# one-line justification (cross-cutting, used in 2+ roles, distinct).
# Cross-cutting concern tags, applied per-task/block where a task belongs to the
# concern. Targeted one at a time (tags are union/OR, never intersected).
concerns:
- packages # apt package install/management
- users # accounts, groups, sudo
- firewall # nftables rulesets & port definitions (ADR-002)
- hardening # security baseline — sshd config, fail2ban, auditd, sysctl
- logging # Alloy / log-shipping config (ADR-018)
- monitoring # metric exporters / health checks
- config # render templated config/compose files to disk — no restart
- deploy # bring services up / restart (compose up -d)
- proxy # reverse-proxy + TLS registration (Traefik routes, Authentik)
# Ansible built-in special tags. Narrow use only:
# always — cheap preflight assertions (run regardless of --tags)
# never — destructive/expensive tasks, paired with an opt-in tag below
special:
- always
- never
# `never`-paired opt-in tags: destructive/expensive tasks that only run when
# named explicitly (e.g. `tags: [never, force_pull]`). Empty until a role adds one.
opt_ins: []
# Playbook-level identity tags for role-less lifecycle plays (e.g. bootstrap.yml).
playbooks:
- bootstrap

85
tests/test_check_tags.py Normal file
View file

@ -0,0 +1,85 @@
import importlib.util
import pathlib
_PATH = pathlib.Path(__file__).resolve().parent.parent / "scripts" / "check-tags.py"
_spec = importlib.util.spec_from_file_location("check_tags", _PATH)
ct = importlib.util.module_from_spec(_spec)
_spec.loader.exec_module(ct)
def test_collect_tags_list_form():
node = {"name": "t", "tags": ["firewall", "users"]}
assert ct.collect_tags(node) == {"firewall", "users"}
def test_collect_tags_string_form():
node = {"name": "t", "tags": "always"}
assert ct.collect_tags(node) == {"always"}
def test_collect_tags_nested_blocks_and_roles():
doc = [
{"hosts": "all", "roles": [{"role": "base", "tags": ["base"]}]},
{"block": [{"name": "x", "tags": ["config"]}], "tags": ["deploy"]},
]
assert ct.collect_tags(doc) == {"base", "config", "deploy"}
def test_collect_tags_ignores_templated_values():
node = {"tags": ["{{ dynamic }}", "logging"]}
assert ct.collect_tags(node) == {"logging"}
def test_load_vocab_unions_all_categories():
vocab = ct.load_vocab()
assert "firewall" in vocab # concern
assert "always" in vocab # special
assert "bootstrap" in vocab # playbook identity
assert len(vocab) >= 10
def test_role_names_reads_role_dirs():
names = ct.role_names()
assert "base" in names
assert "docker_host" in names
def test_scan_text_collects_from_yaml_string():
text = """
- hosts: all
roles:
- role: base
tags: [base]
tasks:
- name: open port
tags: [firewall]
"""
assert ct.scan_text(text) == {"base", "firewall"}
def test_scan_text_tolerates_custom_yaml_tags():
text = "- name: t\n secret: !vault xxx\n tags: [users]\n"
assert ct.scan_text(text) == {"users"}
def test_find_violations_flags_unknown_tag():
allowed = {"base", "firewall"}
used = {"base", "frewall"} # typo
assert ct.find_violations(used, allowed) == ["frewall"]
def test_find_violations_empty_when_all_allowed():
assert ct.find_violations({"base", "firewall"}, {"base", "firewall"}) == []
def test_iter_yaml_files_skips_molecule(tmp_path):
role = tmp_path / "roles" / "demo"
(role / "tasks").mkdir(parents=True)
(role / "tasks" / "main.yml").write_text("---\n")
mol = role / "molecule" / "default"
mol.mkdir(parents=True)
(mol / "verify.yml").write_text("---\n")
found = list(ct.iter_yaml_files(repo=tmp_path, scan_dirs=("roles",)))
names = [p.name for p in found]
assert "main.yml" in names
assert "verify.yml" not in names