docs(plan): tagging standard implementation plan (ADR-019)
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
4ed9e9a8bf
commit
04bfc26422
1 changed files with 728 additions and 0 deletions
728
docs/superpowers/plans/2026-06-06-tagging-strategy.md
Normal file
728
docs/superpowers/plans/2026-06-06-tagging-strategy.md
Normal file
|
|
@ -0,0 +1,728 @@
|
||||||
|
# Ansible Tagging Standard Implementation Plan
|
||||||
|
|
||||||
|
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
|
||||||
|
|
||||||
|
**Goal:** Establish a two-tier Ansible tagging standard (role-name tags + a closed concern list) with machine-enforced vocabulary, plus a Proxmox VM metadata-tag convention, so playbook runs are targeted, transparent, and predictable.
|
||||||
|
|
||||||
|
**Architecture:** A single source-of-truth YAML (`tests/tags.yml`) lists the allowed concern/special/opt-in/playbook tags. A Python checker (`scripts/check-tags.py`) scans `roles/` and `playbooks/`, computes the allowed set as `{role dir names} ∪ {tags.yml entries}`, and fails `make lint` on any unknown tag. Terraform gets a documented three-tag VM convention (metadata only). The standard is recorded as ADR-019 and folded into CLAUDE.md.
|
||||||
|
|
||||||
|
**Tech Stack:** Python 3 (stdlib + PyYAML, already present via ansible-core), pytest (already in `requirements.txt`), Make, Terraform (HCL edit only — not `init`ed), Markdown docs.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## File structure
|
||||||
|
|
||||||
|
| File | Responsibility | Action |
|
||||||
|
|------|----------------|--------|
|
||||||
|
| `tests/tags.yml` | Single source of truth: allowed concern/special/opt-in/playbook tags | Create |
|
||||||
|
| `scripts/check-tags.py` | Scan `roles/`+`playbooks/`, fail on tags outside the allowed set | Create |
|
||||||
|
| `tests/test_check_tags.py` | Unit tests for the checker (mirrors `tests/test_capacity_scan.py`) | Create |
|
||||||
|
| `Makefile` | Wire `check-tags.py` into the `lint` target | Modify |
|
||||||
|
| `playbooks/site.yml` | Fix `docker_host` role tag (`docker` → `docker_host`) | Modify |
|
||||||
|
| `docs/decisions/019-tagging.md` | The ADR (the standard itself) | Create |
|
||||||
|
| `CLAUDE.md` | Reword tag rule; add Proxmox tag convention; add ADR-019 to Further reading | Modify |
|
||||||
|
| `terraform/environments/staging/main.tf` | Add `managed-by=terraform` tag | Modify |
|
||||||
|
| `terraform/environments/production/main.tf` | Add `managed-by=terraform` tag | Modify |
|
||||||
|
| `docs/TODO.md` | Mark 3.7 and 3.11 DECIDED | Modify |
|
||||||
|
| `docs/CAPABILITIES.md` | Note targeted runs as a capability | Modify |
|
||||||
|
|
||||||
|
Notes for the implementer:
|
||||||
|
- The repo venv is `.venv`. Run Python as `.venv/bin/python` (Makefile vars: `PYTHON := .venv/bin/python`). If `.venv` is missing, run `make setup` first.
|
||||||
|
- PyYAML is available in the venv (ansible-core depends on it) — `import yaml` works.
|
||||||
|
- Terraform is **not** `init`ed in this repo, so `terraform validate`/`plan` will fail offline. Only use `terraform fmt` (offline-safe) for the HCL tasks.
|
||||||
|
- Before any `git commit`, the pre-commit hook decrypts `vault.yml`, so the vault agent must be unlocked: run `rbw unlocked` (exit 0 = good). If locked, ask the user to `rbw unlock` and wait. None of these tasks touch vault files, but the hook still runs.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Task 1: Tag vocabulary file (`tests/tags.yml`)
|
||||||
|
|
||||||
|
**Files:**
|
||||||
|
- Create: `tests/tags.yml`
|
||||||
|
|
||||||
|
- [ ] **Step 1: Create the vocabulary file**
|
||||||
|
|
||||||
|
Create `tests/tags.yml` with exactly this content:
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
---
|
||||||
|
# Allowed Ansible tag vocabulary — single source of truth for scripts/check-tags.py.
|
||||||
|
# Authoritative reference & rationale: docs/decisions/019-tagging.md.
|
||||||
|
#
|
||||||
|
# The full allowed set the linter enforces is:
|
||||||
|
# {role directory names under roles/} ∪ everything listed below.
|
||||||
|
#
|
||||||
|
# To add a CONCERN tag: add it here AND add a row to the ADR-019 table with a
|
||||||
|
# one-line justification (cross-cutting, used in 2+ roles, distinct).
|
||||||
|
|
||||||
|
# Cross-cutting concern tags, applied per-task/block where a task belongs to the
|
||||||
|
# concern. Targeted one at a time (tags are union/OR, never intersected).
|
||||||
|
concerns:
|
||||||
|
- packages # apt package install/management
|
||||||
|
- users # accounts, groups, sudo
|
||||||
|
- firewall # nftables rulesets & port definitions (ADR-002)
|
||||||
|
- hardening # security baseline — sshd config, fail2ban, auditd, sysctl
|
||||||
|
- logging # Alloy / log-shipping config (ADR-018)
|
||||||
|
- monitoring # metric exporters / health checks
|
||||||
|
- config # render templated config/compose files to disk — no restart
|
||||||
|
- deploy # bring services up / restart (compose up -d)
|
||||||
|
- proxy # reverse-proxy + TLS registration (Traefik routes, Authentik)
|
||||||
|
|
||||||
|
# Ansible built-in special tags. Narrow use only:
|
||||||
|
# always — cheap preflight assertions (run regardless of --tags)
|
||||||
|
# never — destructive/expensive tasks, paired with an opt-in tag below
|
||||||
|
special:
|
||||||
|
- always
|
||||||
|
- never
|
||||||
|
|
||||||
|
# `never`-paired opt-in tags: destructive/expensive tasks that only run when
|
||||||
|
# named explicitly (e.g. `tags: [never, force_pull]`). Empty until a role adds one.
|
||||||
|
opt_ins: []
|
||||||
|
|
||||||
|
# Playbook-level identity tags for role-less lifecycle plays (e.g. bootstrap.yml).
|
||||||
|
playbooks:
|
||||||
|
- bootstrap
|
||||||
|
```
|
||||||
|
|
||||||
|
- [ ] **Step 2: Verify it parses and has the expected shape**
|
||||||
|
|
||||||
|
Run:
|
||||||
|
```bash
|
||||||
|
.venv/bin/python -c "import yaml; d=yaml.safe_load(open('tests/tags.yml')); assert len(d['concerns'])==9, d['concerns']; assert d['special']==['always','never']; assert d['opt_ins']==[]; assert d['playbooks']==['bootstrap']; print('tags.yml OK')"
|
||||||
|
```
|
||||||
|
Expected: prints `tags.yml OK` and exits 0.
|
||||||
|
|
||||||
|
- [ ] **Step 3: Commit**
|
||||||
|
|
||||||
|
```bash
|
||||||
|
git add tests/tags.yml
|
||||||
|
git commit -m "feat(tags): add allowed-tag vocabulary (tests/tags.yml)"
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Task 2: Checker core — tag collection & allowed-set helpers
|
||||||
|
|
||||||
|
**Files:**
|
||||||
|
- Create: `scripts/check-tags.py`
|
||||||
|
- Test: `tests/test_check_tags.py`
|
||||||
|
|
||||||
|
- [ ] **Step 1: Write the failing tests**
|
||||||
|
|
||||||
|
Create `tests/test_check_tags.py`:
|
||||||
|
|
||||||
|
```python
|
||||||
|
import importlib.util
|
||||||
|
import pathlib
|
||||||
|
|
||||||
|
_PATH = pathlib.Path(__file__).resolve().parent.parent / "scripts" / "check-tags.py"
|
||||||
|
_spec = importlib.util.spec_from_file_location("check_tags", _PATH)
|
||||||
|
ct = importlib.util.module_from_spec(_spec)
|
||||||
|
_spec.loader.exec_module(ct)
|
||||||
|
|
||||||
|
|
||||||
|
def test_collect_tags_list_form():
|
||||||
|
node = {"name": "t", "tags": ["firewall", "users"]}
|
||||||
|
assert ct.collect_tags(node) == {"firewall", "users"}
|
||||||
|
|
||||||
|
|
||||||
|
def test_collect_tags_string_form():
|
||||||
|
node = {"name": "t", "tags": "always"}
|
||||||
|
assert ct.collect_tags(node) == {"always"}
|
||||||
|
|
||||||
|
|
||||||
|
def test_collect_tags_nested_blocks_and_roles():
|
||||||
|
doc = [
|
||||||
|
{"hosts": "all", "roles": [{"role": "base", "tags": ["base"]}]},
|
||||||
|
{"block": [{"name": "x", "tags": ["config"]}], "tags": ["deploy"]},
|
||||||
|
]
|
||||||
|
assert ct.collect_tags(doc) == {"base", "config", "deploy"}
|
||||||
|
|
||||||
|
|
||||||
|
def test_collect_tags_ignores_templated_values():
|
||||||
|
node = {"tags": ["{{ dynamic }}", "logging"]}
|
||||||
|
assert ct.collect_tags(node) == {"logging"}
|
||||||
|
|
||||||
|
|
||||||
|
def test_load_vocab_unions_all_categories():
|
||||||
|
vocab = ct.load_vocab()
|
||||||
|
assert "firewall" in vocab # concern
|
||||||
|
assert "always" in vocab # special
|
||||||
|
assert "bootstrap" in vocab # playbook identity
|
||||||
|
assert len([c for c in vocab]) >= 12
|
||||||
|
|
||||||
|
|
||||||
|
def test_role_names_reads_role_dirs():
|
||||||
|
names = ct.role_names()
|
||||||
|
assert "base" in names
|
||||||
|
assert "docker_host" in names
|
||||||
|
```
|
||||||
|
|
||||||
|
- [ ] **Step 2: Run tests to verify they fail**
|
||||||
|
|
||||||
|
Run: `.venv/bin/python -m pytest tests/test_check_tags.py -v`
|
||||||
|
Expected: FAIL — `ModuleNotFoundError` / file not found for `scripts/check-tags.py` (the module can't be imported yet).
|
||||||
|
|
||||||
|
- [ ] **Step 3: Write the minimal implementation**
|
||||||
|
|
||||||
|
Create `scripts/check-tags.py`:
|
||||||
|
|
||||||
|
```python
|
||||||
|
#!/usr/bin/env python3
|
||||||
|
"""
|
||||||
|
Validate that every Ansible tag used under roles/ and playbooks/ belongs to the
|
||||||
|
approved vocabulary. Single source of truth: tests/tags.yml. Rationale: ADR-019.
|
||||||
|
|
||||||
|
Allowed set = {role directory names under roles/} ∪ {concerns, special, opt_ins,
|
||||||
|
playbooks from tests/tags.yml}. Templated tags (containing "{{") are skipped —
|
||||||
|
they can't be statically validated.
|
||||||
|
|
||||||
|
Usage: python3 scripts/check-tags.py
|
||||||
|
Exit 0 = all tags allowed; exit 1 = unknown tag(s) found.
|
||||||
|
"""
|
||||||
|
import pathlib
|
||||||
|
import sys
|
||||||
|
|
||||||
|
import yaml
|
||||||
|
|
||||||
|
REPO = pathlib.Path(__file__).resolve().parent.parent
|
||||||
|
VOCAB_FILE = REPO / "tests" / "tags.yml"
|
||||||
|
SCAN_DIRS = ("roles", "playbooks")
|
||||||
|
|
||||||
|
|
||||||
|
class _IgnoreUnknownTags(yaml.SafeLoader):
|
||||||
|
"""SafeLoader that tolerates custom YAML tags (e.g. !vault) instead of crashing."""
|
||||||
|
|
||||||
|
|
||||||
|
def _ignore(loader, tag_suffix, node):
|
||||||
|
return None
|
||||||
|
|
||||||
|
|
||||||
|
_IgnoreUnknownTags.add_multi_constructor("", _ignore)
|
||||||
|
_IgnoreUnknownTags.add_multi_constructor("!", _ignore)
|
||||||
|
|
||||||
|
|
||||||
|
def _static_str(value):
|
||||||
|
return isinstance(value, str) and "{{" not in value
|
||||||
|
|
||||||
|
|
||||||
|
def load_vocab(path=VOCAB_FILE):
|
||||||
|
data = yaml.safe_load(path.read_text()) or {}
|
||||||
|
vocab = set()
|
||||||
|
for key in ("concerns", "special", "opt_ins", "playbooks"):
|
||||||
|
vocab.update(data.get(key) or [])
|
||||||
|
return vocab
|
||||||
|
|
||||||
|
|
||||||
|
def role_names(repo=REPO):
|
||||||
|
roles_dir = repo / "roles"
|
||||||
|
if not roles_dir.is_dir():
|
||||||
|
return set()
|
||||||
|
return {p.name for p in roles_dir.iterdir() if p.is_dir()}
|
||||||
|
|
||||||
|
|
||||||
|
def collect_tags(node):
|
||||||
|
"""Recursively collect every static tag string under any 'tags:' key."""
|
||||||
|
tags = set()
|
||||||
|
if isinstance(node, dict):
|
||||||
|
for key, value in node.items():
|
||||||
|
if key == "tags":
|
||||||
|
if _static_str(value):
|
||||||
|
tags.add(value)
|
||||||
|
elif isinstance(value, list):
|
||||||
|
tags.update(t for t in value if _static_str(t))
|
||||||
|
tags |= collect_tags(value)
|
||||||
|
elif isinstance(node, list):
|
||||||
|
for item in node:
|
||||||
|
tags |= collect_tags(item)
|
||||||
|
return tags
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__": # pragma: no cover
|
||||||
|
sys.exit(0)
|
||||||
|
```
|
||||||
|
|
||||||
|
- [ ] **Step 4: Run tests to verify they pass**
|
||||||
|
|
||||||
|
Run: `.venv/bin/python -m pytest tests/test_check_tags.py -v`
|
||||||
|
Expected: PASS (all 6 tests).
|
||||||
|
|
||||||
|
- [ ] **Step 5: Commit**
|
||||||
|
|
||||||
|
```bash
|
||||||
|
git add scripts/check-tags.py tests/test_check_tags.py
|
||||||
|
git commit -m "feat(tags): checker helpers — tag collection & allowed-set"
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Task 3: Checker validation — scan files and fail on unknown tags
|
||||||
|
|
||||||
|
**Files:**
|
||||||
|
- Modify: `scripts/check-tags.py`
|
||||||
|
- Test: `tests/test_check_tags.py`
|
||||||
|
|
||||||
|
- [ ] **Step 1: Write the failing tests**
|
||||||
|
|
||||||
|
Append to `tests/test_check_tags.py`:
|
||||||
|
|
||||||
|
```python
|
||||||
|
def test_scan_text_collects_from_yaml_string():
|
||||||
|
text = """
|
||||||
|
- hosts: all
|
||||||
|
roles:
|
||||||
|
- role: base
|
||||||
|
tags: [base]
|
||||||
|
tasks:
|
||||||
|
- name: open port
|
||||||
|
tags: [firewall]
|
||||||
|
"""
|
||||||
|
assert ct.scan_text(text) == {"base", "firewall"}
|
||||||
|
|
||||||
|
|
||||||
|
def test_scan_text_tolerates_custom_yaml_tags():
|
||||||
|
text = "- name: t\n secret: !vault xxx\n tags: [users]\n"
|
||||||
|
assert ct.scan_text(text) == {"users"}
|
||||||
|
|
||||||
|
|
||||||
|
def test_find_violations_flags_unknown_tag():
|
||||||
|
allowed = {"base", "firewall"}
|
||||||
|
used = {"base", "frewall"} # typo
|
||||||
|
assert ct.find_violations(used, allowed) == ["frewall"]
|
||||||
|
|
||||||
|
|
||||||
|
def test_find_violations_empty_when_all_allowed():
|
||||||
|
assert ct.find_violations({"base", "firewall"}, {"base", "firewall"}) == []
|
||||||
|
```
|
||||||
|
|
||||||
|
- [ ] **Step 2: Run tests to verify they fail**
|
||||||
|
|
||||||
|
Run: `.venv/bin/python -m pytest tests/test_check_tags.py -v`
|
||||||
|
Expected: FAIL — `AttributeError: module 'check_tags' has no attribute 'scan_text'` (and `find_violations`).
|
||||||
|
|
||||||
|
- [ ] **Step 3: Add the scanning + validation functions**
|
||||||
|
|
||||||
|
In `scripts/check-tags.py`, replace the final block:
|
||||||
|
|
||||||
|
```python
|
||||||
|
if __name__ == "__main__": # pragma: no cover
|
||||||
|
sys.exit(0)
|
||||||
|
```
|
||||||
|
|
||||||
|
with:
|
||||||
|
|
||||||
|
```python
|
||||||
|
def scan_text(text):
|
||||||
|
"""Collect static tags from a (possibly multi-document) YAML string."""
|
||||||
|
found = set()
|
||||||
|
for doc in yaml.load_all(text, Loader=_IgnoreUnknownTags):
|
||||||
|
found |= collect_tags(doc)
|
||||||
|
return found
|
||||||
|
|
||||||
|
|
||||||
|
def iter_yaml_files(repo=REPO, scan_dirs=SCAN_DIRS):
|
||||||
|
for name in scan_dirs:
|
||||||
|
base = repo / name
|
||||||
|
if not base.is_dir():
|
||||||
|
continue
|
||||||
|
for ext in ("*.yml", "*.yaml"):
|
||||||
|
yield from sorted(base.rglob(ext))
|
||||||
|
|
||||||
|
|
||||||
|
def find_violations(used, allowed):
|
||||||
|
return sorted(used - allowed)
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
allowed = load_vocab() | role_names()
|
||||||
|
violations = []
|
||||||
|
for path in iter_yaml_files():
|
||||||
|
try:
|
||||||
|
used = scan_text(path.read_text())
|
||||||
|
except yaml.YAMLError as exc:
|
||||||
|
print(f"warning: could not parse {path}: {exc}", file=sys.stderr)
|
||||||
|
continue
|
||||||
|
for tag in find_violations(used, allowed):
|
||||||
|
violations.append((path.relative_to(REPO), tag))
|
||||||
|
|
||||||
|
if violations:
|
||||||
|
print(
|
||||||
|
"error: Ansible tag(s) not in tests/tags.yml or role names "
|
||||||
|
"(see docs/decisions/019-tagging.md):",
|
||||||
|
file=sys.stderr,
|
||||||
|
)
|
||||||
|
for relpath, tag in violations:
|
||||||
|
print(f" {relpath}: '{tag}'", file=sys.stderr)
|
||||||
|
print(f"\nallowed: {', '.join(sorted(allowed))}", file=sys.stderr)
|
||||||
|
sys.exit(1)
|
||||||
|
|
||||||
|
print(f"check-tags: OK ({len(allowed)} tags allowed across {len(SCAN_DIRS)} dirs)")
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
|
```
|
||||||
|
|
||||||
|
- [ ] **Step 4: Run tests to verify they pass**
|
||||||
|
|
||||||
|
Run: `.venv/bin/python -m pytest tests/test_check_tags.py -v`
|
||||||
|
Expected: PASS (all 10 tests).
|
||||||
|
|
||||||
|
- [ ] **Step 5: Commit**
|
||||||
|
|
||||||
|
```bash
|
||||||
|
git add scripts/check-tags.py tests/test_check_tags.py
|
||||||
|
git commit -m "feat(tags): scan roles/+playbooks/ and fail on unknown tags"
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Task 4: Reconcile existing tags & wire into `make lint`
|
||||||
|
|
||||||
|
**Files:**
|
||||||
|
- Modify: `playbooks/site.yml:18-19`
|
||||||
|
- Modify: `Makefile` (the `lint:` target)
|
||||||
|
|
||||||
|
- [ ] **Step 1: Run the checker against the current repo (expect one violation)**
|
||||||
|
|
||||||
|
Run: `.venv/bin/python scripts/check-tags.py`
|
||||||
|
Expected: FAIL (exit 1) reporting `playbooks/site.yml: 'docker'` — because the `docker_host` role is tagged `[docker]`, which is neither a role name nor a vocabulary tag. This confirms the checker works end-to-end.
|
||||||
|
|
||||||
|
- [ ] **Step 2: Fix the role tag to equal the role name**
|
||||||
|
|
||||||
|
In `playbooks/site.yml`, change:
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
- role: docker_host
|
||||||
|
tags: [docker]
|
||||||
|
```
|
||||||
|
|
||||||
|
to:
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
- role: docker_host
|
||||||
|
tags: [docker_host]
|
||||||
|
```
|
||||||
|
|
||||||
|
- [ ] **Step 3: Re-run the checker (expect clean)**
|
||||||
|
|
||||||
|
Run: `.venv/bin/python scripts/check-tags.py`
|
||||||
|
Expected: PASS — prints `check-tags: OK (... tags allowed across 2 dirs)` and exits 0.
|
||||||
|
(Allowed set now includes role names `base`, `docker_host`; used tags are `base`, `docker_host`, `bootstrap` — all allowed.)
|
||||||
|
|
||||||
|
- [ ] **Step 4: Wire the checker into `make lint`**
|
||||||
|
|
||||||
|
In `Makefile`, change the `lint:` target from:
|
||||||
|
|
||||||
|
```makefile
|
||||||
|
lint:
|
||||||
|
$(VENV)/bin/yamllint .
|
||||||
|
$(LINT)
|
||||||
|
```
|
||||||
|
|
||||||
|
to:
|
||||||
|
|
||||||
|
```makefile
|
||||||
|
lint:
|
||||||
|
$(VENV)/bin/yamllint .
|
||||||
|
$(LINT)
|
||||||
|
$(PYTHON) scripts/check-tags.py
|
||||||
|
```
|
||||||
|
|
||||||
|
- [ ] **Step 5: Run the full lint suite and the test suite**
|
||||||
|
|
||||||
|
Run: `make lint && .venv/bin/python -m pytest tests/test_check_tags.py -v`
|
||||||
|
Expected: yamllint passes, ansible-lint passes, `check-tags: OK`, and all pytest tests PASS.
|
||||||
|
|
||||||
|
- [ ] **Step 6: Commit**
|
||||||
|
|
||||||
|
```bash
|
||||||
|
git add playbooks/site.yml Makefile
|
||||||
|
git commit -m "feat(tags): enforce tag vocabulary in make lint; fix docker_host tag"
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Task 5: Terraform Proxmox VM tag convention
|
||||||
|
|
||||||
|
**Files:**
|
||||||
|
- Modify: `terraform/environments/staging/main.tf` (the `tags =` line in `module "vms"`)
|
||||||
|
- Modify: `terraform/environments/production/main.tf` (the `tags =` line in `module "vms"`)
|
||||||
|
|
||||||
|
- [ ] **Step 1: Add `managed-by=terraform` to the staging VM tags**
|
||||||
|
|
||||||
|
In `terraform/environments/staging/main.tf`, change:
|
||||||
|
|
||||||
|
```hcl
|
||||||
|
tags = ["staging", each.value.group]
|
||||||
|
```
|
||||||
|
|
||||||
|
to:
|
||||||
|
|
||||||
|
```hcl
|
||||||
|
tags = ["staging", each.value.group, "managed-by=terraform"]
|
||||||
|
```
|
||||||
|
|
||||||
|
- [ ] **Step 2: Add `managed-by=terraform` to the production VM tags**
|
||||||
|
|
||||||
|
In `terraform/environments/production/main.tf`, change:
|
||||||
|
|
||||||
|
```hcl
|
||||||
|
tags = ["production", each.value.group]
|
||||||
|
```
|
||||||
|
|
||||||
|
to:
|
||||||
|
|
||||||
|
```hcl
|
||||||
|
tags = ["production", each.value.group, "managed-by=terraform"]
|
||||||
|
```
|
||||||
|
|
||||||
|
- [ ] **Step 3: Format-check the HCL (offline-safe)**
|
||||||
|
|
||||||
|
Run: `terraform -chdir=terraform/environments/staging fmt && terraform -chdir=terraform/environments/production fmt`
|
||||||
|
Expected: either no output (already formatted) or the filename printed (reformatted). Exit 0.
|
||||||
|
(Do NOT run `terraform validate`/`plan` — Terraform is not `init`ed in this repo and they will fail offline.)
|
||||||
|
|
||||||
|
- [ ] **Step 4: Confirm the edits**
|
||||||
|
|
||||||
|
Run: `grep -n "managed-by=terraform" terraform/environments/staging/main.tf terraform/environments/production/main.tf`
|
||||||
|
Expected: one match in each file.
|
||||||
|
|
||||||
|
- [ ] **Step 5: Commit**
|
||||||
|
|
||||||
|
```bash
|
||||||
|
git add terraform/environments/staging/main.tf terraform/environments/production/main.tf
|
||||||
|
git commit -m "feat(tags): Proxmox VM metadata convention (managed-by=terraform)"
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Task 6: Documentation — ADR-019, CLAUDE.md, TODO, CAPABILITIES
|
||||||
|
|
||||||
|
**Files:**
|
||||||
|
- Create: `docs/decisions/019-tagging.md`
|
||||||
|
- Modify: `CLAUDE.md` (Ansible conventions; Terraform conventions; Further reading)
|
||||||
|
- Modify: `docs/TODO.md` (items 3.7 and 3.11)
|
||||||
|
- Modify: `docs/CAPABILITIES.md`
|
||||||
|
|
||||||
|
- [ ] **Step 1: Write the ADR**
|
||||||
|
|
||||||
|
Create `docs/decisions/019-tagging.md`:
|
||||||
|
|
||||||
|
````markdown
|
||||||
|
# ADR-019 — Tagging standard for targeted, predictable runs
|
||||||
|
|
||||||
|
## Status
|
||||||
|
|
||||||
|
Accepted (2026-06-06). Resolves TODO 3.7 ("Define a tagging standard that lets us
|
||||||
|
target runs without over-tagging") and TODO 3.11 ("Deliberate tagging strategy").
|
||||||
|
|
||||||
|
## Context
|
||||||
|
|
||||||
|
boma wants to run playbooks **targeted** — a single service, a single layer, or a
|
||||||
|
single cross-cutting concern — **transparently and predictably**: a reader should
|
||||||
|
know from a `--tags` invocation exactly what it will and won't touch. CLAUDE.md
|
||||||
|
already requires tag-filterable tasks, but no vocabulary or convention existed, and
|
||||||
|
the TODO explicitly warns against the opposite failure mode: **over-tagging**.
|
||||||
|
|
||||||
|
## Decision
|
||||||
|
|
||||||
|
### Two-tier tagging
|
||||||
|
|
||||||
|
**Tier 1 — role/service tag (mechanical).** The tag equals the role name, applied
|
||||||
|
once at the role-import level:
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
roles:
|
||||||
|
- role: photoprism
|
||||||
|
tags: [photoprism]
|
||||||
|
```
|
||||||
|
|
||||||
|
Ansible propagates it to every task in the role. Because one service = one role
|
||||||
|
(ADR-004), this single rule covers both the *layer/role* and *single-service*
|
||||||
|
targeting axes with zero per-task burden. Role-less lifecycle playbooks
|
||||||
|
(e.g. `bootstrap.yml`) carry a single playbook-identity tag instead.
|
||||||
|
|
||||||
|
**Tier 2 — concern tag (curated).** A small **closed list** of cross-cutting concern
|
||||||
|
tags, applied per-task/block **only where a task genuinely belongs to that concern**.
|
||||||
|
|
||||||
|
### The closed concern list
|
||||||
|
|
||||||
|
A concern earns a tag only if it (a) appears in 2+ roles, (b) is worth running as a
|
||||||
|
slice on its own, and (c) doesn't overlap confusingly with another.
|
||||||
|
|
||||||
|
| Tag | Covers |
|
||||||
|
|-----|--------|
|
||||||
|
| `packages` | apt package install/management |
|
||||||
|
| `users` | accounts, groups, sudo |
|
||||||
|
| `firewall` | nftables rulesets & port definitions (ADR-002) |
|
||||||
|
| `hardening` | security baseline — sshd config, fail2ban, auditd, sysctl |
|
||||||
|
| `logging` | Alloy / log-shipping config (ADR-018) |
|
||||||
|
| `monitoring` | metric exporters / health checks |
|
||||||
|
| `config` | render templated config/compose files to disk — **no restart** |
|
||||||
|
| `deploy` | bring services up / restart (`compose up -d`) |
|
||||||
|
| `proxy` | reverse-proxy + TLS registration (Traefik routes, Authentik) |
|
||||||
|
|
||||||
|
The `config`/`deploy` split lets you re-render and diff configuration (`--tags
|
||||||
|
config`) without bouncing services, then restart deliberately (`--tags deploy`).
|
||||||
|
`backup` and `secrets` are intentionally omitted until the roles needing them exist.
|
||||||
|
|
||||||
|
### `always` / `never`
|
||||||
|
|
||||||
|
- **`always`** — reserved for cheap preflight assertions (vault unlocked, OS is
|
||||||
|
Debian 13, required vars present), so even `--tags config` runs its safety guards.
|
||||||
|
- **`never`** — reserved for destructive/expensive opt-in tasks, each paired with a
|
||||||
|
descriptive tag (e.g. `tags: [never, force_pull]`); they run only when named.
|
||||||
|
|
||||||
|
### Predictability principle: tags are union-only
|
||||||
|
|
||||||
|
`--tags a,b` runs tasks tagged a **OR** b — Ansible has no native AND. boma therefore
|
||||||
|
targets **one axis at a time**: either a role/service *or* a concern, never an
|
||||||
|
intersection like "photoprism's firewall only." If that's ever needed, just run
|
||||||
|
`--tags photoprism` (idempotent and fast). Designing for intersection is the
|
||||||
|
over-tagging trap; we decline it on purpose.
|
||||||
|
|
||||||
|
### Terraform / Proxmox VM tags (metadata only)
|
||||||
|
|
||||||
|
Every Terraform-managed VM carries exactly three Proxmox tags:
|
||||||
|
|
||||||
|
| Tag | Value | Purpose |
|
||||||
|
|-----|-------|---------|
|
||||||
|
| env | `staging` \| `production` | which environment |
|
||||||
|
| role/group | `docker_hosts`, `proxmox_hosts`, … | matches the inventory group |
|
||||||
|
| managed-by | `terraform` | distinguishes IaC VMs from hand-made ones |
|
||||||
|
|
||||||
|
These are **pure metadata for transparency** (glanceable in the Proxmox UI). They do
|
||||||
|
**not** drive run-targeting and do **not** feed inventory — `scripts/tf_to_inventory.py`
|
||||||
|
keeps building groups from the `group` output field, the single source of truth.
|
||||||
|
|
||||||
|
## Enforcement
|
||||||
|
|
||||||
|
`tests/tags.yml` is the single source of truth for the allowed concern/special/
|
||||||
|
opt-in/playbook tags. `scripts/check-tags.py` (run by `make lint`, covered by
|
||||||
|
`tests/test_check_tags.py`) scans `roles/` and `playbooks/` and fails on any tag
|
||||||
|
outside `{role directory names} ∪ {tests/tags.yml entries}`.
|
||||||
|
|
||||||
|
## Extending the vocabulary
|
||||||
|
|
||||||
|
To add a concern tag: (1) add it to `tests/tags.yml`; (2) add a row to the concern
|
||||||
|
table above with a one-line justification showing it passes the litmus test
|
||||||
|
(cross-cutting, 2+ roles, distinct). That is the whole gate — lightweight, but it
|
||||||
|
leaves a paper trail.
|
||||||
|
|
||||||
|
## Consequences
|
||||||
|
|
||||||
|
- Targeted runs are predictable: only two kinds of tags exist, one of them mechanical.
|
||||||
|
- Over-tagging is structurally resisted (closed list + lint enforcement).
|
||||||
|
- Intersection targeting is unavailable by design.
|
||||||
|
- Authors must keep role tags = role names; the linter enforces it.
|
||||||
|
|
||||||
|
## Related
|
||||||
|
|
||||||
|
ADR-002 (security baseline / firewall), ADR-004 (one service = one role),
|
||||||
|
ADR-009 (TF↔Ansible handoff / inventory), ADR-018 (logging).
|
||||||
|
````
|
||||||
|
|
||||||
|
- [ ] **Step 2: Reword the tag rule in CLAUDE.md**
|
||||||
|
|
||||||
|
In `CLAUDE.md`, under **Ansible conventions**, change:
|
||||||
|
|
||||||
|
```markdown
|
||||||
|
- **Tags**: every task must have at least one tag; playbooks support `--tags` filtering
|
||||||
|
```
|
||||||
|
|
||||||
|
to:
|
||||||
|
|
||||||
|
```markdown
|
||||||
|
- **Tags** (ADR-019): import each role with its role-name tag once at the play level
|
||||||
|
(Ansible inherits it to every task). Tag a task/block with a concern tag from the
|
||||||
|
approved list (`tests/tags.yml`) only where it genuinely belongs to that concern —
|
||||||
|
don't invent tags or tag for tagging's sake. Target one axis at a time (role/service
|
||||||
|
*or* concern; tags are union/OR, never intersected). `make lint` enforces the vocabulary.
|
||||||
|
```
|
||||||
|
|
||||||
|
- [ ] **Step 3: Add the Proxmox tag convention to CLAUDE.md**
|
||||||
|
|
||||||
|
In `CLAUDE.md`, under **Terraform conventions**, add this bullet after the existing
|
||||||
|
"Terraform owns VM existence only" bullet:
|
||||||
|
|
||||||
|
```markdown
|
||||||
|
- Every TF-managed VM carries three Proxmox tags — `<env>`, its inventory `group`, and
|
||||||
|
`managed-by=terraform` — as **metadata only** (ADR-019). They do not feed inventory
|
||||||
|
or run-targeting; `tf_to_inventory.py` still groups by the `group` output field.
|
||||||
|
```
|
||||||
|
|
||||||
|
- [ ] **Step 4: Add ADR-019 to the Further reading table**
|
||||||
|
|
||||||
|
In `CLAUDE.md`, in the **Further reading** table, add this row immediately after the
|
||||||
|
`Logging & log integrity` row:
|
||||||
|
|
||||||
|
```markdown
|
||||||
|
| Tagging & run-targeting | `docs/decisions/019-tagging.md` |
|
||||||
|
```
|
||||||
|
|
||||||
|
- [ ] **Step 5: Mark the TODO items decided**
|
||||||
|
|
||||||
|
In `docs/TODO.md`, change line for item 3.7:
|
||||||
|
|
||||||
|
```markdown
|
||||||
|
7. Define a tagging standard that lets us target runs without over-tagging.
|
||||||
|
```
|
||||||
|
|
||||||
|
to:
|
||||||
|
|
||||||
|
```markdown
|
||||||
|
7. ~~Define a tagging standard that lets us target runs without over-tagging.~~
|
||||||
|
DECIDED (ADR-019): two-tier — role-name tags (auto, at play level) + a closed
|
||||||
|
9-tag concern list (`tests/tags.yml`); union-only targeting; enforced by `make lint`.
|
||||||
|
```
|
||||||
|
|
||||||
|
and change item 3.11:
|
||||||
|
|
||||||
|
```markdown
|
||||||
|
11. Deliberate tagging strategy.
|
||||||
|
```
|
||||||
|
|
||||||
|
to:
|
||||||
|
|
||||||
|
```markdown
|
||||||
|
11. ~~Deliberate tagging strategy.~~ DECIDED (ADR-019) — folded into 3.7.
|
||||||
|
```
|
||||||
|
|
||||||
|
- [ ] **Step 6: Note the capability in CAPABILITIES.md**
|
||||||
|
|
||||||
|
Run: `grep -n "^## \|^### " docs/CAPABILITIES.md` to locate the section covering
|
||||||
|
operations / CI / how playbooks are run. Add this bullet under the most appropriate
|
||||||
|
existing section (operations or testing/CI):
|
||||||
|
|
||||||
|
```markdown
|
||||||
|
- **Targeted runs** (ADR-019): playbooks are sliced with `--tags` along two axes —
|
||||||
|
role/service (tag = role name) or a closed list of cross-cutting concerns
|
||||||
|
(`firewall`, `logging`, `config`, `deploy`, …); the vocabulary is lint-enforced.
|
||||||
|
```
|
||||||
|
|
||||||
|
- [ ] **Step 7: Verify docs are consistent and lint still passes**
|
||||||
|
|
||||||
|
Run:
|
||||||
|
```bash
|
||||||
|
grep -n "019-tagging" CLAUDE.md && grep -c "managed-by=terraform" CLAUDE.md && make lint
|
||||||
|
```
|
||||||
|
Expected: the ADR-019 row is found in CLAUDE.md, `managed-by=terraform` appears at
|
||||||
|
least once, and `make lint` passes (including `check-tags: OK`).
|
||||||
|
|
||||||
|
- [ ] **Step 8: Commit**
|
||||||
|
|
||||||
|
```bash
|
||||||
|
git add docs/decisions/019-tagging.md CLAUDE.md docs/TODO.md docs/CAPABILITIES.md
|
||||||
|
git commit -m "docs(tags): ADR-019 + CLAUDE.md/TODO/CAPABILITIES (tagging standard)"
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Final verification
|
||||||
|
|
||||||
|
- [ ] Run the full suite once more: `make lint && .venv/bin/python -m pytest tests/ -v`
|
||||||
|
Expected: yamllint + ansible-lint pass, `check-tags: OK`, all tests PASS.
|
||||||
|
- [ ] Confirm a deliberate violation is caught: temporarily add `tags: [bogus]` to a
|
||||||
|
task in `playbooks/site.yml`, run `.venv/bin/python scripts/check-tags.py`, confirm it
|
||||||
|
exits 1 reporting `'bogus'`, then revert the edit.
|
||||||
|
- [ ] `git log --oneline -7` shows the six task commits.
|
||||||
Loading…
Add table
Reference in a new issue