Compare commits
3 commits
3dd03d4198
...
8e4bf3dd88
| Author | SHA1 | Date | |
|---|---|---|---|
| 8e4bf3dd88 | |||
| d8afa94c4b | |||
| f0d189ca09 |
10 changed files with 55 additions and 15 deletions
|
|
@ -101,14 +101,17 @@ inventories/
|
||||||
vault.yml
|
vault.yml
|
||||||
docker_hosts/ # hosts running Docker services
|
docker_hosts/ # hosts running Docker services
|
||||||
proxmox_hosts/ # Proxmox nodes themselves
|
proxmox_hosts/ # Proxmox nodes themselves
|
||||||
|
offsite_hosts/ # off-site hosts (askari) — NetBird coordinator + watchdog
|
||||||
host_vars/ # per-host overrides
|
host_vars/ # per-host overrides
|
||||||
staging/ # safe to run freely
|
staging/ # safe to run freely
|
||||||
```
|
```
|
||||||
|
|
||||||
Host groups: `all`, `control`, `docker_hosts`, `proxmox_hosts`
|
Host groups: `all`, `control`, `docker_hosts`, `proxmox_hosts`, `offsite_hosts`
|
||||||
|
|
||||||
(`control` holds `ubongo`, the one manually-provisioned **physical** control node
|
(`control` holds `ubongo`, the one manually-provisioned **physical** control node
|
||||||
outside the cluster — see ADR-009 and ADR-015.)
|
outside the cluster; `offsite_hosts` holds `askari`, the off-site Hetzner host that
|
||||||
|
runs the NetBird coordinator + watchdog — also added manually. See ADR-009, ADR-015,
|
||||||
|
ADR-016.)
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
|
|
|
||||||
17
README.md
17
README.md
|
|
@ -57,7 +57,11 @@ See `Makefile` for the full list of targets.
|
||||||
│
|
│
|
||||||
├── docs/
|
├── docs/
|
||||||
│ ├── decisions/ # Architecture decision records (ADRs)
|
│ ├── decisions/ # Architecture decision records (ADRs)
|
||||||
│ └── runbooks/ # Step-by-step operational procedures
|
│ ├── runbooks/ # Step-by-step operational procedures
|
||||||
|
│ ├── security/ # Per-service security checklist + templates + accepted risks
|
||||||
|
│ ├── testing/ # VERIFY.md template + service-UI verification reports
|
||||||
|
│ ├── hardware/ # Physical capacity reference + reviews
|
||||||
|
│ └── reviews/ # /review-repo reports
|
||||||
│
|
│
|
||||||
├── inventories/
|
├── inventories/
|
||||||
│ ├── production/ # Live hosts — edit carefully
|
│ ├── production/ # Live hosts — edit carefully
|
||||||
|
|
@ -92,6 +96,17 @@ See `Makefile` for the full list of targets.
|
||||||
- Network topology: `docs/decisions/007-network.md`
|
- Network topology: `docs/decisions/007-network.md`
|
||||||
- Testing methodology: `docs/decisions/008-testing.md`
|
- Testing methodology: `docs/decisions/008-testing.md`
|
||||||
- Terraform ↔ Ansible handoff: `docs/decisions/009-provisioning-handoff.md`
|
- Terraform ↔ Ansible handoff: `docs/decisions/009-provisioning-handoff.md`
|
||||||
|
- Forgejo & CI: `docs/decisions/010-forgejo-ci.md`
|
||||||
|
- Update management: `docs/decisions/011-update-management.md`
|
||||||
|
- Hardware & capacity: `docs/decisions/012-hardware-capacity.md`
|
||||||
|
- Heritage / V4 policy: `docs/decisions/013-heritage-v4.md`
|
||||||
|
- Sourcing technical knowledge: `docs/decisions/014-knowledge-sourcing.md`
|
||||||
|
- Control / AI-worker host (`ubongo`): `docs/decisions/015-control-host.md`
|
||||||
|
- Mesh VPN (NetBird): `docs/decisions/016-mesh-vpn.md`
|
||||||
|
- Service-UI verification (Level 4): `docs/decisions/017-service-ui-verification.md`
|
||||||
|
|
||||||
|
(CLAUDE.md carries the full cross-referenced table, including the runbooks and
|
||||||
|
security/testing docs.)
|
||||||
|
|
||||||
## Contributing
|
## Contributing
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -35,12 +35,15 @@ describes the *intended* design — see STATUS.md for what is actually built.
|
||||||
all
|
all
|
||||||
├── control # ubongo — physical control node outside the cluster; baseline config only, runs no services
|
├── control # ubongo — physical control node outside the cluster; baseline config only, runs no services
|
||||||
├── docker_hosts # VMs running Docker services (most hosts)
|
├── docker_hosts # VMs running Docker services (most hosts)
|
||||||
└── proxmox_hosts # Proxmox nodes themselves (limited management scope)
|
├── proxmox_hosts # Proxmox nodes themselves (limited management scope)
|
||||||
|
└── offsite_hosts # askari (off-site Hetzner) — NetBird coordinator + external watchdog
|
||||||
```
|
```
|
||||||
|
|
||||||
The `control` group holds the single manually-provisioned control node; it is
|
The `control` group holds the single manually-provisioned control node; it is
|
||||||
managed for baseline config (SSH, firewall, updates) but never runs the
|
managed for baseline config (SSH, firewall, updates) but never runs the
|
||||||
`docker_host` role. Proxmox nodes are managed only for basic baseline tasks (SSH).
|
`docker_host` role. The `offsite_hosts` group holds `askari`, the off-site Hetzner
|
||||||
|
host — also manually provisioned (ADR-016), managed for baseline config plus the
|
||||||
|
`netbird_coordinator` service role. Proxmox nodes are managed only for basic baseline tasks (SSH).
|
||||||
Proxmox configuration itself (storage, clustering, networking)
|
Proxmox configuration itself (storage, clustering, networking)
|
||||||
is out of scope.
|
is out of scope.
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -42,6 +42,7 @@ below). Each service role contains a standard set of files:
|
||||||
| `defaults/main.yml` | Tuneables, `rolename__` namespace |
|
| `defaults/main.yml` | Tuneables, `rolename__` namespace |
|
||||||
| `README.md` | Purpose, variables, usage (role convention) |
|
| `README.md` | Purpose, variables, usage (role convention) |
|
||||||
| `SECURITY.md` | Per-service security record — see ADR-002 and `docs/security/service-security-template.md` |
|
| `SECURITY.md` | Per-service security record — see ADR-002 and `docs/security/service-security-template.md` |
|
||||||
|
| `VERIFY.md` | Per-service UI acceptance spec — see ADR-008 Level 4 / ADR-017 and `docs/testing/service-verify-template.md` |
|
||||||
| `meta/main.yml`, `molecule/default/` | Metadata + Debian 13 test scenario |
|
| `meta/main.yml`, `molecule/default/` | Metadata + Debian 13 test scenario |
|
||||||
|
|
||||||
### Standard deploy mechanics
|
### Standard deploy mechanics
|
||||||
|
|
|
||||||
|
|
@ -75,7 +75,7 @@ isolation — no risk of accidentally applying the wrong state.
|
||||||
|
|
||||||
Each environment directory contains:
|
Each environment directory contains:
|
||||||
- `providers.tf` — provider version pins and configuration
|
- `providers.tf` — provider version pins and configuration
|
||||||
- `backend.tf` — Forgejo state backend (environment-specific path)
|
- `backend.tf` — backend configuration (local state on the control node; no remote backend — see "State backend" above)
|
||||||
- `variables.tf` — input declarations
|
- `variables.tf` — input declarations
|
||||||
- `terraform.tfvars.example` — tracked template; copy to `terraform.tfvars` for actual values
|
- `terraform.tfvars.example` — tracked template; copy to `terraform.tfvars` for actual values
|
||||||
- `main.tf` — `local.vms` map and module calls (no DNS resources)
|
- `main.tf` — `local.vms` map and module calls (no DNS resources)
|
||||||
|
|
|
||||||
|
|
@ -75,7 +75,12 @@ The seam's interface is a single Terraform output consumed by a single script.
|
||||||
`terraform output -json` and writes `inventories/<env>/hosts.yml`. It validates the
|
`terraform output -json` and writes `inventories/<env>/hosts.yml`. It validates the
|
||||||
group against the allowed set and fails loudly on an unknown group.
|
group against the allowed set and fails loudly on an unknown group.
|
||||||
|
|
||||||
**Valid groups**: `control`, `docker_hosts`, `proxmox_hosts`.
|
**Valid groups**: `control`, `docker_hosts`, `proxmox_hosts`, `offsite_hosts`.
|
||||||
|
|
||||||
|
`control` and `offsite_hosts` are not produced by Terraform — they hold manually
|
||||||
|
provisioned hosts (`ubongo` and `askari` respectively) added to the inventory by hand
|
||||||
|
(see the control-node exception below and ADR-015/ADR-016). They are valid groups so
|
||||||
|
the generated `hosts.yml` carries their (otherwise empty) sections.
|
||||||
|
|
||||||
The generated `hosts.yml` carries a "do not edit manually" header and is owned by
|
The generated `hosts.yml` carries a "do not edit manually" header and is owned by
|
||||||
the generator. Treat it as a build artifact: the source of truth is `local.vms` in
|
the generator. Treat it as a build artifact: the source of truth is `local.vms` in
|
||||||
|
|
|
||||||
|
|
@ -85,10 +85,11 @@ The accelerators this policy prefers (`context7`, `deep-research`, `superpowers`
|
||||||
`claude-code-guide`) are **plugins under `~/.claude/`** — local per machine, **not**
|
`claude-code-guide`) are **plugins under `~/.claude/`** — local per machine, **not**
|
||||||
synced by Claude account and **not** carried by the git repo (only `.claude/commands`,
|
synced by Claude account and **not** carried by the git repo (only `.claude/commands`,
|
||||||
`.claude/hooks`, `.claude/settings.json` travel). A fresh clone therefore lacks the
|
`.claude/hooks`, `.claude/settings.json` travel). A fresh clone therefore lacks the
|
||||||
plugin toolchain until it is reinstalled. Making it reproducible from the repo
|
plugin toolchain until it is reinstalled. Making it reproducible from the repo is
|
||||||
(`extraKnownMarketplaces` + `enabledPlugins` in `.claude/settings.json`, plus a
|
**done** (TODO 10.7): `.claude/settings.json` declares `extraKnownMarketplaces` +
|
||||||
bootstrap step) is tracked in `docs/TODO.md` and tied to control-node/AI setup. Until
|
`enabledPlugins`, and `docs/runbooks/claude-code-setup.md` documents the per-machine
|
||||||
then, the graceful-degradation fallback above keeps the policy working.
|
bootstrap. Until a fresh clone runs that bootstrap, the graceful-degradation fallback
|
||||||
|
above keeps the policy working.
|
||||||
|
|
||||||
## Decision
|
## Decision
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -77,7 +77,8 @@ allocated for it.
|
||||||
- **Coordinator survival:** off-site on `askari` ⇒ mesh survives a homelab outage.
|
- **Coordinator survival:** off-site on `askari` ⇒ mesh survives a homelab outage.
|
||||||
NetBird's management datastore is backed up encrypted off `askari` (synced to
|
NetBird's management datastore is backed up encrypted off `askari` (synced to
|
||||||
`ubongo`/`mamba`); peers keep last-known config through a brief coordinator outage.
|
`ubongo`/`mamba`); peers keep last-known config through a brief coordinator outage.
|
||||||
- **`askari` is Ansible-managed:** its own inventory group, `base` role, plus a
|
- **`askari` is Ansible-managed:** its own inventory group `offsite_hosts` (added
|
||||||
|
manually like the control node — it is not Terraform-managed), `base` role, plus a
|
||||||
dedicated `netbird_coordinator` service role (one service = one role, ADR-004; with
|
dedicated `netbird_coordinator` service role (one service = one role, ADR-004; with
|
||||||
`SECURITY.md`). Agent install/enrollment lives in `base`. NetBird server + agents are
|
`SECURITY.md`). Agent install/enrollment lives in `base`. NetBird server + agents are
|
||||||
version-pinned (ADR-011). boma's `dns` role stays authoritative for
|
version-pinned (ADR-011). boma's `dns` role stays authoritative for
|
||||||
|
|
|
||||||
|
|
@ -82,7 +82,16 @@ service clears the security bar — record any conscious deviation in
|
||||||
manual in review today, with the planned `/security-review` aggregating every
|
manual in review today, with the planned `/security-review` aggregating every
|
||||||
`roles/*/SECURITY.md` to automate it.
|
`roles/*/SECURITY.md` to automate it.
|
||||||
|
|
||||||
### 10. Commit
|
### 10. Write the per-service verification spec (services)
|
||||||
|
|
||||||
|
For a **service** role, copy `docs/testing/service-verify-template.md` to
|
||||||
|
`roles/<rolename>/VERIFY.md` and fill it in: the critical user journeys that define
|
||||||
|
"working" for this service, what good looks like, what is not browser-verifiable
|
||||||
|
(→ manual handoff), and the test data needed. This is the per-service backbone for the
|
||||||
|
Level 4 `/verify-service` check (ADR-008 / ADR-017) and is part of the pre-production
|
||||||
|
service-clearance gate (`docs/security/service-checklist.md`).
|
||||||
|
|
||||||
|
### 11. Commit
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
git checkout -b role/<rolename>
|
git checkout -b role/<rolename>
|
||||||
|
|
|
||||||
|
|
@ -15,13 +15,15 @@ Expected Terraform output shape:
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
Valid groups: control, docker_hosts, proxmox_hosts
|
Valid groups: control, docker_hosts, proxmox_hosts, offsite_hosts
|
||||||
|
(control and offsite_hosts hold manually-provisioned hosts not in Terraform; they
|
||||||
|
are valid so their sections appear in the generated inventory — see ADR-009.)
|
||||||
"""
|
"""
|
||||||
|
|
||||||
import json
|
import json
|
||||||
import sys
|
import sys
|
||||||
|
|
||||||
VALID_GROUPS = {"control", "docker_hosts", "proxmox_hosts"}
|
VALID_GROUPS = {"control", "docker_hosts", "proxmox_hosts", "offsite_hosts"}
|
||||||
|
|
||||||
|
|
||||||
def main() -> None:
|
def main() -> None:
|
||||||
|
|
|
||||||
Loading…
Add table
Reference in a new issue