diff --git a/api-docs/adr/ADR-161-homecore-server-layer-security.md b/api-docs/adr/ADR-161-homecore-server-layer-security.md index 393085f5..4404c608 100644 --- a/api-docs/adr/ADR-161-homecore-server-layer-security.md +++ b/api-docs/adr/ADR-161-homecore-server-layer-security.md @@ -196,7 +196,8 @@ fields are **never read** for verification (only ever set to `None` in tests). re-doc'd **"(P4 — not yet enforced, ADR-161/B5)"** — parsed and round-tripped, but no integrity/signature check happens before a plugin runs. No verification code was added (that is P4); the doc now matches the code. -**Grade: doc-honesty (no behavior change).** +**Grade: doc-honesty (no behavior change).** *(Superseded by ADR-162 §P4: +the hash/signature gate is now implemented and enforced.)* ## Negative Results (NO-ACTION positives — audited, found correct, cited not edited) @@ -213,17 +214,23 @@ touched: ## Deferred Backlog (Nothing Dropped) -- **Plugin authority-isolation (P5)** — `homecore_permissions` claims are parsed - but not enforced at the host-call boundary. **ACCEPTED-FUTURE.** -- **Plugin signature/hash verification (P4)** — implement the +- **Plugin authority-isolation (P5)** — ~~`homecore_permissions` claims are parsed + but not enforced at the host-call boundary.~~ **DONE — ADR-162 §P5.** + `hc_state_set` now consults a `PermissionSet` distilled from the manifest; + an undeclared write returns a typed `-3` to the guest. +- **Plugin signature/hash verification (P4)** — ~~implement the `wasm_module_hash`/`wasm_module_sig`/`publisher_key` gate that B5 now honestly - says is absent. **ACCEPTED-FUTURE.** + says is absent.~~ **DONE — ADR-162 §P4.** `WasmtimeRuntime::load_plugin` now + SHA-256-checks the module, Ed25519-verifies the signature against + `publisher_key`, and enforces a `PluginPolicy` trust allowlist + (secure-default rejects unsigned/untrusted/tampered modules). - **HAP real pairing (P2)** — SRP/HKDF pairing + encrypted sessions; current bridge is an accessory-mapping surface. **ACCEPTED-FUTURE (honestly stubbed).** -- **`RunMode::Queued`/`Restart`/`max` ordering** — `Single`/`Parallel` are +- **`RunMode::Queued`/`Restart`/`max` ordering** — ~~`Single`/`Parallel` are honored; bounded queueing, restart-kill, and `max` concurrency are not yet - wired (every non-Single mode is parallel). **ACCEPTED-FUTURE** — the - `engine.rs` doc states exactly this, no over-claim. + wired (every non-Single mode is parallel).~~ **DONE — ADR-162 §A5.** Restart + aborts the in-flight task, Queued serializes via a per-automation async mutex, + and `max: N` caps concurrency via a per-automation semaphore. - **Automation YAML load-at-boot** — the engine starts empty; a YAML loader is P-next. The bin log states "0 automations registered" honestly. diff --git a/api-docs/adr/ADR-162-plugin-security-and-bounded-runmodes.md b/api-docs/adr/ADR-162-plugin-security-and-bounded-runmodes.md new file mode 100644 index 00000000..dd69042e --- /dev/null +++ b/api-docs/adr/ADR-162-plugin-security-and-bounded-runmodes.md @@ -0,0 +1,186 @@ +# ADR-162: HOMECORE Plugin Security (Signature + Capability Isolation) & Bounded Automation RunModes — Making ADR-161's Deferred Claims TRUE + +- **Status**: accepted +- **Date**: 2026-06-12 +- **Deciders**: ruv +- **Tags**: homecore, homecore-plugins, homecore-automation, plugin-security, wasm-signature-verification, ed25519, capability-isolation, runmode, prove-everything, soundness, honest-labeling +- **Amends**: ADR-161 (relabelled P4/P5 + §A5 deferrals → now enforced), ADR-128 (plugin manifest), ADR-129 (automation engine) + +## Context + +Beyond-SOTA sweep **Milestone 8**, scoped to `homecore-plugins` and +`homecore-automation` only, under the project's **prove-everything / +anti-"AI-slop"** directive. + +ADR-161 (Milestone 7) did the honest thing with three plugin/automation +items it could not finish in that window: rather than fake them, it **relabelled +them as deferred** — + +- **P4** (plugin signature verification): the manifest's `wasm_module_hash` / + `wasm_module_sig` / `publisher_key` were re-doc'd "(P4 — not yet enforced, + ADR-161/B5)" — parsed and round-tripped, but **never checked** before a + plugin runs. +- **P5** (plugin authority isolation): `homecore_permissions` claims were + parsed but **never consulted**; `hc_state_set` let any plugin write any + entity, including `lock.*` / `alarm_control_panel.*`. +- **§A5** (`RunMode`): `Single`/`Parallel` were honored; `Restart`/`Queued`/ + `max: N` were honestly documented as still **unbounded-parallel**. + +### Headline — the deferred security items are now ENFORCED + TESTED + +M8 turns those honest deferrals into real, tested behavior. The plugin trust +boundary is now sound (a tampered module, an untrusted publisher, or an +unsigned module is rejected by the secure default), an over-privileged plugin +write is denied with a typed error, and the bounded run-modes actually bound. +**Every fix is pinned by a test that FAILS on the pre-M8 code** — each of the +three RunMode tests was additionally run against a simulated unbounded-parallel +dispatch and confirmed to panic. + +The Ed25519 crypto reuses the in-repo `cog-ha-matter::witness_signing` pattern +(same `ed25519-dalek` 2.x API, same deterministic-test-key convention). SHA-256 +matches the `sha256:` prefix the manifest already declared and the +`cog-ha-matter` cog manifest's `binary_sha256` hex convention. No new external +dependency tree was introduced — `ed25519-dalek` / `sha2` / `hex` / `base64` +were already in the workspace `Cargo.lock` (cog-ha-matter / bfld pull them in); +only new dependency *edges* were added to `homecore-plugins`. + +Grading vocabulary (ADR-152 / ADR-158 / ADR-160 / ADR-161): +- **MEASURED** — reproduced in this worktree, command + failing-on-old test recorded. +- **ACCEPTED-FUTURE** — deliberately deferred, nothing dropped. + +## Decision — Fixes Landed + +### §P4 — Plugin signature & integrity verification (SECURITY) — MEASURED + +`homecore-plugins/src/manifest.rs` declared `wasm_module_hash` / +`wasm_module_sig` / `publisher_key` but they were **never read** for +verification; the load path (`wasmtime_runtime.rs`) instantiated any `.wasm` +bytes handed to it. + +**Real fix** (`src/verify.rs`, wired into `WasmtimeRuntime::load_plugin`): +before instantiation the runtime now — + +1. computes the **SHA-256** of the actual `.wasm` bytes and rejects if it ≠ the + manifest's `wasm_module_hash` (`sha256:`) — tamper detection; +2. verifies the **Ed25519** `wasm_module_sig` (`ed25519:`, 64-byte raw) + over the 32-byte digest against `publisher_key` (`ed25519:`, 32-byte + raw) and rejects on failure; +3. enforces a configurable **trust policy** — `PluginPolicy::trusted(&[keys])` + is an allowlist of publisher verifying keys; `PluginPolicy::AllowUnsigned` + is an explicit dev escape hatch that LOGS a loud `warn` on every load it + waves through. The **secure default rejects unsigned and unknown-publisher + modules.** `PluginPolicy::deny_all()` trusts no publisher. + +A typed `PluginError::SignatureRejected` is returned (no host panic). The +legacy permission-free `load_wasm` is retained for first-party/trusted/test +modules; production loading goes through `load_plugin`. + +**Failing-on-old tests** (`tests/integration.rs`, `--features wasmtime`) — all +drive `load_plugin`, which **did not exist** on the old code (so the gate is +genuinely new): +- `p4_tampered_module_is_rejected` — a byte-flipped `.wasm` → hash mismatch → rejected. +- `p4_valid_sig_from_trusted_key_loads` — a valid sig from an allowlisted key loads. +- `p4_valid_sig_from_untrusted_key_is_rejected` — a correctly-signed module from a key NOT on the allowlist is rejected. +- `p4_unsigned_module_rejected_by_default_loads_only_under_allow_unsigned` — unsigned rejected under `deny_all`, loads (with warn) only under `AllowUnsigned`. +- Unit (`src/verify.rs`): `valid_sig_from_trusted_key_passes`, `tampered_module_is_rejected`, `valid_sig_from_untrusted_key_is_rejected`, `forged_signature_is_rejected`, `unsigned_module_rejected_under_default_policy`. + +A real deterministic keypair signs real `.wasm` bytes in the tests. +The manifest doc now reads **"(P4 — ENFORCED, ADR-162)"**. **Grade: MEASURED. Milestone headline.** + +### §P5 — Plugin authority / capability isolation (SECURITY) — MEASURED + +`wasmtime_runtime.rs::hc_state_set` applied any write a plugin requested, +ignoring the manifest's `homecore_permissions`. + +**Real fix** (`src/permissions.rs` + `hc_state_set`): the manifest's +`homecore_permissions` (the `state:write:` form, or a bare entity glob +like `light.*`) are distilled into a `PermissionSet` installed in the plugin's +Wasmtime store. The `hc_state_set` host import consults +`permissions.may_write(entity_id)` before applying a write and returns a typed +`-3` (permission denied) to the guest on a violation — **the host is not +panicked.** Wasmtime already gives memory isolation; this adds **authority** +isolation. A plugin with **no** write grants can write nothing (secure default). + +**Failing-on-old tests** (`tests/integration.rs`, `--features wasmtime`): +- `p5_declared_light_plugin_may_write_light_but_not_lock` — a `light.*` plugin writes `light.kitchen` (succeeds) but is REJECTED (`-3`, and the entity is not written) when it tries `lock.front_door`. +- `p5_plugin_with_no_permissions_can_write_nothing` — a plugin with empty `homecore_permissions` cannot write `light.kitchen`. +- Unit (`src/permissions.rs`): domain-glob, exact-grant, wildcard, read-grants-don't-confer-write, no-permissions, and explicit `state:write:` form. + +The manifest doc now reads **"(P5 — ENFORCED, ADR-162)"**. **Grade: MEASURED.** + +### §A5 — Bounded automation RunModes (Restart / Queued / max) — MEASURED + +`homecore-automation/src/engine.rs` (per ADR-161) honored `Single`/`Parallel` +but spawned an unbounded parallel task for `Restart`/`Queued`/`max`. + +**Real fix** (`src/runmode.rs`, a per-automation `RunState` the engine owns and +dispatches through at all three trigger sites — event loop, timer, test hook): +- **Restart** — aborts the in-flight action task via `tokio::task::AbortHandle`, then starts a fresh one. +- **Queued** — serializes runs in arrival order via a per-automation async `Mutex`: sequential, never concurrent, nothing dropped. +- **max: N** — caps concurrency at N via a per-automation `Semaphore`; triggers beyond N **queue** (await a permit) rather than running concurrently. (HA bounded `parallel`/`queued` semantics — chosen and documented as *queue beyond N*, not drop.) +- `Single`/`IgnoreFirst` re-entrancy guard and `Parallel` preserved. + +`engine.rs` trimmed to **433 lines**; the run-mode machinery lives in the new +`runmode.rs` (153 lines) to keep both under the 500-line guideline. + +**Failing-on-old tests** (`tests/engine_behaviors.rs`) — each was run against a +simulated unbounded-parallel dispatch and confirmed to panic: +- `restart_mode_cancels_prior_run` — prior run is aborted: exactly **1** completion (old: both ran → 2). +- `queued_mode_runs_sequentially_not_concurrently` — 3 rapid triggers all run, **max observed concurrency = 1** (old: 3). +- `max_two_caps_concurrency_at_two` — 4 rapid triggers all run, **max observed concurrency ≤ 2** (old: 4). + +**Grade: MEASURED. Restart, Queued, and `max: N` all implemented — no remaining RunMode deferral.** + +## Threat model closed + +| Threat | Before (ADR-161) | After (ADR-162) | +|--------|------------------|-----------------| +| **Tampered module** — attacker swaps `.wasm` bytes after signing | loaded unconditionally (hash never checked) | rejected: SHA-256 mismatch | +| **Untrusted publisher** — valid sig from a key the host doesn't trust | loaded (sig/key never read) | rejected: publisher_key not on allowlist | +| **Unsigned module** — no integrity material at all | loaded | rejected by secure default; loads only under explicit `AllowUnsigned` (loud warn) | +| **Over-privileged plugin write** — a `light.*` plugin writes `lock.front_door` / `alarm_control_panel.*` | applied (permissions never consulted) | denied: typed `-3` to guest, write not applied | +| **Run-mode resource exhaustion** — `max`/`Queued` spawn unbounded tasks | unbounded parallel | bounded: Restart cancels, Queued serializes, `max: N` caps at N | + +## Remaining honest deferral (Nothing Dropped) + +- **Plugin-key provisioning / rotation** — the host's trust allowlist + (`PluginPolicy::trusted`) is supplied by the caller; sourcing it from the + Cognitum control-plane key store (as `cog-ha-matter` does for Seed keys) and + key rotation are **ACCEPTED-FUTURE** (out of M8 scope — same boundary + `witness_signing` draws). +- **`InProcessRuntime` (native first-party plugins)** — has no `.wasm` bytes to + hash, so P4/P5 apply only to the WASM (`wasmtime`) path; native plugins remain + trusted-by-compilation. Honestly noted, not over-claimed. +- **HAP real pairing (P2)** — unchanged from ADR-161; out of M8 scope. + +## Reproduction (MEASURED) + +```bash +cd v2 +# P4/P5 (wasmtime feature needs rustc 1.91+; workspace pins 1.89 for the rest): +cargo +1.91.1 test -p homecore-plugins --features wasmtime +# Bounded RunModes: +cargo test -p homecore-automation --no-default-features +# Full workspace still builds (1.89 toolchain, no wasmtime): +cargo build --workspace --no-default-features +``` + +Result at time of writing (all 0 failed): +- **homecore-plugins** `--features wasmtime` — **32 passed** (lib 23; integration 9). (ADR-161 baseline was 15.) +- **homecore-automation** `--no-default-features` — **45 passed** (lib 37; `engine_behaviors` 8). (ADR-161 baseline was 42.) +- Full workspace `cargo build --workspace --no-default-features` succeeds. + +## Consequences + +- A HOMECORE WASM plugin can no longer be loaded with a tampered binary, an + untrusted publisher, or (by default) no signature at all — the trust boundary + ADR-161/B5 honestly said was absent is now real (P4). +- A plugin can no longer write entities outside its declared + `homecore_permissions`; the lock/alarm escalation path is closed (P5). +- The automation engine's `Restart`, `Queued`, and `max: N` run-modes are now + bounded as documented — no run-mode claims a capability the code lacks. +- No new external dependency tree (reuses the cog-ha-matter Ed25519 stack + already in the lock); source files kept under the 500-line guideline + (`engine.rs` 433, `runmode.rs` 153, `verify.rs` 397, `permissions.rs` 168; + `wasmtime_runtime.rs` non-test source < 500, inline WAT tests as ADR-161 left + them).