wifi-densepose/docs/adr/ADR-162-plugin-security-and...

11 KiB

ADR-162: HOMECORE Plugin Security (Signature + Capability Isolation) & Bounded Automation RunModes — Making ADR-161's Deferred Claims TRUE

  • Status: accepted
  • Date: 2026-06-12
  • Deciders: ruv
  • Tags: homecore, homecore-plugins, homecore-automation, plugin-security, wasm-signature-verification, ed25519, capability-isolation, runmode, prove-everything, soundness, honest-labeling
  • Amends: ADR-161 (relabelled P4/P5 + §A5 deferrals → now enforced), ADR-128 (plugin manifest), ADR-129 (automation engine)

Context

Beyond-SOTA sweep Milestone 8, scoped to homecore-plugins and homecore-automation only, under the project's prove-everything / anti-"AI-slop" directive.

ADR-161 (Milestone 7) did the honest thing with three plugin/automation items it could not finish in that window: rather than fake them, it relabelled them as deferred

  • P4 (plugin signature verification): the manifest's wasm_module_hash / wasm_module_sig / publisher_key were re-doc'd "(P4 — not yet enforced, ADR-161/B5)" — parsed and round-tripped, but never checked before a plugin runs.
  • P5 (plugin authority isolation): homecore_permissions claims were parsed but never consulted; hc_state_set let any plugin write any entity, including lock.* / alarm_control_panel.*.
  • §A5 (RunMode): Single/Parallel were honored; Restart/Queued/ max: N were honestly documented as still unbounded-parallel.

Headline — the deferred security items are now ENFORCED + TESTED

M8 turns those honest deferrals into real, tested behavior. The plugin trust boundary is now sound (a tampered module, an untrusted publisher, or an unsigned module is rejected by the secure default), an over-privileged plugin write is denied with a typed error, and the bounded run-modes actually bound. Every fix is pinned by a test that FAILS on the pre-M8 code — each of the three RunMode tests was additionally run against a simulated unbounded-parallel dispatch and confirmed to panic.

The Ed25519 crypto reuses the in-repo cog-ha-matter::witness_signing pattern (same ed25519-dalek 2.x API, same deterministic-test-key convention). SHA-256 matches the sha256: prefix the manifest already declared and the cog-ha-matter cog manifest's binary_sha256 hex convention. No new external dependency tree was introduced — ed25519-dalek / sha2 / hex / base64 were already in the workspace Cargo.lock (cog-ha-matter / bfld pull them in); only new dependency edges were added to homecore-plugins.

Grading vocabulary (ADR-152 / ADR-158 / ADR-160 / ADR-161):

  • MEASURED — reproduced in this worktree, command + failing-on-old test recorded.
  • ACCEPTED-FUTURE — deliberately deferred, nothing dropped.

Decision — Fixes Landed

§P4 — Plugin signature & integrity verification (SECURITY) — MEASURED

homecore-plugins/src/manifest.rs declared wasm_module_hash / wasm_module_sig / publisher_key but they were never read for verification; the load path (wasmtime_runtime.rs) instantiated any .wasm bytes handed to it.

Real fix (src/verify.rs, wired into WasmtimeRuntime::load_plugin): before instantiation the runtime now —

  1. computes the SHA-256 of the actual .wasm bytes and rejects if it ≠ the manifest's wasm_module_hash (sha256:<hex>) — tamper detection;
  2. verifies the Ed25519 wasm_module_sig (ed25519:<base64>, 64-byte raw) over the 32-byte digest against publisher_key (ed25519:<base64>, 32-byte raw) and rejects on failure;
  3. enforces a configurable trust policyPluginPolicy::trusted(&[keys]) is an allowlist of publisher verifying keys; PluginPolicy::AllowUnsigned is an explicit dev escape hatch that LOGS a loud warn on every load it waves through. The secure default rejects unsigned and unknown-publisher modules. PluginPolicy::deny_all() trusts no publisher.

A typed PluginError::SignatureRejected is returned (no host panic). The legacy permission-free load_wasm is retained for first-party/trusted/test modules; production loading goes through load_plugin.

Failing-on-old tests (tests/integration.rs, --features wasmtime) — all drive load_plugin, which did not exist on the old code (so the gate is genuinely new):

  • p4_tampered_module_is_rejected — a byte-flipped .wasm → hash mismatch → rejected.
  • p4_valid_sig_from_trusted_key_loads — a valid sig from an allowlisted key loads.
  • p4_valid_sig_from_untrusted_key_is_rejected — a correctly-signed module from a key NOT on the allowlist is rejected.
  • p4_unsigned_module_rejected_by_default_loads_only_under_allow_unsigned — unsigned rejected under deny_all, loads (with warn) only under AllowUnsigned.
  • Unit (src/verify.rs): valid_sig_from_trusted_key_passes, tampered_module_is_rejected, valid_sig_from_untrusted_key_is_rejected, forged_signature_is_rejected, unsigned_module_rejected_under_default_policy.

A real deterministic keypair signs real .wasm bytes in the tests. The manifest doc now reads "(P4 — ENFORCED, ADR-162)". Grade: MEASURED. Milestone headline.

§P5 — Plugin authority / capability isolation (SECURITY) — MEASURED

wasmtime_runtime.rs::hc_state_set applied any write a plugin requested, ignoring the manifest's homecore_permissions.

Real fix (src/permissions.rs + hc_state_set): the manifest's homecore_permissions (the state:write:<glob> form, or a bare entity glob like light.*) are distilled into a PermissionSet installed in the plugin's Wasmtime store. The hc_state_set host import consults permissions.may_write(entity_id) before applying a write and returns a typed -3 (permission denied) to the guest on a violation — the host is not panicked. Wasmtime already gives memory isolation; this adds authority isolation. A plugin with no write grants can write nothing (secure default).

Failing-on-old tests (tests/integration.rs, --features wasmtime):

  • p5_declared_light_plugin_may_write_light_but_not_lock — a light.* plugin writes light.kitchen (succeeds) but is REJECTED (-3, and the entity is not written) when it tries lock.front_door.
  • p5_plugin_with_no_permissions_can_write_nothing — a plugin with empty homecore_permissions cannot write light.kitchen.
  • Unit (src/permissions.rs): domain-glob, exact-grant, wildcard, read-grants-don't-confer-write, no-permissions, and explicit state:write: form.

The manifest doc now reads "(P5 — ENFORCED, ADR-162)". Grade: MEASURED.

§A5 — Bounded automation RunModes (Restart / Queued / max) — MEASURED

homecore-automation/src/engine.rs (per ADR-161) honored Single/Parallel but spawned an unbounded parallel task for Restart/Queued/max.

Real fix (src/runmode.rs, a per-automation RunState the engine owns and dispatches through at all three trigger sites — event loop, timer, test hook):

  • Restart — aborts the in-flight action task via tokio::task::AbortHandle, then starts a fresh one.
  • Queued — serializes runs in arrival order via a per-automation async Mutex: sequential, never concurrent, nothing dropped.
  • max: N — caps concurrency at N via a per-automation Semaphore; triggers beyond N queue (await a permit) rather than running concurrently. (HA bounded parallel/queued semantics — chosen and documented as queue beyond N, not drop.)
  • Single/IgnoreFirst re-entrancy guard and Parallel preserved.

engine.rs trimmed to 433 lines; the run-mode machinery lives in the new runmode.rs (153 lines) to keep both under the 500-line guideline.

Failing-on-old tests (tests/engine_behaviors.rs) — each was run against a simulated unbounded-parallel dispatch and confirmed to panic:

  • restart_mode_cancels_prior_run — prior run is aborted: exactly 1 completion (old: both ran → 2).
  • queued_mode_runs_sequentially_not_concurrently — 3 rapid triggers all run, max observed concurrency = 1 (old: 3).
  • max_two_caps_concurrency_at_two — 4 rapid triggers all run, max observed concurrency ≤ 2 (old: 4).

Grade: MEASURED. Restart, Queued, and max: N all implemented — no remaining RunMode deferral.

Threat model closed

Threat Before (ADR-161) After (ADR-162)
Tampered module — attacker swaps .wasm bytes after signing loaded unconditionally (hash never checked) rejected: SHA-256 mismatch
Untrusted publisher — valid sig from a key the host doesn't trust loaded (sig/key never read) rejected: publisher_key not on allowlist
Unsigned module — no integrity material at all loaded rejected by secure default; loads only under explicit AllowUnsigned (loud warn)
Over-privileged plugin write — a light.* plugin writes lock.front_door / alarm_control_panel.* applied (permissions never consulted) denied: typed -3 to guest, write not applied
Run-mode resource exhaustionmax/Queued spawn unbounded tasks unbounded parallel bounded: Restart cancels, Queued serializes, max: N caps at N

Remaining honest deferral (Nothing Dropped)

  • Plugin-key provisioning / rotation — the host's trust allowlist (PluginPolicy::trusted) is supplied by the caller; sourcing it from the Cognitum control-plane key store (as cog-ha-matter does for Seed keys) and key rotation are ACCEPTED-FUTURE (out of M8 scope — same boundary witness_signing draws).
  • InProcessRuntime (native first-party plugins) — has no .wasm bytes to hash, so P4/P5 apply only to the WASM (wasmtime) path; native plugins remain trusted-by-compilation. Honestly noted, not over-claimed.
  • HAP real pairing (P2) — unchanged from ADR-161; out of M8 scope.

Reproduction (MEASURED)

cd v2
# P4/P5 (wasmtime feature needs rustc 1.91+; workspace pins 1.89 for the rest):
cargo +1.91.1 test -p homecore-plugins --features wasmtime
# Bounded RunModes:
cargo test -p homecore-automation --no-default-features
# Full workspace still builds (1.89 toolchain, no wasmtime):
cargo build --workspace --no-default-features

Result at time of writing (all 0 failed):

  • homecore-plugins --features wasmtime32 passed (lib 23; integration 9). (ADR-161 baseline was 15.)
  • homecore-automation --no-default-features45 passed (lib 37; engine_behaviors 8). (ADR-161 baseline was 42.)
  • Full workspace cargo build --workspace --no-default-features succeeds.

Consequences

  • A HOMECORE WASM plugin can no longer be loaded with a tampered binary, an untrusted publisher, or (by default) no signature at all — the trust boundary ADR-161/B5 honestly said was absent is now real (P4).
  • A plugin can no longer write entities outside its declared homecore_permissions; the lock/alarm escalation path is closed (P5).
  • The automation engine's Restart, Queued, and max: N run-modes are now bounded as documented — no run-mode claims a capability the code lacks.
  • No new external dependency tree (reuses the cog-ha-matter Ed25519 stack already in the lock); source files kept under the 500-line guideline (engine.rs 433, runmode.rs 153, verify.rs 397, permissions.rs 168; wasmtime_runtime.rs non-test source < 500, inline WAT tests as ADR-161 left them).