wifi-densepose

Commit Graph

Author	SHA1	Message	Date
rUv	6ee1b55896	feat: implement ADR-270 vendor provider beta (#1360 )	2026-07-19 00:09:50 -04:00
rUv	76c80c33d7	feat(hardware): add Qualcomm CSI simulator and vendor roadmap (#1359 )	2026-07-18 23:03:45 -04:00
rUv	232b1c79f6	feat(hardware): add MediaTek Filogic CSI simulator (#1358 )	2026-07-18 22:00:09 -04:00
rUv	8a5af5dad4	feat: add RTL8720F Realtek radar beta support (#1356 ) * docs(adr): plan RTL8720F radar SDK integration * feat(hardware): add Rust RTL8720F radar simulator * feat(ruview): ingest RTL8720F radar frames	2026-07-18 19:48:40 -04:00
ruv	8c0bdaef92	fix open issue release blockers	2026-07-18 18:01:23 -04:00
rjperry36	1692b16f6e	fix(sensing-server): canonicalize calibration frames to 56-tone grid (real HT40 fix) Follow-up to the calibration deadlock fix. With the status gate unstuck, maybe_feed_calibration reached feed_calibration, but a real ESP32 HT40 node streams 128-wide amplitude frames while the single-link FieldModel is the canonical 56-tone grid. LinkStats::update returned DimensionMismatch, feed_calibration bubbled it, and maybe_feed_calibration swallowed it at debug level — so frame_count stayed pinned at 0 on live hardware (presence/vitals, which read the global history, were unaffected). Resample each frame onto the model's canonical 56-tone grid via HardwareNormalizer::resample_to_canonical before feeding — the same length-only canonicalization the multistatic fusion path uses (#1170). Pinned by maybe_feed_calibration_resamples_wide_frames_and_accumulates (128-wide -> Collecting + count 1). sensing-server bin: 173 passed, 0 failed. Co-Authored-By: claude-flow <ruv@ruv.net>	2026-07-18 17:27:30 -04:00
rjperry36	4bab48e15f	fix(sensing-server): unstick empty-room field-model calibration deadlock POST /api/v1/calibration/start created the FieldModel in Uncalibrated, but field_bridge::maybe_feed_calibration only fed frames while already Collecting — and the only thing that sets Collecting is feed_calibration on its first fed frame. The two gates deadlocked: no first frame was ever fed, so calibration_frame_count stayed 0 and status never left Uncalibrated. Observed live on a streaming ESP32 node as {"status":"Uncalibrated","frame_count":0} that never advanced. - field_bridge::maybe_feed_calibration: feed while Uncalibrated \| Collecting so the first frame flips the model to Collecting and the count advances. - calibration_stop: return structured {success:false, frame_count, frames_needed} instead of an opaque 500 when finalized with too few frames. - FieldModel::min_calibration_frames() accessor for the guard above. - Regression test: maybe_feed_calibration_advances_uncalibrated_to_collecting. Presence/motion/vitals were unaffected (separate auto rolling baseline). Co-Authored-By: claude-flow <ruv@ruv.net>	2026-07-18 17:27:08 -04:00
ruv	82c1b8fdf8	chore: bump wifi-densepose-signal 0.3.5 for crates.io (#1334 ) Published 0.3.4 predates HardwareNormalizer::resample_to_canonical and MultistaticConfig::for_tdm_schedule, which the sensing-server binary uses — its publish verify fails against the registry 0.3.4. The in-repo version had not been bumped since those APIs landed. Co-Authored-By: claude-flow <ruv@ruv.net>	2026-07-14 13:31:42 -04:00
ruv	9dceb976c7	chore(publish): version rufield deps + bump worldgraph/rufield submodules (#1334 ) - wifi-densepose-rufield: add version="0.1.0" to the four rufield path deps — rufield-core/-provenance/-privacy/-fusion are now published to crates.io, making this crate (and wifi-densepose-sensing-server 0.3.4) publishable - v2/crates/worldgraph -> 4441bc0: wifi-densepose-worldgraph 0.3.2 published (adds prune_semantic_states; unblocks wifi-densepose-engine 0.3.1 publish) - vendor/rufield -> f3c1492: breaks the fusion<->adapters circular dev-dependency (path-only dev-dep, stripped at publish) Closes the crates.io publish blockers in #1334. Co-Authored-By: claude-flow <ruv@ruv.net>	2026-07-14 13:18:43 -04:00
ruv	8c1d3d772a	chore: bump wifi-densepose-engine 0.3.1, wifi-densepose-sensing-server 0.3.4 engine 0.3.1: additive StreamingEngine::set_multistatic_config (#1312) sensing-server 0.3.4: bearer-auth pose-WS exemption + EngineBridge guard config threading (#1312, #1313); no public lib API change (engine_bridge is binary-internal) Co-Authored-By: claude-flow <ruv@ruv.net>	2026-07-14 12:12:48 -04:00
ruv	83f549d308	fix: post-review fixes for PRs #1311/#1312/#1313 + CHANGELOG - sdkconfig.defaults.devkitc: header build command idf v5.2 -> v5.4 (source needs esp_driver_uart, IDF >=5.3 — same reason the PR fixed the README refs) - engine/lib.rs: separate doc comments — set_room_adapter's ADR-150 adapter/witness doc had merged into set_multistatic_config's doc - ui: apply stored bearer token at api.service.js module load instead of QuickSettings.init — services + dashboard tab dispatch their first /api/v1/* requests before initializeEnhancements() runs, so the first requests on every load went out without the Authorization header - CHANGELOG: Unreleased entries for #1308/#1309/#1310 Co-Authored-By: claude-flow <ruv@ruv.net>	2026-07-14 11:08:37 -04:00
ruv	d59ca00baa	Merge PR #1313 : exempt /api/v1/stream/pose WS from bearer auth + UI token field	2026-07-13 13:50:36 -04:00
erichkusuki	2ddb6a7b02	fix(sensing-server): exempt /api/v1/stream/pose WS from bearer auth; add UI token field Browsers cannot attach an Authorization header to a WebSocket upgrade, so with RUVIEW_API_TOKEN set the Live Demo pose stream at /api/v1/stream/pose always failed with 401 — the same reason /ws/sensing is already exempted (see bearer_auth module docs). Adds a narrow EXEMPT_PATHS list plus a regression test that the exemption does not leak to other /api/v1/* paths. Query-string tokens remain rejected (CWE-598 test untouched). Also adds an 'API Access' bearer-token field to the QuickSettings panel: ui/services/api.service.js had setAuthToken() but nothing ever called it, so enabling RUVIEW_API_TOKEN broke every /api/v1/* call from the bundled dashboard. The token is stored in localStorage and applied before the first request. Fixes #1310 Co-Authored-By: claude-flow <ruv@ruv.net>	2026-07-11 13:32:02 +02:00
erichkusuki	b5ce60081b	fix(sensing-server): wire WDP_GUARD_INTERVAL_US/WDP_TDM_* into EngineBridge EngineBridge::new built its own StreamingEngine, whose internal MultistaticFuser was hardcoded to MultistaticConfig::default() (60 ms guard) — the #1031/#1049 env overrides only reached the sibling multistatic_fuser field on AppState, so governed trust cycles failed against the default guard no matter what the deployment configured. - wifi-densepose-engine: add StreamingEngine::set_multistatic_config() - engine_bridge: EngineBridge::new takes Option<MultistaticConfig> - main.rs: thread the same env-derived config into both fusion paths Verified on a 2-node ESP32-S3 deployment: the failure message now reflects the configured guard, and with healthy nodes governed cycles ran 90 s with zero fusion errors. Fixes #1309 Co-Authored-By: claude-flow <ruv@ruv.net>	2026-07-11 13:32:01 +02:00
rUv	fca5e6f0a0	fix: multistatic canonicalization, csi_fps burst inflation, control-packet starvation (#1170 , #1180 , #1183 ) (#1193 ) #1170 — live multistatic bridge fed raw, un-canonicalized per-node CSI (64/128/192 bins) to MultistaticFuser, tripping DimensionMismatch every cycle and silently disabling fusion on mixed HT20/HT40 meshes. Add HardwareNormalizer::resample_to_canonical (resample-only, no z-score) and canonicalize every node frame onto the 56-tone grid before fusion. #1180 — update_csi_fps_ema only rejected dt<=0 or >=1s, so sub-ms UDP-burst arrivals (36us -> ~27kHz) inflated csi_fps_ema 40-840x. Add a 5ms plausibility floor and stop re-anchoring observe_csi_frame_arrival on burst deltas. #1183 — global ENOMEM backoff (CSI flood) starved <=48B/<=1Hz control packets. Add stream_sender_send_priority() bypassing the backoff gate without touching the streak; route feature_state/HEALTH/sync through it. Fix the misleading "HEALTH sent" log that printed even on rv_mesh_send failure. Verified: signal 501, sensing-server 677 tests (0 failed); firmware builds clean. Claude-Session: https://claude.ai/code/session_01AgpTcBLRJ32hUsKWxDXf36	2026-06-27 13:04:44 -04:00
rUv	315d7df09e	chore(deps): bump ruv-neural submodule — ColorMap no_std for ESP32 (#1126 ) Points to ruvnet/ruv-neural#3 (c9638fa): ruv-neural-viz::ColorMap now builds no_std, so it can run on the ESP32. Unblocks driving the onboard WS2812 from the viridis/cool-warm colormap.	2026-06-17 20:18:35 -04:00
rUv	bdd1eaf927	chore: untrack ruvector.db runtime artifacts + gitignore at any depth (#1124 ) These are hook/runtime-generated databases (ruvector/intelligence store) that kept showing as dirty and don't belong in version control. Removed from the index (files kept on disk) and ignored globally.	2026-06-17 17:49:47 -04:00
rUv	c6e7667676	Merge pull request #1104 from ruvnet/fix/issue-1049-configurable-guard fix(sensing-server): make multistatic guard interval configurable (closes #1049)	2026-06-17 09:53:23 -04:00
ruv	c84ea39e62	chore: bump ruv-neural submodule → current main (web console, closed-loop neuromod, ruvn mention) Advances the vendored ruv-neural submodule from the stale 'Initial' pin (1ece3af) to current main (81be9e1): the static web console UI, the closed-loop neuromodulation platform, repositioned README, and the @ruvnet/ruvn companion-tool mention. ruv-neural is not a v2 workspace member, so this does not affect the workspace build (cargo metadata resolves clean). Co-Authored-By: claude-flow <ruv@ruv.net>	2026-06-16 17:00:13 -04:00
ruv	9c751d0d92	chore(worldgraph): bump submodule — README + metadata polish Co-Authored-By: claude-flow <ruv@ruv.net>	2026-06-16 14:52:34 -04:00
ruv	a13e9b66cb	chore: bump ruv-drone + worldgraph submodules (LICENSE + CI polish) Co-Authored-By: claude-flow <ruv@ruv.net>	2026-06-16 14:43:10 -04:00
ruv	6db183bf3e	chore(swarm): bump ruv-drone submodule — README cleanup Co-Authored-By: claude-flow <ruv@ruv.net>	2026-06-16 14:35:06 -04:00
ruv	f65d0f79e7	chore(swarm): bump ruv-drone submodule (rescope + stray-file cleanup) Co-Authored-By: claude-flow <ruv@ruv.net>	2026-06-16 14:28:30 -04:00
ruv	7fb3b88061	chore(swarm): bump ruv-drone submodule — industrial rescope (drop ITAR/USML gating) Co-Authored-By: claude-flow <ruv@ruv.net>	2026-06-16 14:27:24 -04:00
ruv	aeac5f5543	chore(worldgraph): extract geo+worldgraph+worldmodel to ruvnet/worldgraph submodule - published as github.com/ruvnet/worldgraph (3-crate workspace, history via git-filter-repo) - replace the 3 in-tree crates with one submodule at v2/crates/worldgraph - parent workspace: drop the 3 members, exclude the submodule (it is its own workspace), repoint workspace.dependencies(worldmodel) + engine/sensing-server path-deps into it - cargo metadata resolves clean (geo/worldgraph/worldmodel consumed from the submodule) Co-Authored-By: claude-flow <ruv@ruv.net>	2026-06-16 14:14:34 -04:00
ruv	c257e67c3d	chore(swarm): extract ruview-swarm to ruvnet/ruv-drone submodule - ruview-swarm published as github.com/ruvnet/ruv-drone (history preserved via subtree split) - replace the in-tree crate with a submodule at v2/crates/ruview-swarm (branch main) - standalone repo dropped the unused wifi-densepose-core path-dep; export-control NOTICE added there - workspace member path unchanged; cargo metadata resolves ruview-swarm from the submodule Co-Authored-By: claude-flow <ruv@ruv.net>	2026-06-16 14:03:56 -04:00
ruv	c27d6cc98e	fix(sensing-server): make multistatic guard interval operator-configurable (#1049 ) Two ESP32-S3 nodes on WiFi/ESP-NOW sync drift 10-150 ms (~70 ms typ.), exceeding the 60 ms default guard → permanent trust demotion to Restricted, all pose output suppressed, 200k+ errors, no escape but a container restart. Add a direct WDP_GUARD_INTERVAL_US override (+ optional WDP_SOFT_GUARD_US) to multistatic_guard_config_from_env. Precedence (most-specific wins): direct override > WDP_TDM_SLOTS+WDP_TDM_SLOT_US schedule-derived > 60ms/20ms default. Soft band always clamped strictly below hard; malformed/zero ignored (falls back, never breaks fusion). Effective guard logged at startup. Pinned by 6 tests (multistatic_guard_config_tests). sensing-server bin tests 449 -> 455, 0 failed. Python proof PASS, hash unchanged (off signal path). Closes #1049. Co-Authored-By: claude-flow <ruv@ruv.net>	2026-06-15 13:41:43 -04:00
rUv	cafbeb1e81	fix(wasm-edge): sanitize non-finite host floats at the WASM↔host frame boundary (#1102 ) Closing beyond-SOTA security review of wifi-densepose-wasm-edge (ADR-040, ~70 edge modules). The two WASM↔host boundaries (lib.rs::on_frame/on_timer and bin/ghost_hunter.rs::on_frame) read raw IEEE-754 f32 from the csi_get_* imports with no finiteness check — the crate had zero is_finite/is_nan guards and its clamp helpers propagate NaN. A single non-finite host value latches NaN into long-lived per-module accumulators (EMA / Welford / phasor sums / anomaly baselines), after which detectors fail degraded (stuck gate state, silently-disabled checks) — silent corruption, not a crash. Add sanitize_host_f32() (non-finite -> 0.0, core-only for no_std) applied at every host_get_* float read: one chokepoint covering all downstream modules, mirroring the existing M-01 negative-n_subcarriers boundary clamp. LOW / defense-in-depth (the Tier-2 DSP firmware supplies the imports, a semi-trusted boundary). Pinned by boundary_tests::{sanitize_passes_finite_values_through, sanitize_maps_non_finite_to_zero, coherence_monitor_nan_latches_without_sanitize_but_not_with} — the last asserts on the current CoherenceMonitor that a raw NaN frame latches the smoothed score while the sanitized path stays finite. Other review dimensions attested clean with evidence (see CHANGELOG): no hot-path panics (all unwrap/expect are test-only or std-gated RVF builder), all bounds min()-clamped, all index-by-cast const-bounded or guarded, no leaking closures (no move\|\|/forget/leak), no secrets. Verified: host `cargo test --features std,medical-experimental` 672 passed / 0 failed (+3 new tests); all three wasm32-unknown-unknown release artifacts build clean (lib default no_std/panic=abort, ghost_hunter standalone-bin, medical-experimental); Python proof VERDICT PASS, hash unchanged.	2026-06-15 13:06:46 -04:00
rUv	c859f6f743	security(occworld-candle): int32-checkpoint crash + degenerate-input guards + ADR-179 (closes Milestone #9 ) (#1101 ) * fix(occworld-candle): security review fixes — int32 checkpoint crash + predict input validation Beyond-SOTA security + correctness review of wifi-densepose-occworld-candle (Milestone #9, crate 4/4 — the last ungated crate). Findings fixed: 1. HIGH (MEASURED) — checkpoint-load crash on any int32 tensor. model.rs mapped safetensors I32 -> candle DType::I64 and passed the raw int32 byte buffer (4 bytes/elem) to Tensor::from_raw_buffer(.., I64, ..). Candle derives elem_count = data.len() / dtype.size(), so the I64 path halved the count while keeping the original shape -> a tensor whose shape claims 2x its storage. Reading it PANICS (slice OOB: "range end index 6 out of range for slice of length 3") on any checkpoint containing an int32 tensor. Fixed: I32 -> DType::I32, I16 -> DType::I16 (both first-class candle dtypes). Reproduced on old code; pinned in tests/checkpoint_loading.rs. 2. LOW (MEASURED) — predict() lacked frame/batch validation at the input boundary. f_in > num_frames2 over-indexed the temporal embedding (cryptic candle "gather" error); zero frame/batch fed a zero-element tensor in. Now rejected with a clear ShapeMismatch. Pinned in tests/input_validation.rs. 3. LOW (MEASURED) — divide-by-zero panic in the public VQCodebook::encode on a rank-0 / empty-last-dim tensor (last == 0). Now fails closed with a clear error. Pinned in vqvae.rs unit tests. Dimensions confirmed clean with evidence: panic surface (no unwrap/expect/ panic in prod paths), NaN-state-poisoning (N/A — stateless engine, u8 input), unbounded-alloc/shape-data mismatch (defended upstream by safetensors:: validate), secrets (none). unsafe_code = forbid. Validation (MEASURED, Windows): crate 31/31 pass; workspace 0 failed (lone desktop api_integration "Access is denied" file-lock flake passes 21/21 in isolation); Python proof VERDICT PASS, hash f8e76f21…446f7a unchanged. Warrants ADR slot 179 (parent to author). Co-Authored-By: claude-flow <ruv@ruv.net> docs(adr): ADR-179 — occworld-candle checkpoint-load hardening (closes Milestone #9) Records the HIGH int32-checkpoint crash fix (I32→I64 dtype-widening → slice-OOB panic on load = DoS) + 2 LOW degenerate-input fixes from 5e77f47e5. Stateless engine (NaN-poisoning N/A), unsafe forbidden, safetensors validate() defends malloc upstream. occworld 31/31. Final ungated crate — Milestone #9 complete. Co-Authored-By: claude-flow <ruv@ruv.net>	2026-06-15 12:35:29 -04:00
rUv	10c813fde3	security(desktop): IPC serial-command-injection + over-broad shell capability + ADR-178 (#1100 ) * fix(security): desktop IPC serial-command-injection + over-broad shell capability (ADR-178) Beyond-SOTA security review of wifi-densepose-desktop (Tauri v2). Two real findings, each MEASURED on Windows (crate builds + tests under --no-default-features): WDP-DESK-01 (MODERATE) — serial command injection via configure_esp32_wifi. The #[tauri::command] handler concatenated webview-supplied ssid/password into newline-terminated serial commands with no validation; a \r\n let a compromised webview inject an arbitrary follow-up firmware command (reboot/erase). Added validate_wifi_credentials() enforcing WPA2 length bounds and rejecting all control characters, called fail-closed before any serial write. Pinned by 3 new tests (rejects \r\n / \n / NUL injection, rejects out-of-range, accepts valid boundaries). WDP-DESK-02 (MODERATE) — removed unused shell:allow-execute / shell:allow-open from capabilities/default.json. The Rust backend spawns processes via std::process::Command (bypassing the allowlist) and the UI only uses dialog.open; the shell perms were unused privilege granting the webview arbitrary host command execution on compromise. Regenerated capabilities.json confirms only core:default + dialog perms remain. lib tests 18 -> 21 (+3 pins), integration 21 -> 21, 0 failed. Python deterministic proof unchanged (f8e76f21...46f7a; desktop off the signal path). Co-Authored-By: claude-flow <ruv@ruv.net> * docs(adr): ADR-178 — desktop IPC injection fix + capability least-privilege Records the 2 MEASURED MODERATE fixes in feddcde9d: WDP-DESK-01 (webview ssid/password \r\n-injected arbitrary firmware serial commands → validated fail-closed) and WDP-DESK-02 (unused shell:allow-execute/open capability granted to the webview → removed). 30-command IPC surface + capability scope audited; 6 dimensions clean-with-evidence. desktop 18→21. Co-Authored-By: claude-flow <ruv@ruv.net>	2026-06-15 12:01:17 -04:00
rUv	20ad75f30c	feat(ADR-131): HOMECORE-UI dashboard + BFF gateway — review-fixed (supersedes #1082 ) (#1099 ) * feat(ADR-131): HOMECORE-UI operational dashboard + BFF gateway Complete two-tier Cognitum operator dashboard (ADR-131), served by homecore-server at /homecore, plus the single-origin BFF gateway that wires it to real backends. Front-end (zero-dep vanilla TS/JS + CSS, exact Cognitum design tokens): - All 10 panels (§4.1-4.10): dashboard, SEED fleet + detail, fleet map, entities (live WS subscribe_events, never polls), rooms, COGs, calibration wizard, events + automation builder, witness/audit, settings. - §6 UX invariants in code: first-class provenance, prominent stale/veto/ fragility, null(not-trained) vs withheld vs error, --mono everywhere, Hailo vs CPU COG distinction. - api.js calls the gateway routes in production; mock demoted to a dev-only ?demo=1 fixture (no mock in prod); typed error states. - Tests under plain node: import-graph, boot, render-smoke (22), interaction (3), prod-errors (13) — 5 files green; bundle ~137 KB (~37x smaller than HA), <2 ms/cold-render. BFF gateway (homecore-server/src/gateway.rs, compiled + tested on Rust 1.89): - /api/cal/* reverse-proxy to the calibration API (ADR-151). - GET /api/homecore/rooms with the RoomState adapter (breathing->breathing_bpm, heartbeat:null->heart_bpm:null, injected anomaly.threshold/room_id). - GET /api/homecore/cogs supervisor over /var/lib/cognitum/apps/. - GET /api/homecore/appliance from /proc + TCP service probes. - SEED-device/appliance routes return typed 503 upstream_unavailable. - cargo test -p homecore-server = 12/12; run live (curl-verified); fixed a real double-v1 proxy-URL bug found during live testing. Honest scope: W1/W2/W4/W6-appliance functional; W3/W5/W6-Hailo/federation return typed 503 (depend on services/hardware not in this repo). Co-Authored-By: claude-flow <ruv@ruv.net> * fix(homecore-ui): resolve code-review findings — SSRF guard, CORS/trace coverage, §6 honesty, crash guards Addresses the high-effort review of PR #1082: - SECURITY: cal_proxy rejects path-traversal/confused-deputy SSRF (`.`/`..` segments, backslash, %2e%2e/%2f, absolute) on raw+decoded forms → 400, before attaching the server-side calibration bearer. - CORRECTNESS: /api/homecore/* + /api/cal/* now covered by the shared CORS allowlist (build_cors_layer, exported from homecore-api) + TraceLayer — previously merged outside router()'s layers (no CORS, no tracing). - §6 HONESTY (no fabricated data): dashboard renders '—' for null metrics (not "null%"/"null°C"); cogs Hailo pill reflects the REAL appliance probe (not hardcoded "connected"); room anomaly threshold passed through / null, not a fabricated 0.5. - ROBUSTNESS: cogs asArray(hef) guards a non-array manifest field; calibration progress guards target<=0 (no NaN%/Infinity%); restart clears the poll timer. - CLEANUP: mock.js is now a cached DYNAMIC import (demo-only) — never bundled in production (§2.2). - New ui/tests/unit-fixes.mjs pins the above; ADR-131 + CHANGELOG updated. Co-Authored-By: claude-flow <ruv@ruv.net> --------- Co-authored-by: Nick Ruest <127058086+nicholas-ruest@users.noreply.github.com>	2026-06-15 11:11:19 -04:00
rUv	1df6d1e1ee	security(nvsim): guard degenerate input — config panic + NaN silent-corruption + ADR-177 (#1098 ) * fix(nvsim): guard degenerate input — config-induced panic + NaN-state poisoning Beyond-SOTA security review of the ADR-089 NV-diamond simulator (milestone #9, crate 2 of 4). Two real degenerate-input findings, each pinned fails-on-old: NVSIM-DT-01 (config panic/DoS, pipeline.rs): an external f_s_hz == 0 made dt == +Inf, dt_us saturated to u64::MAX, and `sample * dt_us` panicked with "attempt to multiply with overflow" at sample >= 2 (debug/WASM panic=abort; garbage t_us in release). Fix: sanitise dt (non-finite/non-positive -> 1 µs fallback), cap the u64 cast, and saturating_mul the timestamp. NVSIM-NAN-01 (NaN-state poisoning, digitiser.rs): a non-finite scene parameter (NaN dipole position / Inf moment / NaN loop radius) bypasses the near-field clamp (NaN < R_MIN_M is false) and yields a NaN field; at the ADC `NaN as i32` == 0 silently emitted b_pt=[0,0,0] with ADC_SATURATED CLEAR — indistinguishable from a legit zero-field reading. Fix at the funnel: adc_quantise treats any non-finite input as out-of-range -> clamps to code 0 AND raises the saturation flag, so the corruption is visible downstream. Determinism integrity, panic-free MagFrame deserialisation, and RNG seeding confirmed clean with evidence. The published cross-machine witness (cc8de9b0…93b4) is unchanged — guards only affect degenerate inputs. cargo test -p nvsim --no-default-features: 50 -> 53 passed, 0 failed. Workspace green; Python deterministic proof unchanged (f8e76f21…46f7a, nvsim off the signal proof path). Needs ADR slot 177. Co-Authored-By: claude-flow <ruv@ruv.net> * docs(adr): ADR-177 — nvsim degenerate-input hardening Records the 2 MEASURED MEDIUM fixes in `37764be55` (NVSIM-DT-01 config-induced overflow panic / WASM-abort DoS; NVSIM-NAN-01 non-finite scene param → silent fake zero-field reading with saturation flag clear) + 3 pins, and the clean-with-evidence determinism/deser/div-by-zero verdict. Cross-machine witness cc8de9b0…93b4 reproduces unchanged. Co-Authored-By: claude-flow <ruv@ruv.net>	2026-06-15 10:55:04 -04:00
rUv	4a083999e5	security(ruview-swarm): fail-closed on NaN/Inf at the swarm-comm trust boundary + ADR-176 (#1096 ) * fix(ruview-swarm): fail-closed on NaN/Inf at swarm-comm trust boundary (ADR-148) Beyond-SOTA security review of the ADR-148 drone swarm control plane found four IEEE-754 NaN/Inf fail-open / DoS bugs on data crossing the untrusted swarm-comm boundary (receive_peer_state / receive_peer_detection accept full DroneState/CsiDetection whose f64/f32 fields deserialize with no finite-check). - HIGH: failsafe::tick collision-avoidance + battery checks fail-open on NaN (NaN < threshold == false silently disabled collision avoidance / kept a NaN-battery drone Nominal). Now fails closed to EmergencyDiverge / RTH. - MED: geofence::check NaN-altitude bypass returned Safe through the point-in-polygon path. Now leading non-finite-coordinate guard -> HardBreach. - MED/DoS: antijamming FhssRadio panicked with "% 0" on an empty deserialized channels_mhz. Now len==0 early-returns (benign 0.0 sentinel). - LOW: multiview::fuse propagated a NaN victim_position into the fused "confirmed victim" location. Now requires finite confidence + position. Each fix pinned by a fails-on-old / passes-on-new test (MEASURED: old code returned Nominal/Safe or panicked). cargo test -p ruview-swarm --no-default-features: 117 -> 123 passed, 0 failed. Workspace green; Python deterministic proof unchanged (f8e76f21...46f7a, off the signal path). Documented-not-fixed (ADR slot 176): Raft AppendEntries lacks Log-Matching consistency check (topology/raft.rs); MavlinkSigner::verify uses non-constant -time tag compare + no replay-window rejection (already doc-flagged). Co-Authored-By: claude-flow <ruv@ruv.net> * docs(adr): ADR-176 — ruview-swarm NaN-fail-open safety review Records the 4 MEASURED fail-open safety bugs fixed in `f671000d7` (collision avoidance, battery RTH, geofence, anti-jamming %0 panic — all NaN/Inf defeating a safety comparison at the swarm-comm trust boundary) + 6 pins, 5 clean-with-evidence dimensions, and the 2 genuine issues deferred to a focused follow-up (Raft AppendEntries log-matching; MAVLink signer constant-time + replay window). Co-Authored-By: claude-flow <ruv@ruv.net>	2026-06-15 09:55:40 -04:00
rUv	0f64d23516	feat(bench): int8 quantization of WiFlow-STD half pose model — MEASURED trade-off (ADR-175, honest negative) (#1095 ) Sub-deliverable 8.2 of the benchmark/optimization milestone. Quantizes the 843,834-param "half" WiFlow-STD pose model (half_best.pth) to int8 two ways and MEASURES the accuracy/size trade-off vs fp32 under ONE locked normalization (ADR-173 torso-diameter PCK, upstream calculate_pck use_torso_norm=True), on the same seed-42 file-level 70/15/15 test split that produced the fp32 sweep numbers. MEASURED on ruvultra (RTX 5080, torch 2.11.0+cu128, fbgemm; clean test, torso-PCK): fp32 96.62% pck@20 99.47% pck@50 0.008981 mpjpe 3.351 MB int8 PTQ static 40.98% pck@20 94.98% pck@50 0.038262 mpjpe 1.046 MB (-55.64pp) int8 QAT (3 ep) 67.48% pck@20 98.69% pck@50 0.026548 mpjpe 1.043 MB (-29.15pp) Verdict (honest no): int8 is NOT a win at the strict PCK@20 edge target. Static PTQ collapses; QAT recovers a large share but still loses 29 pp @20 for a 3.2x size win — keep fp32/fp16 on the edge. Disclosed: QAT fake-quant val pck@20 was 83.45% but converted int8 scores 67.48% (~16pp convert_fx gap, reported honestly). Deliverables: - v2/crates/wifi-densepose-train/scripts/quantize_half_int8.py (reproducible: header carries the exact ssh command + run date; QAT primary, static PTQ fallback) - docs/adr/ADR-175-int8-quantization-half-pose-model-measured.md (MEASURED table, locked normalization, QAT-vs-PTQ labeling, verdict, reproduction, limitations) - CHANGELOG [Unreleased] ### Added entry No production Rust or signal-pipeline change. Python deterministic proof unchanged (f8e76f21a0f9852b70b6d9dd5318239f6b20cbcb4cdd995863263cecdc446f7a, bit-exact).	2026-06-15 09:16:22 -04:00
rUv	b209b8b778	ci(bench): compile-verify regression gate for v2 criterion benches + ADR-174 (#1094 ) * ci(bench): wire v2 criterion benches into CI as a compile-verify regression gate Sub-deliverable 8.3 of the benchmark/optimization milestone (needs ADR slot 174). The v2/ workspace ships 26 criterion benches across 18 crates, but benches are not part of `cargo test`, so nothing in CI compiled them and they silently rot when a public API they call changes. Add `.github/workflows/bench-regression.yml`: - bench-compile (HARD GATE): `cargo bench --workspace --no-default-features --no-run` compiles + links every default-feature bench (no measurement) plus the cir-gated cir_bench — a real, deterministic regression guard against bench bit-rot. - bench-fast-run (INFORMATIONAL, continue-on-error, never gates): runs a curated pure-CPU subset (nvsim, ruvector sketch/fusion) in criterion quick-mode and uploads logs as an artifact. No timing-regression gate, by design: wall-clock on shared GitHub runners varies 2-3x run-to-run, so a hard threshold or cross-runner `criterion --baseline` compare would manufacture false failures. The honest scope is compile-verify + informational-run; the workflow header documents the self-hosted-runner condition under which true timing-gating becomes honest. The crv-gated crv_bench is excluded because its crates.io dep ruvector-crv 0.1.1 fails to build upstream. Running the gate immediately caught one already-bit-rotted bench: wifi-densepose-mat/detection_bench failed to compile (E0063: missing field last_rssi in SensorPosition). Fixed (last_rssi: None) and re-verified. Validation (MEASURED): mat detection_bench + cir_bench + nvsim + ruvector + vitals + swarm benches compile under --no-default-features; fast subset runs; `cargo test -p wifi-densepose-mat --no-default-features` 174 passed / 0 failed; Python proof PASS, hash f8e76f21...46f7a unchanged. Co-Authored-By: claude-flow <ruv@ruv.net> * docs(adr): ADR-174 — CI bench-regression compile-verify gate Records sub-deliverable 8.3 (bench-regression.yml, committed `c4c59e085`): a hard compile-verify gate over all 26 v2 criterion benches (caught + fixed one real bit-rotted bench, mat/detection_bench E0063) + an informational fast-run. Documents the honest scope — no timing-regression gate, since shared-runner wall-clock varies 2-3x; states the self-hosted-runner condition under which timing gating becomes honest. Co-Authored-By: claude-flow <ruv@ruv.net>	2026-06-15 08:26:38 -04:00
rUv	90a88ada9a	feat(train): metric-locked PCK/MPJPE accuracy harness + ADR-173 (resolve PCK-definition ambiguity) (#1092 ) * feat(train): metric-locked PCK/MPJPE accuracy harness — resolve PCK-definition ambiguity The SOTA brief (docs/research/sota-nn-train-benchmark-brief.md §1/§3.1/§4) identifies metric ambiguity as the single biggest threat to any beyond-SOTA claim: three PCK@20 numbers (96.09% WiFlow-STD image-normalized, 81.63% AetherArena torso-PCK, 61.1% GraphPose-Fi standard PCK) cannot be lined up because each silently uses a different normalization. The project was retracted twice over this (a withdrawn 92.9% used absolute pixels, not torso). New src/accuracy.rs makes the normalizer explicit, selectable, and carried with every reported number: - PckNormalization enum: TorsoDiameter (standard MM-Fi/GraphPose-Fi hip↔hip), BoundingBoxDiagonal (looser WiFlow-STD image-normalized), AbsolutePixels(t) (retracted convention, reproducible + clearly non-comparable). - pck_at(pred, gt, vis, k, normalization) — one canonical PCK reusing the metrics_core geometric primitives (no duplicate kernel). - mpjpe(pred, gt, vis) — 2D/3D, mm. - PoseAccuracy { pck_at: BTreeMap<u8,f32>, mpjpe, normalization, n_keypoints, n_frames } via accuracy_report(frames, ks, normalization) — an unlabeled PCK number is structurally impossible. 17 hand-computed deterministic tests (no GPU, no datasets) prove the harness arithmetic, including the key proof that identical predictions score 0.50 / 1.00 / 0.75 under the three normalizations, plus graceful degenerate handling (zero torso, empty frames, NaN coords — no panic, never false-perfect). This is measurement infrastructure, NOT an accuracy claim. Public API worth an ADR — needs ADR slot 173 (parent to write). wifi-densepose-train lib 191→206, test_metrics 12→14, 0 failed; full workspace green (exit 0); Python deterministic proof unchanged (f8e76f21a0f9852b70b6d9dd5318239f6b20cbcb4cdd995863263cecdc446f7a). Co-Authored-By: claude-flow <ruv@ruv.net> * docs(adr): ADR-173 — metric-locked PCK/MPJPE accuracy harness Documents the accuracy harness (committed `3a8b2ed13`) that resolves the PCK-definition ambiguity flagged as the #1 beyond-SOTA risk in the SOTA brief (#1090): three historical numbers (96/81.6/61) used three unstated normalizations. The harness makes normalization explicit + selectable (PckNormalization enum) and every reported number carries its definition. Key proof: identical predictions → 0.50/1.00/0.75 under torso/bbox/abs. Co-Authored-By: claude-flow <ruv@ruv.net>	2026-06-15 00:41:02 -04:00
rUv	cfd0ad76cf	security(core,cli): pin CSI-deserialiser DoS-resistance + ADR-172 (clean-with-evidence) (#1091 ) * test(core,cli): pin DoS-resistance of CSI deserialisers (ADR-127 security review) Beyond-SOTA security review of wifi-densepose-core + wifi-densepose-cli. Load-bearing-question verdict: the NaN-state-poisoning bug class does NOT originate in core — core exposes no stateful accumulator (no Welford, von-Mises, IIR, voxel grid, running mean); each downstream crate rolls its own, so each fix is correctly local. Both crates confirmed clean on every reviewed dimension (panic-on-adversarial-input, NaN handling, unbounded memory, path traversal, secrets) — no production code changed. Adds 4 regression pins locking in two existing-but-untested DoS guards: - core: from_canonical_bytes shape guard (Vec::with_capacity bound) — proven to fail with `capacity overflow` when the saturating-mul guard is removed. - core: canonical decoder never panics on arbitrary/truncated bytes. - cli: parse_csi_packet rejects an oversized n_antennasn_subcarriers claim before Array2 allocation (33 MB claim in a 2 KB datagram -> None). - cli: parse_csi_packet never panics on arbitrary UDP bytes. core: 35 -> 37 lib tests; cli: 24 -> 26 tests; 0 failed. Python proof unchanged (f8e76f21…46f7a — off the signal path). Co-Authored-By: claude-flow <ruv@ruv.net> docs(adr): ADR-172 — wifi-densepose-cli + core CSI-deserialiser security review Records the clean-with-evidence verdict + 4 DoS-resistance regression pins (test-only, committed in `a1051607d`). Documents the load-bearing finding: the NaN-state-poisoning bug class does NOT originate in a shared core primitive (core exposes no stateful accumulator — MEASURED via grep), so the 3 prior downstream-local fixes are complete. Gives the wifi-densepose-cli review its own ADR slot (core portion cross-refs ADR-127 §9). Co-Authored-By: claude-flow <ruv@ruv.net>	2026-06-14 23:58:09 -04:00
rUv	5287497a4a	security(homecore-migrate): redact secret value from malformed secrets.yaml error (#1089 ) * fix(homecore-migrate): redact secret value from malformed secrets.yaml error (secret-leak) `read_secrets` wrapped serde_yaml's parse error into `MigrateError::YamlParse { source }`. serde_yaml's message for a typed-tag coercion failure embeds the offending scalar verbatim, e.g. `invalid value: string "<the-secret-value>"`. That error propagates out of `read_secrets`, is `?`-returned by the `InspectSecrets` CLI path in main.rs, and printed to stderr by anyhow — leaking a secret value despite the CLI's deliberate `<redacted>` design. Fix: secrets.yaml parse failures now map to a new redacting variant `MigrateError::SecretsParse { path, line, column }` that carries only the file path and a coarse location (from `serde_yaml::Error::location()`), never the scalar content. Other (non-secret) YAML files keep `YamlParse`. Pinned by `secrets::tests::malformed_secrets_error_never_contains_secret_value` (asserts the rendered error AND its full #[source] chain never contain the secret value; fails on the old `YamlParse` path) plus `malformed_secrets_error_reports_location` (still fail-closed + locatable). ADR-165 secret-handling rule: a secret value must never appear in output. Co-Authored-By: claude-flow <ruv@ruv.net> * docs(homecore-migrate): record secret-leak fix in ADR-165 + CHANGELOG Note the secrets.yaml error-redaction fix and the review's clean dimensions (read-only source / no traversal / no panic / fail-closed versioning / no injection) in ADR-165 §2.4, bump the test-evidence count 19→21 in §2.6, and add an [Unreleased] Security entry to CHANGELOG. Co-Authored-By: claude-flow <ruv@ruv.net>	2026-06-14 23:09:55 -04:00
rUv	bf1dfe79fd	fix(homecore core): TOCTOU race dropped/reordered state_changed events under concurrent writers (~93k→0) + 2 fail-closed hardenings (#1087 ) * fix(homecore): atomic state set — close TOCTOU lost/reordered state_changed events StateMachine::set did get() (release shard lock) → compute next + no-op decision → insert() (re-acquire lock) → send(). The read-modify-write was not atomic w.r.t. a concurrent writer on the same entity: a writer that read a stale `old` could mis-classify a real transition as a no-op and drop its state_changed event (a missed automation trigger) or fire an event whose new_state duplicated the previously delivered one (a spurious trigger for any automation keyed on old_state != new_state). ADR-127 §2.1 promises "writer atomically replaces the map entry"; the implementation did not. Fix: hold the DashMap shard write-lock across the whole read→decide→insert→ fire sequence via entry()/insert_entry(). tx.send is non-blocking, non-async, and never re-enters the map, so firing under the shard lock cannot deadlock and keeps global event order in lock-step with global commit order. Pinned by concurrent_set_fires_no_duplicate_adjacent_events: 4 writers toggling one entity A/B; asserts no two consecutive fired events carry the same new_state (impossible under correct serialisation). Fails reliably on the old code (~365-476 duplicate-adjacent events on the first trial), passes on the fix across repeated runs. Co-Authored-By: claude-flow <ruv@ruv.net> * harden(homecore): bound entity_id length — close memory-DoS at the REST boundary homecore-api/src/rest.rs parses untrusted path segments straight through EntityId::parse (get/delete/set_state). With no length cap, an otherwise-valid id like "a." + many MB of [a-z0-9_] was accepted; a POST /api/states/<giant> would persist it into the DashMap state store, permanently growing memory (amplification across distinct ids). Fix: reject ids longer than MAX_ENTITY_ID_LEN (255, HA-compatible) up front in parse(), before any per-char scan, with a new EntityIdError::TooLong. Fails closed at the boundary type so every caller (REST, registry deserialize, automation) is protected. Pinned by entity_id_length_boundary: exactly-MAX accepted, MAX+1 rejected, 4 MiB id rejected as TooLong. Fails on old code (oversized parses Ok). Co-Authored-By: claude-flow <ruv@ruv.net> * harden(homecore): isolate panicking service handlers (catch_unwind) ServiceRegistry::call already ran handlers outside the registry lock (the Arc<dyn ServiceHandler> is cloned out of the read guard first), so a panic could never poison the RwLock or block other callers — good. But a panicking handler unwound through call() into the caller's task; the task driving the engine (e.g. an axum request handler invoking a service) could be aborted by one buggy integration. Fix: wrap the handler future in AssertUnwindSafe + FutureExt::catch_unwind and convert a panic into ServiceError::HandlerPanicked. Mirrors HA isolating service-handler exceptions. The registry stays fully usable afterwards. Pinned by panicking_handler_is_isolated_and_registry_survives: the panicking call returns HandlerPanicked (not an unwind), a sibling healthy service still returns its value, and the bad service remains registered. Fails on old code (the await point panics instead of returning Err). Co-Authored-By: claude-flow <ruv@ruv.net> * test(homecore): pin event-bus lag safety (bounded broadcast, no DoS) Documents-with-evidence that the core EventBus does NOT have the homecore-api WS broadcast-lag failure: with EVENT_CHANNEL_CAPACITY=4096, firing 3x capacity while a subscriber never drains keeps fire_* non-blocking (publisher never waits on slow receivers), gives the slow receiver a recoverable Lagged(n) (drop-oldest + re-sync) rather than a closed channel, and leaves the bus live for a fresh fast subscriber. No code change — pins the clean dimension. Co-Authored-By: claude-flow <ruv@ruv.net> * docs(homecore): record ADR-127 §9 security+concurrency review + CHANGELOG Documents the three pinned fixes (HC-RACE-01 state-set TOCTOU, HC-EID-LEN-01 entity_id memory-DoS, HC-SVC-PANIC-01 service-handler isolation) and the clean dimensions (bounded event-bus lag handling, lock discipline / no lock-across-await, no panic-on-input) with their evidence. Co-Authored-By: claude-flow <ruv@ruv.net>	2026-06-14 22:28:05 -04:00
rUv	9b126e927e	harden(assist security): bound untrusted utterance (DoS); cmd-injection/ReDoS/NaN/fail-open all proven clean with evidence (#1086 ) * fix(homecore-assist): bound untrusted utterance length, fail closed (ADR-133 security) The intent recognizers accept utterances from untrusted callers (voice transcripts, the WebSocket `assist` command). Neither the regex nor the semantic path bounded utterance length, so a pathological multi-megabyte utterance forced an unbounded `to_lowercase()` clone plus a per-registered- pattern scan (and, in the semantic path, full tokenisation + feature-hash embedding) — an allocation/CPU amplification on attacker-controlled input. The `regex` crate is linear-time (no catastrophic backtracking), so this was a throughput/memory DoS rather than a hang, but it was still unbounded. Fix: introduce MAX_UTTERANCE_BYTES (4 KiB — far above any real spoken command) and check it at both recognizer boundaries BEFORE any allocation or scan. An over-length utterance fails closed: Ok(None) (no intent, no action), identical to an unrecognised phrase. No legitimate command is affected. Pinned by fails-on-old tests: - recognizer::over_length_utterance_fails_closed — an over-length utterance that contains a valid command resolves to None (would have matched before) - semantic_recognizer::over_length_utterance_fails_closed_semantic Co-Authored-By: claude-flow <ruv@ruv.net> * test(homecore-assist): pin clean security dimensions with evidence (ADR-133) Adds regression tests documenting the dimensions reviewed and found clean, so the properties cannot silently regress: - runner: no subprocess surface exists. RufloRunnerOpts.{script_path,env} are inert and never executed; even a hostile script_path/env spawns nothing. And the entity_id capture class [a-z0-9_ .] strips every shell metacharacter, so a resolved slot can never carry ; \| & $ ` / etc into a (future) argv — sanitisation by construction. (shell_metachars_never_survive_into_a_resolved_slot, runner_opts_are_inert_no_process_spawned) - recognizer: the regex crate is a linear-time finite automaton; a classic catastrophic-backtracking shape (a+)+$ on adversarial input completes in bounded time — no ReDoS. (pathological_backtracking_pattern_completes_in_bounded_time) - embedding: embeddings are structurally finite (FNV feature-hash + guarded L2 normalise, no external float input, no unguarded division), so a crafted utterance cannot inject NaN/Inf to poison cosine k-NN; cosine against the zero vector is a finite 0.0, never NaN. (embeddings_are_structurally_finite, cosine_with_zero_vector_is_finite_not_nan, empty_utterance_against_empty_index_no_panic_no_match) - pipeline: injection-shaped utterances never deliver a metacharacter into a service call; the worst case resolves to a clean entity token, and an unrecognised utterance fails closed to not_understood (no action). (pipeline_injection_shaped_utterance_carries_no_metachars_to_service) Co-Authored-By: claude-flow <ruv@ruv.net> * docs(homecore-assist): record ADR-133 security review (HC-ASSIST-01 + clean dims) CHANGELOG [Unreleased] Security entry + ADR-133 section 6 review notes for the homecore-assist voice/intent pipeline review. Co-Authored-By: claude-flow <ruv@ruv.net>	2026-06-14 21:34:38 -04:00
rUv	41bee64593	fix(recorder): bound history query (memory-DoS) + add missing transactional purge (disk-DoS); SQL-injection & NaN dims clean (#1084 ) * fix(homecore-recorder): bound history query + add transactional purge (memory-DoS + disk-DoS) Security review of the HA-compat state recorder (ADR-132) found two real bounding bugs; SQL-injection and NaN-index dimensions confirmed clean. (1) Memory-DoS: get_state_history carried no LIMIT — a wide [since,until] window over a high-frequency entity loaded an unbounded row set into a single in-memory Vec. Added LIMIT MAX_HISTORY_ROWS (1,000,000); the sibling search paths were already k-bounded. (2) Disk-DoS / documented-but-missing purge: README advertised Recorder::purge(older_than) but no retention path existed -> unbounded disk growth. Added a transactional purge with an EXCLUSIVE cutoff (idempotent, no off-by-one) that deletes old states+events and garbage-collects orphaned state_attributes blobs (dedup-shared blobs are kept until their last referencing state is gone). All three deletes run in one transaction so a mid-purge failure rolls back cleanly. Pinning tests (homecore-recorder 19->25 no-default / 25->31 ruvector, 0 failed): - malicious_entity_id_is_stored_literally_not_executed (SQL injection) - like_metacharacters_in_query_are_literal_not_wildcards (LIKE escape) - history_query_carries_a_limit_clause (memory-DoS bound) - purge_keeps_boundary_row_and_drops_older (exclusive-cutoff, true pin) - purge_gcs_orphaned_attributes_but_keeps_shared (dedup-safe GC) - purge_also_removes_old_events No behaviour change beyond the two fixes. Python deterministic proof unchanged (recorder is off the signal proof path). Co-Authored-By: claude-flow <ruv@ruv.net> * docs(homecore-recorder): record ADR-132 security review findings Add a "3a. Security review" section to ADR-132 and a CHANGELOG [Unreleased] Security entry covering the homecore-recorder review: SQL-injection and NaN-index dimensions confirmed clean with evidence (every query bound; LIKE pattern bound+escaped; SHA-256->i32->f32 embeddings always finite, empty index/k=0 probed no-panic), plus the two fixes (unbounded history LIMIT, transactional exclusive-cutoff purge with orphan-attribute GC). Co-Authored-By: claude-flow <ruv@ruv.net>	2026-06-14 21:00:52 -04:00
rUv	5bc3b634b7	fix(automation security): template-bomb DoS (100MB/11s render → fuel-bounded, HIGH) + delay panic-on-config (MEDIUM) (#1083 ) * fix(homecore-automation): bound template render to stop unbounded-expansion DoS (HC-SEC-01) A `template:` condition / value_template comes straight from user automation config and was rendered with MiniJinja's default (no instruction budget, no output cap). A single condition such as `{% for i in range(5000) %}{% for j in range(5000) %}xxxx{% endfor %}{% endfor %}` rendered a 100 MB string over ~11 s on one render call (proven empirically) — a CPU/memory denial of service, the bfld-class "unbounded expansion". Fix: - Enable MiniJinja's `fuel` feature and set a per-render instruction budget (`set_fuel(Some(1_000_000))`). A nested loop burns one unit per iteration, so the budget caps total work regardless of nesting; the attack now fails fast (~90 ms) with "engine ran out of fuel". - Reject template sources over 64 KiB before compilation (defense in depth so a pathological literal can neither compile nor emit verbatim). Legitimate HA templates (a few dozen instructions) are unaffected. Tests (fail on old — unbounded render / no rejection): - nested_loop_template_is_bounded_not_unbounded_dos - single_huge_repeat_template_is_bounded - oversized_template_source_is_rejected - legitimate_template_still_renders_within_fuel (no regression) Co-Authored-By: claude-flow <ruv@ruv.net> * fix(homecore-automation): stop crafted delay/timeout from panicking the run task (HC-SEC-02) `Action::Delay { seconds }` and `Action::WaitForTrigger { timeout_seconds }` fed the user-supplied float straight into `Duration::from_secs_f64`, which PANICS on negative, NaN, infinite, or overflowing inputs. All of those are reachable from a crafted (or simply typo'd) automation YAML — `delay: {seconds: -1}`, `.nan`, `.inf`, `1e308` — so one hostile config aborts the spawned automation task with a panic ("cannot convert float seconds to Duration: value is negative", proven empirically). Fix: a `safe_duration_from_secs` guard that saturates instead of panicking, matching Home Assistant's lenient "non-positive delay = no delay": - NaN / ±inf / negative -> Duration::ZERO - absurdly large (would overflow) -> clamped to ~100 years (MAX_DELAY_SECS) Tests (fail on old — panic = failure): - delay_negative_seconds_does_not_panic - delay_nan_seconds_does_not_panic - delay_infinite_seconds_does_not_panic - wait_for_trigger_negative_timeout_does_not_panic - safe_duration_saturates_hostile_values (incl. overflow clamp) Co-Authored-By: claude-flow <ruv@ruv.net> * docs(homecore-automation): record HC-SEC-01/02 security review (CHANGELOG + ADR-129 §8a) Document the two DoS findings (template unbounded-expansion HC-SEC-01, delay panic-on-config HC-SEC-02) and the dimensions probed clean (condition fail-closed, bounded run-modes, sandboxed read-only templates). Co-Authored-By: claude-flow <ruv@ruv.net>	2026-06-14 20:22:07 -04:00
rUv	e1f4897269	fix(geo numerical): parse_hgt underflow/inf-grid (HIGH) + haversine asin-NaN; pointcloud confirmed-robust (NaN-poisoning class, 3rd find) (#1081 ) * fix(geo numerical robustness): parse_hgt underflow panic + haversine asin-domain NaN Targeted numerical-robustness audit of wifi-densepose-geo (ADR-154-class sweep). Two real bugs, each pinned by a fails-on-old test: 1. terrain.rs parse_hgt — usize underflow panic on degenerate input. `side = sqrt(n_samples)`; for empty / sub-2x2 buffers side <= 1, so `1.0 / (side - 1)` underflows `usize` (panic "attempt to subtract with overflow" in debug; wraps to a huge value in release → garbage/inf cell_size_deg that poisons every ElevationGrid::get). A truncated HTTP body or a 404 HTML page reaches parse_hgt. Now bails with a clear error when side < 2. 2. coord.rs haversine — asin domain overflow → NaN for (near-)antipodal points. Floating rounding can push `h.sqrt()` to 1.0 + ~4e-16, and `asin(>1)` is NaN (verified: pair (-44.4994,-178.95722)→(44.49939999, 1.04278001) yields h=1.0000000000000004). A NaN distance silently breaks all downstream `<`/`>` comparisons. Clamp into [0,1] before asin. Also pins the ±90° pole-singularity (cos(lat)=0 division) as no-panic; the ENU transform itself is unchanged (no behavior change for valid inputs). Tests: wifi-densepose-geo 9→15 lib (6 new), 8 integration unchanged. 0 failed. Co-Authored-By: claude-flow <ruv@ruv.net> * test(pointcloud robustness): pin NaN-state-poisoning resistance + degenerate voxel fusion Numerical-robustness audit of wifi-densepose-pointcloud. No bug found — the crate is confirmed-robust against the proven NaN-state-poisoning class that bit calibration/vitals. This adds regression pins documenting why: 1. csi_pipeline.rs — persistent auto-accumulating state (occupancy EMA, vitals) is provably self-healing. The UDP parser only emits finite amplitudes/phases (sqrt/atan2 of i8), and even an adversarial hand-built CsiFrame with NaN/inf amplitudes+phases cannot latch non-finite state: motion_score = (NaN/100).min(1.0) → 1.0; breathing path → 0 → clamp(5,40) → 5.0; tomography EMA uses only integer rssi. The new test injects 40 poisoned frames and asserts occupancy/vitals stay finite AND the pipeline recovers to an in-range estimate afterward — so a future refactor that drops a `.min`/`.clamp` self-heal would fail this pin. 2. fusion.rs — fuse_clouds voxel averaging is div-by-zero-safe (per-voxel count >= 1 by construction). Pins empty / single-point / all-coincident inputs as no-panic with finite output. No behavior change. Tests: wifi-densepose-pointcloud 18→22 (4 new), 0 failed. Co-Authored-By: claude-flow <ruv@ruv.net> * docs(geo/pointcloud robustness): CHANGELOG + ADR-154 sibling-crate sweep note Record the wifi-densepose-geo + wifi-densepose-pointcloud numerical-robustness audit under CHANGELOG [Unreleased] → Fixed, and a sibling-crate-extension note on the ADR-154 horizon ledger (these crates are outside ADR-154's signal scope but the sweep is the same ADR-154 class). Co-Authored-By: claude-flow <ruv@ruv.net>	2026-06-14 19:37:08 -04:00
rUv	9f80b66ae3	harden(cog-ha-matter crypto): domain-separate witness signing + verify_strict (signing chain otherwise sound — P2 crypto core verified) (#1080 ) * fix(cog-ha-matter): domain-separate witness signing chain + verify_strict (ADR-116 §2.2) Crypto review of the SHA-256 + Ed25519 witness chain that ADR-262 P2 reuses. The sibling wifi-densepose-engine bug class (unframed concatenation of operator-influenceable strings into a signed digest) is ABSENT here — canonical_bytes already length-prefixes kind/payload. Two real hardening gaps fixed: - CHM-WIT-01: add a versioned domain-separation tag (WITNESS_DOMAIN_TAG = b"cog-ha-matter/witness-event/v1\0") to canonical_bytes so the witness SHA-256 preimage / Ed25519 message cannot be replayed as a message for another signing context that shares key infrastructure (notably the manifest binary_signature). Completes the engine review's "domain-tag + length-prefix" rule. Witness bytes change by design (prior on-disk hashes/sigs invalidated); no in-repo crate consumes these bytes programmatically. - CHM-WIT-02: verify_signature uses VerifyingKey::verify_strict (rejects non-canonical encodings + small-order keys) for the audit-uniqueness property. Key stays caller-pinned (not read from the event). Pinned by fails-on-old tests: canonical_bytes_is_domain_separated, canonical_bytes_starts_with_domain_tag_then_prev_hash, witness_preimage_cannot_collide_with_a_bare_manifest_digest, signature_commits_to_domain_tag_not_bare_fields; key-pinning guarded by verify_uses_strict_path_and_pins_caller_key. cog-ha-matter 64 -> 68 tests, 0 failed. Co-Authored-By: claude-flow <ruv@ruv.net> * docs(cog-ha-matter): record ADR-116 crypto review findings (CHM-WIT-01/02) CHANGELOG [Unreleased] Security entry + ADR-116 §4.1 review notes: engine-class signed-digest collision confirmed ABSENT (length-prefixing already correct), domain-separation tag added, verify_strict hardening, and the clean dimensions (verify-before-trust, key-handling, determinism, fail-closed parsing) with byte-layout evidence. Co-Authored-By: claude-flow <ruv@ruv.net>	2026-06-14 19:04:09 -04:00
rUv	02cb84e0bb	fix(vitals safety): non-finite CSI frame permanently froze breathing+HR via IIR-state poisoning (self-heal) + noise-never-Valid pin (#1079 ) * fix(vitals): self-heal IIR filters after non-finite CSI frame (ADR-021/ADR-158 §A1) The 2nd-order resonator bandpass_filter in BreathingExtractor and HeartRateExtractor latches each output y[n] into the filter state (y1/y2). A single non-finite amplitude residual from a corrupt CSI frame produced a NaN output that was written into the state. The existing extract() is_finite() guard dropped that one sample from the history buffer but never sanitized the poisoned filter state, so every subsequent output stayed NaN, was rejected too, and the sliding-window history never refilled: breathing AND heart-rate extraction went silently dead (returning None forever) until reset(). On the vitals alert path this is a safety-relevant denial of service — one bad frame stops monitoring with no error surfaced. Same class as the calibration NaN bug (ADR-154 §3) and the firmware vitals fixes (#998/#996/#987): prior hardening guarded the history boundary but not the filter-state boundary. Fix: when bandpass_filter computes a non-finite output it resets the IIR state to default and returns 0.0, so the resonator recovers on the next clean frame (the 0.0 is still dropped by the caller's finite-check, so no spurious sample enters history). Also de-magic the safety-critical HR physiological plausibility band into named HR_PLAUSIBLE_MIN_BPM/HR_PLAUSIBLE_MAX_BPM consts (value-identical 40/180 BPM). Pinned by: - breathing::tests::nan_frame_does_not_permanently_poison_filter (FAILS pre-fix) - breathing::tests::inf_mid_stream_does_not_freeze_history (FAILS pre-fix) - heartrate::tests::nan_frame_does_not_permanently_poison_filter (FAILS pre-fix) - heartrate::tests::pure_noise_is_never_reported_valid (fabricated-vital negative) - heartrate::tests::plausibility_band_constants_pinned (de-magic value pin) wifi-densepose-vitals --no-default-features: 55->60 lib tests, 0 failed. Workspace green (3370 passed, 0 failed). Python proof unchanged (vitals off the deterministic proof's signal path). Co-Authored-By: claude-flow <ruv@ruv.net> * docs(vitals): record IIR NaN/inf self-heal fix (ADR-021, CHANGELOG) Document the wifi-densepose-vitals filter-state poisoning fix in ADR-021 Implementation Notes (parallel to the firmware #998/#996/#987 robustness class) and add a CHANGELOG [Unreleased] Fixed entry. Notes the confirmed clean dimensions with evidence (flat -> None; noise -> low-confidence Unreliable, never Valid; harmonic-rich breathing -> not a confident false HR; out-of-band BPM clamped). Co-Authored-By: claude-flow <ruv@ruv.net>	2026-06-14 18:01:47 -04:00
rUv	ebfaee4437	fix(calibration): NaN-poisoning silently disabled presence specialist (Features::from_series unguarded) + de-magic (#1077 ) * fix(calibration): drop non-finite samples in Features::from_series (ADR-151) A single NaN/inf scalar sample (corrupt CSI frame) poisoned mean/variance into NaN, which — baked into a persisted PresenceSpecialist::threshold — silently disabled presence detection (every `f.variance > NaN` is false), no error raised. extract.rs is the live-inference + training feature path, yet (unlike geometry_embedding.rs) had no non-finite guard. Fix at the production boundary: filter non-finite samples before computing any statistic; an all-non-finite series degrades to Features::ZERO, same as the empty series. Value-identical for all-finite input (full_loop + existing extract tests unchanged). Pinned by two fails-on-old tests. Co-Authored-By: claude-flow <ruv@ruv.net> * refactor(calibration): de-magic specialist thresholds to named consts (ADR-151) Promote the bare default min-score literals (breathing 0.25, heartbeat 0.3) and the anomaly score scale / label cutoff (2.0× spread, > 0.5) to documented named consts. Value-identical — pinned by characterization tests asserting the consts equal the prior literals and the gate boundary (score >= floor). Co-Authored-By: claude-flow <ruv@ruv.net> * docs(calibration): record ADR-151 review — NaN fix + clean dimensions CHANGELOG [Unreleased] Security entry and ADR-151 §6.1 review note for the beyond-SOTA correctness+security review: NaN-poisoning fail-closed fix, file/path (no I/O in crate), untrusted-load, receipt/hash (absent), and the clean numerical paths — all with evidence. Co-Authored-By: claude-flow <ruv@ruv.net>	2026-06-14 17:22:20 -04:00
rUv	db3d94a313	fix(homecore-api security): auth-gate GET /api/ (was unauthenticated) + recover WS subscription on broadcast lag (#1076 ) * fix(homecore-api security): auth-gate GET /api/ (HC-API-AUTH-01, ADR-161) `rest::api_root` took no headers and unconditionally returned `200 {"message":"API running."}`, while every sibling REST route gates on `BearerAuth::from_headers`. HA's `APIStatusView` inherits `requires_auth = True`, so `/api/` must return 401 for a missing/wrong bearer — HA clients use it as a token-validation probe, so a 200 told a bad-token client its token was valid and let an unauthenticated party confirm a live endpoint. LOW severity (static body, no data leak), reported at true severity. Fix: `api_root(headers, State)` validates the bearer like `get_config`. Pinned by fails-on-old tests (200 -> assert 401): - api_root_rejects_missing_bearer - api_root_rejects_wrong_bearer guarded by api_root_accepts_correct_bearer (still 200 with valid token). Co-Authored-By: claude-flow <ruv@ruv.net> * fix(homecore-api security): recover WS subscription on broadcast lag (HC-WS-LAG-01, ADR-161) `subscribe_events`'s per-subscription task matched `Err(_) => break` on both broadcast `recv()` arms. `RecvError::Lagged(n)` (a slow consumer falling >EVENT_CHANNEL_CAPACITY=4,096 events behind) is recoverable — the bus doc says "Lagged receivers must re-sync" and HA keeps the subscription alive across a lag. The old code treated the first lag as fatal, so after an event burst the client's stream went permanently silent with no error frame — a self-inflicted event-delivery DoS under load. LOW severity. Fix: `Lagged(_) => continue` (skip dropped window, re-sync), `Closed => break`, on both the system and domain arms. Pinned by subscription_survives_broadcast_lag: subscribes, floods 6,000 filtered events past the 4,096 capacity to force a Lagged, then asserts a subsequent subscribed event is still delivered (old code: 5s timeout). Co-Authored-By: claude-flow <ruv@ruv.net> * docs(homecore-api security): record HC-API-AUTH-01 + HC-WS-LAG-01 review (ADR-161) CHANGELOG [Unreleased] Security entry + ADR-161 addendum documenting the beyond-SOTA network-API review: two LOW bugs fixed (unauthenticated GET /api/; WS subscription killed on broadcast lag) and the auth/traversal/injection/info-leak/CORS dimensions confirmed clean with evidence (no traversal surface — in-memory DashMap + EntityId allowlist; HashSet token compare, not a byte-== timing oracle). Co-Authored-By: claude-flow <ruv@ruv.net>	2026-06-14 16:48:57 -04:00
rUv	a369fbe66e	fix(bfld security): close HIGH privacy-bypass in process_to_frame (identity surface leaked despite restrictive class) + JSON-injection (#1075 ) * fix(bfld): route process_to_frame payload through PrivacyGate (ADR-141 privacy bypass) BfldPipeline::process_to_frame stamped the frame header with the active privacy class but serialized the caller-supplied BfldPayload UNCHANGED via BfldFrame::from_payload. This let a frame labeled Anonymous(2) or Restricted(3) carry the full identity-leaky compressed_angle_matrix (+ amplitude/phase proxies, csi_delta) that PrivacyGate::demote is documented and tested (privacy_gate_demote.rs) to strip at exactly those classes. A NetworkSink accepts class >= Derived(1), so such a frame would publish the beamforming angle matrix — the identity surface — across the node boundary despite its restrictive class byte. The class byte lied about payload content. Fix: after building the frame at the active class, apply PrivacyGate::demote to the same class. demote() strips sections by target-class threshold (independent of any class transition), so a same-class demote performs no class change but brings the payload into policy compliance. Research classes (Raw/Derived) keep the full payload — demote is a no-op there. Pinned by three fails-on-old tests in pipeline_to_frame.rs: - process_to_frame_at_anonymous_strips_identity_leaky_sections (FAILED pre-fix) - process_to_frame_in_privacy_mode_strips_amplitude_and_phase (FAILED pre-fix) - process_to_frame_at_derived_preserves_full_payload (guards against over-strip) The pre-existing round-trip test is updated to assert the gated payload. Co-Authored-By: claude-flow <ruv@ruv.net> * fix(bfld): JSON-escape zone_id in MQTT state-topic payload render_events emitted the zone_activity payload as format!("\"{zone}\"") with no escaping, while ha_discovery.rs already escapes operator-controlled strings via push_str_field. A zone name containing a double-quote or backslash therefore produced malformed / injectable JSON on the state topic that Home Assistant parses (e.g. zone `a"b` -> payload `"a"b"`). Fix: add json_string_literal() mirroring ha_discovery's escaping (", \, \n, \r, \t, control chars) and use it for the zone payload. Value-identical for normal zone names (living_room etc.). Pinned by zone_payload_escapes_json_metacharacters (FAILED pre-fix); the existing zone_payload_is_json_string_with_quotes still passes unchanged. Co-Authored-By: claude-flow <ruv@ruv.net> * docs(adr-141): record bfld privacy+security review findings + CHANGELOG Document the two fixed bugs (process_to_frame privacy-bypass; zone_id JSON injection) and the dimensions confirmed clean (event-field gating, witness/hash framing, fail-closed) in ADR-141, plus CHANGELOG [Unreleased] Security/Fixed entries. Co-Authored-By: claude-flow <ruv@ruv.net>	2026-06-14 16:15:42 -04:00
rUv	d2089c342a	fix(engine security): close witness domain-separation collision in governed-trust cycle + prove privacy monotonicity (#1074 ) * fix(engine): length-prefix witness fields to close domain-separation collision The BLAKE3 trust witness concatenated model_version, calibration_version, and privacy_decision boundary-to-boundary, with the variable-length evidence list lacking an explicit count. A string straddling a field boundary (e.g. a per-room adapter id absorbing the leading bytes of the calibration epoch, or a model_version absorbing a trailing evidence ref) collided with a different trust decision — silently un-distinguishing two distinct privacy-relevant inputs and defeating the ADR-137 tamper/drift audit guarantee. model_version is operator-influenceable via the adapter id (ADR-150 §3.4), so the ambiguity was reachable. Fix: domain-tag the hash and length-prefix every field (8-byte LE length), plus an explicit evidence count. Pinned by two fails-on-old tests: witness_distinguishes_model_calibration_boundary and witness_distinguishes_evidence_model_boundary. Co-Authored-By: claude-flow <ruv@ruv.net> * test(engine): pin privacy monotonicity, fail-closed boundaries; de-magic constants Review hardening for the governed-trust cycle (no behavior change): - forced_contradiction_never_relaxes_class: property test over all 5 privacy modes proving a forced contradiction only ever raises the emitted class byte (more restrictive) and a clean cycle emits exactly the base class — the ADR-141/120 information-only-removed invariant. - empty_cycle_fails_closed: a zero-frame cycle errors (fusion NoFrames), emits no SemanticState, and does not advance the cycle counter. - single_node_cycle_is_well_formed: characterizes the n=1 boundary (no mesh, no directional, base class, witness still emitted) — documents single-node sensing as a valid non-demoting mode, not a bypass. - De-magicked the engine-construction literals (coherence accept gate, ADR-143 SLAM discovery + static-anchor thresholds) into named documented consts, value-identical, pinned by engine_constants_match_prior_values. Co-Authored-By: claude-flow <ruv@ruv.net> * docs(engine-review): record witness domain-separation fix + monotonicity clean bill CHANGELOG [Unreleased] Security entry and review notes appended to ADR-137 (witness domain-separation fix) and ADR-141 (privacy monotonicity confirmed clean over all 5 modes, fail-closed boundaries pinned). Co-Authored-By: claude-flow <ruv@ruv.net>	2026-06-14 15:32:24 -04:00
rUv	df617145d6	feat(ADR-262 P3): live /api/field + /ws/field — RuView sensing speaks RuField (fail-closed egress) (#1071 ) * feat(ADR-262 P3): live RuField surface — RuView sensing speaks RuField on /api/field + /ws/field Wire the P1 `wifi-densepose-rufield` bridge into the live `wifi-densepose-sensing-server` so the governed sensing cycle emits real signed RuField `FieldEvent`s on two additive endpoints. - Cargo: add the `wifi-densepose-rufield` path dep (the single coupling point, ADR-262 §5.4 — no new RuView-internal coupling). - New `src/rufield_surface.rs` (kept out of the 8k-line main.rs): `FieldSurface` holds a dedicated ed25519 `Signer` + a bounded ring of recent events + the `/ws/field` broadcast topic; `GET /api/field` and `GET /ws/field` handlers; a standalone `router()` for isolated testing. - Signer (defers the P2 key decision, ADR-262 §8 Q1): a STANDALONE dev/sensing key from `WDP_RUFIELD_SIGNING_SEED`, else a deterministic dev default with a logged WARN. Reusing the `cog-ha-matter` Ed25519 key is the deferred P2 call — P3 does not pre-empt it. - Tap: at the ESP32 governed-trust cycle (`main.rs` ~5886 observe_cycle / ~5938 SensingUpdate build), `emit_rufield_event` joins the cycle's features/classification/signal_field with the engine's effective_class/demoted trust state into a `SensingSnapshot` and surfaces it via the bridge. Existing endpoints (`/ws/sensing` etc.) are unchanged — purely additive. - Privacy egress: `network_egress_allowed` is fail-closed for an unattended live surface — only P1/P2 leave the box; P0 raw and P3/P4/P5 (identity/biometric/aggregate) are held edge-local. A `Derived` cycle maps to P4/P5 and never surfaces. - No-phantom: `emit` drops no-presence cycles (no fabricated events). Gates (tests/rufield_surface_test.rs, tower::oneshot, 4/0): well-formed signed event (WifiCsi, P2 not P1, is_fusable, real timestamp); empty cycle → no phantom; Derived trust never surfaces; mixed stream surfaces only egress-safe events. Honesty (ADR-262 §0/§6): real plumbing on a live endpoint, NOT accuracy. Single-link CSI with its existing caveats (no validated room-coordinate accuracy); dedicated dev signing key pending the P2 ownership decision; no accuracy claim. Co-Authored-By: claude-flow <ruv@ruv.net> * docs(ADR-262 P3): mark P1+P3 implemented; document /api/field + /ws/field; CHANGELOG - ADR-262 Status → "P1 + P3 implemented"; add a P3 implementation-status block (tap site, endpoints, dedicated dev signer deferring the §8 Q1 key decision, fail-closed egress, gates). Keep the honesty framing: real plumbing on a live endpoint, not accuracy. - CHANGELOG [Unreleased]: add the ADR-262 P3 entry. - user-guide: add `/api/field` to the REST table + a "RuField surface (ADR-262 P3)" section covering `/api/field` + `/ws/field`, the fail-closed P1/P2-only egress, the WDP_RUFIELD_SIGNING_SEED dev key, and the no-accuracy honesty note. Co-Authored-By: claude-flow <ruv@ruv.net> * ci: checkout submodules everywhere + Dockerfile copies vendor/rufield Making wifi-densepose-rufield (ADR-262 bridge) a v2 workspace member means EVERY cargo-on-workspace context must have the vendor/rufield submodule present (cargo loads all member manifests). P1 only fixed the rust-tests job; this adds `submodules: recursive` to all workflow checkouts that run cargo (mqtt-integration was failing on the missing submodule manifest), and makes Dockerfile.rust COPY vendor/rufield/ to /vendor/rufield (matches the bridge's ../../../vendor/rufield path-dep under the collapsed Docker layout). update-submodules.yml left alone (it manages submodules itself). Co-Authored-By: claude-flow <ruv@ruv.net> --------- Co-authored-by: ruv <ruvnet@gmail.com>	2026-06-14 13:55:41 -04:00

1 2 3 4 5 ...

281 Commits