The module-level doc comment in matter/bridge.rs had a 4-space-indented
ASCII tree diagram. Rustdoc parses any 4-space-indented block in a doc
comment as a Rust code block (markdown indented-code-block syntax) and
runs it as a doctest. The tree text isn't valid Rust → doctest fails.
This broke the Rust Workspace Tests workflow on PR #778:
test crates/.../src/matter/bridge.rs - matter::bridge (line 6) ... FAILED
test result: FAILED. 0 passed; 1 failed
error: doctest failed, to rerun pass `-p ... --doc`
Wrapping the tree in a `text` fenced block tells rustdoc to render but
not compile it.
Verified locally:
cargo test -p wifi-densepose-sensing-server --no-default-features --doc
test result: ok. 0 passed; 0 failed; 1 ignored
Refs PR #778, issue #776.
Co-Authored-By: claude-flow <ruv@ruv.net>
state_messages_published_on_snapshot_broadcast was failing under CI
(0/3 → 2/3 passes; only this one red). Root cause: the test waited
only 700ms after spawn(publisher) before sending the first
VitalsSnapshot through the broadcast channel, and used a 3s capture
window after a 200ms inter-snapshot delay.
What's actually happening on the wire during those 700ms:
1. rumqttc::AsyncClient::new() returns immediately (connection is
lazy — happens on first publish)
2. publisher::run() awaits publish_all_discovery() which issues 21+
QoS-1 publishes on the discovery prefix. Each is an ack-waited
round-trip — median ~800ms total on local loopback, easily
>2s on a fresh GH Actions runner with cold rustls.
3. After discovery, the run loop reaches its tokio::select! and
starts draining state_rx.
The test was sending broadcasts WHILE the publisher was still in
discovery, so the broadcast::Receiver buffer (capacity 32) was
draining without the publisher ever processing them — the publisher's
select! only polls state_rx between rumqttc events.
Fix:
- Wait 3s after spawn() (well past observed ramp-up, doubled for
CI variance)
- Send 6 snapshots in a loop with 200ms gaps (one dropped won't
tank the test)
- Capture window 8s instead of 3s (room for rate-limited publishes
to land)
Local impact: test now reliably passes against `mosquitto -c
allow_anonymous=true` on loopback in ~12s wall time. CI matrix should
pick the same green outcome.
Other two integration tests (discovery + privacy_mode) already passed
on every prior run — they only assert on discovery topics, which the
publisher emits before any state.
Refs PR #778, issue #776.
Co-Authored-By: claude-flow <ruv@ruv.net>
4 new proptest cases in matter::commissioning::tests. Each fuzzes
~256 random (passcode, discriminator) pairs per cargo-test run →
~1,024 additional commissioning-code trials per CI cycle.
## Invariants enforced under random sampling
- `manual_code_shape_invariants` — for ANY valid (passcode, disc) in
range and not in the §5.1.6.1 disallowed set, from_input MUST
produce: exactly 11 ASCII digits, Verhoeff self-consistent body+
check, 4-3-4 display form with dashes at positions 4 and 8.
- `disallowed_passcodes_always_rejected` — every passcode in the
§5.1.6.1 list MUST be rejected regardless of discriminator.
- `oversized_inputs_always_rejected` — passcode ≥ 2^27 OR
discriminator ≥ 2^12 MUST be rejected, regardless of the other
axis's value.
- `manual_code_deterministic_under_random_input` — same input always
produces same code (uses prop_assume to skip the spec-disallowed
passcodes since they'd Err out before getting to the code-equality
check).
The DISALLOWED_PASSCODES const is hoisted from the example-based
test for reuse across proptest cases.
Lib test count: 420 → 424. Effective ADR-115 fuzz coverage rises to
~3,584 fuzzed trials per CI run (1,280 wire-boundary + 1,280
semantic-bus + 1,024 commissioning).
Refs #776, PR #778.
Co-Authored-By: claude-flow <ruv@ruv.net>
5 new proptest cases in semantic:🚌:tests. Each runs ~256
iterations per cargo-test invocation → ~1,280 additional fuzzed
snapshot trials per CI run, throwing every variety of RawSnapshot
the bus can plausibly see at the 10-primitive FSM dispatch.
The `arb_snapshot()` Strategy generates RawSnapshots with:
- since_start ∈ 0..86400 s (covers warmup + 24h primitives)
- timestamp_ms full positive range
- motion deliberately ∈ -0.5..2.0 (out-of-range to test clamping)
- motion_energy ∈ -1000..10000
- breathing_rate_bpm ∈ Option<0..200>
- heart_rate_bpm ∈ Option<0..250>
- n_persons ∈ 0..10
- rssi_dbm ∈ Option<-120..0>
- vital_confidence ∈ 0..1
- local_seconds_since_midnight ∈ 0..86400 (covers bed_exit window
wrap-around test)
- active_zones ∈ random vec of [a-z]{3,8} strings
Strategy is split into two nested tuples because proptest only impls
Strategy for tuples up to length 12 (we have 13 fields).
Invariants enforced:
- `bus_tick_never_panics_on_arbitrary_snapshot` — every primitive
handles every plausible input without panic. Pathological cases
include motion=1.7, HR=Some(0.0), empty zones, NULs nowhere
(RawSnapshot doesn't carry those), and odd timestamp combinations.
- `bus_events_carry_node_id_and_ts` — no event ever emitted with
empty node_id; timestamp_ms exactly matches the input snapshot's.
- `boolean_states_always_have_reason_tags` — when `changed=true`,
the `reason.tags` MUST be non-empty. The explainability contract
is enforced at the bus boundary, not just where convenient.
- `per_tick_event_count_bounded_by_primitive_count` — bus emits ≤
10 events per tick (one per primitive). Catches double-emission
bugs where a future primitive accidentally fires twice.
- `replay_same_snapshot_is_deterministic_per_fresh_bus` — replaying
the same snapshot to N fresh buses produces the same event-kind
list every time. Catches uninitialised internal state.
Lib test count: 415 → 420 (each proptest function = 1 test slot but
fuzzes ~256 cases internally). Effective coverage rises to ~1,955
assertions per CI lib run.
Refs #776, PR #778.
Co-Authored-By: claude-flow <ruv@ruv.net>
## Status flip — ADR-115 §Status
Per maintainer ACK (#776 issue body + 13 ACK'd open questions) and the
shipped implementation in PR #778 (410 lib tests, witness bundle
VERIFIED), the MQTT track is now Accepted. The Matter SDK wiring P8b
remains Proposed pending the §9.10 deferral to v0.7.1.
ADR header table updated:
- Status: "**Accepted** (MQTT track P1-P7 + P8a + P9 + P10 shipped
2026-05-23 in PR #778, 410 lib tests, witness bundle VERIFIED) /
**Proposed** (Matter SDK wiring P8b deferred to v0.7.1 per §9.10)"
- Codename: HA-DISCO (MQTT) + HA-FABRIC (Matter) + **HA-MIND** (semantic
primitives) — the third codename always belonged in the masthead.
- Tracking issue: now points at #776 + PR #778
`docs/adr/README.md` ADR index gets an ADR-115 row in the
"Platform and UI" section with the same Accepted/Proposed split.
## Property-based fuzzing — mqtt::security
Added 5 proptest cases (each runs ~256 iterations per cargo-test
invocation, so ~1280 additional assertions per CI run):
- topic_segment_rejects_anything_with_wildcards_or_separators —
random Unicode prefix/suffix + an injected '+', '#', NUL, or '/'
MUST be rejected
- topic_segment_accepts_safe_alphabet — any string built solely from
the safe alphabet MUST be accepted
- topic_segment_always_rejects_empty — invariant across seeds
- payload_size_check_is_monotonic — every size ≤ MAX is OK, every
size > MAX errors with the exact size
- path_safety_rejects_nul_or_newline_anywhere — NUL/newline at any
offset in the path MUST be rejected
`proptest` 1.5 added as dev-dep with default features off (no
proptest-derive needed). ~3 transitive crates added, dev-only.
Total lib tests: 410 → 415 passed, 0 failed, 1 properly ignored.
Refs #776, PR #778.
Co-Authored-By: claude-flow <ruv@ruv.net>
## Two CI failures on PR #778 fixed
### 1. Rust Workspace Tests (E0601: `main` not found in mqtt_publisher)
Default `cargo build --workspace` compiles examples without forwarding
`--features mqtt`. The example had a crate-level `#![cfg(feature =
"mqtt")]` so the entire file evaporated, leaving zero `main`. Now
provides a stub `main` when the feature is off (prints a hint and
exits 2), and gates the real implementation behind `#[cfg(feature =
"mqtt")]` per-item.
Local verification:
cargo check --no-default-features --examples → clean
### 2. mqtt-integration (mosquitto never became reachable)
`eclipse-mosquitto:2.x` rejects anonymous connections by default and
GH Actions `services:` containers don't easily support volume-mounting
a custom config. Removed the service container and start mosquitto
manually in a step with an inline `allow_anonymous true` listener on
port 11883. Same wire shape, no auth (CI tests protocol behaviour,
not security — production uses mTLS per ADR §3.9).
## Benchmark numbers captured (`docs/integrations/benchmarks.md`)
Ran `cargo bench --features mqtt --bench mqtt_throughput` locally:
| Hot path | Measured | Target | Better by |
|---------------------------------------|----------|--------|-----------|
| state::event_fall encode | 259 ns | <2 µs | 7.7× |
| rate_limiter::allow_first | 49.7 ns | <100 ns| 2× |
| rate_limiter::allow_within_gap | 62.1 ns | <100 ns| 1.6× |
| privacy::decide_hr_strip | 0.24 ns | <50 ns | 208× |
| privacy::decide_presence_keep | 0.24 ns | <50 ns | 208× |
| semantic::bus_tick_all_10_primitives | 717 ns | <10 µs | 14× |
At 1 Hz publish rate per node, the entire ADR-115 hot path costs
~1 µs per node per tick on commodity hardware. A Cognitum Seed
hosting 100 nodes would burn 100 µs/sec — 0.01% load floor. Memory:
~30 KB total FSM state for 10 primitives × 100 nodes.
The numbers exceed every target by ≥1.6×, several by 100×+. No need
to optimise further for v0.7.0.
Refs #776, PR #778.
Co-Authored-By: claude-flow <ruv@ruv.net>
Ships the SDK-independent half of the Matter Bridge production work:
## `matter::bridge` — endpoint tree assembly
`build_bridge_tree(nodes) -> BridgeTree` walks a list of `(node_id,
friendly_name, [EntityKind])` tuples and produces the Matter endpoint
graph the SDK will materialise:
EP 0 (BridgedDevicesAggregator)
EP 1 (BridgedNode for "Bedroom")
EP 2 (OccupancySensor for Presence + PersonCount vendor attr)
EP 3 (OccupancySensor for SomeoneSleeping)
EP 4 (GenericSwitch for FallDetected)
EP 5 (BridgedNode for "Living") …
Key invariants enforced by tests:
- `PersonCount` collapses onto Presence's endpoint as a vendor
attribute, never gets its own endpoint
- Biometric entities (HR/BR/pose) are skipped entirely — they
never appear in the tree
- Every child endpoint carries `BasicInformation` cluster
- Endpoint IDs are monotonic + unique (verified by sort+dedup test)
- Empty node list yields just the root aggregator
- Multi-node bridges keep per-node endpoint isolation
- `endpoint(id)` lookup resolves every assigned ID
## `matter::commissioning` — setup-code generation
`SetupCodeInput::dev(passcode, discriminator) -> ManualPairingCode`
produces the 11-digit human-readable Matter pairing code that users
scan/enter into Apple Home / Google Home / HA Matter integration.
Validates against Matter Core Spec §5.1.6.1 disallowed-values list
(11111111, 12345678, 87654321, all-same-digit patterns, 0). Rejects
oversized passcode (≥2^27) and discriminator (≥2^12).
The Verhoeff check digit is computed per spec §5.1.4.1.5 — full
D/P/INV tables transcribed. The check digit appended to the body is
self-consistent (verified by a recompute-and-compare test).
`ManualPairingCode::display_4_3_4()` returns the dashed form
(`1234-567-8901`) controllers actually display.
Bit-packing is a placeholder for v0.7.0 — the chunk values are
hashed-then-mod into their decimal widths so the output is
deterministic + input-sensitive + Verhoeff-valid, but not yet
bit-perfect spec-compliant. The fully spec-compliant code (with QR
base-38 payload) lands at P8b when `rs-matter` is integrated; see
ADR-115 §9.10. This module gives the SDK layer a stable testable
contract to build against.
## Tests
- 16 cluster mapping (existing)
- 11 bridge assembly (new): aggregator root, branch-per-node,
PersonCount collapsing, HR/BR skip, BasicInformation cluster on
every endpoint, monotonic+unique IDs, total endpoint count, lookup,
multi-node isolation, empty-node list
- 11 commissioning (new): dev VID/PID defaults, disallowed-passcode
rejection (12 spec values), oversized-passcode rejection,
oversized-discriminator rejection, canonical test vectors accepted,
11-digit code always, 4-3-4 display format, determinism, sensitivity
to passcode change, sensitivity to discriminator change, Verhoeff
self-consistency, invalid-input early return
Total lib tests: **410 passed**, 0 failed, 1 properly ignored.
Refs #776, PR #778.
Co-Authored-By: claude-flow <ruv@ruv.net>
Ships the **Matter cluster + device-type mapping table** as pure Rust
types independent of any specific Matter SDK. SDK choice between
`matter-rs` and chip-tool FFI per ADR-115 §9.10 lands in P8 once
spike-validated against real controllers; this commit gives the SDK
work a stable mapping target to build against.
## What this lands
- `matter::clusters` module:
- Spec-defined constants: `CLUSTER_OCCUPANCY_SENSING` (0x0406),
`CLUSTER_SWITCH` (0x003B), `CLUSTER_BOOLEAN_STATE` (0x0045),
`CLUSTER_BRIDGED_DEVICE_BASIC_INFORMATION` (0x0039),
`DEVICE_TYPE_OCCUPANCY_SENSOR` (0x0107),
`DEVICE_TYPE_GENERIC_SWITCH` (0x000F),
`DEVICE_TYPE_AGGREGATOR` (0x000E),
`DEVICE_TYPE_BRIDGED_NODE` (0x0013),
`VENDOR_ATTR_PERSON_COUNT` (0xFFF1_0001),
`EVENT_SWITCH_MULTI_PRESS_COMPLETE` (0x06).
Values transcribed from Matter Core Spec 1.3 §A.1 + Device Library 1.3.
- `matter_mapping(EntityKind) -> Option<MatterClusterMapping>` —
single source of truth implementing ADR §3.11.1:
* Presence / zones / sleeping / room-active / meeting / bathroom
→ OccupancySensing on OccupancySensor endpoints
* Fall / bed-exit / multi-room → Switch.MultiPressComplete events
on GenericSwitch endpoints
* Distress / elderly-anomaly / no-movement → BooleanState (NOT
occupancy — keeps controllers from binding motion-light scenes
to safety alerts)
* Person count → vendor-extension attribute on shared OccupancySensor
* Fall-risk score → vendor attribute on BridgedNode endpoint
* HR / BR / pose / motion-level / motion-energy / presence-score /
RSSI → explicit `None` (no Matter cluster represents them, stay
MQTT-only per §3.11.4)
- `entity_on_matter` + `next_endpoint` helpers.
## Tests (16/16 pass, lib total now 388)
- per-entity mapping correctness for every category (occupancy /
switch event / boolean state / vendor extension / explicitly None)
- distinction between presence (OccupancySensing) and distress
(BooleanState) — critical so controllers don't bind motion scenes to
safety alerts
- `someone_sleeping` lives on its own occupancy endpoint (NOT shared
with raw presence) so controllers can wire scenes independently
- biometric channels (HR / BR / pose) explicitly verified to have
`None` mapping — they NEVER reach Matter
- exhaustiveness canary: every `EntityKind` variant hit so adding a
new variant fails the test until the matter table is updated
- spec-ID sanity: cluster IDs match Matter 1.3 published values
## Why scaffolding-first
Per maintainer decision principle (§9): preserve clean protocols,
avoid fake semantics, ship MQTT first, validate Matter second. This
module locks in the cluster mapping table now so when P8 wires
`rs-matter` (or chip-tool FFI fallback), the wire surface is already
defined and tested — only the SDK calls change, not the protocol
contract.
P8 (Matter Bridge production using matter-rs) and P9 (multi-controller
validation against Apple Home / Google Home / HA) remain on the v0.7.1
docket per §9.10.
Refs #776, PR #778.
Co-Authored-By: claude-flow <ruv@ruv.net>
## Security audit (`mqtt::security`)
New module enforcing the ADR-115 §3.9 / §7 wire-level invariants as
pure functions, callable from both the publisher hot path and the
unit-test suite:
- **Topic safety** — reject `+`, `#`, `\0`, `/` in segment-level
identifiers (node_id, client_id, zone tag). Prevents a malicious
upstream payload from injecting MQTT wildcards that would corrupt
subscription semantics.
- **Path safety** — reject NUL / newline in TLS cert / CA paths.
- **Payload-size cap** — 32 KB hard limit per publish, well below
broker defaults (most brokers cap at 256 KB). Lets the publisher
drop oversized payloads with a WARN instead of crashing.
- **Credential hygiene** — `password_via_env_only` is a canary: if
the CLI ever grows an inline `--mqtt-password` flag, this test
fails on purpose. Today we only accept `--mqtt-password-env <VAR>`.
- **STRICT_TLS upgrade** — `RUVIEW_MQTT_STRICT_TLS=1` promotes the
`PlaintextOnPublicHost` advisory from `MqttConfig::validate` to
fatal. This is the planned v0.8.0 default per ADR §9.5.
- **Discovery prefix sanity** — rejects non-alphanumeric prefixes
outside [_-/], so a malformed `--mqtt-prefix` can't escape the HA
topic namespace.
15 unit tests (mqtt::security) covering every invariant + 1
properly-`#[ignore]`d test for the env-mutating STRICT_TLS path.
## Criterion benchmarks (`benches/mqtt_throughput.rs`)
Micro-benchmarks for the MQTT + semantic hot paths:
- discovery payload generation (presence / heart_rate / fall event)
- state encoders (boolean / numeric / event)
- rate-limiter `allow()` decisions (first sample + within-gap)
- privacy `decide()` (strip HR vs keep presence)
- full bus tick across all 10 semantic primitives
Bench targets (laptop-class release build):
- discovery payload: <5 µs state encode: <2 µs
- rate limit: <100 ns privacy decide: <50 ns
- bus tick (10 prim): <10 µs
Run with `cargo bench -p wifi-densepose-sensing-server --bench
mqtt_throughput --features mqtt`. Numbers will be captured into the
witness bundle in P10.
`criterion` 0.5 added as dev-dep. `[[bench]] required-features = ["mqtt"]`
so default `cargo bench --workspace` doesn't try to build it without
rumqttc.
Lib test count: **372 passed** (357 → 372, +15 security tests).
Refs #776.
Co-Authored-By: claude-flow <ruv@ruv.net>
Adds three integration tests (`v2/crates/wifi-densepose-sensing-server/
tests/mqtt_integration.rs`) that prove the publisher works against a
real broker, gated behind `--features mqtt` + `RUVIEW_RUN_INTEGRATION=1`:
1. `discovery_topics_appear_on_broker` — spawn the publisher, subscribe
`homeassistant/#` with rumqttc, drain for 6s, assert that presence/
heart_rate/fall discovery config topics all landed with the exact
JSON shape (device_class, payload_on/off, unique_id namespace).
2. `privacy_mode_suppresses_biometric_discovery` — with
`privacy_mode=true`, biometric topics (heart_rate, breathing_rate,
pose) must NEVER appear on the wire. Semantic primitives
(someone_sleeping, etc) MUST still appear — they're inferred
states, not biometric values, per ADR-115 §3.12.3.
3. `state_messages_published_on_snapshot_broadcast` — push a
VitalsSnapshot through the broadcast channel, assert ON/OFF state
messages reach the broker.
Plus `.github/workflows/mqtt-integration.yml` — spins up Mosquitto
2.0.18 as a GH Actions service container, waits for it via
`mosquitto_pub` health probe, runs both the lib unit suite under
`--features mqtt` and the integration suite. Dumps broker logs on
failure for debugging.
Tests are SKIPPED locally unless `RUVIEW_RUN_INTEGRATION=1` is set —
default `cargo test --workspace` stays fast for developers.
Fixed an unused-import warning in `semantic::bus` (gated `Reason`
behind `#[cfg(test)]`).
Lib test count now: 357 passed across the crate (cli 6 + mqtt 45 +
semantic 66 + everything else 240 — all green under
`cargo test --no-default-features --lib`).
Refs #776.
Co-Authored-By: claude-flow <ruv@ruv.net>
Lands the remaining six §3.12 v1 primitives:
- `distress` (PossibleDistress) — EWMA baseline HR + 1.5× multiplier
+ agitated motion + no-fall + 60 s dwell → ON. Refractory 5 min
after exit. Baseline only updates when NOT active AND NOT in
candidate-distress state (low motion, HR near baseline) so a
sustained elevated HR doesn't drift the baseline up before the
dwell completes — without this guard the test would never fire.
- `elderly_anomaly` (ElderlyInactivityAnomaly) — current idle stretch
> 2× longest-observed-idle baseline. Baseline floor at 30 min so
the first day doesn't fire spuriously. 24 h refractory per resident.
- `meeting` (MeetingInProgress) — n_persons ≥ 2 + low-amplitude motion
(1–20%) + 10 min dwell → ON. 2 min exit dwell on count drop.
- `fall_risk` (FallRiskElevated) — 0–100 continuous score from
near-fall count in trailing 24 h + recent motion variance. Emits
Scalar every tick; emits Event on upward threshold crossing
(default 70).
- `bed_exit` (BedExit) — edge-triggered event: was in bed_zone, now
not, between 22:00 and 06:00 local (wrap-around window honoured).
- `multi_room` (MultiRoomTransition) — edge-triggered event: zone
exit + different zone enter within 10 s gap. Reason payload carries
from/to zone tags so HA automations can route paths.
Bus wired to dispatch all 10 primitives; `SemanticKind` enum expanded
to match. `tick()` returns up to 10 events per snapshot.
32 new tests (66 semantic + 45 mqtt + 6 cli = **117 total**):
- distress (7): does-not-fire-with-normal-HR, fires-on-sustained-
elevated-HR-with-motion, does-not-fire-during-fall, exits-when-
motion-calms-and-HR-normalises, refractory-blocks-immediate-refire,
refire-allowed-after-refractory, baseline-does-not-track-during-
active.
- elderly_anomaly (5): fires-when-idle-exceeds-2x-baseline, does-not-
fire-before-threshold, motion-clears-active-state, baseline-grows-
to-observed-max, refractory-prevents-repeat-alerts.
- meeting (4): fires-after-dwell-with-2+, does-not-fire-with-1-
person, does-not-fire-with-high-motion, exits-after-2-min-of-low-
count.
- fall_risk (5): warmup-blocks, emits-scalar-when-active, score-
grows-with-falls, emits-event-when-crossing-threshold, fall-
history-evicts-after-24h.
- bed_exit (6): fires-on-bed-to-non-bed-overnight, does-not-fire-
during-day, does-not-fire-without-prior-in-bed, warmup-blocks,
does-not-fire-when-bed-zones-unconfigured, fires-just-after-
midnight-window-start.
- multi_room (5): fires-when-zone-changes-quickly, does-not-fire-
after-long-gap, does-not-fire-on-same-zone-re-entry, warmup-blocks,
handles-simultaneous-zone-swap.
ADR-115 §3.12 inference layer now complete. Each primitive has
warmup, hysteresis, explainability tags, configurable thresholds.
Adding a v2 primitive is one file + one bus entry.
Refs #776.
Co-Authored-By: claude-flow <ruv@ruv.net>
ADR-115 §3.12 keystone. Raw signals are not the product — customers want
first-class entities like `binary_sensor.bedroom_someone_sleeping`, not a
Node-RED flow that thresholds breathing rate at night. This commit lands
the inference layer that turns the broadcast channel into 10 v1 semantic
primitives, starting with the 4 highest-leverage ones.
Modules:
- `semantic::common` — `RawSnapshot` projection, `PrimitiveState`,
`PrimitiveConfig` (thresholds matching the v1
catalog in ADR §3.12), `in_window` for time-gated
primitives, `Reason` explainability struct.
- `semantic::sleeping` — SomeoneSleeping FSM: presence + motion<5%
+ BR ∈ [8,20] bpm + 5min dwell. Exit on
presence-drop (immediate) or motion>15%
for 30s.
- `semantic::room_active` — motion >10% in 30s window → ON. Exit on
presence-drop or 10min idle.
- `semantic::bathroom` — presence + zone tagged as bathroom. Safe
in privacy mode (no biometrics in the
derivation).
- `semantic::no_movement` — presence + motion<1% for 30min → ON.
Safety-check primitive for aging-in-place.
- `semantic::bus` — single dispatch that runs all primitives
on each `RawSnapshot`, returns a list of
`SemanticEvent`s for MQTT+Matter publish.
Every primitive has:
- Warmup suppression (60s default, §3.12.4)
- Hysteresis (enter + exit thresholds different)
- Explainability via `Reason::new(&["motion<5%", "br=12bpm", ...])`
- Configurable thresholds via `PrimitiveConfig`
Test coverage (34 tests, all passing under `--no-default-features`):
- common: in_window simple + wrap-around midnight, default thresholds
match ADR catalog, Reason struct.
- sleeping (7 tests): warmup blocks, fires after dwell, no-fire on high
motion, no-fire on BR out of range, exits on presence-drop immediately,
exits on sustained motion only after 30s, brief blip does not exit.
- room_active (6 tests): warmup, fires on high+presence, no-fire without
presence, no-fire below threshold, exits on presence-drop, exits on
extended idle.
- bathroom (5 tests): fires on zone match, ignores other zones, requires
presence, warmup blocks, emits OFF on zone exit.
- no_movement (4 tests): fires after dwell, no-fire with motion, brief
motion resets timer, exits on motion.
- bus (6 tests): empty during warmup, emits room_active, emits bathroom,
multiple simultaneous primitives, event carries node_id+ts, reason
populated for HA debug.
Total cargo test count now:
cli: 6 + mqtt: 45 + semantic: 34 = 85 tests passing
P4.5b (next iteration) lands the remaining 6 primitives: distress
(HR multiple over baseline), elderly_anomaly (long-window inactivity),
meeting (multi-person dwell), fall_risk (gait instability score),
bed_exit (sleeping → presence-out between 22:00-06:00),
multi_room (track_id continuous across zones).
Refs #776.
Co-Authored-By: claude-flow <ruv@ruv.net>
Adds `mqtt` and `matter` Cargo features (default off) plus 20+ new CLI
flags wired through cli.rs per ADR-115 §3.8 / §3.10 / §3.11 / §3.12:
- MQTT (HA-DISCO): --mqtt, --mqtt-host/--mqtt-port/--mqtt-username/
--mqtt-password-env/--mqtt-client-id/--mqtt-prefix, TLS controls
(--mqtt-tls/--mqtt-ca-file/--mqtt-client-cert/--mqtt-client-key),
rate controls (--mqtt-refresh-secs, --mqtt-rate-{vitals,motion,count,
rssi,pose}, --mqtt-publish-pose).
- Privacy (ADR-106): --privacy-mode strips HR/BR/pose pre-publish.
- Matter (HA-FABRIC): --matter, --matter-setup-file, --matter-reset,
--matter-vendor-id (dev VID 0xFFF1 per §9.9), --matter-product-id.
- Semantic (HA-MIND): --semantic (default ON), thresholds/zones files,
--semantic-baseline-window-days, --no-semantic <PRIMITIVE> repeatable.
rumqttc 0.24 added as optional dep with rustls (Windows-friendly parity
with ureq in this crate). matter-rs deferred to P7 spike per §9.10.
6 unit tests cover defaults, compound flag composition, and repeatable
--no-semantic. Tests pass:
cargo test -p wifi-densepose-sensing-server --no-default-features cli::tests
6 passed; 0 failed.
Branch coordination: this work is on feat/adr-115-ha-mqtt-matter off
main, parallel to ADR-110 work on adr-110-esp32c6 (no file overlap).
Refs #776 (ADR-115 implementation tracking issue).
Co-Authored-By: claude-flow <ruv@ruv.net>
Sets up docs/research/sota-2026-05-22/ as the autonomous-research
output dir, with PROGRESS.md as the canonical 15-vector research
agenda spanning spatial intelligence, RF features, RSSI-only, and
exotic/long-horizon verticals. Cron d6e5c473 (*/10 * * * *) picks
threads from this file and self-terminates at 2026-05-22 08:00 ET.
First concrete contribution this tick — R5 subcarrier saliency:
* examples/research-sota/r5_subcarrier_saliency.py: pure-numpy port
of the count cog's Conv1d encoder + count head, computes per-
subcarrier input×gradient saliency via central-difference. 128
samples × 56 subcarriers × 2 forward passes/subcarrier ≈ ~3 s on
CPU, no GPU or framework dependency.
* docs/research/sota-2026-05-22/R5-subcarrier-saliency.md: research
note with motivation, method, novelty argument, and the first
measured ranking. Top-8 subcarriers for cog-person-count v0.0.2:
[41, 52, 30, 31, 10, 35, 2, 38]. Max/mean ratio 2.85x.
* v2/crates/cog-person-count/cog/artifacts/saliency.json: machine-
readable per-subcarrier saliency + top-K lists, so future-tick
experiments (retrain at K=8/16/32) consume it without re-running.
Key insight from the first measurement: top-8 saliency is *band-
spread* (indices span 2-52), not concentrated. This directly raises
R8's (RSSI-only) feasibility ceiling, because RSSI is a band-
aggregate — it retains the integral of a band-spread signal. First-
order estimate: RSSI-only should hit ~60% of full-CSI accuracy for
the count task. R7 (adversarial defence) inherits a concrete defender-
priority list: corroborate these 8 subcarriers across nodes.
This commit is the first of many short, focused contributions over
the next ~12 hours. PROGRESS.md is the canonical pointer for the
next tick to pick up the next thread.
* chore: stage v0.0.2 artifacts + temperature scalar for build pipeline
Stages count_v1.{safetensors,onnx,temperature,train_results.json}
ahead of the build/sign/upload step. This commit is a momentary
side-effect — the next commit will refresh the per-arch manifests
with the new binary SHAs once ruvultra finishes the cross-build.
The .temperature file holds the calibration scalar from LBFGS over the
held-out conf logits. The Rust cog will read it post-load and divide
conf_logits by it before sigmoid, exactly matching the Python eval.
* feat(cog-person-count): v0.0.2 — K-fold validated, label smoothing + early stop + temp scale
The v0.0.1 "65.1% but class-1=0%" result was an unlucky temporal split
that let a degenerate "always predict 0" classifier hit eval acc =
class-0 fraction. 5-fold stratified random CV proved the architecture
actually learns ~57.1% class-1 accuracy under fair splits — a real,
modestly useful signal.
v0.0.2 ships a retrained model that:
* **Splits randomly (seed=42) 80/20** instead of temporally — eliminates
the trailing-window-class-imbalance cheat.
* **Class-balanced sampler** (multinomial with replacement, weighted by
inverse class frequency) — per-batch expected counts are equal
regardless of dataset distribution.
* **Label smoothing 0.1** on the cross-entropy — reduces confidence
saturation that drove v0.0.1's all-or-nothing predictions.
* **Early stopping** with patience=20 — stops at epoch 29 instead of
overfitting through 400.
* **Temperature scaling** of the conf head — LBFGS fits a scalar T on
held-out conf logits; ships as a count_v1.temperature sidecar so the
Rust cog can divide conf_logits by T before sigmoid.
Numbers on the same data:
| Metric | v0.0.1 | v0.0.2 | K-fold (5x100) |
|------------------|--------|--------|----------------|
| Overall acc | 65.1% | 62.3% | 62.2% ± 1.9% |
| Class 0 acc | 100% | 86.2% | 67.4% |
| Class 1 acc | 0% | 34.3% | 57.1% ✓ |
| MAE | 0.349 | 0.377 | 0.378 |
| Spearman | 0.023 | 0.013 | 0.160 |
Class-1 accuracy 0 → 34.3% is the headline win. Net acc moves slightly
because we stopped cheating on class 0. K-fold's 57% says there's
headroom remaining; reaching it needs more independent splits (== more
data), not more training tricks.
Confidence calibration didn't move. Temperature scaling alone can't fix
a confidence head trained against a noisy argmax==truth indicator over
a 62%-accurate classifier — the head's training signal is the issue,
not its post-hoc transform. The honest fix is multi-room data (#645),
not another calibration knob.
Live on cognitum-v0 at /var/lib/cognitum/apps/person-count/ — health
reports candle-cpu backend, count = 1 (was 0 in v0.0.1) on synthetic
zero input.
Files changed:
* scripts/train-count.py — adds --k-fold (no sklearn dep, hand-rolled
stratified splits with deterministic shuffle) and --v2 paths.
* v2/.../cog/artifacts/count_v1.safetensors (392 KB, new sha
32996433…) + count_v1.onnx (16 KB) + count_v1.temperature (0.9262
scalar) + count_train_results.json (full epoch trace).
* v2/.../cog/artifacts/manifests/{arm,x86_64}/manifest.json bumped to
version 0.0.2 with the new weights_sha256 + caveats.
* docs/benchmarks/person-count-cog.md — appends a v0.0.2 section
with the K-fold diagnostic table and honest-read paragraph.
GCS:
gs://cognitum-apps/cogs/arm/cog-person-count-count_v1.safetensors
refreshed (binaries unchanged — load weights via mmap at runtime).
The arm + x86_64 manifests committed in #696 referenced the binaries
built before #697 wired the `run` subcommand. Rebuilt + re-signed +
re-uploaded to GCS, and re-deployed to cognitum-v0:
arm sha 15c2fbac…7728ea5 (3,807,456 B, up from 2,168,816 — added Tokio runtime)
x86_64 sha 051614ce…cc8388b3 (4,502,960 B, up from 2,615,528)
Both re-signed Ed25519 with COGNITUM_OWNER_SIGNING_KEY. Manifests
now match the binaries published at gs://cognitum-apps/cogs/{arm,
x86_64}/cog-person-count-* and the binary installed at
/var/lib/cognitum/apps/person-count/ on cognitum-v0.
Phase 4 of ADR-103. Adds the long-running polling loop so the cog's
fourth verb (`run`) does real work, completing the ADR-100 runtime
contract end-to-end:
cog-person-count version → "person-count 0.3.0"
cog-person-count manifest → JSON skeleton
cog-person-count health → loads weights + 1-shot infer + emit
cog-person-count run --config → long-running per-frame emit ← THIS
What ships:
* src/runtime.rs (new) — `run_loop` polls sensing_url every poll_ms,
slides a [56, 20] CSI window, runs InferenceEngine::infer, emits
publisher::person_count events. Same shape as
cog-pose-estimation::runtime — fetch_frame extracts amplitudes
from `snapshot.nodes[0].amplitude[]`, fails open on connect errors
with a WARN log rather than crashing.
* src/lib.rs — registers the runtime module.
* src/main.rs — cmd_run now loads RunConfig from a JSON file, builds
the InferenceEngine (with weights if cfg.model_path is set,
otherwise auto-discover), emits a run.started event, and hands off
to the Tokio multi-thread runtime's block_on(run_loop). Single-node
fusion is a no-op for N=1 today; v0.2.0 will append predictions
from sibling nodes and call fusion::fuse_confidence_weighted before
emit.
Verified locally:
cargo check -p cog-person-count --no-default-features → clean
cargo test -p cog-person-count → 15/15 pass (no regressions)
cargo build -p cog-person-count --release → 2.36 MB unchanged
./cog-person-count run --config bad-config.json:
line 1: {"event":"run.started","fields":{"cog":"person-count",
"sensing_url":"http://127.0.0.1:9999/...",poll_ms:100,
"model_path":"(auto-discover)"}}
line 2: WARN sensing-server fetch failed
error=Connection Failed: Connect error: actively refused
(loop alive — exits cleanly on SIGTERM, no crash, no NaN)
Also adds a "Relationship to the in-process score_to_person_count
heuristic" section to cog/README.md explaining the dual-emitter
design (sensing-server keeps emitting the PR #491 slot heuristic;
the cog runs out-of-process and emits person.count events from the
learned model). Operators choose by installing the cog or not — no
sensing-server rebuild required.
ADR-103 §"Migration" status:
1. Land ADR + scaffold ........... done (#693, #694)
2. Train count_v1 ................ done (#695)
3. Cross-compile + sign + GCS .... done (#696)
4. Server-side wiring ............ done — out-of-process design
means no rewire needed; this
cog is the wiring.
5. v0.2.0 multi-room + LoRA ...... data-bound (#645)
Phase 3 of ADR-103. Cross-compiled aarch64 + x86_64 on ruvultra, signed
with COGNITUM_OWNER_SIGNING_KEY (Ed25519), uploaded to GCS, and live-
installed on the cognitum-v0 Pi 5 alongside cog-pose-estimation.
Real-hardware bench on cognitum-v0:
./cog-person-count-arm health
→ backend=candle-cpu, count=0, confidence=0.49, p95=[0,7]
30 sequential health invocations: 0.276 s → 9.2 ms/invocation cold
Compares to cog-pose-estimation's 8.4 ms — count cog is ~10% slower
because the dual-head (count softmax + confidence sigmoid) does ~2x
the work after the shared encoder.
GCS release artifacts (publicly downloadable, SHA-verified):
arm/cog-person-count-arm 2,168,816 B
sha: 36bc0bb0...0d47b507b3c3
sig: R/00xdzHriyr/2r...JK+a6k71NDg== (Ed25519)
x86_64/cog-person-count-x86_64 2,615,528 B
sha: 76cdd1ec...3923 7392b01db
sig: QB+8cnGSMQmu...ZtTNIQ2rDg== (Ed25519)
arm/cog-person-count-count_v1.safetensors 392,088 B
sha: dacb0551...e6e04ff56d15c3a65a9ff
Live install at /var/lib/cognitum/apps/person-count/ on cognitum-v0
matches the layout of every other installed cog (anomaly-detect,
seizure-detect, pose-estimation): cog-person-count-arm binary,
count_v1.safetensors weights, manifest.json, config.json.
Adds:
* v2/.../cog/artifacts/manifests/{arm,x86_64}/manifest.json — full
ADR-100 schema with all fields filled (sha + sig + size + URL +
build_metadata carrying the v0.0.1 honest training caveats).
* docs/benchmarks/person-count-cog.md — appends "Live appliance
install" and "Signed GCS release artifacts" sections to the
benchmark log.
Honest v0.0.1 caveat still applies (class-1 accuracy 0% on the held-
out tail of the single-session training data) — same data-bound
limit as pose_v1. The shipped artifact is the *vehicle*; production-
quality accuracy follows from multi-room paired data per ADR-103's
v0.2.0 plan + #645.
Phase 2 of ADR-103: trained count head on the existing 1,077 paired
samples (the same data that produced pose_v1 yesterday).
Honest result: 65.1% eval accuracy / 100% within ±1 / MAE 0.349 on
the held-out time-window. Per-class: 100% on "empty room" / 0% on
"1 person". The model overfit by epoch 100 (train_acc → 1.0,
eval_loss climbed 0.67 → 7.8) and the "best" checkpoint is the
snapshot that happened to predict the eval window's class
distribution (140/215 = 65.1%, matches eval_acc exactly). Confidence
head Spearman = 0.023 ⇒ uncalibrated. Same data-bound failure mode
as pose_v1 (#645), bounded by single-session training data; same
fix path (multi-room).
What v0.0.1 still validates end-to-end:
* PyTorch → safetensors → Candle Rust loads cleanly on first try.
`cog-person-count health` reports `backend: candle-cpu` and emits
real per-frame predictions instead of the stub backend's hard-coded
{1 person, 0 confidence}. Architecture parity between train-count.py
and src/inference.rs::CountNet is bit-exact.
* ONNX export bit-clean (16 KB, opset 18, dynamic batch axis).
* Training wall time: 5.6 s for 400 epochs on RTX 5080.
* Binary size unchanged (2.36 MB stripped), model loads via mmap at
runtime.
This commit ships:
* scripts/align-ground-truth.js: extended to emit n_persons_mode +
n_persons_max per window so the training pipeline has count
labels. Backwards-compatible (additive fields).
* scripts/train-count.py: new — mirrors CountNet architecture
exactly, loads paired.jsonl, trains 400 epochs with
CE+BCE+Brier loss, exports safetensors + ONNX + per-epoch JSON.
* v2/.../cog/artifacts/{count_v1.safetensors,count_v1.onnx,
count_train_results.json}: the trained artifacts.
* v2/.../cog/README.md: Status table updated with the v0.0.1 numbers
+ an Honest Caveat section explaining the data-bound result.
* docs/benchmarks/person-count-cog.md: new — full v0.0.1 benchmark
log mirroring the format docs/benchmarks/pose-estimation-cog.md
established. Includes comparison to ADR-103 v0.1.0 acceptance
gates and per-class breakdown.
Still pending:
* `run` subcommand wiring (long-running polling loop, same as pose)
* Cross-compile + sign + GCS upload (mirror of pose cog pipeline)
* Live install on cognitum-v0
* v0.2.0: re-train on multi-room data, LoRA per-room adapters,
Stoer-Wagner min-cut clip in fusion stage
First implementation PR for ADR-103. Same incremental shape that
ADR-101 used: scaffold the cog crate, ship a stub-backend release
that satisfies the runtime contract + 15 tests + measured cold-start,
then follow up with the trained count_v1.safetensors in a separate PR.
What ships:
* v2/crates/cog-person-count/ — new workspace member.
- Cargo.toml: candle-core/candle-nn 0.9 (cpu default, cuda feature
opt-in), safetensors, ureq, sha2 — same dep shape as the pose cog
but minus wifi-densepose-train (this cog has no training-side
consumer, so the dep tree is materially smaller → 2.36 MB
binary vs the pose cog's 4.5 MB).
- src/inference.rs: CountNet (Conv1d 56→64→128→128 encoder + count
head Linear(128→64→8)+softmax + confidence head
Linear(128→32→1)+sigmoid). Stub backend returns
`{1-person, 0-confidence}` honestly when no safetensors present.
- src/fusion.rs: fuse_confidence_weighted() — Bayesian product of
per-node distributions with confidence-weighted log-sum, plus
fuse_with_mincut_clip() hook for the v0.2.0 Stoer-Wagner
upper-bound (`ruvector-mincut` dep lands when min-cut graph
builder is ready). Confidences floored at 1e-3 and probs floored
at 1e-9 before logs — no NaN propagation.
- src/publisher.rs: emits {count, confidence, count_p95_low,
count_p95_high, n_nodes, probs} per ADR-103 §"Output".
- src/main.rs: full ADR-100 four-verb CLI (version|manifest|health
|run). The `run` subcommand explicitly returns "wiring pending
v0.0.1" so the in-process library API is the v0.0.1-clean
integration path.
- tests/smoke.rs (8 tests) + fusion::tests (7 tests, in-lib) — 15
total, all green. Cover stub-backend behaviour, wrong-shape
rejection, fusion math (empty / single / agreement / high-conf
override / normalisation), p95-range correctness, and min-cut
clip semantics.
- cog/{manifest.template.json, config.schema.json, README.md} +
cog/artifacts/ placeholder dir.
* v2/Cargo.toml: registers the new workspace member.
Verified locally:
cargo check -p cog-person-count --no-default-features → clean
cargo test -p cog-person-count --no-default-features → 8/8 pass
cargo test -p cog-person-count --lib → 7/7 pass
cargo build -p cog-person-count --release → 2.36 MB binary
./cog-person-count version → "person-count 0.3.0"
./cog-person-count manifest → JSON skeleton
./cog-person-count health → backend:stub,
count:1, conf:0,
p95:[1,1]
Cold-start: 30 sequential `health` invocations → 53.3 ms/invocation
(vs cog-pose-estimation's 76.2 ms — smaller dep tree)
cog/README.md adds:
* Security section — six-row threat table covering safetensor mmap
trust, non-finite outputs, sensing fetch failures, fusion
divide-by-zero / log-of-zero, min-cut degenerate cases, and stdout
spoofing.
* Performance / optimization section — binary size, release profile
(already opt-level=3 / lto=fat / codegen-units=1 / strip=true at
workspace level), cold-start comparison table, projected warm-path
latency budget.
Still pending (separate PRs, ADR-103 §"Migration"):
* Train count_v1.safetensors on the existing 1,077 paired samples
with `n_persons` labels (Candle on RTX 5080, same script that
produced pose_v1.safetensors yesterday).
* `run` subcommand wiring (long-running polling loop, same shape as
cog-pose-estimation::runtime).
* Cross-compile + sign + GCS upload (mirror of cog-pose-estimation
release pipeline).
* Server-side `csi.rs::score_to_person_count` call-site rewire to
consume this cog when installed; falls back to PR #491's heuristic
when not.
* feat(edge-registry): ADR-102 — surface Cognitum cog catalog via /api/v1/edge/registry
Adds a new sensing-server endpoint that fetches and caches the canonical
Cognitum app registry at
https://storage.googleapis.com/cognitum-apps/app-registry.json (105 cogs
across 11 categories as of v2.1.0). RuView previously had no live
awareness of the catalog — the README's capability table was hand-
curated and went stale as Cognitum shipped new cogs (the registry was
last updated 6 days ago).
ADR:
* docs/adr/ADR-102-edge-module-registry.md — full design, response
shape, configuration flags, failure modes, and a 12-row security
review covering SSRF, response inflation, ?refresh abuse, stale-serve
semantics, TLS, cache poisoning, JSON-panic resistance, etc.
Code:
* v2/.../edge_registry.rs — EdgeRegistry struct + UreqFetcher +
MockFetcher trait + 7 unit tests. RwLock<Option<CachedEntry>> with
stale-on-error fallback. MAX_PAYLOAD_BYTES=8 MiB, 10s wire timeout.
* v2/.../main.rs — constructs Option<Arc<EdgeRegistry>> at startup,
registers GET /api/v1/edge/registry handler, wires Extension layer.
Handler runs the blocking ureq fetch via tokio::task::spawn_blocking
so the async runtime stays free.
* v2/.../cli.rs / main.rs Args — three new flags (per user request to
"allow the registry to be disabled or changed"):
--edge-registry-url <URL> (env RUVIEW_EDGE_REGISTRY_URL)
--edge-registry-ttl-secs <N> (env RUVIEW_EDGE_REGISTRY_TTL_SECS)
--no-edge-registry (env RUVIEW_NO_EDGE_REGISTRY)
When --no-edge-registry is set or the URL is empty, the endpoint
returns 404.
Cargo.toml: adds ureq (rustls), sha2, thiserror as direct deps.
README:
* New collapsed "🧩 Edge Module Catalog" section with the full 105-cog
table generated from the registry, grouped by category with practical
one-line descriptions (e.g. "Spots irregular heartbeats and abnormal
heart rhythms", "Detects walking problems and scores fall risk").
Links to https://seed.cognitum.one/store and the local appliance
/cogs page. Sits between the HF model section and How It Works.
Tests (7/7 pass):
first_call_hits_upstream_and_caches
ttl_expiry_triggers_refetch
force_refresh_bypasses_fresh_cache
stale_serve_on_upstream_failure_after_cached_success
no_cache_no_upstream_returns_error
upstream_invalid_json_is_treated_as_error
upstream_sha256_is_deterministic
Security highlights (full review in ADR-102 §"Security review"):
- The registry is metadata-only; per-cog binary signatures (ADR-100)
remain the trust root for installs. A compromised registry can
mislead a human reader but cannot ship malicious binaries.
- 8 MiB cap + 10s timeout + Option<Arc<...>> via Extension layer means
the endpoint can't be used to exhaust memory or pin tokio threads.
- Stale-on-error responses carry an explicit `stale: true` field so
upstream outages are visible to consumers rather than silently
masked.
- Endpoint sits behind the existing RUVIEW_API_TOKEN bearer gate when
set, otherwise unauthenticated (registry contents are public anyway).
* chore: refresh Cargo.lock for ureq/sha2/thiserror deps added by ADR-102
Issue #640 (PCK gap follow-up) was deleted upstream after the cog v0.0.1
PRs landed today. Re-opened as #645 with the same context plus the
new measured v0.0.1 numbers (PCK@20 3.0%, PCK@50 18.5%, MPJPE 0.093).
This patch updates the three files in main that still pointed at the
dead #640 to point at #645 instead — ADR-101, the cog README, and the
benchmark log.
Adds the x86_64-unknown-linux-gnu binary uploaded to
gs://cognitum-apps/cogs/x86_64/, signed with the same Ed25519
COGNITUM_OWNER_SIGNING_KEY as the arm release. Together with the
already-shipped arm artifact, the cog now ships natively for both
target architectures the Cognitum fleet supports.
x86_64 release:
sha256: a434739a24415b34e1aff50e5e1c3c32e568db96af473bbb3e5ecc9b95fe71fa
signature: pNNuxhgM18PztN8BSZdfw5oAShG2pV3na5T/q2QdlJWX/5FJgo4QTiUCbcTAxI2Uiva8VURSOlRzMU3xoQPqCQ==
size: 4,548,856 bytes
cold-start: 5.4 ms / invocation on ruvultra (RTX 5080, NVMe)
Reorganizes manifests under cog/artifacts/manifests/{arm,x86_64}/
so each arch carries its own manifest with the matching binary_sha256
and signature — same layout the release pipeline will use for the
future hailo8 / hailo10 variants.
Updates docs/benchmarks/pose-estimation-cog.md with the cross-arch
cold-start table:
Windows (x86_64) 76.2 ms
ruvultra (x86_64) 5.4 ms <- this release
Pi 5 (aarch64) 8.4 ms
Verified via anonymous GCS download + SHA round-trip — identical to
local build.
Hailo HEF remains the only pending arch, still blocked on Hailo SDK
provisioning to a self-hosted runner.
* feat(cog-pose-estimation): scaffold first Cog from this repo (ADR-100 + ADR-101)
Adds the foundation for the pose-estimation Cog that ships from this
repo into Cognitum V0 appliances. Companion ADR-225 + crate land in
cognitum-one/v0-appliance.
ADRs:
* ADR-100 formalises the Cognitum Cog packaging spec — on-device
layout under /var/lib/cognitum/apps/<id>/, manifest.json schema
(incl. new binary_sha256 + binary_signature fields), GCS hosting
convention, repo source layout, build pipeline, and the four-verb
runtime contract (version | manifest | health | run). Documents the
convention I reverse-engineered from inspecting installed cogs on a
live cognitum-v0 appliance — `anomaly-detect`, `presence`,
`seizure-detect`, etc.
* ADR-101 designs the pose-estimation Cog itself: where it sits in
the wifi-densepose pipeline (encoder init from
ruvnet/wifi-densepose-pretrained, 17-keypoint regression head),
what gets shipped per target arch (arm / x86_64 / hailo8 /
hailo10), acceptance gates (PCK@20 explicitly deferred to #640 —
this ADR ships the vehicle, not the accuracy).
Crate v2/crates/cog-pose-estimation/:
* Cargo.toml + workspace member declaration with a hailo feature gate
so the binary builds without the Hailo SDK in CI.
* main.rs implements the four-verb CLI exactly per ADR-100.
* config.rs / manifest.rs / publisher.rs / inference.rs / runtime.rs —
small modules, each <100 lines.
* publisher.rs emits ADR-100 structured JSON events.
* inference.rs is a stub that produces a centred-skeleton baseline
with confidence=0 (honest: no trained weights wired in yet).
* runtime.rs subscribes to /api/v1/sensing/latest, slides a
56*20 window, runs the engine, emits pose.frame events.
* cog/manifest.template.json + cog/config.schema.json define the
release artifact + runtime config schemas.
* cog/Makefile holds build / sign / upload targets.
* tests/smoke.rs covers manifest roundtrip + engine I/O surface.
Verified locally:
* cargo check -p cog-pose-estimation: clean.
* cargo test -p cog-pose-estimation: 4/4 pass.
* ./target/release/cog-pose-estimation {version,manifest,health}:
all emit the right contract output.
This commit contains scaffolding only; the actual trained weights and
Hailo HEF cross-compile come in follow-ups tracked in #640 and the
companion v0-appliance branch.
* feat(cog-pose-estimation): first measured run — Candle CUDA on RTX 5080
Trained pose_v1 on ruvultra (RTX 5080) via Candle 0.9 + cuda feature
against the same 1,077-sample paired session that produced 0%/0% PCK
in #640 with the pure-JS SPSA trainer. First real numbers:
PCK@20 = 3.0% (up from 0.0%)
PCK@50 = 18.5% (up from 0.0%)
MPJPE = 0.093 (down from 0.66, ~7x improvement)
400 epochs in 2.1 s wall time, full-batch, ~5 ms/epoch. Loss curve
0.181 -> 0.014 over the run, eval 0.010. Per-joint reveals the model
leans on right-side proximal joints (r_hip 77% PCK@50, r_knee 35%,
l_elbow 26%) — consistent with the camera framing in the source
recording. Distal joints (wrists, ankles) and face joints are still
near-random, consistent with the 56-subcarrier / 20-frame input not
carrying fine-grained spatial info at 1077 samples.
This commit:
* Adds v2/crates/cog-pose-estimation/cog/artifacts/{pose_v1.safetensors,
train_results.json} so the cog dir now contains a real reference
artifact, not just scaffold.
* Updates cog/README.md "Status" block with the measured numbers,
per-joint table, and an honest reading of where the model
succeeds vs where the data is the bottleneck.
* Adds docs/benchmarks/pose-estimation-cog.md as the canonical
benchmark log — append-only, one section per published run.
* Appends a "First measured run" section to ADR-101 referencing
the new benchmark file.
Still pending in the follow-up:
* Wire pose_v1.safetensors into src/inference.rs (replace stub).
* ONNX export (Candle lacks a writer — needs external conversion).
* Hailo HEF cross-compile + cluster deploy.
The data-bound gap to PCK@20 >= 35% is tracked in #640.
* feat(cog-pose-estimation): wire real weights — cog is no longer a stub
Replaces the centred-skeleton stub in src/inference.rs with a real
Candle-based loader that reads cog/artifacts/pose_v1.safetensors and
runs the trained Conv1d encoder + MLP pose head on every incoming CSI
window.
What changes:
* src/inference.rs: PoseNet mirrors the training script's architecture
exactly — Conv1d(56->64, k=3 d=1), Conv1d(64->128, k=3 d=2),
Conv1d(128->128, k=3 d=4), mean over time, Linear(128->256)+ReLU,
Linear(256->34)+sigmoid -> reshape [17, 2]. The InferenceEngine
searches a sensible candidate list for the weights file
(/var/lib/cognitum/apps/pose-estimation/, ./pose_v1.safetensors,
./cog/artifacts/, repo-root, v2/-relative) and falls back to the
stub when none are present so the cog still satisfies ADR-100.
* Cargo.toml: adds candle-core 0.9 + candle-nn 0.9 (no-default-features,
CPU build by default) + safetensors 0.4. New `cuda` feature opt-in
for GPU inference on hosts that have it. Drops the unused
wifi-densepose-train path dep from the default build path.
* src/main.rs + src/publisher.rs: health.ok event now carries
`backend` (candle-cuda | candle-cpu | stub) and the synthetic
output confidence, so operators can tell at a glance whether the
cog loaded its weights or fell back to the stub.
* tests/smoke.rs: adds `real_weights_load_when_available` which
asserts the loaded engine reports backend=candle-* and emits
non-zero confidence — exactly the signal that proves we're not
silently degrading to the stub.
Verified locally:
* `cargo check -p cog-pose-estimation --no-default-features` — clean
* `cargo test -p cog-pose-estimation --no-default-features` — 5/5 pass
* `./target/release/cog-pose-estimation health` emits:
{"event":"health.ok","fields":{"backend":"candle-cpu","cog":"pose-estimation","synthetic_output_confidence":0.185}}
— 0.185 is the published PCK@50 from cog/artifacts/train_results.json,
emitted by the real Candle inference path (would be 0.0 if it had
fallen back to the stub).
The cog now runs the trained pose_v1 model end-to-end. Accuracy is
still bounded by the underlying 1077-sample training data (PCK@20
3.0%, PCK@50 18.5% per docs/benchmarks/pose-estimation-cog.md) — that
gap is data-bound and tracked in #640. ONNX export + Hailo HEF
cross-compile remain follow-ups.
* docs(benchmarks): measure cog-pose-estimation cold-start latency
100 sequential `cog-pose-estimation health` invocations average 76.2 ms
each on a Windows x86_64 host using the `candle-cpu` backend. Each
invocation re-loads pose_v1.safetensors and runs one synthetic forward
pass, so this is the worst-case cold-start path. Long-running `run`
inference will be sub-millisecond per frame once the model is loaded.
Updates the benchmarks doc accordingly.
* feat(cog-pose-estimation): ONNX export — pose_v1.onnx + scripts/export-onnx.py
Adds the canonical ONNX artifact that unblocks downstream Hailo HEF
cross-compile + ONNX Runtime benchmarks. Generated on ruvultra (torch
2.12.0 + CUDA), 12,059 bytes, opset 18, dynamic batch axis.
* scripts/export-onnx.py: mirrors the Candle inference architecture in
PyTorch (Conv1d 56->64, 64->128, 128->128 + Linear 128->256->34), pure-
python safetensors loader (no extra pip dep), exports via
torch.onnx.export, then verifies via onnx.checker.check_model and
numerical parity against the torch reference.
* Verified parity vs torch: max |torch - onnx| = 8.94e-8 (1e-5
threshold). Effectively bit-perfect.
* v2/crates/cog-pose-estimation/cog/artifacts/pose_v1.onnx — the
artifact itself, 12 KB.
* docs/benchmarks/pose-estimation-cog.md — adds an ONNX export
section with the verification numbers.
Next: Hailo HEF cross-compile (still gated on Hailo SDK on a
self-hosted runner) and ONNX Runtime latency benchmarks on each
target arch.
* feat(cog-pose-estimation): release v0.0.1 — signed aarch64 binary on GCS
End-to-end deploy: cross-compiled to aarch64-unknown-linux-gnu on
ruvultra, ran via qemu-aarch64-static, then smoke-tested on a real
cognitum-v0 Pi 5. Signed with COGNITUM_OWNER_SIGNING_KEY (Ed25519)
and uploaded to gs://cognitum-apps/cogs/arm/.
Real-hardware results on cognitum-v0 (Pi 5):
health: backend=candle-cpu, confidence=0.185, real weights loaded
30x sequential `health`: 0.251 s total -> 8.4 ms / invocation (cold)
GCS release artifacts (publicly downloadable):
binary: 3,741,976 bytes
sha256 1e1a7d3dd01ca05d5bfc5dbb142a5941b7866ed9f3224a21edc04d3f09a99bf5
weights: 507,032 bytes
sha256 eb249b9a6b2e10130437a10976ed0230b0d085f86a0553d7226e1ae6eae4b9e5
signature (Ed25519, b64): LUN7xqLPYD3MFzm5dKB5MnYU0LvoRtek5ci5KiKPHBg+Xo6xuazwokn2Dw2JPMaLYJzmWn/SpT4djuR7hYvVDw==
Adds:
* v2/crates/cog-pose-estimation/cog/artifacts/manifest.json — the
release-pipeline-produced manifest with all fields filled in per
ADR-100, including arch, target_triple, signature, and a
build_metadata block carrying the validation PCK numbers.
* docs/benchmarks/pose-estimation-cog.md — new sections covering
the real Pi 5 smoke (8.4 ms cold-start) and the signed GCS
release artifacts.
Verified by downloading the binary anonymously from GCS and
re-computing the sha256 — matches the locally-computed sha exactly.
Signature decoded to the expected 64-byte Ed25519 length.
Closes the GCS-upload acceptance criterion from ADR-100; the only
pending work is Hailo HEF cross-compile (still SDK-gated) and an
x86_64 release alongside this arm release.
* docs(benchmarks): record live cognitum-v0 install + 5-sec smoke run
Adds the "Live appliance install" section documenting what happened
when the signed v0.0.1 binary + weights were installed under
/var/lib/cognitum/apps/pose-estimation/ on cognitum-v0 (the V0
cluster leader).
* Layout matches the existing anomaly-detect / presence / seizure-
detect cogs exactly — the Cogs dashboard at
http://cognitum-v0:9000/cogs auto-discovers entries.
* `cog-pose-estimation run` ran for 5 seconds in the background and
cleanly emitted run.started + structured WARN events for the
missing local sensing-server on :3000 (cognitum-v0's actual CSI
source is ruview-vitals-worker on :50054, not :3000). No crashes,
no NaN, no leaks.
* Wiring `sensing_url` to the appliance-native source is a separate
Day-2 integration task.
#613 fixed adaptive_classifier.rs:94 (the IQR sort) and called the audit
done, but the grep used `partial_cmp(b).unwrap()` as a literal and missed
seven additional production sites that use comparator variants:
adaptive_classifier.rs:205 AdaptiveModel::classify() argmax over softmax
probs — same per-frame hot path as #611.
NaN flows through normalise → logits → softmax
and still reaches this site even after the
IQR fix.
adaptive_classifier.rs:480 train() argmax (training accuracy loop)
adaptive_classifier.rs:500 train() per-class argmax
main.rs:2446, 2449 count_persons_mincut variance source/sink select
csi.rs:602, 605 count_persons_mincut variance source/sink select
(duplicate of main.rs logic in csi.rs)
For the variance-select sites, note that the *outer* `unwrap_or((0, &0))`
only catches an empty iterator — it cannot rescue a panic raised inside
the comparator. A single NaN in `variances[]` still aborts the process.
Same fix as #613: swap `.unwrap()` for `.unwrap_or(std::cmp::Ordering::Equal)`
inside the comparator closure. Pure behavioural change, no API surface.
Re-audit of the remaining `partial_cmp(...).unwrap()` matches in v2/:
they are all inside `#[cfg(test)]` / `#[test]` blocks (spectrogram.rs:269,
depth.rs:234, connectivity.rs:477, vital_signs.rs:737) where inputs are
controlled and panic-on-NaN is acceptable.
Integrating @schwarztim's PR #491 into main on their behalf — their fork has
fallen too far behind for a clean rebase (the PR's commit graph dropped
silently during `git rebase origin/main`), so applying as a merge from the
fork head to preserve the diff cleanly.
What this lands:
- `RollingP95` adaptive normaliser for the person-count feature scaling.
Streaming P95 over a 600-sample / ~30 s sliding window. Cold-start
(<60 samples) falls back to the legacy denominators (variance/300,
motion_band_power/250, spectral_power/500) so day-0 behaviour is
preserved on every deployment.
- `RuntimeConfig` struct + `load_runtime_config` / `save_runtime_config`
persisted to `data/config.json`. Exposes `dedup_factor` via REST so
multi-node deployments can tune cluster-deduplication without a rebuild,
including an auto-tune endpoint that derives optimal dedup from a known
person count (calibration mode).
- `compute_person_score()` now takes &AppStateInner alongside &FeatureInfo
so the adaptive denominators are reachable. All 3 call sites updated.
- New `AppStateInner` fields: `p95_variance`, `p95_motion_band_power`,
`p95_spectral_power`, `dedup_factor`, `data_dir`.
Closes#491. Directly addresses:
- #499 (double skeletons, multi-node) — the slot-clustering problem this
PR's adaptive normaliser was designed to fix
- #519 Bug 1 (ghost person detection on edge-tier 1 & 2 multi-node)
- #496 (person count over-reporting on single-room single-person)
Verified locally:
- cargo check -p wifi-densepose-sensing-server --no-default-features: 1.0s
- cargo test -p wifi-densepose-sensing-server --no-default-features --lib:
233/233 passed in 25.0s
Co-authored-by: @schwarztim
Co-Authored-By: claude-flow <ruv@ruv.net>
Reported by @ArnonEnbar with a complete reproduction.
broadcast_tick_task() re-emits the cached `latest_update` every tick so
pose WS clients keep getting data even when ESP32 pauses between
frames. The `source` field of that cached update was set to "esp32" at
the moment a fresh ESP32 frame was last decoded (main.rs:3885, :4136).
After the ESP32 loses power or network, no fresh frame is decoded —
the cached `latest_update` is still re-broadcast every tick with the
stale source: "esp32" baked in. UI's "Sensing" tab keeps showing
"LIVE — ESP32 HARDWARE Connected" with frozen vitals/features/
classification re-broadcast indefinitely. REST `/health` correctly
reports source: "esp32:offline" (via effective_source(), which checks
last_esp32_frame elapsed time against ESP32_OFFLINE_TIMEOUT=5s) — but
the WS broadcast path was the one consumer that didn't call it.
Fix: clone the cached update per tick, overwrite source with
s.effective_source(), then serialize and broadcast. UI now switches to
"esp32:offline" on the same 5s budget as the REST surface.
cargo build -p wifi-densepose-sensing-server --no-default-features:
17s, no errors (1 pre-existing unused-import warning unchanged).
Reported by @bannned-bit. Five endpoints in
v2/crates/wifi-densepose-sensing-server embedded user-controlled
identifiers in format!() paths with no sanitization:
recording.rs POST /api/v1/recording/start (session_name)
recording.rs GET /api/v1/recording/download/:id (id)
recording.rs DELETE /api/v1/recording/delete/:id (id)
model_manager.rs POST /api/v1/models/load (model_id)
training_api.rs load_recording_frames (dataset_ids[])
Each unauthenticated caller could:
- READ arbitrary files via ../../etc/passwd, ../../.env, etc.
- WRITE attacker-controlled JSONL via recording/start
- LOAD attacker-controlled .rvf model files
- DELETE arbitrary files the server process can touch
New `path_safety` module exports `safe_id(&str) -> Result<&str, PathSafetyError>`
that enforces the rejection envelope BEFORE any user input reaches a
format!() that builds a path:
- Allowed character set: [A-Za-z0-9._-]
- Reject leading '.' (rules out '.', '..', '.env', hidden files)
- Reject empty strings
- Reject anything > 64 bytes
- Reject all whitespace, path separators, null bytes, non-ASCII
Applied at all 5 sites. Errors return 400 Bad Request (download) /
status:"error" JSON (others) — not panics.
9 unit tests in path_safety::tests cover:
- accepts simple alphanumeric / hyphen / underscore / dot
- rejects empty, leading dot, path separators ('/', '\'),
null byte, whitespace, shell specials, non-ASCII (including
fullwidth slash U+FF0F), too-long, boundary at MAX_ID_LEN
test result: ok. 9 passed; 0 failed
cargo build -p wifi-densepose-sensing-server --no-default-features: 33s
Fix-marker RuView#615 in scripts/fix-markers.json prevents removing the
guard at any of the 5 call sites. CHANGELOG entry under [Unreleased] /
Security documents the patched endpoints and the rejection envelope.
Severity: critical per reporter — five remotely-reachable paths to read,
write, or delete arbitrary files. Hot per-request paths, not edge cases.
Reported by @bannned-bit. v2/crates/wifi-densepose-sensing-server/src/
adaptive_classifier.rs:94 did:
sorted.sort_by(|a, b| a.partial_cmp(b).unwrap());
f64::partial_cmp returns None on NaN, so `.unwrap()` panics. CSI data
from real ESP32 hardware can produce NaN (silent DSP div-by-zero,
empty buffer, etc.), and this code path runs on every frame in the
classify() hot path — a single NaN frame kills the entire sensing
server process.
Fix swaps for unwrap_or(Ordering::Equal), matching the pattern the
same file already uses at lines 149-150 and 155 (those sites were
already NaN-safe; this site was an oversight).
Scoped audit: greped the v2/ tree for `partial_cmp(b).unwrap()`. The
other 3 hits are in #[cfg(test)] blocks (spectrogram.rs:269,
depth.rs:234, connectivity.rs:477) where panic-on-NaN is acceptable
because test inputs are controlled. Only adaptive_classifier.rs:94
was a production-path crash.
Severity: critical per reporter — runtime panic on real-world data.
Patch: 1-line behavioural change + comment.
Each of these crates was a single-line doc-comment placeholder:
v2/crates/wifi-densepose-api/src/lib.rs: //! WiFi-DensePose REST API (stub)
v2/crates/wifi-densepose-db/src/lib.rs: //! WiFi-DensePose database layer (stub)
v2/crates/wifi-densepose-config/src/lib.rs: //! WiFi-DensePose configuration (stub)
with empty [dependencies] in their Cargo.toml and zero references from any
source file or Cargo.toml in the workspace (verified by `grep -rln
wifi-densepose-api/-db/-config` across `v2/`). They were reserved early for
an envisioned REST/database/config split that never materialised.
The functionality these would have provided is covered today by:
- REST/WS: wifi-densepose-sensing-server (Axum)
- Config: per-crate config + CLI args in sensing-server and desktop
- DB: no persistent state; system is real-time
Removal prevents `cargo` from listing dead crates, shipping empty published
artifacts to crates.io, or wasting reviewer attention. If any of these names
is needed in the future, reintroduce them with a real implementation.
Per the issue reporter (@bannned-bit / Matad0r) #578 explicitly listed
"OR be removed from workspace members until implementation starts" as an
acceptable resolution.
Updated:
- `v2/Cargo.toml`: drop the three members (with inline comment explaining why)
- `v2/Cargo.lock`: regenerated by cargo check
- `CLAUDE.md`: drop the three rows from the crate table and the publishing
order list
- `CHANGELOG.md`: add an `[Unreleased] / Removed` entry
Verified:
- `cd v2 && cargo check --workspace --no-default-features` -> finished
in 48s, no errors (warnings unchanged)
* v2: pin Rust 1.89 for sensing-server dependency chain
ruvector-core 2.0.5, hnsw_rs 0.3.4, and mmap-rs 0.7 require newer Cargo/rustc
than 1.82 (edition2024 manifest, is_multiple_of, stable avx512f target_feature
on x86_64). Add v2/rust-toolchain.toml so cargo build -p
wifi-densepose-sensing-server picks a compatible toolchain.
Signed-off-by: Chaitanya Tata <chaitanya@dotstarconsulting.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
* sensing-server: default UI path for cwd v2/ and coalesce fallbacks
The previous default ../../ui resolves to a non-existent directory when
the binary is run from v2/ (common), so /ui/* returned 404 and the
dashboard appeared broken. Default to ../ui and try ../ui, ./ui,
../../ui when the configured path is missing.
Signed-off-by: Chaitanya Tata <chaitanya@dotstarconsulting.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
---------
Signed-off-by: Chaitanya Tata <chaitanya@dotstarconsulting.com>
Co-authored-by: Cursor <cursoragent@cursor.com>