Commit Graph

89 Commits

Author SHA1 Message Date
ruv 9c49ff1a38 feat(adr-110): fleet cardinality gauge wifi_densepose_mesh_node_total
Iter 37 — adds a fleet-summary gauge to the iter-36 Prometheus
exposition. Ops dashboards now answer "how many leaders / followers
/ no-sync nodes are there right now" in one scrape, without having
to scrape every per-node series and aggregate client-side.

  # HELP wifi_densepose_mesh_node_total Per-state node count across the fleet
  # TYPE wifi_densepose_mesh_node_total gauge
  wifi_densepose_mesh_node_total{state="leader"}   1
  wifi_densepose_mesh_node_total{state="follower"} 2
  wifi_densepose_mesh_node_total{state="no_sync"}  0

  - leader / follower split derived from snapshot.is_leader
  - no_sync = total_nodes_in_state - nodes_with_snapshot
    (so a node that has sent CSI frames but never a sync packet
     shows up here, which is what an operator wants to alert on)

Implementation factored as a free function `fleet_role_counts` so the
math is testable without spinning up the axum handler. Same pattern
iter 18 (update_csi_fps_ema) and iter 30 (sync_snapshot) used.

Test added (9/9 sync_snapshot_helper_tests now green):
  fleet_role_counts_classifies_correctly
    Three cases:
      - empty fleet → (0, 0)
      - 1 leader + 2 followers → (1, 2)
      - all-leaders edge case → (2, 0) (election prevents this in
        practice but the gauge math must still be consistent)

Useful Grafana queries this unlocks:
  - sum(wifi_densepose_mesh_node_total{state="follower"})
    → total reachable follower count
  - wifi_densepose_mesh_node_total{state="no_sync"} > 0
    → alert when any node has dropped off the mesh

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-05-23 15:08:16 -04:00
ruv 74eb09f604 feat(adr-110): Prometheus exposition endpoint /api/v1/mesh/metrics
Iter 36 — Grafana / Home Assistant Prometheus integration / Cognitum
Seed observability stack can now scrape mesh state directly with no
JSON-to-metric translation layer.

Endpoint: GET /api/v1/mesh/metrics → text/plain (Prometheus exposition
format v0.0.4). Eight gauges, one per NodeSyncSnapshot field, labeled
by node:

  wifi_densepose_mesh_offset_us{node="N"}        <signed-int>
  wifi_densepose_mesh_is_leader{node="N"}        0|1
  wifi_densepose_mesh_is_valid{node="N"}         0|1
  wifi_densepose_mesh_smoothed{node="N"}         0|1
  wifi_densepose_mesh_sequence{node="N"}         <u32>
  wifi_densepose_mesh_csi_fps{node="N"}          <float>
  wifi_densepose_mesh_csi_fps_samples{node="N"}  <u32>
  wifi_densepose_mesh_staleness_ms{node="N"}     <u64>

Each metric carries the standard `# HELP` + `# TYPE` headers before
its series block, exactly the format Prometheus + most scrape-format
implementations expect.

Implementation reuses iter-30's `NodeState::sync_snapshot()` as the
single source of truth — same data the JSON endpoints emit, just
text-formatted with `{node=...}` labels. Nodes without a fresh sync
are absent (Prometheus handles missing series natively).

Test added (8/8 sync_snapshot_helper_tests now green):
  bool_metric_returns_zero_or_one_as_text
    Pins the Prometheus convention that boolean gauges emit "0" or "1"
    literally, never "false"/"true" — if anyone refactors the helper
    to format!("{b}"), Prometheus would 400-reject the scrape; this
    test catches that drift before production.

User-guide REST table updated with the new endpoint.

Grafana / HA scrape config:
  - job_name: wifi-densepose-mesh
    scrape_interval: 5s
    metrics_path: /api/v1/mesh/metrics
    static_configs:
      - targets: ['localhost:3000']

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-05-23 15:03:51 -04:00
ruv 883765150c chore(sensing-server): drop unused tracker_bridge imports
Iter 35 — every cargo check / cargo test since iter 15 has emitted the
same warning:

  warning: unused imports: `KeypointState`, `PoseTrack`, and `self`
   --> crates/wifi-densepose-sensing-server/src/tracker_bridge.rs:10

The three unused names date from before the bridge was refactored
to use the `pose_tracker::PoseTracker` direct import on line 12.
Removing them clears the noise without changing any behavior — the
file's actual uses (`TrackLifecycleState`, `TrackId`, `NUM_KEYPOINTS`)
stay imported via the narrowed `use { ... }` list.

After this commit `cargo check -p wifi-densepose-sensing-server` shows
only the pre-existing `rvf_container.rs:128 associated function 'new'
is never used` warning, which is unrelated to ADR-110 and out of scope
for this loop.

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-05-23 14:58:41 -04:00
ruv f6a85fe7db feat(adr-110): NodeSyncSnapshot.staleness_ms — sync age in milliseconds
Iter 34 — adds an optional `staleness_ms` field to the iter-23
NodeSyncSnapshot that exposes (Instant::now() - latest_sync_at).
Dashboards / Prometheus exporters / UI badges can now decay sync
freshness without re-deriving it from latest_sync_at on the host.

Wire compatibility: new field is `#[serde(skip_serializing_if =
"Option::is_none")]` so pre-iter-34 clients that strict-parse via
serde + deny_unknown_fields are unaffected (default serde_json
strategy is to ignore unknown fields anyway).

Sensing-server changes:
  + NodeSyncSnapshot.staleness_ms: Option<u64>
  + sync_snapshot() populates it via latest_sync_at.elapsed().as_millis()
  + iter-24 serialization tests now check 8 contract fields, not 7
  + new test `snapshot_staleness_ms_tracks_apply_time` pins
    latest_sync_at to a past Instant and asserts the snapshot reports
    ~750 ms staleness with ±500 ms tolerance for scheduler delay

User-guide updates:
  + REST/WebSocket field table grows a `staleness_ms` row with the
    UI-rendering thresholds (fade at 5 s, drop at 9 s to match the
    firmware's VALID_WINDOW_MS-derived gate).

Tests passing:
  sync_snapshot_helper_tests:           7/7
  node_sync_snapshot_serialization_tests: 3/3 (8-field assertion green)

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-05-23 14:54:21 -04:00
ruv bea7edee1f test(adr-110): lock the 9-second staleness gate on mesh_aligned_us_for_csi_frame
Iter 33 — closes a real test-coverage gap. The iter 17 staleness gate
(returns None when latest_sync_at is older than 9 s = 3 × the firmware's
VALID_WINDOW_MS) was shipped but never directly tested. A future
careless edit changing `from_secs(9)` to e.g. `from_secs(90)` would
silently break ADR-029/030 multistatic fusion freshness guarantees.

Test (3 assertions, no sleep — uses `Instant::checked_sub` to set
latest_sync_at to past values directly):

  * 1  s old   → Some (fresh)
  * 8  s old   → Some (just inside the gate)
  * 10 s old   → None (just outside the gate)

If anyone widens or narrows the gate, exactly one of these assertions
fires and points at the off-by-one. Total time for the test < 1 ms.

sync_snapshot_helper_tests: 6/6 green.

Branch-coord clean — main.rs only.

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-05-23 14:48:22 -04:00
ruv 8805c8ec0b test+refactor(adr-110): NodeState::apply_sync_packet + 2 tests for the receive-side dispatch
Iter 32 — completes the helper-extraction discipline started in iter 30.
The iter 15 inline `ns.latest_sync = Some(sync); ns.latest_sync_at = ...`
was the LAST untested receive-side mutation; now it's a named method
with 2 tests covering its full state-transition surface.

Refactor:
  Add `NodeState::apply_sync_packet(pkt, now)` taking an Instant so
  the test can pass deterministic timing.
  udp_receiver_task now calls the method instead of touching the
  fields inline — one less place to break the staleness gate.

Tests (2 new — sync_snapshot_helper_tests module now at 5 tests):

  apply_sync_packet_populates_a_fresh_node
    Mirrors udp_receiver_task's first-packet-from-unknown-node path:
    asserts latest_sync goes from None → Some, latest_sync_at matches
    the passed Instant exactly (no clock skew from real Instant::now()),
    and sync_snapshot() now returns Some (REST 200 OK path lit up).

  apply_sync_packet_overwrites_older_data
    Subsequent packets must replace, not accumulate. Asserts sequence,
    local_us advance, and the staleness clock resets. This is what
    keeps the §A0.10-smoothed offset tracking the latest beacon rather
    than drifting with stale state.

cargo test sync_snapshot_helper → 5/5 green.

Branch-coord clean — no Cargo.toml / cli.rs touched.

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-05-23 14:44:25 -04:00
ruv a07deb9180 test+refactor(adr-110): NodeState::sync_snapshot + 3 helper tests, dedupe 4 call sites
Iter 30 — defends the iter 29 REST endpoints + iter 23 WebSocket
broadcast with tests, AND deduplicates the four call sites that all
built the same NodeSyncSnapshot inline.

Refactor:
  Add `NodeState::sync_snapshot() -> Option<NodeSyncSnapshot>` as the
  single source of truth. All four call sites simplified:
    1. node_sync_endpoint (REST /api/v1/nodes/:id/sync) — 12 → 5 lines
    2. mesh_endpoint (REST /api/v1/mesh)                — 11 → 3 lines
    3. WebSocket vitals-only NodeInfo (line 4284)        — 9  → 1 line
    4. WebSocket CSI-frame NodeInfo (line 4617)          — 9  → 1 line
  Net: -35 lines, single point of contact for any future schema change.

Tests (3 new, all green; brings binary suite to 95+):
  fresh_node_with_no_sync_returns_none
    Mirrors REST 404 "no_sync" + WebSocket sync omission paths.
  node_with_latest_sync_produces_correct_snapshot
    Mirrors REST 200 OK + WebSocket sync field paths.
    Asserts §A0.10's measured 1_163_565 µs offset survives the helper.
  snapshot_reflects_leader_state
    Leader-case shape: is_leader=true, offset≈0 (–7 µs call-stack).

These tests cover BOTH REST routes and BOTH WebSocket NodeInfo sites
through the shared helper — one test per behavioral path, no axum
state plumbing required. cargo check -p ...sensing-server → green.

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-05-23 14:36:54 -04:00
ruv c6a0d5dbf5 feat(adr-110): REST endpoints /api/v1/nodes/:id/sync and /api/v1/mesh
Iter 29 — extends the iter 23 WebSocket NodeSyncSnapshot publication
with an HTTP surface so non-streaming clients (curl scripts, Home
Assistant REST sensors, Prometheus exporters, automation rule probes)
can poll mesh state without holding a WebSocket connection.

  GET /api/v1/nodes/:id/sync
    200 → Json(NodeSyncSnapshot) when latest_sync is present
    404 → {"error": "unknown_node" | "no_sync", "node_id": N}
           — "no_sync" includes a `hint` pointing operators at the
             "no mesh peer or not v0.6.9+" diagnostic.

  GET /api/v1/mesh
    200 → { "nodes": { "<id>": NodeSyncSnapshot, ... }, "total": N }
    Nodes without a recent sync are omitted; an empty `nodes` object
    means no mesh peers reachable.

Both handlers reuse the iter 23 NodeSyncSnapshot struct (same JSON
shape as the WebSocket broadcast — clients get one schema, two
delivery modes). The Path<u8> extractor returns 404 on overflow
automatically (axum), so /api/v1/nodes/256/sync gives a clean error.

cargo check -p wifi-densepose-sensing-server --no-default-features → green.

Curl quick-start (added to operator playbook material in a follow-up):
  curl http://localhost:3000/api/v1/mesh                  # full fleet
  curl http://localhost:3000/api/v1/nodes/9/sync          # one node

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-05-23 14:30:14 -04:00
ruv e764504dc5 test(adr-110): lock NodeSyncSnapshot JSON wire contract (iter 24)
Iter 24 — ultra-opt for public-API stability. Iter 23 added a new JSON
field that UI clients (viz.html, future Tauri desktop, automation) now
depend on; this iter locks its exact shape so any future rename /
removal fails a named test instead of silently breaking consumers.

New module `node_sync_snapshot_serialization_tests` (3 tests, all green):

  * sync_present_serializes_all_seven_fields
      Builds NodeInfo with Some(sample_sync), serializes to serde_json::Value,
      asserts all 7 documented field names exist (offset_us, is_leader,
      is_valid, smoothed, sequence, csi_fps_ema, csi_fps_samples) and
      spot-checks numeric values.

  * sync_absent_omits_the_key_entirely
      Builds NodeInfo with sync = None, asserts the `sync` JSON key is
      DROPPED entirely (not emitted as `"sync": null`). This is the
      backwards-compat contract that lets pre-iter-23 UI clients ignore
      mesh-aware nodes silently.

  * sync_round_trips_through_serde
      to_string / from_str round-trip on a populated NodeInfo recovers
      every field of the sync sub-object byte-for-byte (modulo float tol).

Test infrastructure: pure pure serde_json — no network, no fixtures,
no I/O. Adds 92 lines, 0 runtime allocs in the steady path.

Branch-coord clean (no Cargo.toml or cli.rs touched).

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-05-23 14:05:59 -04:00
ruv 41f28ae85e feat(adr-110): surface NodeSyncSnapshot in WebSocket sensing_update JSON
Iter 23 — converts the iter 1-21 firmware-side mesh substrate from
"works internally" to "visible to UI clients". WebSocket sensing_update
broadcasts now carry a per-node optional `sync` object exposing the
mesh state the iter 15-22 wire and storage capture:

  {
    "type": "sensing_update",
    ...
    "nodes": [
      {
        "node_id": 9,
        ...
        "sync": {
          "offset_us":      1163565,    // §A0.10's measured 1.16 s
          "is_leader":      false,
          "is_valid":       true,
          "smoothed":       true,       // EMA seeded
          "sequence":       20,         // §A0.12 pairing key
          "csi_fps_ema":    10.0,       // iter 18 measured rate
          "csi_fps_samples": 47         // ≥5 means trust csi_fps_ema
        }
      }
    ],
    ...
  }

`sync` is `Option<NodeSyncSnapshot>` with `#[serde(skip_serializing_if =
"Option::is_none")]` so non-mesh paths (multi-BSSID scan / synthetic RSSI
/ simulation) emit no `sync` key — preserves backwards compatibility
with existing UI clients.

Plumbed into all four NodeInfo construction sites:
  1. multi-BSSID scan path                     → sync: None
  2. synthetic-RSSI fallback                   → sync: None
  3. simulated frame path                      → sync: None
  4. real ESP32 CSI path (line 4528)           → sync: snapshot from NodeState
  5. ADR-039 vitals-only path (line 4207)      → sync: snapshot from NodeState

cargo check -p wifi-densepose-sensing-server --no-default-features → green.

UI clients (viz.html, future Tauri desktop, downstream automation) can
now render leader/follower badges, jitter histograms, and the §A0.10
clock-skew trajectory without any further firmware or aggregator work.

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-05-23 14:03:22 -04:00
ruv 82be960de5 test(adr-110): cross-language wire-format conformance gate
Iter 21 — ultra-opt for protocol correctness across the two production
decoders. Pin the same 32-byte canonical hex in both Python and Rust
tests; if either decoder drifts from the wire, ONE of the tests starts
failing — and it's clear which side moved.

Canonical packet: COM9 sync-pkt #1 from §A0.12 live capture, expressed
as exact little-endian bytes:

  10a111c5 09 01 06 00                      magic + node + ver + flags + rsvd
  f26db70100000000                          local_us = 28_798_450
  c5aca50100000000                          epoch_us = 27_634_885
  1400000000000000                          sequence = 20 + reserved

Python test:
  archive/v1/tests/unit/test_esp32_binary_parser.py::TestSyncPacketParser
  ::test_canonical_wire_bytes_match_rust_decoder
  — decodes the pinned hex, asserts every field including the §A0.10
    1,163,565 µs offset.

Rust test:
  v2/crates/wifi-densepose-hardware/src/sync_packet.rs::tests
  ::canonical_wire_bytes_match_python_decoder
  — decodes the same bytes, asserts the same fields, then re-encodes
    via to_bytes() and asserts the round-trip produces the EXACT same
    32 bytes. So this also catches drift in the Rust encoder.

Test counts after this iter:
  Rust sync_packet: 15/15 green (was 14)
  Python SyncPacketParser: 7/7 green (was 6)

Branch contract: if a future PR changes the firmware wire format, BOTH
tests must be updated atomically with the new canonical hex. CI will
gate this naturally.

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-05-23 13:52:44 -04:00
ruv 40bd6b81b8 test(adr-110): end-to-end sync decode → frame mesh recovery integration test
Iter 20 — defensive ultra-opt: one test that exercises the entire
iter 14→17 chain in a single assertion, so any future refactor that
breaks the contract surfaces as a single, named regression instead of
14 unit-test diffs to triangulate.

  1. firmware emits sync packet (bytes built here as a stand-in)
  2. host decodes via SyncPacket::from_bytes — assert round-trip
  3. a CSI frame arrives 100 sequences later (≈ 5 s @ 20 fps)
  4. mesh_aligned_us_for_sequence recovers the mesh timestamp
  5. cross-check: same value via raw apply_to_local

Asserts mesh_us == sync.epoch_us + 5_000_000 µs exactly, plus both
paths (sequence-interpolation + direct local→mesh) agree byte-for-byte.

Result: 14/14 sync_packet tests pass, full wifi-densepose-hardware
crate at 136/136 (no regression from iter 1-19). Contract for any
ADR-029/030 multistatic fusion consumer is now defended by a test that
fails loud if either piece of the chain drifts.

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-05-23 13:47:14 -04:00
ruv 898a2d7d9f feat(adr-110): wire observe_csi_frame_arrival into CSI receive path
Iter 19 — without this call, iter 18's EMA fps tracking was dead code
because csi_fps_samples stayed 0 forever and mesh_aligned_us_for_csi_frame
always fell back to the 20 Hz constant.

In udp_receiver_task's parse_esp32_frame branch, replace the bare
last_frame_time assignment with NodeState::observe_csi_frame_arrival,
which computes dt vs last_frame_time, feeds update_csi_fps_ema (α=1/8),
bumps csi_fps_samples, and sets last_frame_time as a side effect (same
value the bare assignment did).

Effect: after ~5 CSI frames arrive from any node, mesh_aligned_us_for_csi_frame
returns interpolated timestamps using the node's actually-observed frame
rate instead of the 20 Hz default. Real bench rate was ~10 fps, so this
halves the per-frame timestamp error in §A0.12-style multistatic alignment.

cargo check -p wifi-densepose-sensing-server --no-default-features → green.

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-05-23 13:44:59 -04:00
ruv 0dfa3d46aa feat(adr-115): P1 — Cargo features + CLI flags for MQTT/Matter/Semantic
Adds `mqtt` and `matter` Cargo features (default off) plus 20+ new CLI
flags wired through cli.rs per ADR-115 §3.8 / §3.10 / §3.11 / §3.12:

- MQTT (HA-DISCO): --mqtt, --mqtt-host/--mqtt-port/--mqtt-username/
  --mqtt-password-env/--mqtt-client-id/--mqtt-prefix, TLS controls
  (--mqtt-tls/--mqtt-ca-file/--mqtt-client-cert/--mqtt-client-key),
  rate controls (--mqtt-refresh-secs, --mqtt-rate-{vitals,motion,count,
  rssi,pose}, --mqtt-publish-pose).
- Privacy (ADR-106): --privacy-mode strips HR/BR/pose pre-publish.
- Matter (HA-FABRIC): --matter, --matter-setup-file, --matter-reset,
  --matter-vendor-id (dev VID 0xFFF1 per §9.9), --matter-product-id.
- Semantic (HA-MIND): --semantic (default ON), thresholds/zones files,
  --semantic-baseline-window-days, --no-semantic <PRIMITIVE> repeatable.

rumqttc 0.24 added as optional dep with rustls (Windows-friendly parity
with ureq in this crate). matter-rs deferred to P7 spike per §9.10.

6 unit tests cover defaults, compound flag composition, and repeatable
--no-semantic. Tests pass:

  cargo test -p wifi-densepose-sensing-server --no-default-features cli::tests
  6 passed; 0 failed.

Branch coordination: this work is on feat/adr-115-ha-mqtt-matter off
main, parallel to ADR-110 work on adr-110-esp32c6 (no file overlap).

Refs #776 (ADR-115 implementation tracking issue).

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-05-23 13:41:38 -04:00
ruv ca2059b07f fix(branch-coord): revert ADR-115 Cargo.toml/cli.rs that slipped into iter 18
Iter 18's commit 2997165bc accidentally absorbed the ADR-115 agent's
uncommitted MQTT/Matter additions (Cargo.toml `rumqttc` dep + [features]
block, cli.rs --mqtt CLI flags) into the adr-110-esp32c6 branch during
the cross-branch checkout described in that commit's message.

The actual iter 18 EMA work in main.rs is correct and stays; this commit
restores Cargo.toml + cli.rs to their HEAD~1 (iter 17) state so the
ADR-115 agent's stashed `stash@adr115-pending-work` can be popped cleanly
back onto their feat/adr-115-ha-mqtt-matter branch without colliding.

Net effect on adr-110-esp32c6:
  - main.rs iter 18 EMA: kept ✓
  - 4 fps_ema_tests: still green
  - Cargo.toml: back to iter-17 state (wifi-densepose-hardware dep only)
  - cli.rs: back to iter-17 state (no MQTT flags)
  - Cargo.lock: synced to match

The ADR-115 agent can pop their stash on feat/adr-115-ha-mqtt-matter
and resume without merging an unrelated branch's ADR-110 work.

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-05-23 13:31:58 -04:00
ruv 2997165bc1 feat(adr-110): per-node measured CSI fps + EMA for mesh-time interpolation
Iter 18 (after recovery from a cross-branch slip — see commit-history
context below). Replaces the hardcoded 20 Hz CSI_FPS_HZ constant in
mesh_aligned_us_for_csi_frame with a per-node EMA of observed
inter-frame intervals, falling back to 20 Hz until ≥5 samples land.

Real bench data (§A0.12 captures) showed the actual CSI rate at ~10 fps
because the firmware's CSI_MIN_SEND_INTERVAL_US gate combined with
ruv.net's traffic level paces it to that. Using 20 Hz against actual
10 fps inflates Δus 2× and shifts the recovered mesh timestamp by up
to the inter-sync interval / 2 = ~1 s. Measured fps fixes that.

State on NodeState:
  csi_fps_ema:     f64    — EMA (seeded at 20.0)
  csi_fps_samples: u32    — counts inter-frame deltas observed

API:
  NodeState::observe_csi_frame_arrival(now)  — call once per CSI frame
                                               from udp_receiver_task
  update_csi_fps_ema(prev_fps, dt_sec) -> Option<f64>  — free fn,
                                                          testable

mesh_aligned_us_for_csi_frame now uses the measured fps when samples ≥ 5,
falls back to 20 Hz otherwise.

4 unit tests (fps_ema_tests module, all passing on the binary):
  * steady_10hz_converges_toward_10  — 40 samples at 100 ms converge to ±0.1 Hz
  * steady_20hz_stays_near_20        — 20 samples at 50 ms stay within 0.05 Hz
  * nonpositive_dt_rejected          — dt ≤ 0 returns None
  * long_gap_rejected_as_implausible — dt > 1 s rejected (likely a dropout)

Branch-coordination note: this iter's working tree was briefly applied
to feat/adr-115-ha-mqtt-matter by a `git checkout` between iter 17 and
iter 18. Stashed the ADR-115 agent's MQTT/Matter Cargo.toml work
(`stash@adr115-pending-work`) before switching back here. No code lost.

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-05-23 13:30:02 -04:00
ruv 0c311a202b feat(adr-110): SyncPacket::mesh_aligned_us_for_sequence (interpolation) + NodeState hook
Iter 17 — closes the per-frame mesh-time loop for ADR-018 CSI frames
that carry no per-frame local_us field (the v1 wire format reserves no
slot — see WITNESS-LOG-110 §A0.11).

Math: pair the frame's sequence number against the sync packet's
sequence high-water + an assumed CSI frame rate. Δframes × 1/fps
estimates the node-local delta from the sync, then apply_to_local
recovers the mesh epoch.

  SyncPacket::mesh_aligned_us_for_sequence(frame_seq: u32, fps_hz: f64) -> u64

3 new unit tests (13 total in sync_packet::tests, all green):
  * mesh_aligned_for_sequence_identity_at_sync_point — at sync.sequence
    returns sync.epoch_us exactly
  * mesh_aligned_for_sequence_extrapolates_forward — 20 frames @ 20 fps
    extrapolates by exactly 1 s
  * mesh_aligned_for_sequence_handles_seq_wraparound — u32 sequence
    wrap doesn't jump backward by 2^32 (wrapping_sub guards it)

NodeState hook:
  NodeState::mesh_aligned_us_for_csi_frame(frame_sequence: u32) -> Option<u64>
    Wraps the SyncPacket method, defaults fps_hz=20.0 (matches the
    firmware's CSI_MIN_SEND_INTERVAL_US-implied ceiling), enforces the
    same 9 s staleness gate as mesh_aligned_us.

cargo check -p wifi-densepose-sensing-server --no-default-features → green.
cargo test -p wifi-densepose-hardware sync_packet → 13/13, 122 filtered.

Downstream ADR-029/030 multistatic fusion code can now do:
  if frame.adr018_flags.ieee802154_sync_valid {
      if let Some(mesh_us) = ns.mesh_aligned_us_for_csi_frame(frame.sequence) {
          // pair this frame with frames from sibling nodes by mesh_us
      }
  }

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-05-23 13:19:06 -04:00
ruv df95360e52 feat(adr-110 P10): apply_to_local + NodeState::mesh_aligned_us + full ADR rewrite
Iter 16 closes the math loop and updates ADR-110 to reflect the full
P1-P10 sprint outcome (per user request).

Code (the math layer that converts the iter 15 stored sync into a
per-frame mesh-aligned timestamp):

  wifi-densepose-hardware:
    SyncPacket::apply_to_local(local_at_frame_us: u64) -> u64
      Pure integer math: offset = epoch - local; mesh = local_at_frame + offset.
      3 new unit tests (10 total, all green):
      - apply_to_local_recovers_packet_epoch (identity at the packet's local_us)
      - apply_to_local_preserves_inter_frame_delta (Δlocal == Δmesh)
      - apply_to_local_on_leader_is_near_identity (leader offset ≈ 0)

  wifi-densepose-sensing-server:
    NodeState::mesh_aligned_us(local_at_frame_us: u64) -> Option<u64>
      Returns the recovered mesh timestamp using the most-recent sync
      packet, or None if no sync seen or last one older than 9 s
      (3× firmware VALID_WINDOW_MS = 9 s staleness gate).
      cargo check -p wifi-densepose-sensing-server --no-default-features
        → green

ADR-110 substantial rewrite (per user "update adr 110 with details"):

  - Status line: P1-P10 complete, firmware-side substrate closed at v0.7.0.
  - Front matter now lists all 4 firmware releases + witness link.
  - Phase table grows a P10 row capturing the v0.6.8 / v0.6.9 / v0.7.0
    arc (EMA smoother + sync packet + bit-4 wire-fix + host crates).
  - New §4.1 — /loop 5m SOTA sprint summary table (iters 1-16, 4 releases,
    17 commits, 13 unit tests, what shipped each iter).
  - New §4.2 — measured numbers table with 99.56% RX, 104.1 µs smoothed
    stdev, 3.95x suppression, 1.4 ppm crystal skew, etc — every cell
    backed by a witness §A0.x entry and a preserved bench log.
  - New §4.3 — host-side production surface listing (sync_packet.rs +
    sensing-server NodeState + Python parser, with file paths).
  - §5 open question on 802.15.4 channel resolved (Kconfig, default ch26
    not ch15, with the witness §D1 rationale).
  - New §6 — explicit scope of what's outside this ADR (multistatic fusion
    math in ADR-029/030, hardware-gated measurements needing INA / 11ax AP,
    IDF upstream fixes pending).

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-05-23 13:16:11 -04:00
ruv 23fd8ac371 feat(sensing-server): consume ADR-110 §A0.12 sync packets, store per-node
Iter 15 — converts the iter 14 SyncPacket decoder from "shipped" to
"consumed" by wiring it into the sensing-server UDP receive loop.

Wiring:
- Cargo.toml gains wifi-densepose-hardware = path = "../wifi-densepose-hardware"
  to pull in the SyncPacket decoder + SYNC_PACKET_MAGIC dispatch constant.
- NodeState gains two new fields:
    latest_sync:    Option<SyncPacket>           — the parsed packet
    latest_sync_at: Option<std::time::Instant>   — staleness clock
- udp_receiver_task now magic-dispatches every incoming datagram against
  SYNC_PACKET_MAGIC (0xC511A110) before falling through to the existing
  ADR-039 vitals / ADR-040 WASM / ADR-018 CSI parsers. Same Option-returning
  pattern as the other parsers, so future packet types slot in cleanly.

When a sync packet arrives:
  * write-lock state, lookup-or-create NodeState by node_id
  * stash the SyncPacket + Instant::now() on the node
  * debug-log node, leader/valid/smoothed flags, sequence, offset_us
  * continue (don't fall through — we know it's not a CSI frame)

Downstream multistatic CSI fusion now has a documented landing pad: any
CSI frame with byte 19 bit 4 set looks up the matching NodeState, applies
ns.latest_sync.epoch_us + (now_local - ns.latest_sync.local_us) to get a
mesh-aligned timestamp. Implementation of that fusion math is the next
ADR-029/030 layer (wifi-densepose-signal).

Verification:
- cargo check -p wifi-densepose-sensing-server --no-default-features → green
- cargo test -p wifi-densepose-hardware sync_packet → 7/7 pass, 122 filtered
- Zero behavioral change for nodes that don't emit sync packets — the
  dispatch only fires on magic match.

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-05-23 13:11:35 -04:00
ruv d72944f887 feat(hardware): Rust SyncPacket decoder + 7 unit tests (ADR-110 §A0.12)
Iter 14 — moves the v0.7.0 Python stub into the Rust production tree
so the sensing-server can decode incoming UDP datagrams by leading
magic and apply mesh-aligned timestamps to in-flight CSI frames.

Module: v2/crates/wifi-densepose-hardware/src/sync_packet.rs
Public surface (re-exported from the crate root):
  - SyncPacket — 32-byte decoded packet
  - SyncPacketFlags — bit0=leader, bit1=valid, bit2=smoothed
  - SYNC_PACKET_MAGIC = 0xC511A110, SYNC_PACKET_SIZE = 32

Tests (all 7 passing, plus 122 existing hardware-crate tests still pass):
  * follower_typical_packet_roundtrips — reproduces COM9 sync-pkt #1
    from §A0.12, including the 1,163,565 µs offset §A0.10 measured
  * leader_packet_has_local_close_to_epoch — COM12 leader case
    (flags=0x03, epoch ≈ local, offset = -7 µs call-stack only)
  * magic_mismatch_is_typed_error
  * short_packet_is_typed_error
  * all_flag_combinations_roundtrip — every (leader,valid,smoothed) triple
  * sync_and_csi_magics_differ — host can dispatch by leading u32
  * wire_size_constant_is_correct

Uses the existing ParseError variants (InvalidMagic, InsufficientData) so
the sensing-server's dispatch code can treat sync-packet decode failures
the same way it treats CSI frame decode failures.

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-05-23 13:06:08 -04:00
ruv 3959fabf31 feat(rust): host-side decode for ADR-018 byte 18-19 (ADR-110 closure)
Parse the C6 firmware's HE PPDU type + bandwidth/flags from ADR-018
bytes 18-19 (previously discarded as _reserved). Adds two types to
CsiMetadata: ppdu_type (HtLegacy/HeSu/HeMu/HeTb/Unknown) and
adr018_flags (bw40/stbc/ldpc/ieee802154_sync_valid).

Pre-ADR-110 firmware sends zeros which round-trip as HtLegacy +
default flags — fully backwards compatible.

6 new deterministic unit tests:
- Pre-ADR-110 backwards compat
- HE-SU / HE-MU / HE-TB decode
- Unknown PPDU byte -> Unknown
- All-bits-set flags round-trip
- PpduType byte round-trip

Result: 122 wifi-densepose-hardware tests pass, 0 fail. Host decoder
now matches the firmware encoder bit-for-bit — HE-LTF metadata path
works end-to-end the moment an 11ax AP is in range.

Ref: ruvnet/RuView#762

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-05-22 22:42:49 -04:00
rUv a85d4e31e4
research(sota): kick off SOTA research loop + first R5 saliency measurement (#702)
Sets up docs/research/sota-2026-05-22/ as the autonomous-research
output dir, with PROGRESS.md as the canonical 15-vector research
agenda spanning spatial intelligence, RF features, RSSI-only, and
exotic/long-horizon verticals. Cron d6e5c473 (*/10 * * * *) picks
threads from this file and self-terminates at 2026-05-22 08:00 ET.

First concrete contribution this tick — R5 subcarrier saliency:

* examples/research-sota/r5_subcarrier_saliency.py: pure-numpy port
  of the count cog's Conv1d encoder + count head, computes per-
  subcarrier input×gradient saliency via central-difference. 128
  samples × 56 subcarriers × 2 forward passes/subcarrier ≈ ~3 s on
  CPU, no GPU or framework dependency.
* docs/research/sota-2026-05-22/R5-subcarrier-saliency.md: research
  note with motivation, method, novelty argument, and the first
  measured ranking. Top-8 subcarriers for cog-person-count v0.0.2:
  [41, 52, 30, 31, 10, 35, 2, 38]. Max/mean ratio 2.85x.
* v2/crates/cog-person-count/cog/artifacts/saliency.json: machine-
  readable per-subcarrier saliency + top-K lists, so future-tick
  experiments (retrain at K=8/16/32) consume it without re-running.

Key insight from the first measurement: top-8 saliency is *band-
spread* (indices span 2-52), not concentrated. This directly raises
R8's (RSSI-only) feasibility ceiling, because RSSI is a band-
aggregate — it retains the integral of a band-spread signal. First-
order estimate: RSSI-only should hit ~60% of full-CSI accuracy for
the count task. R7 (adversarial defence) inherits a concrete defender-
priority list: corroborate these 8 subcarriers across nodes.

This commit is the first of many short, focused contributions over
the next ~12 hours. PROGRESS.md is the canonical pointer for the
next tick to pick up the next thread.
2026-05-21 23:05:55 -04:00
rUv b3a5012dbd
feat(cog-person-count): v0.0.2 — K-fold + label-smoothing + temperature-calibrated (#699)
* chore: stage v0.0.2 artifacts + temperature scalar for build pipeline

Stages count_v1.{safetensors,onnx,temperature,train_results.json}
ahead of the build/sign/upload step. This commit is a momentary
side-effect — the next commit will refresh the per-arch manifests
with the new binary SHAs once ruvultra finishes the cross-build.

The .temperature file holds the calibration scalar from LBFGS over the
held-out conf logits. The Rust cog will read it post-load and divide
conf_logits by it before sigmoid, exactly matching the Python eval.

* feat(cog-person-count): v0.0.2 — K-fold validated, label smoothing + early stop + temp scale

The v0.0.1 "65.1% but class-1=0%" result was an unlucky temporal split
that let a degenerate "always predict 0" classifier hit eval acc =
class-0 fraction. 5-fold stratified random CV proved the architecture
actually learns ~57.1% class-1 accuracy under fair splits — a real,
modestly useful signal.

v0.0.2 ships a retrained model that:

* **Splits randomly (seed=42) 80/20** instead of temporally — eliminates
  the trailing-window-class-imbalance cheat.
* **Class-balanced sampler** (multinomial with replacement, weighted by
  inverse class frequency) — per-batch expected counts are equal
  regardless of dataset distribution.
* **Label smoothing 0.1** on the cross-entropy — reduces confidence
  saturation that drove v0.0.1's all-or-nothing predictions.
* **Early stopping** with patience=20 — stops at epoch 29 instead of
  overfitting through 400.
* **Temperature scaling** of the conf head — LBFGS fits a scalar T on
  held-out conf logits; ships as a count_v1.temperature sidecar so the
  Rust cog can divide conf_logits by T before sigmoid.

Numbers on the same data:

  | Metric           | v0.0.1 | v0.0.2 | K-fold (5x100) |
  |------------------|--------|--------|----------------|
  | Overall acc      | 65.1%  | 62.3%  | 62.2% ± 1.9%   |
  | Class 0 acc      | 100%   | 86.2%  | 67.4%          |
  | Class 1 acc      |  0%    | 34.3%  | 57.1% ✓        |
  | MAE              | 0.349  | 0.377  | 0.378          |
  | Spearman         | 0.023  | 0.013  | 0.160          |

Class-1 accuracy 0 → 34.3% is the headline win. Net acc moves slightly
because we stopped cheating on class 0. K-fold's 57% says there's
headroom remaining; reaching it needs more independent splits (== more
data), not more training tricks.

Confidence calibration didn't move. Temperature scaling alone can't fix
a confidence head trained against a noisy argmax==truth indicator over
a 62%-accurate classifier — the head's training signal is the issue,
not its post-hoc transform. The honest fix is multi-room data (#645),
not another calibration knob.

Live on cognitum-v0 at /var/lib/cognitum/apps/person-count/ — health
reports candle-cpu backend, count = 1 (was 0 in v0.0.1) on synthetic
zero input.

Files changed:
* scripts/train-count.py — adds --k-fold (no sklearn dep, hand-rolled
  stratified splits with deterministic shuffle) and --v2 paths.
* v2/.../cog/artifacts/count_v1.safetensors (392 KB, new sha
  32996433…) + count_v1.onnx (16 KB) + count_v1.temperature (0.9262
  scalar) + count_train_results.json (full epoch trace).
* v2/.../cog/artifacts/manifests/{arm,x86_64}/manifest.json bumped to
  version 0.0.2 with the new weights_sha256 + caveats.
* docs/benchmarks/person-count-cog.md — appends a v0.0.2 section
  with the K-fold diagnostic table and honest-read paragraph.

GCS:
  gs://cognitum-apps/cogs/arm/cog-person-count-count_v1.safetensors
    refreshed (binaries unchanged — load weights via mmap at runtime).
2026-05-21 19:47:04 -04:00
rUv e6a5df36eb
chore(cog-person-count): refresh GCS manifests after run-wiring rebuild (#698)
The arm + x86_64 manifests committed in #696 referenced the binaries
built before #697 wired the `run` subcommand. Rebuilt + re-signed +
re-uploaded to GCS, and re-deployed to cognitum-v0:

  arm    sha 15c2fbac…7728ea5  (3,807,456 B, up from 2,168,816 — added Tokio runtime)
  x86_64 sha 051614ce…cc8388b3 (4,502,960 B, up from 2,615,528)

Both re-signed Ed25519 with COGNITUM_OWNER_SIGNING_KEY. Manifests
now match the binaries published at gs://cognitum-apps/cogs/{arm,
x86_64}/cog-person-count-* and the binary installed at
/var/lib/cognitum/apps/person-count/ on cognitum-v0.
2026-05-21 19:13:10 -04:00
rUv 5c914e63c7
feat(cog-person-count): wire `run` subcommand — v0.0.1 fully functional (#697)
Phase 4 of ADR-103. Adds the long-running polling loop so the cog's
fourth verb (`run`) does real work, completing the ADR-100 runtime
contract end-to-end:

  cog-person-count version    → "person-count 0.3.0"
  cog-person-count manifest   → JSON skeleton
  cog-person-count health     → loads weights + 1-shot infer + emit
  cog-person-count run --config  → long-running per-frame emit  ← THIS

What ships:

* src/runtime.rs (new) — `run_loop` polls sensing_url every poll_ms,
  slides a [56, 20] CSI window, runs InferenceEngine::infer, emits
  publisher::person_count events. Same shape as
  cog-pose-estimation::runtime — fetch_frame extracts amplitudes
  from `snapshot.nodes[0].amplitude[]`, fails open on connect errors
  with a WARN log rather than crashing.
* src/lib.rs — registers the runtime module.
* src/main.rs — cmd_run now loads RunConfig from a JSON file, builds
  the InferenceEngine (with weights if cfg.model_path is set,
  otherwise auto-discover), emits a run.started event, and hands off
  to the Tokio multi-thread runtime's block_on(run_loop). Single-node
  fusion is a no-op for N=1 today; v0.2.0 will append predictions
  from sibling nodes and call fusion::fuse_confidence_weighted before
  emit.

Verified locally:

  cargo check  -p cog-person-count --no-default-features   → clean
  cargo test   -p cog-person-count                          → 15/15 pass (no regressions)
  cargo build  -p cog-person-count --release                → 2.36 MB unchanged
  ./cog-person-count run --config bad-config.json:
    line 1: {"event":"run.started","fields":{"cog":"person-count",
             "sensing_url":"http://127.0.0.1:9999/...",poll_ms:100,
             "model_path":"(auto-discover)"}}
    line 2: WARN sensing-server fetch failed
            error=Connection Failed: Connect error: actively refused
    (loop alive — exits cleanly on SIGTERM, no crash, no NaN)

Also adds a "Relationship to the in-process score_to_person_count
heuristic" section to cog/README.md explaining the dual-emitter
design (sensing-server keeps emitting the PR #491 slot heuristic;
the cog runs out-of-process and emits person.count events from the
learned model). Operators choose by installing the cog or not — no
sensing-server rebuild required.

ADR-103 §"Migration" status:
  1. Land ADR + scaffold ........... done (#693, #694)
  2. Train count_v1 ................ done (#695)
  3. Cross-compile + sign + GCS .... done (#696)
  4. Server-side wiring ............ done — out-of-process design
                                      means no rewire needed; this
                                      cog is the wiring.
  5. v0.2.0 multi-room + LoRA ...... data-bound (#645)
2026-05-21 19:10:15 -04:00
rUv a5e99670f8
feat(cog-person-count): release v0.0.1 — signed binaries on GCS, live on cognitum-v0 (#696)
Phase 3 of ADR-103. Cross-compiled aarch64 + x86_64 on ruvultra, signed
with COGNITUM_OWNER_SIGNING_KEY (Ed25519), uploaded to GCS, and live-
installed on the cognitum-v0 Pi 5 alongside cog-pose-estimation.

Real-hardware bench on cognitum-v0:
  ./cog-person-count-arm health
  → backend=candle-cpu, count=0, confidence=0.49, p95=[0,7]
  30 sequential health invocations: 0.276 s → 9.2 ms/invocation cold

Compares to cog-pose-estimation's 8.4 ms — count cog is ~10% slower
because the dual-head (count softmax + confidence sigmoid) does ~2x
the work after the shared encoder.

GCS release artifacts (publicly downloadable, SHA-verified):
  arm/cog-person-count-arm                          2,168,816 B
    sha:  36bc0bb0...0d47b507b3c3
    sig:  R/00xdzHriyr/2r...JK+a6k71NDg==  (Ed25519)
  x86_64/cog-person-count-x86_64                    2,615,528 B
    sha:  76cdd1ec...3923 7392b01db
    sig:  QB+8cnGSMQmu...ZtTNIQ2rDg==  (Ed25519)
  arm/cog-person-count-count_v1.safetensors           392,088 B
    sha:  dacb0551...e6e04ff56d15c3a65a9ff

Live install at /var/lib/cognitum/apps/person-count/ on cognitum-v0
matches the layout of every other installed cog (anomaly-detect,
seizure-detect, pose-estimation): cog-person-count-arm binary,
count_v1.safetensors weights, manifest.json, config.json.

Adds:
* v2/.../cog/artifacts/manifests/{arm,x86_64}/manifest.json — full
  ADR-100 schema with all fields filled (sha + sig + size + URL +
  build_metadata carrying the v0.0.1 honest training caveats).
* docs/benchmarks/person-count-cog.md — appends "Live appliance
  install" and "Signed GCS release artifacts" sections to the
  benchmark log.

Honest v0.0.1 caveat still applies (class-1 accuracy 0% on the held-
out tail of the single-session training data) — same data-bound
limit as pose_v1. The shipped artifact is the *vehicle*; production-
quality accuracy follows from multi-room paired data per ADR-103's
v0.2.0 plan + #645.
2026-05-21 19:02:26 -04:00
rUv 6b4994e105
feat(cog-person-count): train count_v1.safetensors — honest v0.0.1 (ADR-103) (#695)
Phase 2 of ADR-103: trained count head on the existing 1,077 paired
samples (the same data that produced pose_v1 yesterday).

Honest result: 65.1% eval accuracy / 100% within ±1 / MAE 0.349 on
the held-out time-window. Per-class: 100% on "empty room" / 0% on
"1 person". The model overfit by epoch 100 (train_acc → 1.0,
eval_loss climbed 0.67 → 7.8) and the "best" checkpoint is the
snapshot that happened to predict the eval window's class
distribution (140/215 = 65.1%, matches eval_acc exactly). Confidence
head Spearman = 0.023 ⇒ uncalibrated. Same data-bound failure mode
as pose_v1 (#645), bounded by single-session training data; same
fix path (multi-room).

What v0.0.1 still validates end-to-end:
* PyTorch → safetensors → Candle Rust loads cleanly on first try.
  `cog-person-count health` reports `backend: candle-cpu` and emits
  real per-frame predictions instead of the stub backend's hard-coded
  {1 person, 0 confidence}. Architecture parity between train-count.py
  and src/inference.rs::CountNet is bit-exact.
* ONNX export bit-clean (16 KB, opset 18, dynamic batch axis).
* Training wall time: 5.6 s for 400 epochs on RTX 5080.
* Binary size unchanged (2.36 MB stripped), model loads via mmap at
  runtime.

This commit ships:

* scripts/align-ground-truth.js: extended to emit n_persons_mode +
  n_persons_max per window so the training pipeline has count
  labels. Backwards-compatible (additive fields).
* scripts/train-count.py: new — mirrors CountNet architecture
  exactly, loads paired.jsonl, trains 400 epochs with
  CE+BCE+Brier loss, exports safetensors + ONNX + per-epoch JSON.
* v2/.../cog/artifacts/{count_v1.safetensors,count_v1.onnx,
  count_train_results.json}: the trained artifacts.
* v2/.../cog/README.md: Status table updated with the v0.0.1 numbers
  + an Honest Caveat section explaining the data-bound result.
* docs/benchmarks/person-count-cog.md: new — full v0.0.1 benchmark
  log mirroring the format docs/benchmarks/pose-estimation-cog.md
  established. Includes comparison to ADR-103 v0.1.0 acceptance
  gates and per-class breakdown.

Still pending:
* `run` subcommand wiring (long-running polling loop, same as pose)
* Cross-compile + sign + GCS upload (mirror of pose cog pipeline)
* Live install on cognitum-v0
* v0.2.0: re-train on multi-room data, LoRA per-room adapters,
  Stoer-Wagner min-cut clip in fusion stage
2026-05-21 18:56:52 -04:00
rUv 6959a42312
feat(cog-person-count): v0.0.1 scaffold + tests + fusion math + bench (ADR-103) (#694)
First implementation PR for ADR-103. Same incremental shape that
ADR-101 used: scaffold the cog crate, ship a stub-backend release
that satisfies the runtime contract + 15 tests + measured cold-start,
then follow up with the trained count_v1.safetensors in a separate PR.

What ships:

* v2/crates/cog-person-count/ — new workspace member.
    - Cargo.toml: candle-core/candle-nn 0.9 (cpu default, cuda feature
      opt-in), safetensors, ureq, sha2 — same dep shape as the pose cog
      but minus wifi-densepose-train (this cog has no training-side
      consumer, so the dep tree is materially smaller → 2.36 MB
      binary vs the pose cog's 4.5 MB).
    - src/inference.rs: CountNet (Conv1d 56→64→128→128 encoder + count
      head Linear(128→64→8)+softmax + confidence head
      Linear(128→32→1)+sigmoid). Stub backend returns
      `{1-person, 0-confidence}` honestly when no safetensors present.
    - src/fusion.rs: fuse_confidence_weighted() — Bayesian product of
      per-node distributions with confidence-weighted log-sum, plus
      fuse_with_mincut_clip() hook for the v0.2.0 Stoer-Wagner
      upper-bound (`ruvector-mincut` dep lands when min-cut graph
      builder is ready). Confidences floored at 1e-3 and probs floored
      at 1e-9 before logs — no NaN propagation.
    - src/publisher.rs: emits {count, confidence, count_p95_low,
      count_p95_high, n_nodes, probs} per ADR-103 §"Output".
    - src/main.rs: full ADR-100 four-verb CLI (version|manifest|health
      |run). The `run` subcommand explicitly returns "wiring pending
      v0.0.1" so the in-process library API is the v0.0.1-clean
      integration path.
    - tests/smoke.rs (8 tests) + fusion::tests (7 tests, in-lib) — 15
      total, all green. Cover stub-backend behaviour, wrong-shape
      rejection, fusion math (empty / single / agreement / high-conf
      override / normalisation), p95-range correctness, and min-cut
      clip semantics.
    - cog/{manifest.template.json, config.schema.json, README.md} +
      cog/artifacts/ placeholder dir.

* v2/Cargo.toml: registers the new workspace member.

Verified locally:

  cargo check -p cog-person-count --no-default-features    → clean
  cargo test  -p cog-person-count --no-default-features    → 8/8 pass
  cargo test  -p cog-person-count --lib                    → 7/7 pass
  cargo build -p cog-person-count --release                → 2.36 MB binary
  ./cog-person-count version                               → "person-count 0.3.0"
  ./cog-person-count manifest                              → JSON skeleton
  ./cog-person-count health                                → backend:stub,
                                                              count:1, conf:0,
                                                              p95:[1,1]
  Cold-start: 30 sequential `health` invocations → 53.3 ms/invocation
              (vs cog-pose-estimation's 76.2 ms — smaller dep tree)

cog/README.md adds:

* Security section — six-row threat table covering safetensor mmap
  trust, non-finite outputs, sensing fetch failures, fusion
  divide-by-zero / log-of-zero, min-cut degenerate cases, and stdout
  spoofing.
* Performance / optimization section — binary size, release profile
  (already opt-level=3 / lto=fat / codegen-units=1 / strip=true at
  workspace level), cold-start comparison table, projected warm-path
  latency budget.

Still pending (separate PRs, ADR-103 §"Migration"):

* Train count_v1.safetensors on the existing 1,077 paired samples
  with `n_persons` labels (Candle on RTX 5080, same script that
  produced pose_v1.safetensors yesterday).
* `run` subcommand wiring (long-running polling loop, same shape as
  cog-pose-estimation::runtime).
* Cross-compile + sign + GCS upload (mirror of cog-pose-estimation
  release pipeline).
* Server-side `csi.rs::score_to_person_count` call-site rewire to
  consume this cog when installed; falls back to PR #491's heuristic
  when not.
2026-05-21 18:46:57 -04:00
rUv 67fec45e61
feat(edge-registry): ADR-102 — surface Cognitum cog catalog via /api/v1/edge/registry (#648)
* feat(edge-registry): ADR-102 — surface Cognitum cog catalog via /api/v1/edge/registry

Adds a new sensing-server endpoint that fetches and caches the canonical
Cognitum app registry at
https://storage.googleapis.com/cognitum-apps/app-registry.json (105 cogs
across 11 categories as of v2.1.0). RuView previously had no live
awareness of the catalog — the README's capability table was hand-
curated and went stale as Cognitum shipped new cogs (the registry was
last updated 6 days ago).

ADR:
* docs/adr/ADR-102-edge-module-registry.md — full design, response
  shape, configuration flags, failure modes, and a 12-row security
  review covering SSRF, response inflation, ?refresh abuse, stale-serve
  semantics, TLS, cache poisoning, JSON-panic resistance, etc.

Code:
* v2/.../edge_registry.rs — EdgeRegistry struct + UreqFetcher +
  MockFetcher trait + 7 unit tests. RwLock<Option<CachedEntry>> with
  stale-on-error fallback. MAX_PAYLOAD_BYTES=8 MiB, 10s wire timeout.
* v2/.../main.rs — constructs Option<Arc<EdgeRegistry>> at startup,
  registers GET /api/v1/edge/registry handler, wires Extension layer.
  Handler runs the blocking ureq fetch via tokio::task::spawn_blocking
  so the async runtime stays free.
* v2/.../cli.rs / main.rs Args — three new flags (per user request to
  "allow the registry to be disabled or changed"):
    --edge-registry-url <URL>       (env RUVIEW_EDGE_REGISTRY_URL)
    --edge-registry-ttl-secs <N>    (env RUVIEW_EDGE_REGISTRY_TTL_SECS)
    --no-edge-registry              (env RUVIEW_NO_EDGE_REGISTRY)
  When --no-edge-registry is set or the URL is empty, the endpoint
  returns 404.

Cargo.toml: adds ureq (rustls), sha2, thiserror as direct deps.

README:
* New collapsed "🧩 Edge Module Catalog" section with the full 105-cog
  table generated from the registry, grouped by category with practical
  one-line descriptions (e.g. "Spots irregular heartbeats and abnormal
  heart rhythms", "Detects walking problems and scores fall risk").
  Links to https://seed.cognitum.one/store and the local appliance
  /cogs page. Sits between the HF model section and How It Works.

Tests (7/7 pass):
  first_call_hits_upstream_and_caches
  ttl_expiry_triggers_refetch
  force_refresh_bypasses_fresh_cache
  stale_serve_on_upstream_failure_after_cached_success
  no_cache_no_upstream_returns_error
  upstream_invalid_json_is_treated_as_error
  upstream_sha256_is_deterministic

Security highlights (full review in ADR-102 §"Security review"):
- The registry is metadata-only; per-cog binary signatures (ADR-100)
  remain the trust root for installs. A compromised registry can
  mislead a human reader but cannot ship malicious binaries.
- 8 MiB cap + 10s timeout + Option<Arc<...>> via Extension layer means
  the endpoint can't be used to exhaust memory or pin tokio threads.
- Stale-on-error responses carry an explicit `stale: true` field so
  upstream outages are visible to consumers rather than silently
  masked.
- Endpoint sits behind the existing RUVIEW_API_TOKEN bearer gate when
  set, otherwise unauthenticated (registry contents are public anyway).

* chore: refresh Cargo.lock for ureq/sha2/thiserror deps added by ADR-102
2026-05-19 18:08:43 -04:00
rUv 4b1a835107
docs: repoint #640 references to #645 (original deleted, replaced) (#646)
Issue #640 (PCK gap follow-up) was deleted upstream after the cog v0.0.1
PRs landed today. Re-opened as #645 with the same context plus the
new measured v0.0.1 numbers (PCK@20 3.0%, PCK@50 18.5%, MPJPE 0.093).
This patch updates the three files in main that still pointed at the
dead #640 to point at #645 instead — ADR-101, the cog README, and the
benchmark log.
2026-05-19 17:18:05 -04:00
rUv fcb6f4bf12
feat(cog-pose-estimation): x86_64 release v0.0.1 — parallel to arm (#643)
Adds the x86_64-unknown-linux-gnu binary uploaded to
gs://cognitum-apps/cogs/x86_64/, signed with the same Ed25519
COGNITUM_OWNER_SIGNING_KEY as the arm release. Together with the
already-shipped arm artifact, the cog now ships natively for both
target architectures the Cognitum fleet supports.

x86_64 release:
  sha256:    a434739a24415b34e1aff50e5e1c3c32e568db96af473bbb3e5ecc9b95fe71fa
  signature: pNNuxhgM18PztN8BSZdfw5oAShG2pV3na5T/q2QdlJWX/5FJgo4QTiUCbcTAxI2Uiva8VURSOlRzMU3xoQPqCQ==
  size:      4,548,856 bytes
  cold-start: 5.4 ms / invocation on ruvultra (RTX 5080, NVMe)

Reorganizes manifests under cog/artifacts/manifests/{arm,x86_64}/
so each arch carries its own manifest with the matching binary_sha256
and signature — same layout the release pipeline will use for the
future hailo8 / hailo10 variants.

Updates docs/benchmarks/pose-estimation-cog.md with the cross-arch
cold-start table:

  Windows (x86_64)   76.2 ms
  ruvultra (x86_64)   5.4 ms   <- this release
  Pi 5 (aarch64)     8.4 ms

Verified via anonymous GCS download + SHA round-trip — identical to
local build.

Hailo HEF remains the only pending arch, still blocked on Hailo SDK
provisioning to a self-hosted runner.
2026-05-19 17:08:23 -04:00
rUv 3314c8db8d
feat(cog-pose-estimation): scaffold first Cog from this repo (ADR-100 + ADR-101) (#642)
* feat(cog-pose-estimation): scaffold first Cog from this repo (ADR-100 + ADR-101)

Adds the foundation for the pose-estimation Cog that ships from this
repo into Cognitum V0 appliances. Companion ADR-225 + crate land in
cognitum-one/v0-appliance.

ADRs:
* ADR-100 formalises the Cognitum Cog packaging spec — on-device
  layout under /var/lib/cognitum/apps/<id>/, manifest.json schema
  (incl. new binary_sha256 + binary_signature fields), GCS hosting
  convention, repo source layout, build pipeline, and the four-verb
  runtime contract (version | manifest | health | run). Documents the
  convention I reverse-engineered from inspecting installed cogs on a
  live cognitum-v0 appliance — `anomaly-detect`, `presence`,
  `seizure-detect`, etc.
* ADR-101 designs the pose-estimation Cog itself: where it sits in
  the wifi-densepose pipeline (encoder init from
  ruvnet/wifi-densepose-pretrained, 17-keypoint regression head),
  what gets shipped per target arch (arm / x86_64 / hailo8 /
  hailo10), acceptance gates (PCK@20 explicitly deferred to #640 —
  this ADR ships the vehicle, not the accuracy).

Crate v2/crates/cog-pose-estimation/:
* Cargo.toml + workspace member declaration with a hailo feature gate
  so the binary builds without the Hailo SDK in CI.
* main.rs implements the four-verb CLI exactly per ADR-100.
* config.rs / manifest.rs / publisher.rs / inference.rs / runtime.rs —
  small modules, each <100 lines.
* publisher.rs emits ADR-100 structured JSON events.
* inference.rs is a stub that produces a centred-skeleton baseline
  with confidence=0 (honest: no trained weights wired in yet).
* runtime.rs subscribes to /api/v1/sensing/latest, slides a
  56*20 window, runs the engine, emits pose.frame events.
* cog/manifest.template.json + cog/config.schema.json define the
  release artifact + runtime config schemas.
* cog/Makefile holds build / sign / upload targets.
* tests/smoke.rs covers manifest roundtrip + engine I/O surface.

Verified locally:
* cargo check -p cog-pose-estimation: clean.
* cargo test  -p cog-pose-estimation: 4/4 pass.
* ./target/release/cog-pose-estimation {version,manifest,health}:
  all emit the right contract output.

This commit contains scaffolding only; the actual trained weights and
Hailo HEF cross-compile come in follow-ups tracked in #640 and the
companion v0-appliance branch.

* feat(cog-pose-estimation): first measured run — Candle CUDA on RTX 5080

Trained pose_v1 on ruvultra (RTX 5080) via Candle 0.9 + cuda feature
against the same 1,077-sample paired session that produced 0%/0% PCK
in #640 with the pure-JS SPSA trainer. First real numbers:

  PCK@20 = 3.0%   (up from 0.0%)
  PCK@50 = 18.5%  (up from 0.0%)
  MPJPE  = 0.093  (down from 0.66, ~7x improvement)

400 epochs in 2.1 s wall time, full-batch, ~5 ms/epoch. Loss curve
0.181 -> 0.014 over the run, eval 0.010. Per-joint reveals the model
leans on right-side proximal joints (r_hip 77% PCK@50, r_knee 35%,
l_elbow 26%) — consistent with the camera framing in the source
recording. Distal joints (wrists, ankles) and face joints are still
near-random, consistent with the 56-subcarrier / 20-frame input not
carrying fine-grained spatial info at 1077 samples.

This commit:

* Adds v2/crates/cog-pose-estimation/cog/artifacts/{pose_v1.safetensors,
  train_results.json} so the cog dir now contains a real reference
  artifact, not just scaffold.
* Updates cog/README.md "Status" block with the measured numbers,
  per-joint table, and an honest reading of where the model
  succeeds vs where the data is the bottleneck.
* Adds docs/benchmarks/pose-estimation-cog.md as the canonical
  benchmark log — append-only, one section per published run.
* Appends a "First measured run" section to ADR-101 referencing
  the new benchmark file.

Still pending in the follow-up:
* Wire pose_v1.safetensors into src/inference.rs (replace stub).
* ONNX export (Candle lacks a writer — needs external conversion).
* Hailo HEF cross-compile + cluster deploy.

The data-bound gap to PCK@20 >= 35% is tracked in #640.

* feat(cog-pose-estimation): wire real weights — cog is no longer a stub

Replaces the centred-skeleton stub in src/inference.rs with a real
Candle-based loader that reads cog/artifacts/pose_v1.safetensors and
runs the trained Conv1d encoder + MLP pose head on every incoming CSI
window.

What changes:

* src/inference.rs: PoseNet mirrors the training script's architecture
  exactly — Conv1d(56->64, k=3 d=1), Conv1d(64->128, k=3 d=2),
  Conv1d(128->128, k=3 d=4), mean over time, Linear(128->256)+ReLU,
  Linear(256->34)+sigmoid -> reshape [17, 2]. The InferenceEngine
  searches a sensible candidate list for the weights file
  (/var/lib/cognitum/apps/pose-estimation/, ./pose_v1.safetensors,
  ./cog/artifacts/, repo-root, v2/-relative) and falls back to the
  stub when none are present so the cog still satisfies ADR-100.
* Cargo.toml: adds candle-core 0.9 + candle-nn 0.9 (no-default-features,
  CPU build by default) + safetensors 0.4. New `cuda` feature opt-in
  for GPU inference on hosts that have it. Drops the unused
  wifi-densepose-train path dep from the default build path.
* src/main.rs + src/publisher.rs: health.ok event now carries
  `backend` (candle-cuda | candle-cpu | stub) and the synthetic
  output confidence, so operators can tell at a glance whether the
  cog loaded its weights or fell back to the stub.
* tests/smoke.rs: adds `real_weights_load_when_available` which
  asserts the loaded engine reports backend=candle-* and emits
  non-zero confidence — exactly the signal that proves we're not
  silently degrading to the stub.

Verified locally:

* `cargo check -p cog-pose-estimation --no-default-features` — clean
* `cargo test  -p cog-pose-estimation --no-default-features` — 5/5 pass
* `./target/release/cog-pose-estimation health` emits:
  {"event":"health.ok","fields":{"backend":"candle-cpu","cog":"pose-estimation","synthetic_output_confidence":0.185}}
  — 0.185 is the published PCK@50 from cog/artifacts/train_results.json,
  emitted by the real Candle inference path (would be 0.0 if it had
  fallen back to the stub).

The cog now runs the trained pose_v1 model end-to-end. Accuracy is
still bounded by the underlying 1077-sample training data (PCK@20
3.0%, PCK@50 18.5% per docs/benchmarks/pose-estimation-cog.md) — that
gap is data-bound and tracked in #640. ONNX export + Hailo HEF
cross-compile remain follow-ups.

* docs(benchmarks): measure cog-pose-estimation cold-start latency

100 sequential `cog-pose-estimation health` invocations average 76.2 ms
each on a Windows x86_64 host using the `candle-cpu` backend. Each
invocation re-loads pose_v1.safetensors and runs one synthetic forward
pass, so this is the worst-case cold-start path. Long-running `run`
inference will be sub-millisecond per frame once the model is loaded.

Updates the benchmarks doc accordingly.

* feat(cog-pose-estimation): ONNX export — pose_v1.onnx + scripts/export-onnx.py

Adds the canonical ONNX artifact that unblocks downstream Hailo HEF
cross-compile + ONNX Runtime benchmarks. Generated on ruvultra (torch
2.12.0 + CUDA), 12,059 bytes, opset 18, dynamic batch axis.

* scripts/export-onnx.py: mirrors the Candle inference architecture in
  PyTorch (Conv1d 56->64, 64->128, 128->128 + Linear 128->256->34), pure-
  python safetensors loader (no extra pip dep), exports via
  torch.onnx.export, then verifies via onnx.checker.check_model and
  numerical parity against the torch reference.
* Verified parity vs torch: max |torch - onnx| = 8.94e-8 (1e-5
  threshold). Effectively bit-perfect.
* v2/crates/cog-pose-estimation/cog/artifacts/pose_v1.onnx — the
  artifact itself, 12 KB.
* docs/benchmarks/pose-estimation-cog.md — adds an ONNX export
  section with the verification numbers.

Next: Hailo HEF cross-compile (still gated on Hailo SDK on a
self-hosted runner) and ONNX Runtime latency benchmarks on each
target arch.

* feat(cog-pose-estimation): release v0.0.1 — signed aarch64 binary on GCS

End-to-end deploy: cross-compiled to aarch64-unknown-linux-gnu on
ruvultra, ran via qemu-aarch64-static, then smoke-tested on a real
cognitum-v0 Pi 5. Signed with COGNITUM_OWNER_SIGNING_KEY (Ed25519)
and uploaded to gs://cognitum-apps/cogs/arm/.

Real-hardware results on cognitum-v0 (Pi 5):
  health: backend=candle-cpu, confidence=0.185, real weights loaded
  30x sequential `health`: 0.251 s total -> 8.4 ms / invocation (cold)

GCS release artifacts (publicly downloadable):
  binary:  3,741,976 bytes
    sha256 1e1a7d3dd01ca05d5bfc5dbb142a5941b7866ed9f3224a21edc04d3f09a99bf5
  weights:   507,032 bytes
    sha256 eb249b9a6b2e10130437a10976ed0230b0d085f86a0553d7226e1ae6eae4b9e5
  signature (Ed25519, b64): LUN7xqLPYD3MFzm5dKB5MnYU0LvoRtek5ci5KiKPHBg+Xo6xuazwokn2Dw2JPMaLYJzmWn/SpT4djuR7hYvVDw==

Adds:
* v2/crates/cog-pose-estimation/cog/artifacts/manifest.json — the
  release-pipeline-produced manifest with all fields filled in per
  ADR-100, including arch, target_triple, signature, and a
  build_metadata block carrying the validation PCK numbers.
* docs/benchmarks/pose-estimation-cog.md — new sections covering
  the real Pi 5 smoke (8.4 ms cold-start) and the signed GCS
  release artifacts.

Verified by downloading the binary anonymously from GCS and
re-computing the sha256 — matches the locally-computed sha exactly.
Signature decoded to the expected 64-byte Ed25519 length.

Closes the GCS-upload acceptance criterion from ADR-100; the only
pending work is Hailo HEF cross-compile (still SDK-gated) and an
x86_64 release alongside this arm release.

* docs(benchmarks): record live cognitum-v0 install + 5-sec smoke run

Adds the "Live appliance install" section documenting what happened
when the signed v0.0.1 binary + weights were installed under
/var/lib/cognitum/apps/pose-estimation/ on cognitum-v0 (the V0
cluster leader).

* Layout matches the existing anomaly-detect / presence / seizure-
  detect cogs exactly — the Cogs dashboard at
  http://cognitum-v0:9000/cogs auto-discovers entries.
* `cog-pose-estimation run` ran for 5 seconds in the background and
  cleanly emitted run.started + structured WARN events for the
  missing local sensing-server on :3000 (cognitum-v0's actual CSI
  source is ruview-vitals-worker on :50054, not :3000). No crashes,
  no NaN, no leaks.
* Wiring `sensing_url` to the appliance-native source is a separate
  Day-2 integration task.
2026-05-19 17:03:09 -04:00
Rahul c00f45e296
fix(sensing): finish #611 NaN-panic audit — 7 more sites missed by #613 (#624)
#613 fixed adaptive_classifier.rs:94 (the IQR sort) and called the audit
done, but the grep used `partial_cmp(b).unwrap()` as a literal and missed
seven additional production sites that use comparator variants:

  adaptive_classifier.rs:205  AdaptiveModel::classify() argmax over softmax
                              probs — same per-frame hot path as #611.
                              NaN flows through normalise → logits → softmax
                              and still reaches this site even after the
                              IQR fix.
  adaptive_classifier.rs:480  train() argmax (training accuracy loop)
  adaptive_classifier.rs:500  train() per-class argmax
  main.rs:2446, 2449          count_persons_mincut variance source/sink select
  csi.rs:602, 605             count_persons_mincut variance source/sink select
                              (duplicate of main.rs logic in csi.rs)

For the variance-select sites, note that the *outer* `unwrap_or((0, &0))`
only catches an empty iterator — it cannot rescue a panic raised inside
the comparator. A single NaN in `variances[]` still aborts the process.

Same fix as #613: swap `.unwrap()` for `.unwrap_or(std::cmp::Ordering::Equal)`
inside the comparator closure. Pure behavioural change, no API surface.

Re-audit of the remaining `partial_cmp(...).unwrap()` matches in v2/:
they are all inside `#[cfg(test)]` / `#[test]` blocks (spectrogram.rs:269,
depth.rs:234, connectivity.rs:477, vital_signs.rs:737) where inputs are
controlled and panic-on-NaN is acceptable.
2026-05-19 10:02:08 -04:00
Winter Lau e964eaf14f
fix(deps): bump ndarray 0.15→0.17 and ndarray-npy 0.8→0.10 (closes #626) (#627) 2026-05-19 10:01:52 -04:00
ruv 79cc2d7b22 Merge #491: feat(sensing-server): adaptive person count — RollingP95 + dedup_factor runtime API
Integrating @schwarztim's PR #491 into main on their behalf — their fork has
fallen too far behind for a clean rebase (the PR's commit graph dropped
silently during `git rebase origin/main`), so applying as a merge from the
fork head to preserve the diff cleanly.

What this lands:
- `RollingP95` adaptive normaliser for the person-count feature scaling.
  Streaming P95 over a 600-sample / ~30 s sliding window. Cold-start
  (<60 samples) falls back to the legacy denominators (variance/300,
  motion_band_power/250, spectral_power/500) so day-0 behaviour is
  preserved on every deployment.
- `RuntimeConfig` struct + `load_runtime_config` / `save_runtime_config`
  persisted to `data/config.json`. Exposes `dedup_factor` via REST so
  multi-node deployments can tune cluster-deduplication without a rebuild,
  including an auto-tune endpoint that derives optimal dedup from a known
  person count (calibration mode).
- `compute_person_score()` now takes &AppStateInner alongside &FeatureInfo
  so the adaptive denominators are reachable. All 3 call sites updated.
- New `AppStateInner` fields: `p95_variance`, `p95_motion_band_power`,
  `p95_spectral_power`, `dedup_factor`, `data_dir`.

Closes #491. Directly addresses:
- #499 (double skeletons, multi-node) — the slot-clustering problem this
  PR's adaptive normaliser was designed to fix
- #519 Bug 1 (ghost person detection on edge-tier 1 & 2 multi-node)
- #496 (person count over-reporting on single-room single-person)

Verified locally:
- cargo check -p wifi-densepose-sensing-server --no-default-features: 1.0s
- cargo test -p wifi-densepose-sensing-server --no-default-features --lib:
  233/233 passed in 25.0s

Co-authored-by: @schwarztim
Co-Authored-By: claude-flow <ruv@ruv.net>
2026-05-19 08:25:47 -04:00
rUv b2e2e6d6fd
fix(sensing-server): WS broadcast emits effective_source() not hardcoded "esp32" (closes #618) (#621)
Reported by @ArnonEnbar with a complete reproduction.

broadcast_tick_task() re-emits the cached `latest_update` every tick so
pose WS clients keep getting data even when ESP32 pauses between
frames. The `source` field of that cached update was set to "esp32" at
the moment a fresh ESP32 frame was last decoded (main.rs:3885, :4136).

After the ESP32 loses power or network, no fresh frame is decoded —
the cached `latest_update` is still re-broadcast every tick with the
stale source: "esp32" baked in. UI's "Sensing" tab keeps showing
"LIVE — ESP32 HARDWARE Connected" with frozen vitals/features/
classification re-broadcast indefinitely. REST `/health` correctly
reports source: "esp32:offline" (via effective_source(), which checks
last_esp32_frame elapsed time against ESP32_OFFLINE_TIMEOUT=5s) — but
the WS broadcast path was the one consumer that didn't call it.

Fix: clone the cached update per tick, overwrite source with
s.effective_source(), then serialize and broadcast. UI now switches to
"esp32:offline" on the same 5s budget as the REST surface.

cargo build -p wifi-densepose-sensing-server --no-default-features:
17s, no errors (1 pre-existing unused-import warning unchanged).
2026-05-18 08:18:18 -04:00
rUv 72bbd256e7
fix(security): path-traversal guard on 5 sensing-server endpoints (closes #615) (#616)
Reported by @bannned-bit. Five endpoints in
v2/crates/wifi-densepose-sensing-server embedded user-controlled
identifiers in format!() paths with no sanitization:

  recording.rs       POST   /api/v1/recording/start       (session_name)
  recording.rs       GET    /api/v1/recording/download/:id (id)
  recording.rs       DELETE /api/v1/recording/delete/:id   (id)
  model_manager.rs   POST   /api/v1/models/load           (model_id)
  training_api.rs    load_recording_frames                (dataset_ids[])

Each unauthenticated caller could:
- READ arbitrary files via ../../etc/passwd, ../../.env, etc.
- WRITE attacker-controlled JSONL via recording/start
- LOAD attacker-controlled .rvf model files
- DELETE arbitrary files the server process can touch

New `path_safety` module exports `safe_id(&str) -> Result<&str, PathSafetyError>`
that enforces the rejection envelope BEFORE any user input reaches a
format!() that builds a path:

  - Allowed character set: [A-Za-z0-9._-]
  - Reject leading '.' (rules out '.', '..', '.env', hidden files)
  - Reject empty strings
  - Reject anything > 64 bytes
  - Reject all whitespace, path separators, null bytes, non-ASCII

Applied at all 5 sites. Errors return 400 Bad Request (download) /
status:"error" JSON (others) — not panics.

9 unit tests in path_safety::tests cover:
  - accepts simple alphanumeric / hyphen / underscore / dot
  - rejects empty, leading dot, path separators ('/', '\'),
    null byte, whitespace, shell specials, non-ASCII (including
    fullwidth slash U+FF0F), too-long, boundary at MAX_ID_LEN

  test result: ok. 9 passed; 0 failed
  cargo build -p wifi-densepose-sensing-server --no-default-features: 33s

Fix-marker RuView#615 in scripts/fix-markers.json prevents removing the
guard at any of the 5 call sites. CHANGELOG entry under [Unreleased] /
Security documents the patched endpoints and the rejection envelope.

Severity: critical per reporter — five remotely-reachable paths to read,
write, or delete arbitrary files. Hot per-request paths, not edge cases.
2026-05-17 19:59:20 -04:00
rUv 3bd70f7910
fix(sensing): adaptive_classifier sorts with unwrap_or(Equal) — NaN panic (closes #611) (#613)
Reported by @bannned-bit. v2/crates/wifi-densepose-sensing-server/src/
adaptive_classifier.rs:94 did:

    sorted.sort_by(|a, b| a.partial_cmp(b).unwrap());

f64::partial_cmp returns None on NaN, so `.unwrap()` panics. CSI data
from real ESP32 hardware can produce NaN (silent DSP div-by-zero,
empty buffer, etc.), and this code path runs on every frame in the
classify() hot path — a single NaN frame kills the entire sensing
server process.

Fix swaps for unwrap_or(Ordering::Equal), matching the pattern the
same file already uses at lines 149-150 and 155 (those sites were
already NaN-safe; this site was an oversight).

Scoped audit: greped the v2/ tree for `partial_cmp(b).unwrap()`. The
other 3 hits are in #[cfg(test)] blocks (spectrogram.rs:269,
depth.rs:234, connectivity.rs:477) where panic-on-NaN is acceptable
because test inputs are controlled. Only adaptive_classifier.rs:94
was a production-path crash.

Severity: critical per reporter — runtime panic on real-world data.
Patch: 1-line behavioural change + comment.
2026-05-17 19:29:07 -04:00
rUv 1b155ad027
chore: remove empty stub crates wifi-densepose-{api,db,config} (closes #578) (#608)
Each of these crates was a single-line doc-comment placeholder:

  v2/crates/wifi-densepose-api/src/lib.rs:    //! WiFi-DensePose REST API (stub)
  v2/crates/wifi-densepose-db/src/lib.rs:     //! WiFi-DensePose database layer (stub)
  v2/crates/wifi-densepose-config/src/lib.rs: //! WiFi-DensePose configuration (stub)

with empty [dependencies] in their Cargo.toml and zero references from any
source file or Cargo.toml in the workspace (verified by `grep -rln
wifi-densepose-api/-db/-config` across `v2/`). They were reserved early for
an envisioned REST/database/config split that never materialised.

The functionality these would have provided is covered today by:
- REST/WS:  wifi-densepose-sensing-server (Axum)
- Config:   per-crate config + CLI args in sensing-server and desktop
- DB:       no persistent state; system is real-time

Removal prevents `cargo` from listing dead crates, shipping empty published
artifacts to crates.io, or wasting reviewer attention. If any of these names
is needed in the future, reintroduce them with a real implementation.

Per the issue reporter (@bannned-bit / Matad0r) #578 explicitly listed
"OR be removed from workspace members until implementation starts" as an
acceptable resolution.

Updated:
- `v2/Cargo.toml`: drop the three members (with inline comment explaining why)
- `v2/Cargo.lock`: regenerated by cargo check
- `CLAUDE.md`: drop the three rows from the crate table and the publishing
  order list
- `CHANGELOG.md`: add an `[Unreleased] / Removed` entry

Verified:
- `cd v2 && cargo check --workspace --no-default-features` -> finished
  in 48s, no errors (warnings unchanged)
2026-05-17 18:50:57 -04:00
dependabot[bot] 0310b1fa9a
chore(deps): bump @tauri-apps/plugin-dialog (#462)
Bumps [@tauri-apps/plugin-dialog](https://github.com/tauri-apps/plugins-workspace) from 2.6.0 to 2.7.0.
- [Release notes](https://github.com/tauri-apps/plugins-workspace/releases)
- [Commits](https://github.com/tauri-apps/plugins-workspace/compare/log-v2.6.0...log-v2.7.0)

---
updated-dependencies:
- dependency-name: "@tauri-apps/plugin-dialog"
  dependency-version: 2.7.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-05-17 18:11:58 -04:00
dependabot[bot] 4ecc053a27
chore(deps-dev): bump typescript in /v2/crates/wifi-densepose-desktop/ui (#456)
Bumps [typescript](https://github.com/microsoft/TypeScript) from 5.9.3 to 6.0.3.
- [Release notes](https://github.com/microsoft/TypeScript/releases)
- [Commits](https://github.com/microsoft/TypeScript/compare/v5.9.3...v6.0.3)

---
updated-dependencies:
- dependency-name: typescript
  dependency-version: 6.0.3
  dependency-type: direct:development
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-05-17 18:11:41 -04:00
dependabot[bot] 4d45add824
chore(deps): bump react-dom and @types/react-dom (#451)
Bumps [react-dom](https://github.com/facebook/react/tree/HEAD/packages/react-dom) and [@types/react-dom](https://github.com/DefinitelyTyped/DefinitelyTyped/tree/HEAD/types/react-dom). These dependencies needed to be updated together.

Updates `react-dom` from 18.3.1 to 19.2.5
- [Release notes](https://github.com/facebook/react/releases)
- [Changelog](https://github.com/facebook/react/blob/main/CHANGELOG.md)
- [Commits](https://github.com/facebook/react/commits/v19.2.5/packages/react-dom)

Updates `@types/react-dom` from 18.3.7 to 19.2.3
- [Release notes](https://github.com/DefinitelyTyped/DefinitelyTyped/releases)
- [Commits](https://github.com/DefinitelyTyped/DefinitelyTyped/commits/HEAD/types/react-dom)

---
updated-dependencies:
- dependency-name: react-dom
  dependency-version: 19.2.5
  dependency-type: direct:production
  update-type: version-update:semver-major
- dependency-name: "@types/react-dom"
  dependency-version: 19.2.3
  dependency-type: direct:development
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-05-17 18:11:26 -04:00
dependabot[bot] a80617ee84
chore(deps): bump console from 0.15.11 to 0.16.3 in /v2 (#471)
Bumps [console](https://github.com/console-rs/console) from 0.15.11 to 0.16.3.
- [Release notes](https://github.com/console-rs/console/releases)
- [Changelog](https://github.com/console-rs/console/blob/main/CHANGELOG.md)
- [Commits](https://github.com/console-rs/console/compare/0.15.11...0.16.3)

---
updated-dependencies:
- dependency-name: console
  dependency-version: 0.16.3
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-05-17 18:10:01 -04:00
dependabot[bot] afc86c6fc4
chore(deps): bump thiserror from 1.0.69 to 2.0.18 in /v2 (#469)
Bumps [thiserror](https://github.com/dtolnay/thiserror) from 1.0.69 to 2.0.18.
- [Release notes](https://github.com/dtolnay/thiserror/releases)
- [Commits](https://github.com/dtolnay/thiserror/compare/1.0.69...2.0.18)

---
updated-dependencies:
- dependency-name: thiserror
  dependency-version: 2.0.18
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-05-17 18:09:54 -04:00
dependabot[bot] e6710e8988
chore(deps): bump ndarray-linalg from 0.16.0 to 0.18.1 in /v2 (#477)
Bumps [ndarray-linalg](https://github.com/rust-ndarray/ndarray-linalg) from 0.16.0 to 0.18.1.
- [Release notes](https://github.com/rust-ndarray/ndarray-linalg/releases)
- [Commits](https://github.com/rust-ndarray/ndarray-linalg/compare/ndarray-linalg-v0.16.0...ndarray-linalg-v0.18.1)

---
updated-dependencies:
- dependency-name: ndarray-linalg
  dependency-version: 0.18.1
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-05-17 18:08:08 -04:00
dependabot[bot] ab9799adc3
chore(deps): bump tower-http from 0.5.2 to 0.6.8 in /v2 (#483)
Bumps [tower-http](https://github.com/tower-rs/tower-http) from 0.5.2 to 0.6.8.
- [Release notes](https://github.com/tower-rs/tower-http/releases)
- [Commits](https://github.com/tower-rs/tower-http/compare/tower-http-0.5.2...tower-http-0.6.8)

---
updated-dependencies:
- dependency-name: tower-http
  dependency-version: 0.6.8
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-05-17 18:08:04 -04:00
dependabot[bot] bdb4484259
chore(deps): bump tch from 0.14.0 to 0.24.0 in /v2 (#482)
Bumps [tch](https://github.com/LaurentMazare/tch-rs) from 0.14.0 to 0.24.0.
- [Release notes](https://github.com/LaurentMazare/tch-rs/releases)
- [Changelog](https://github.com/LaurentMazare/tch-rs/blob/main/CHANGELOG.md)
- [Commits](https://github.com/LaurentMazare/tch-rs/commits)

---
updated-dependencies:
- dependency-name: tch
  dependency-version: 0.24.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-05-17 18:08:01 -04:00
dependabot[bot] ba370c7b08
chore(deps): bump tabled from 0.15.0 to 0.20.0 in /v2 (#481)
Bumps [tabled](https://github.com/zhiburt/tabled) from 0.15.0 to 0.20.0.
- [Changelog](https://github.com/zhiburt/tabled/blob/master/CHANGELOG.md)
- [Commits](https://github.com/zhiburt/tabled/commits)

---
updated-dependencies:
- dependency-name: tabled
  dependency-version: 0.20.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-05-17 18:07:57 -04:00
dependabot[bot] 3fdd310f89
chore(deps): bump tauri-plugin-dialog from 2.6.0 to 2.7.1 in /v2 (#480)
Bumps [tauri-plugin-dialog](https://github.com/tauri-apps/plugins-workspace) from 2.6.0 to 2.7.1.
- [Release notes](https://github.com/tauri-apps/plugins-workspace/releases)
- [Commits](https://github.com/tauri-apps/plugins-workspace/compare/log-v2.6.0...log-v2.7.1)

---
updated-dependencies:
- dependency-name: tauri-plugin-dialog
  dependency-version: 2.7.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-05-17 18:07:53 -04:00
dependabot[bot] 98e7eeda42
chore(deps): bump ruvector-core from 2.0.5 to 2.2.0 in /v2 (#479)
Bumps [ruvector-core](https://github.com/ruvnet/ruvector) from 2.0.5 to 2.2.0.
- [Release notes](https://github.com/ruvnet/ruvector/releases)
- [Changelog](https://github.com/ruvnet/RuVector/blob/main/CHANGELOG.md)
- [Commits](https://github.com/ruvnet/ruvector/compare/v2.0.5...v2.2.0)

---
updated-dependencies:
- dependency-name: ruvector-core
  dependency-version: 2.2.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-05-17 18:07:37 -04:00