Commit Graph

9 Commits

Author SHA1 Message Date
ruv 22ca3da48c chore(cogs): publish cog-person-count + cog-pose-estimation 0.3.0 to crates.io
- cog-person-count: no path deps, clean publish.
- cog-pose-estimation: added explicit version="0.3.1" to the
  wifi-densepose-train path dep (crates.io rejects path-only deps).
- cog-ha-matter: keeps publish=false; the published
  wifi-densepose-sensing-server@0.3.0 does not expose the `mqtt` feature
  this cog requires. Note added inline; republish sensing-server with the
  feature exposed before dropping the flag.

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-05-25 10:52:47 -04:00
rUv 004a63e82d
fix(security): audit — fix RUSTSEC vulns, clippy warnings, dead code (#769)
- Upgrade openssl to 0.10.78 (CVE-2026-41676), jsonwebtoken to 9.4
- Suppress unmaintained-only/no-CVE advisories in .cargo/audit.toml
  with per-entry rationale
- Fix all `cargo clippy --all-targets -- -D warnings` errors across
  35 crates: derivable_impls, needless_range_loop, map_or→is_some_and/
  is_none_or, await_holding_lock (drop MutexGuard before .await),
  ptr_arg (&mut Vec→&mut [T]), useless_conversion, approximate_constant
  (2.718→E, 3.14→PI), field_reassign_with_default, manual_inspect,
  useless_vec, lines_filter_map_ok, print_literal, dead_code
- Apply `cargo fmt --all`
- Pre-existing test failure in wifi-densepose-signal
  (test_estimate_occupancy_noise_only) is not introduced by this PR
2026-05-23 05:36:13 -04:00
rUv a85d4e31e4
research(sota): kick off SOTA research loop + first R5 saliency measurement (#702)
Sets up docs/research/sota-2026-05-22/ as the autonomous-research
output dir, with PROGRESS.md as the canonical 15-vector research
agenda spanning spatial intelligence, RF features, RSSI-only, and
exotic/long-horizon verticals. Cron d6e5c473 (*/10 * * * *) picks
threads from this file and self-terminates at 2026-05-22 08:00 ET.

First concrete contribution this tick — R5 subcarrier saliency:

* examples/research-sota/r5_subcarrier_saliency.py: pure-numpy port
  of the count cog's Conv1d encoder + count head, computes per-
  subcarrier input×gradient saliency via central-difference. 128
  samples × 56 subcarriers × 2 forward passes/subcarrier ≈ ~3 s on
  CPU, no GPU or framework dependency.
* docs/research/sota-2026-05-22/R5-subcarrier-saliency.md: research
  note with motivation, method, novelty argument, and the first
  measured ranking. Top-8 subcarriers for cog-person-count v0.0.2:
  [41, 52, 30, 31, 10, 35, 2, 38]. Max/mean ratio 2.85x.
* v2/crates/cog-person-count/cog/artifacts/saliency.json: machine-
  readable per-subcarrier saliency + top-K lists, so future-tick
  experiments (retrain at K=8/16/32) consume it without re-running.

Key insight from the first measurement: top-8 saliency is *band-
spread* (indices span 2-52), not concentrated. This directly raises
R8's (RSSI-only) feasibility ceiling, because RSSI is a band-
aggregate — it retains the integral of a band-spread signal. First-
order estimate: RSSI-only should hit ~60% of full-CSI accuracy for
the count task. R7 (adversarial defence) inherits a concrete defender-
priority list: corroborate these 8 subcarriers across nodes.

This commit is the first of many short, focused contributions over
the next ~12 hours. PROGRESS.md is the canonical pointer for the
next tick to pick up the next thread.
2026-05-21 23:05:55 -04:00
rUv b3a5012dbd
feat(cog-person-count): v0.0.2 — K-fold + label-smoothing + temperature-calibrated (#699)
* chore: stage v0.0.2 artifacts + temperature scalar for build pipeline

Stages count_v1.{safetensors,onnx,temperature,train_results.json}
ahead of the build/sign/upload step. This commit is a momentary
side-effect — the next commit will refresh the per-arch manifests
with the new binary SHAs once ruvultra finishes the cross-build.

The .temperature file holds the calibration scalar from LBFGS over the
held-out conf logits. The Rust cog will read it post-load and divide
conf_logits by it before sigmoid, exactly matching the Python eval.

* feat(cog-person-count): v0.0.2 — K-fold validated, label smoothing + early stop + temp scale

The v0.0.1 "65.1% but class-1=0%" result was an unlucky temporal split
that let a degenerate "always predict 0" classifier hit eval acc =
class-0 fraction. 5-fold stratified random CV proved the architecture
actually learns ~57.1% class-1 accuracy under fair splits — a real,
modestly useful signal.

v0.0.2 ships a retrained model that:

* **Splits randomly (seed=42) 80/20** instead of temporally — eliminates
  the trailing-window-class-imbalance cheat.
* **Class-balanced sampler** (multinomial with replacement, weighted by
  inverse class frequency) — per-batch expected counts are equal
  regardless of dataset distribution.
* **Label smoothing 0.1** on the cross-entropy — reduces confidence
  saturation that drove v0.0.1's all-or-nothing predictions.
* **Early stopping** with patience=20 — stops at epoch 29 instead of
  overfitting through 400.
* **Temperature scaling** of the conf head — LBFGS fits a scalar T on
  held-out conf logits; ships as a count_v1.temperature sidecar so the
  Rust cog can divide conf_logits by T before sigmoid.

Numbers on the same data:

  | Metric           | v0.0.1 | v0.0.2 | K-fold (5x100) |
  |------------------|--------|--------|----------------|
  | Overall acc      | 65.1%  | 62.3%  | 62.2% ± 1.9%   |
  | Class 0 acc      | 100%   | 86.2%  | 67.4%          |
  | Class 1 acc      |  0%    | 34.3%  | 57.1% ✓        |
  | MAE              | 0.349  | 0.377  | 0.378          |
  | Spearman         | 0.023  | 0.013  | 0.160          |

Class-1 accuracy 0 → 34.3% is the headline win. Net acc moves slightly
because we stopped cheating on class 0. K-fold's 57% says there's
headroom remaining; reaching it needs more independent splits (== more
data), not more training tricks.

Confidence calibration didn't move. Temperature scaling alone can't fix
a confidence head trained against a noisy argmax==truth indicator over
a 62%-accurate classifier — the head's training signal is the issue,
not its post-hoc transform. The honest fix is multi-room data (#645),
not another calibration knob.

Live on cognitum-v0 at /var/lib/cognitum/apps/person-count/ — health
reports candle-cpu backend, count = 1 (was 0 in v0.0.1) on synthetic
zero input.

Files changed:
* scripts/train-count.py — adds --k-fold (no sklearn dep, hand-rolled
  stratified splits with deterministic shuffle) and --v2 paths.
* v2/.../cog/artifacts/count_v1.safetensors (392 KB, new sha
  32996433…) + count_v1.onnx (16 KB) + count_v1.temperature (0.9262
  scalar) + count_train_results.json (full epoch trace).
* v2/.../cog/artifacts/manifests/{arm,x86_64}/manifest.json bumped to
  version 0.0.2 with the new weights_sha256 + caveats.
* docs/benchmarks/person-count-cog.md — appends a v0.0.2 section
  with the K-fold diagnostic table and honest-read paragraph.

GCS:
  gs://cognitum-apps/cogs/arm/cog-person-count-count_v1.safetensors
    refreshed (binaries unchanged — load weights via mmap at runtime).
2026-05-21 19:47:04 -04:00
rUv e6a5df36eb
chore(cog-person-count): refresh GCS manifests after run-wiring rebuild (#698)
The arm + x86_64 manifests committed in #696 referenced the binaries
built before #697 wired the `run` subcommand. Rebuilt + re-signed +
re-uploaded to GCS, and re-deployed to cognitum-v0:

  arm    sha 15c2fbac…7728ea5  (3,807,456 B, up from 2,168,816 — added Tokio runtime)
  x86_64 sha 051614ce…cc8388b3 (4,502,960 B, up from 2,615,528)

Both re-signed Ed25519 with COGNITUM_OWNER_SIGNING_KEY. Manifests
now match the binaries published at gs://cognitum-apps/cogs/{arm,
x86_64}/cog-person-count-* and the binary installed at
/var/lib/cognitum/apps/person-count/ on cognitum-v0.
2026-05-21 19:13:10 -04:00
rUv 5c914e63c7
feat(cog-person-count): wire `run` subcommand — v0.0.1 fully functional (#697)
Phase 4 of ADR-103. Adds the long-running polling loop so the cog's
fourth verb (`run`) does real work, completing the ADR-100 runtime
contract end-to-end:

  cog-person-count version    → "person-count 0.3.0"
  cog-person-count manifest   → JSON skeleton
  cog-person-count health     → loads weights + 1-shot infer + emit
  cog-person-count run --config  → long-running per-frame emit  ← THIS

What ships:

* src/runtime.rs (new) — `run_loop` polls sensing_url every poll_ms,
  slides a [56, 20] CSI window, runs InferenceEngine::infer, emits
  publisher::person_count events. Same shape as
  cog-pose-estimation::runtime — fetch_frame extracts amplitudes
  from `snapshot.nodes[0].amplitude[]`, fails open on connect errors
  with a WARN log rather than crashing.
* src/lib.rs — registers the runtime module.
* src/main.rs — cmd_run now loads RunConfig from a JSON file, builds
  the InferenceEngine (with weights if cfg.model_path is set,
  otherwise auto-discover), emits a run.started event, and hands off
  to the Tokio multi-thread runtime's block_on(run_loop). Single-node
  fusion is a no-op for N=1 today; v0.2.0 will append predictions
  from sibling nodes and call fusion::fuse_confidence_weighted before
  emit.

Verified locally:

  cargo check  -p cog-person-count --no-default-features   → clean
  cargo test   -p cog-person-count                          → 15/15 pass (no regressions)
  cargo build  -p cog-person-count --release                → 2.36 MB unchanged
  ./cog-person-count run --config bad-config.json:
    line 1: {"event":"run.started","fields":{"cog":"person-count",
             "sensing_url":"http://127.0.0.1:9999/...",poll_ms:100,
             "model_path":"(auto-discover)"}}
    line 2: WARN sensing-server fetch failed
            error=Connection Failed: Connect error: actively refused
    (loop alive — exits cleanly on SIGTERM, no crash, no NaN)

Also adds a "Relationship to the in-process score_to_person_count
heuristic" section to cog/README.md explaining the dual-emitter
design (sensing-server keeps emitting the PR #491 slot heuristic;
the cog runs out-of-process and emits person.count events from the
learned model). Operators choose by installing the cog or not — no
sensing-server rebuild required.

ADR-103 §"Migration" status:
  1. Land ADR + scaffold ........... done (#693, #694)
  2. Train count_v1 ................ done (#695)
  3. Cross-compile + sign + GCS .... done (#696)
  4. Server-side wiring ............ done — out-of-process design
                                      means no rewire needed; this
                                      cog is the wiring.
  5. v0.2.0 multi-room + LoRA ...... data-bound (#645)
2026-05-21 19:10:15 -04:00
rUv a5e99670f8
feat(cog-person-count): release v0.0.1 — signed binaries on GCS, live on cognitum-v0 (#696)
Phase 3 of ADR-103. Cross-compiled aarch64 + x86_64 on ruvultra, signed
with COGNITUM_OWNER_SIGNING_KEY (Ed25519), uploaded to GCS, and live-
installed on the cognitum-v0 Pi 5 alongside cog-pose-estimation.

Real-hardware bench on cognitum-v0:
  ./cog-person-count-arm health
  → backend=candle-cpu, count=0, confidence=0.49, p95=[0,7]
  30 sequential health invocations: 0.276 s → 9.2 ms/invocation cold

Compares to cog-pose-estimation's 8.4 ms — count cog is ~10% slower
because the dual-head (count softmax + confidence sigmoid) does ~2x
the work after the shared encoder.

GCS release artifacts (publicly downloadable, SHA-verified):
  arm/cog-person-count-arm                          2,168,816 B
    sha:  36bc0bb0...0d47b507b3c3
    sig:  R/00xdzHriyr/2r...JK+a6k71NDg==  (Ed25519)
  x86_64/cog-person-count-x86_64                    2,615,528 B
    sha:  76cdd1ec...3923 7392b01db
    sig:  QB+8cnGSMQmu...ZtTNIQ2rDg==  (Ed25519)
  arm/cog-person-count-count_v1.safetensors           392,088 B
    sha:  dacb0551...e6e04ff56d15c3a65a9ff

Live install at /var/lib/cognitum/apps/person-count/ on cognitum-v0
matches the layout of every other installed cog (anomaly-detect,
seizure-detect, pose-estimation): cog-person-count-arm binary,
count_v1.safetensors weights, manifest.json, config.json.

Adds:
* v2/.../cog/artifacts/manifests/{arm,x86_64}/manifest.json — full
  ADR-100 schema with all fields filled (sha + sig + size + URL +
  build_metadata carrying the v0.0.1 honest training caveats).
* docs/benchmarks/person-count-cog.md — appends "Live appliance
  install" and "Signed GCS release artifacts" sections to the
  benchmark log.

Honest v0.0.1 caveat still applies (class-1 accuracy 0% on the held-
out tail of the single-session training data) — same data-bound
limit as pose_v1. The shipped artifact is the *vehicle*; production-
quality accuracy follows from multi-room paired data per ADR-103's
v0.2.0 plan + #645.
2026-05-21 19:02:26 -04:00
rUv 6b4994e105
feat(cog-person-count): train count_v1.safetensors — honest v0.0.1 (ADR-103) (#695)
Phase 2 of ADR-103: trained count head on the existing 1,077 paired
samples (the same data that produced pose_v1 yesterday).

Honest result: 65.1% eval accuracy / 100% within ±1 / MAE 0.349 on
the held-out time-window. Per-class: 100% on "empty room" / 0% on
"1 person". The model overfit by epoch 100 (train_acc → 1.0,
eval_loss climbed 0.67 → 7.8) and the "best" checkpoint is the
snapshot that happened to predict the eval window's class
distribution (140/215 = 65.1%, matches eval_acc exactly). Confidence
head Spearman = 0.023 ⇒ uncalibrated. Same data-bound failure mode
as pose_v1 (#645), bounded by single-session training data; same
fix path (multi-room).

What v0.0.1 still validates end-to-end:
* PyTorch → safetensors → Candle Rust loads cleanly on first try.
  `cog-person-count health` reports `backend: candle-cpu` and emits
  real per-frame predictions instead of the stub backend's hard-coded
  {1 person, 0 confidence}. Architecture parity between train-count.py
  and src/inference.rs::CountNet is bit-exact.
* ONNX export bit-clean (16 KB, opset 18, dynamic batch axis).
* Training wall time: 5.6 s for 400 epochs on RTX 5080.
* Binary size unchanged (2.36 MB stripped), model loads via mmap at
  runtime.

This commit ships:

* scripts/align-ground-truth.js: extended to emit n_persons_mode +
  n_persons_max per window so the training pipeline has count
  labels. Backwards-compatible (additive fields).
* scripts/train-count.py: new — mirrors CountNet architecture
  exactly, loads paired.jsonl, trains 400 epochs with
  CE+BCE+Brier loss, exports safetensors + ONNX + per-epoch JSON.
* v2/.../cog/artifacts/{count_v1.safetensors,count_v1.onnx,
  count_train_results.json}: the trained artifacts.
* v2/.../cog/README.md: Status table updated with the v0.0.1 numbers
  + an Honest Caveat section explaining the data-bound result.
* docs/benchmarks/person-count-cog.md: new — full v0.0.1 benchmark
  log mirroring the format docs/benchmarks/pose-estimation-cog.md
  established. Includes comparison to ADR-103 v0.1.0 acceptance
  gates and per-class breakdown.

Still pending:
* `run` subcommand wiring (long-running polling loop, same as pose)
* Cross-compile + sign + GCS upload (mirror of pose cog pipeline)
* Live install on cognitum-v0
* v0.2.0: re-train on multi-room data, LoRA per-room adapters,
  Stoer-Wagner min-cut clip in fusion stage
2026-05-21 18:56:52 -04:00
rUv 6959a42312
feat(cog-person-count): v0.0.1 scaffold + tests + fusion math + bench (ADR-103) (#694)
First implementation PR for ADR-103. Same incremental shape that
ADR-101 used: scaffold the cog crate, ship a stub-backend release
that satisfies the runtime contract + 15 tests + measured cold-start,
then follow up with the trained count_v1.safetensors in a separate PR.

What ships:

* v2/crates/cog-person-count/ — new workspace member.
    - Cargo.toml: candle-core/candle-nn 0.9 (cpu default, cuda feature
      opt-in), safetensors, ureq, sha2 — same dep shape as the pose cog
      but minus wifi-densepose-train (this cog has no training-side
      consumer, so the dep tree is materially smaller → 2.36 MB
      binary vs the pose cog's 4.5 MB).
    - src/inference.rs: CountNet (Conv1d 56→64→128→128 encoder + count
      head Linear(128→64→8)+softmax + confidence head
      Linear(128→32→1)+sigmoid). Stub backend returns
      `{1-person, 0-confidence}` honestly when no safetensors present.
    - src/fusion.rs: fuse_confidence_weighted() — Bayesian product of
      per-node distributions with confidence-weighted log-sum, plus
      fuse_with_mincut_clip() hook for the v0.2.0 Stoer-Wagner
      upper-bound (`ruvector-mincut` dep lands when min-cut graph
      builder is ready). Confidences floored at 1e-3 and probs floored
      at 1e-9 before logs — no NaN propagation.
    - src/publisher.rs: emits {count, confidence, count_p95_low,
      count_p95_high, n_nodes, probs} per ADR-103 §"Output".
    - src/main.rs: full ADR-100 four-verb CLI (version|manifest|health
      |run). The `run` subcommand explicitly returns "wiring pending
      v0.0.1" so the in-process library API is the v0.0.1-clean
      integration path.
    - tests/smoke.rs (8 tests) + fusion::tests (7 tests, in-lib) — 15
      total, all green. Cover stub-backend behaviour, wrong-shape
      rejection, fusion math (empty / single / agreement / high-conf
      override / normalisation), p95-range correctness, and min-cut
      clip semantics.
    - cog/{manifest.template.json, config.schema.json, README.md} +
      cog/artifacts/ placeholder dir.

* v2/Cargo.toml: registers the new workspace member.

Verified locally:

  cargo check -p cog-person-count --no-default-features    → clean
  cargo test  -p cog-person-count --no-default-features    → 8/8 pass
  cargo test  -p cog-person-count --lib                    → 7/7 pass
  cargo build -p cog-person-count --release                → 2.36 MB binary
  ./cog-person-count version                               → "person-count 0.3.0"
  ./cog-person-count manifest                              → JSON skeleton
  ./cog-person-count health                                → backend:stub,
                                                              count:1, conf:0,
                                                              p95:[1,1]
  Cold-start: 30 sequential `health` invocations → 53.3 ms/invocation
              (vs cog-pose-estimation's 76.2 ms — smaller dep tree)

cog/README.md adds:

* Security section — six-row threat table covering safetensor mmap
  trust, non-finite outputs, sensing fetch failures, fusion
  divide-by-zero / log-of-zero, min-cut degenerate cases, and stdout
  spoofing.
* Performance / optimization section — binary size, release profile
  (already opt-level=3 / lto=fat / codegen-units=1 / strip=true at
  workspace level), cold-start comparison table, projected warm-path
  latency budget.

Still pending (separate PRs, ADR-103 §"Migration"):

* Train count_v1.safetensors on the existing 1,077 paired samples
  with `n_persons` labels (Candle on RTX 5080, same script that
  produced pose_v1.safetensors yesterday).
* `run` subcommand wiring (long-running polling loop, same shape as
  cog-pose-estimation::runtime).
* Cross-compile + sign + GCS upload (mirror of cog-pose-estimation
  release pipeline).
* Server-side `csi.rs::score_to_person_count` call-site rewire to
  consume this cog when installed; falls back to PR #491's heuristic
  when not.
2026-05-21 18:46:57 -04:00