This commit is contained in:
ruvnet 2026-06-10 19:35:47 +00:00
parent 4bdb31b5cf
commit 5808507cc5
4 changed files with 598 additions and 0 deletions

View File

@ -0,0 +1,260 @@
# ADR-151: RuView Per-Room Calibration & Specialized Model Training System
| Field | Value |
|-------|-------|
| **Status** | Accepted — Stages 15 implemented (statistical specialists); HF-backbone distillation pending |
| **Date** | 2026-06-09 |
| **Deciders** | ruv |
| **Codebase target** | New `wifi-densepose-calibration` crate (orchestration); `wifi-densepose-train` (`rapid_adapt.rs`, `signal_features.rs`, `trainer.rs`); `wifi-densepose-ruvector` (RVF specialist storage); `wifi-densepose-signal/ruvsense/*` (feature extractors); `wifi-densepose-cli` (`enroll`, `train-room`, `room-status` subcommands) |
| **Relates to** | ADR-135 (Empty-Room Baseline Calibration), ADR-030 (Persistent Field Model), ADR-134 (CIR), ADR-024 (Contrastive CSI Embedding / AETHER), ADR-027 (Cross-Environment Domain Generalization / MERIDIAN), ADR-070 (Self-Supervised Pretraining), ADR-105 (Federated CSI Training), ADR-149 (AetherArena / Hugging Face), ADR-150 (RF Foundation Encoder) |
---
## 1. Context
### 1.1 The thesis — teach the room before you teach the model
RuView's deployment frontier is not a better generic model. ADR-150 documents the wall directly: an MM-Fi pose head scores **81.63% torso-PCK@20 in-domain but ~11.6% leakage-free cross-subject**, and bigger capacity *hurts* cross-subject (transformer 24.8% < conv 27.3%). A single oversized model that "understands the world" overfits the rooms and bodies it has seen. The lever is the opposite of scale: **a small model that understands *one* room and *one* person**, calibrated in minutes, run locally, and specialised per biological signal.
This positions RuView between the two incumbents in ambient sensing:
- **Wearables** — high fidelity, but people forget to wear them, and they only measure the wearer.
- **Cameras** — powerful, but invasive, store identifiable video, and fail in the dark / under covers.
RuView sits in the middle: it learns the *space*, learns the *person*, and tracks biological rhythm (breathing, heartbeat, restlessness, posture, presence) without seeing skin or storing video. Heartbeat and breathing are not visual problems — they are tiny, repeating disturbances in the RF field. Capturing them well is a *calibration* problem, not a *model-size* problem.
### 1.2 What already exists (and what is missing)
The pieces of a calibration→training pipeline exist as disconnected modules. There is no system that runs them end to end and emits a per-room model bank.
| Capability | Status today | Gap |
|------------|--------------|-----|
| Empty-room baseline (environmental fingerprint) | ADR-135 `BaselineCalibration` (Proposed): per-subcarrier amplitude + circular-phase stats, `ruvcal` NVS namespace | Captures the *room*, but there is no step that captures *guided human anchors* on top of it |
| Field eigenstructure | ADR-030 `field_model.rs` (SVD room eigenmodes) | Consumes calibration; not wired to a training trigger |
| Shared invariant backbone | ADR-150 RF Foundation Encoder (pose-preserving, subject/room/device-invariant) | Defined as a *foundation* embedding; nothing distills it into per-room specialists |
| Few-shot adaptation | `train/src/rapid_adapt.rs` — test-time training → LoRA weight deltas (MERIDIAN P5) | Produces a *single* pose-adaptation delta, not a bank of per-modality specialists |
| Feature extractors | `ruvsense/{bvp,longitudinal,intention,gesture,pose_tracker,adversarial}.rs`, `train/src/signal_features.rs` | Each emits a signal; none is packaged as a labelled training source for enrollment |
| Small-model storage | `wifi-densepose-ruvector` (RVF cognitive containers, HNSW, sketch) | No schema for "a bank of specialist models scoped to a room_id" |
| HF publishing | ADR-149 AetherArena (Hugging Face Space + signed scorer), `sensing-server` `from_pretrained` path | Publishes/評価s a *global* model; no notion of a published *base* + private *local* heads |
**The missing system is the connective tissue**: a guided enrollment protocol, a feature-extraction-to-label bridge, a specialist-bank trainer that reuses the frozen HF backbone, and a runtime that fuses the specialists with confidence gating. This ADR defines that system.
### 1.3 The four-step user model (and where each step lands)
The system is deliberately presented to operators as four plain steps. Each maps to existing or new code:
1. **Capture a quiet baseline** — no people, just room/router/reflections/noise/drift → the *environmental fingerprint*. → **Reuse ADR-135** `BaselineCalibration` + **ADR-030** field eigenmodes. No new capture code; the calibration crate calls it.
2. **Capture guided samples** — stand, sit, lie down, slow vs normal breathing, small movement, sleep posture. Clean anchors, not hours of data. → **NEW** `EnrollmentProtocol` (Section 2.2).
3. **Extract the useful signal** — CSI phase, amplitude, Doppler shift, micro-motion, periodicity, variance, timing. → **Reuse** `signal_features.rs` + ruvsense extractors, packaged as labelled `AnchorFeature` records (Section 2.3).
4. **Compress patterns into small ruVector models***specialised* per signal: breathing, heartbeat, sleep restlessness, posture, presence, anomaly. → **NEW** `SpecialistBank` trained via `rapid_adapt` LoRA heads over the frozen ADR-150 backbone, stored as RVF (Section 2.4).
---
## 2. Decision
**Build the RuView Per-Room Calibration & Specialized Model Training System: a four-stage, local-first pipeline (`baseline → enroll → extract → train`) that produces a versioned *bank of small specialised ruVector models* scoped to one `room_id`, each a lightweight head distilled/adapted from the frozen, Hugging-Face-published RF Foundation Encoder (ADR-150).** Big model understands the world; small ruVector models understand *your room*.
Two invariants govern every design choice below:
> **(A) Specialisation over scale.** One small model per biological signal, not one large model for all of them. Each specialist is faster, cheaper, more private, and — because it is calibrated to the room's actual fingerprint — often *more accurate* than a general model.
>
> **(B) Local-first, base-shared.** The frozen room/subject/device-invariant backbone is the only artifact published to Hugging Face. Per-room baselines and per-specialist heads never leave the device unless the operator opts into federation (ADR-105).
### 2.1 System architecture
```
HUGGING FACE HUB (public, room-agnostic)
┌───────────────────────────────────────┐
│ RF Foundation Encoder (ADR-150) │
│ pose-preserving · subject/room/device │
│ -invariant · frozen · safetensors │
└───────────────┬───────────────────────┘
│ from_pretrained() once, cached on device
STAGE 1 baseline STAGE 2 enroll STAGE 3 extract STAGE 4 train (per room_id)
┌──────────────┐ ┌──────────────┐ ┌────────────────┐ ┌─────────────────────────┐
│ ADR-135 │ │ Enrollment │ │ signal_features│ │ SpecialistBank │
│ Baseline- │──fp──► │ Protocol │─clip►│ + ruvsense │─AF──►│ frozen backbone │
│ Calibration │ │ guided │ │ extractors │ │ │ ┌────────────────┐ │
│ (env finger- │ │ anchors: │ │ → AnchorFeature│ │ ├─►│ breathing head │ │
│ print) │ │ stand/sit/ │ │ (phase, amp, │ │ ├─►│ heartbeat head │ │
│ ADR-030 │ │ lie/breathe/ │ │ doppler, │ │ ├─►│ restless head │ │
│ field eigen │ │ move/sleep │ │ micromotion, │ │ ├─►│ posture head │ │
└──────────────┘ └──────────────┘ │ periodicity, │ │ ├─►│ presence head │ │
│ │ variance, │ │ └─►│ anomaly head │ │
│ baseline drift > τ → invalidate bank │ timing) │ │ (LoRA / ruVector │
└───────────────────────────────────────┴────────────────┴──────┤ small models) │
└───────────┬─────────────┘
│ RVF container
RUNTIME: Mixture-of-Specialists
each head emits {value, confidence};
coherence_gate (ADR-135) + anomaly
head veto → fused RoomState
```
The shared backbone is loaded **once per device** and frozen. Every specialist is a small head over its embedding — so the marginal cost of a sixth specialist is kilobytes of LoRA weights, not another full model.
### 2.2 Stage 2 — the guided enrollment protocol (NEW)
`EnrollmentProtocol` is a CLI-driven state machine that walks the operator through a fixed sequence of labelled **anchors**. The design rule from the user vision is explicit: *clean anchors, not hours of data.* Each anchor is a short (default 20 s @ 20 Hz = 400 frames) labelled clip captured against the already-recorded baseline.
| Anchor | Label | Duration | Primary signal taught | Feature emphasis |
|--------|-------|----------|-----------------------|------------------|
| `empty` | presence=0 | (reuse ADR-135 baseline) | absence reference | amplitude variance floor |
| `stand_still` | posture=standing, presence=1 | 20 s | static human load | amplitude mean shift, eigenmode delta |
| `sit` | posture=sitting | 20 s | lower static load | amplitude profile |
| `lie_down` | posture=lying | 20 s | sleep-position load | amplitude profile, low Doppler |
| `breathe_slow` | resp≈0.10.15 Hz | 30 s | slow respiration | periodicity, micro-Doppler |
| `breathe_normal` | resp≈0.20.3 Hz | 30 s | normal respiration | periodicity, BVP phase |
| `small_move` | motion=1 | 20 s | limb micro-motion | Doppler spread, variance |
| `sleep_posture` | posture=lying, restless=0 | 30 s | quiescent sleep baseline | long-window variance, timing |
The protocol is **adaptive**: an anchor is only accepted when its captured features pass a quality gate (coherence ≥ threshold from `coherence_gate.rs`, sufficient SNR vs baseline, no saturation). A failed anchor is re-prompted rather than silently kept — bad anchors poison small models far more than large ones. Total guided enrollment is ~4 minutes of wall-clock, producing 8 clean anchors. This is intentionally far below the "hours of data" that a from-scratch model needs, because the backbone already carries world knowledge; enrollment only teaches *this* room's offsets.
Anchors are persisted as an append-only `EnrollmentSession` (event-sourced, per CLAUDE.md state rules) under `room_id`, so re-enrollment is incremental and auditable.
### 2.3 Stage 3 — feature extraction to labelled records (REUSE + bridge)
Each accepted anchor clip is run through the existing extractor stack, baseline-subtracted per ADR-135, and packaged into an `AnchorFeature` record. No new DSP is invented — this stage is a *bridge*, not a new algorithm.
| Feature group | Source module | Used by specialists |
|---------------|---------------|---------------------|
| CSI amplitude mean/variance | ADR-135 baseline subtraction + `signal_features.rs` | presence, posture |
| CSI phase (sanitised, LO-aligned) | `phase_sanitizer``phase_align` | posture, heartbeat |
| Doppler shift / micro-Doppler | `ruvsense/bvp.rs`, `breathing` path | breathing, small-move |
| Micro-motion / intention lead | `ruvsense/intention.rs` | restlessness, anomaly |
| Periodicity / spectral peaks | `bvp.rs` autocorrelation + FFT | breathing, heartbeat |
| Long-window variance / drift | `ruvsense/longitudinal.rs` (Welford) | restlessness, presence |
| Timing / inter-frame epoch | `c6_timesync` epoch, frame Δt | all (rhythm alignment) |
| Field eigenmode coefficients | ADR-030 `field_model.rs` | posture, presence |
`AnchorFeature` = `{ room_id, anchor_label, t_epoch_us, embedding: [f32; D] (backbone output), aux: { resp_hz?, doppler_spread, variance, periodicity_score, eigen_coeffs } }`. The backbone embedding is the *shared* representation; `aux` carries the cheap hand-features that let small heads specialise without re-learning DSP.
### 2.4 Stage 4 — the specialist bank (NEW, the core contribution)
A **`SpecialistBank`** is a versioned collection of small models scoped to one `room_id`, persisted as a single RVF cognitive container (`wifi-densepose-ruvector`). Each specialist is a *head* over the frozen backbone embedding, trained from the labelled `AnchorFeature` records via the existing `rapid_adapt.rs` LoRA machinery (test-time/few-shot training, contrastive + entropy losses), **not** a from-scratch network.
| Specialist | Model type | Params (typ.) | Label source | Output |
|------------|-----------|---------------|--------------|--------|
| **breathing** | 1-D temporal head + periodicity regressor | ~8 KB LoRA + aux | `breathe_slow`/`breathe_normal` | resp rate (Hz) + confidence |
| **heartbeat** | narrowband phase head (harmonic-aware) | ~12 KB | quiescent anchors + periodicity | HR (bpm) + confidence |
| **sleep restlessness** | variance/drift classifier | ~4 KB | `sleep_posture` vs `small_move` | restlessness score [0,1] |
| **posture** | k-way prototype classifier (HNSW NN) | prototypes only | `stand/sit/lie` anchors | posture class + margin |
| **presence** | binary energy/eigenmode gate | ~2 KB | `empty` vs occupied anchors | presence prob |
| **anomaly** | one-class / physically-impossible detector (`adversarial.rs`) | ~6 KB | baseline + all anchors (novelty) | anomaly score + veto flag |
Design properties that follow from invariant (A):
- **Independently versioned & swappable.** Re-enrolling breathing does not retrain posture. A specialist carries its own `{trained_at, anchor_set_hash, baseline_hash, backbone_rev}`.
- **HNSW prototype storage for the classifiers.** Posture and presence are nearest-prototype lookups in the RVF index — no inference engine, microsecond latency, and new postures are added by inserting a prototype, not retraining.
- **SONA online adaptation.** Each specialist may carry a SONA/MicroLoRA online-adaptation slot (`ruvllm_sona_*` / `microlora` primitives) so it tracks slow drift (furniture moved, seasonal RF change) between full re-enrollments, gated by ADR-135 baseline drift.
- **Teacherstudent distillation (optional, offline).** Where a labelled public corpus exists (MM-Fi, Wi-Pose), the ADR-150 backbone acts as teacher to pre-shape a head before per-room fine-tuning, improving cold-start. The *teacher* is global/HF; the *student head* is local.
**Invalidation contract.** The bank stores the `baseline_id` (the baseline UUID) it was trained against. **As implemented**, the runtime marks the bank `STALE` whenever the *current* baseline id differs from the trained one — a conservative trigger that catches re-calibration (room rearranged, AP moved, band changed) because any of those produces a new baseline. A finer **drift-threshold** trigger (mark STALE when ADR-135's per-subcarrier deviation exceeds τ *without* a full re-baseline) is a planned refinement (P6). Either way the runtime prompts re-enrollment rather than emitting silently wrong vitals — the calibration analogue of the #954 `DEGRADED` honesty rule: never report confident numbers from an invalid model.
### 2.5 Runtime — mixture of specialists with confidence gating
At inference, the frozen backbone embeds each CSI window once; every specialist consumes that shared embedding and emits `{value, confidence}`. Fusion rules:
- The **anomaly** specialist holds a **veto**: a high anomaly score (physically-impossible signal per `adversarial.rs`, or a coherence-gate `Reject`) suppresses positive vitals/posture output and raises a flag, rather than propagating a hallucinated reading.
- **presence=0** short-circuits breathing/heartbeat/posture to `null` (you cannot have a respiration rate in an empty room).
- Each emitted reading is tagged with the specialist's confidence and the `baseline_hash`/`backbone_rev` provenance, so downstream consumers (sensing-server, MQTT, Home Assistant) can gate on quality — consistent with ADR-135 coherence-gate semantics.
### 2.6 Crate & module layout
New bounded-context crate `wifi-densepose-calibration` (orchestration only; files < 500 lines, typed public APIs, event-sourced sessions per CLAUDE.md):
```
wifi-densepose-calibration/
src/
lib.rs # public API: CalibrationSystem facade
enrollment.rs # EnrollmentProtocol state machine (Stage 2)
anchor.rs # Anchor, EnrollmentSession (event-sourced)
extract.rs # AnchorFeature bridge over signal_features + ruvsense (Stage 3)
specialist.rs # Specialist trait, SpecialistKind enum
bank.rs # SpecialistBank (RVF container, versioning, invalidation)
runtime.rs # MixtureOfSpecialists fusion + veto (Stage 5)
backbone.rs # frozen ADR-150 encoder loader (hf_hub from_pretrained, cached)
error.rs
```
Dependencies (no duplication — orchestrates existing crates): `wifi-densepose-signal` (ruvsense extractors, ADR-135 baseline), `wifi-densepose-train` (`rapid_adapt`, `signal_features`, `trainer`), `wifi-densepose-ruvector` (RVF, HNSW), `wifi-densepose-nn` (backbone inference). The `wifi-densepose-cli` gains `enroll`, `train-room`, and `room-status` subcommands, sequenced after the existing ADR-135 `calibrate`.
### 2.7 CLI flow (operator-facing)
```bash
# Stage 1 — environmental fingerprint (ADR-135, existing)
wifi-densepose calibrate --room living-room --duration 60s # empty room
# Stage 2+3 — guided enrollment (NEW); prompts through 8 anchors, ~4 min
wifi-densepose enroll --room living-room
# → "Stand still in view of the sensor…" [✓ anchor accepted: coherence 0.91]
# → "Sit down…" [✗ low SNR, retrying]
# ...
# Stage 4 — train the specialist bank (NEW); reuses cached HF backbone
wifi-densepose train-room --room living-room \
--specialists breathing,heartbeat,restlessness,posture,presence,anomaly
# Status / invalidation
wifi-densepose room-status --room living-room
# baseline: fresh (drift 0.04 < 0.20) · backbone: rf-foundation@1.2.0
# breathing ✓ trained 2026-06-09 conf p50 0.88
# heartbeat ✓ trained 2026-06-09 conf p50 0.71
# posture ✓ 3 prototypes (stand/sit/lie)
# anomaly ✓ · presence ✓ · restlessness ✓
```
---
## 3. Consequences
### 3.1 Positive
- **Fidelity through specialisation.** Six small calibrated heads beat one oversized general model on the cross-room/cross-subject frontier that ADR-150 quantified — and each runs in microseconds-to-milliseconds, on-device.
- **Privacy by construction.** Only the room-agnostic backbone is public (HF). The environmental fingerprint and the person-specific heads stay local; no video, no skin, no cloud round-trip. This is the core differentiator vs cameras and the convenience differentiator vs wearables.
- **Minutes, not hours.** Because the backbone carries world knowledge, ~4 minutes of clean anchors calibrates a room. Re-enrollment is incremental.
- **Honest degradation.** The `baseline_hash` invalidation + anomaly veto mean an out-of-calibration room reports `STALE`/flagged rather than confidently wrong — the same honesty principle as the firmware `DEGRADED` flag.
- **Composable & cheap to extend.** A new biological signal = a new small head over the same embedding, not a new model.
### 3.2 Negative / risks
- **Backbone dependency.** Every specialist rides on ADR-150's encoder; its quality and revision compatibility (`backbone_rev`) are a single point of leverage. Mitigation: pin `backbone_rev` in each specialist; distillation cold-start reduces sensitivity.
- **Enrollment burden.** 4 minutes is small but non-zero, and anchor quality depends on the operator following prompts. Mitigation: adaptive re-prompting + quality gates; ship sane defaults so a partial bank (presence+posture) works after just the static anchors.
- **Heartbeat is hard.** Sub-mm chest displacement at HR frequencies is near the ESP32-S3 noise floor; the heartbeat specialist will have lower and more variable confidence than breathing. The confidence-gated runtime surfaces this rather than faking it.
- **Per-room storage proliferation.** A bank per room per person; needs a clear RVF lifecycle (list/prune/export) — handled by `bank.rs` versioning and the `room-status` CLI.
### 3.3 Alternatives considered
| Alternative | Verdict | Reason |
|-------------|---------|--------|
| One large general model for all signals | **Rejected** | The ADR-150 evidence: scale overfits rooms/subjects and collapses cross-domain; also slower, costlier, less private. Directly contradicts invariant (A). |
| Cloud training of per-room models | **Rejected** | Violates invariant (B): would ship raw CSI of a person's home/sleep to a server. Local-first is the privacy promise. Federation (ADR-105) is the *opt-in* path for shared improvement, exchanging gradients/deltas, never raw CSI. |
| Skip the backbone; train each specialist from scratch | **Rejected** | Reintroduces the "hours of data" requirement the user vision explicitly rejects, and loses cross-room priors. |
| Fold this into ADR-135 | **Rejected** | ADR-135 is *room* calibration (no humans). This ADR is *human-anchor* enrollment + model training on top of it. Distinct lifecycles, distinct invalidation; kept as separate bounded contexts. |
---
## 4. Implementation phases
| Phase | Scope | Exit criterion | Status |
|-------|-------|----------------|--------|
| **P1** | Scaffold `wifi-densepose-calibration` crate; `AnchorFeature` schema; (backbone via `hf_hub` deferred) | Crate + schema; unit tests | ✅ Done (crate + Stage-1 baseline via `calibrate`/`calibrate-serve`; HF backbone deferred) |
| **P2** | `EnrollmentProtocol` + `anchor.rs` (event-sourced sessions) + CLI `enroll` with quality gates | 8-anchor enrollment; bad anchors re-prompt | ✅ Done (`anchor.rs`, `enrollment.rs`, CLI `enroll`) |
| **P3** | `extract.rs` bridge → labelled records; baseline subtraction (ADR-135) | `AnchorFeature` records persisted per `room_id` | ✅ Done (`extract.rs`; autocorr periodicity + variance/motion) |
| **P4** | `SpecialistBank` + presence/posture (prototype) + breathing (periodicity); persistence + versioning | `train-room` produces a bank; `room-status` reads it back | ✅ Done (`specialist.rs`, `bank.rs`, CLI `train-room`/`room-status`; JSON persistence — RVF/HNSW = future) |
| **P5** | heartbeat + restlessness + anomaly specialists; `runtime.rs` mixture + veto + confidence gating | End-to-end RoomState on hardware; anomaly veto verified | ✅ Done (`runtime.rs`, CLI `room-watch`; breathing read live on COM8 ESP32) |
| **P6** | Baseline-drift `STALE` invalidation; SONA online adaptation; optional ADR-105 federation; HF teacherstudent distillation | Drift marks bank STALE; AetherArena entry | ◐ Partial (STALE done; SONA/federation/HF-backbone = follow-ups) |
**Current status (2026-06-10):** Stages 15 implemented with *statistical* specialists (threshold/prototype/autocorrelation). 55 tests (35 unit incl. multistatic + 1 full-loop integration + 19 CLI), all passing under qemu-aarch64. **Validation scope is precise:** baseline capture + HTTP API + auth are proven on real CSI (Pi-5 nexmon, 6,813 frames; and an ESP32-S3). The complete `baseline → enroll → train-room → infer` loop is now **proven in-process** on deterministic synthetic CSI (`tests/full_loop.rs`: clean baseline with zero motion flags, 8/8 anchors through the quality gate, 6 specialists trained, JSON bank round-trip, trained-bank inference 18±2 BPM positive / absent negative / foreign-baseline STALE; seed-robust). The one live runtime signal (breathing ~1631 BPM via `room-watch`) used the *stateless* breathing head, **not** a trained bank; the clean empty-room loop has **not** yet run on-target — the remaining gap is strictly the hardware session (empty room + operator anchors). The four behavioral findings from the full-loop test (z-band squeeze, variance-only presence, ungated hz embedding, heart-band lag-floor leakage) are FIXED and regression-guarded — see the integration doc §7. SOTA-intake decisions affecting this system (geometry conditioning, checkerboard alignment) are recorded in ADR-152. Open refinements: `--source-format adr018v6` (drive from the Pi's own nexmon), phase-based breathing carrier, RVF/HNSW storage, and the ADR-150 frozen HF backbone the specialists would distill from.
Validation per CLAUDE.md: `cargo test --workspace --no-default-features` green; hardware verification on the ESP32-S3 (currently COM8) before any release; witness bundle regenerated if the proof surface changes.
---
## 5. Summary
> Big models understand the world. Small ruVector models understand *your room*.
ADR-151 makes that operational: a local-first `baseline → enroll → extract → train` pipeline that turns ~4 minutes of clean human anchors — layered on ADR-135's empty-room fingerprint and ADR-150's Hugging-Face-published invariant backbone — into a versioned bank of tiny, specialised, privacy-preserving models for breathing, heartbeat, restlessness, posture, presence, and anomaly. Specialisation over scale; local heads over a shared base; honest `STALE` degradation over confident error.

View File

@ -0,0 +1,98 @@
# ADR-152: WiFi-Pose SOTA 2026 Intake — Geometry-Conditioned Calibration, External Benchmarks, and the Foundation-Encoder Training Recipe
| Field | Value |
|-------|-------|
| **Status** | Proposed |
| **Date** | 2026-06-10 |
| **Deciders** | ruv |
| **Codebase target** | `wifi-densepose-calibration` (geometry conditioning, ADR-151 Stage 2), `wifi-densepose-train` (camera-supervised path, MAE recipe), `wifi-densepose-cli` (benchmark harness), docs |
| **Relates to** | ADR-151 (Per-Room Calibration), ADR-150 (RF Foundation Encoder), ADR-135 (Empty-Room Baseline), ADR-079 (Camera-Supervised Pose), ADR-027 (MERIDIAN), ADR-024 (AETHER), ADR-149 (AetherArena), ADR-029 (Multistatic) |
| **Research provenance** | Deep-research run 2026-06-10: 22 sources fetched, 110 claims extracted, 25 adversarially verified (3-vote), 24 confirmed / 1 refuted. Evidence grades per source below. |
---
## 1. Context
A structured survey of the 20252026 WiFi human-sensing state of the art was run on 2026-06-10 to answer: *what should RuView integrate next, and does anything published invalidate our current direction?* Every claim below was verified against the primary source by independent adversarial reviewers; **evidence grades distinguish what the papers measured from what they merely claim**. Almost all performance numbers are author-self-reported preprint results — treated here as CLAIMED until reproduced on our hardware.
### 1.1 The five verified findings
**(F1) "Coordinate overfitting" is a named, diagnosed failure mode of camera-supervised WiFi pose — and our ADR-079 pipeline has the exact shape of it.**
PerceptAlign (arXiv [2601.12252](https://arxiv.org/abs/2601.12252), accepted ACM MobiCom 2026) shows that models regressing CSI directly to camera-frame coordinates memorize the deployment-specific transceiver layout; SOTA baselines degrade to >600 mm MPJPE in unseen scenes. Their fix is cheap: a <5-minute calibration using two checkerboards and a few photos to align WiFi and vision in one shared 3D frame, plus **fusing transceiver-position embeddings with CSI features**. Claimed: 12.3% in-domain error, 60%+ cross-domain error. They release the claimed-largest cross-domain 3D WiFi pose dataset (21 subjects, 5 scenes, 18 actions, **7 device layouts**). *Evidence: improvements CLAIMED (preprint w/ MobiCom acceptance); the failure mode itself is corroborated across the cross-domain literature and independently by our own ADR-150 data (81.63% in-domain vs ~11.6% leakage-free cross-subject torso-PCK).*
**(F2) An external model named "WiFlow" claims 97.25% PCK@20 with 2.23M params and ships everything.**
arXiv [2602.08661](https://arxiv.org/abs/2602.08661) (Apr 2026) — spatio-temporal-decoupled CSI pose, 97.25% PCK@20 / 99.48% PCK@50 / 0.007 m MPJPE, 2.23M parameters (~2.2 MB int8). Code, pretrained weights, and a 360k-sample CSI-pose dataset are public under Apache-2.0 ([repo](https://github.com/DY2434/WiFlow-WiFi-Pose-Estimation-with-Spatio-Temporal-Decoupling), Kaggle dataset). *Evidence: artifact availability MEASURED (verified by direct repo inspection); PCK numbers CLAIMED (5-subject, in-domain, self-collected dataset; hardware unspecified; 15 keypoints vs our 17).* ⚠️ **Name collision:** this is unrelated to RuView's internal WiFlow model. In all RuView docs the external model is referred to as **WiFlow-STD (DY2434)**.
**(F3) For CSI foundation encoders, data scale — not model capacity — is the bottleneck, and the tokenization recipe is now known.**
UNSW's MAE pretraining study (arXiv [2511.18792](https://arxiv.org/abs/2511.18792), Nov 2025) — the largest heterogeneous CSI pretraining run to date (1,320,892 samples, 14 public datasets incl. MM-Fi, Widar 3.0, Person-in-WiFi 3D; 4 devices; 2.4/5/6 GHz; 20160 MHz) — reports zero-shot cross-domain gains of 2.215.7% over supervised baselines, with unseen-domain performance scaling **log-linearly with pretraining data, unsaturated at 1.3M samples**, while ViT-Base adds only 0.40.9% over ViT-Small. Optimal recipe: **80% masking ratio, small (30,3) patches** (+4.7% over (40,5) by preserving fine temporal dynamics). *Evidence: MEASURED within-study (ablations verified in body text) but preprint; downstream tasks are classification, NOT pose — pose transfer is a hypothesis. Independently corroborates ADR-150's finding that capacity hurts cross-subject.*
**(F4) Hardware/standards: 802.11bf is finished; Espressif ships official sensing; Wi-Fi 6 AP CSI is reachable.**
- **IEEE 802.11bf-2025** published **2025-09-26** (verified against the IEEE SA record) — sensing standardization is complete for both sub-7 GHz and >45 GHz, with formal sensing setup/feedback procedures. No ESP32 silicon implements it yet. *Evidence: MEASURED (standards-body record).*
- **Espressif `esp_wifi_sensing`** (Apache-2.0, v0.1.x, ESP Component Registry): official CSI presence/motion FSM; esp-csi actively maintained (commit 2026-04-22, verified), CSI confirmed across ESP32/S2/C3/S3/C5/C6/C61. *Evidence: MEASURED (vendor pages + commit log).* ⚠️ A stronger "drop-in compatible with RuView nodes" claim was **REFUTED 0-3** — WiFi-6 parts use a different CSI acquisition config struct.
- **ZTECSITool** (arXiv [2506.16957](https://arxiv.org/abs/2506.16957), [code](https://github.com/WiFiZTE2025/ZTE_WiFi_Sensing)): CSI from commercial Wi-Fi 6 APs at up to 160 MHz / 512 subcarriers (~510× ESP32 subcarrier count; the gain is aperture, not per-Hz granularity). Firmware is gated behind a ZTE serial-number approval. *Evidence: capability CLAIMED by the vendor-authored tool paper; code artifact MEASURED.*
**(F5) Nothing in 20252026 does full DensePose UV regression from commodity WiFi.** Keypoint pose remains the field's frontier. Three "wireless foundation model" papers were screened out by full-text inspection (HeterCSI = simulated cellular channels only; the NeurIPS-2025 FMCW pilot = mmWave radar, presence-only; arXiv 2509.15258 = survey, no artifacts). *Evidence: MEASURED (absence verified by full-text inspection of the candidates that surfaced; absence of evidence across the whole literature is necessarily weaker).*
### 1.2 What this means for the ADR-151 calibration system
ADR-151's enrollment protocol captures guided human anchors but does **not** record or condition on transceiver geometry. F1 says that omission is precisely the thing that makes camera-supervised (and, plausibly, anchor-supervised) heads layout-brittle. ADR-151's per-room thesis ("teach the room before you teach the model") is *strengthened* by F1 — PerceptAlign is independent evidence that layout must be modeled explicitly — and the fix composes naturally with our Stage-2 enrollment.
ADR-150's masked-CSI-encoder design is *validated* by F3, which also hands us the hyperparameters and the priority call: **collect/aggregate more heterogeneous CSI before scaling the encoder.**
## 2. Decision
Adopt four changes, ordered by effort-vs-gain:
### 2.1 Geometry-condition the calibration system (extends ADR-151 Stage 2) — ACCEPTED
1. **Record transceiver geometry at enrollment.** `EnrollmentProtocol` gains an optional `NodeGeometry` record per node (position estimate, antenna orientation, inter-node distances where known). Stored alongside the room baseline in the bank; schema-versioned so existing banks remain readable.
2. **Fuse geometry embeddings into specialist training.** Where a specialist head consumes the (future, ADR-150) backbone embedding, concatenate a small learned embedding of `NodeGeometry` — the PerceptAlign mechanism, transplanted to our per-room banks. Statistical specialists (current) ignore it; LoRA heads (ADR-151 P6) consume it.
3. **Adopt the two-checkerboard alignment for the camera-supervised path (ADR-079).** When MediaPipe supervision is used, calibrate camera↔WiFi into one shared 3D frame before regression (<5 min, two checkerboards, a few photos). This is the direct defense against F1 for our 92.9%-PCK@20 pipeline.
4. **Evaluate on the PerceptAlign cross-domain dataset** (21 subjects / 7 layouts) as the MERIDIAN cross-layout benchmark — *gated on confirming its license and downloadability* (open question; repo per paper: github.com/Trymore-lab/PerceptAlign).
### 2.2 Benchmark against WiFlow-STD (DY2434) — ACCEPTED
Pull the Apache-2.0 weights + 360k-sample dataset; run three measurements: (a) their model on their data (reproduce 97.25% claim), (b) their model fine-tuned on our ESP32 17-keypoint eval set, (c) our internal WiFlow on their dataset (15-keypoint subset mapping). Until (a)(c) are measured, **no RuView doc may cite 97.25% as a comparable number** — different dataset, subjects, keypoints.
### 2.3 Apply the UNSW recipe to the ADR-150 encoder — ACCEPTED (amends ADR-150 §2.3)
- Pretraining corpus: start from the same 14 public datasets (1.3M samples) + our home/MM-Fi frames; data aggregation takes priority over architecture work.
- Tokenization: 80% masking, (30,3)-class small patches; encoder stays ViT-Small-class (~15M params) — F3 and our own DANN/transformer results agree that capacity does not pay.
- The published log-linear scaling (unsaturated) sets the expectation: more heterogeneous CSI in, better zero-shot out.
### 2.4 Hardware watch items — ACCEPTED (no code now)
- **802.11bf**: track silicon/certification; revisit when any commodity chipset exposes standardized sensing measurements. Our opportunistic CSI extraction remains the mechanism until then.
- **esp_wifi_sensing**: benchmark our presence pipeline against the vendor FSM (one afternoon; useful external baseline). Do **not** treat as drop-in (refuted claim).
- **ZTECSITool AP**: optional high-resolution anchor node for the ADR-029 multistatic mesh — procurement-gated; only pursue if a 160 MHz anchor materially helps tomography.
### 2.5 Explicitly NOT adopted
- No pivot toward "wireless foundation model" papers that don't ship WiFi-CSI artifacts (HeterCSI, FMCW pilot, surveys).
- No DensePose-UV work item: the field has not demonstrated UV regression from commodity WiFi; keypoints remain our supervised target (F5).
## 3. Consequences
**Positive:** the calibration system gains the one mechanism (geometry conditioning) the 2026 literature identifies as the difference between layout-brittle and layout-robust supervised WiFi pose; ADR-150 gets a measured training recipe instead of a guessed one; we acquire two external benchmarks (WiFlow-STD, PerceptAlign dataset) to keep our claims honest.
**Negative / risks:** geometry records add schema surface to banks (mitigated: optional + versioned); every adopted number is preprint-grade until our own benchmark runs land (mitigated by §2.2's no-citation rule); PerceptAlign dataset license is unconfirmed (gated); name collision risk in docs (mitigated: "WiFlow-STD (DY2434)" naming rule).
**Re-check by 2026-12:** 802.11bf silicon, esp_wifi_sensing maturity (v0.1.x today), and the preprint field (newest source Apr 2026).
## 4. Open questions (carried from the research run)
1. Does WiFlow-STD retain accuracy when fine-tuned on ESP32-S3/C6 CSI (fewer subcarriers, lower SNR), scored on our 17-keypoint set? (§2.2 answers this.)
2. Is the PerceptAlign dataset downloadable under a usable license, and does the two-checkerboard procedure work with ESP32 transceiver geometry? (§2.1.4 gate.)
3. Will esp_wifi_sensing evolve toward 802.11bf compliance, replacing opportunistic CSI extraction?
## 5. Source register (evidence-graded)
| Source | Type | Used for | Grade |
|---|---|---|---|
| arXiv 2601.12252 (PerceptAlign, MobiCom'26) | preprint+acceptance | F1, §2.1 | CLAIMED numbers; failure mode corroborated |
| arXiv 2602.08661 + DY2434 repo (WiFlow-STD) | preprint + code | F2, §2.2 | numbers CLAIMED; artifacts MEASURED |
| arXiv 2511.18792 (UNSW MAE) | preprint | F3, §2.3 | ablations MEASURED in-study; pose transfer hypothesis |
| IEEE SA 802.11bf-2025 record | standards body | F4, §2.4 | MEASURED |
| Espressif component registry + esp-csi repo | vendor | F4, §2.4 | MEASURED; "drop-in" REFUTED 0-3 |
| arXiv 2506.16957 + ZTE repo (ZTECSITool) | vendor preprint + code | F4, §2.4 | capability CLAIMED; code MEASURED |
| arXiv 2601.18200 (HeterCSI), OpenReview LMufK3vzE5 (FMCW pilot), arXiv 2509.15258 (survey) | preprints | F5, §2.5 (screened out) | MEASURED (full-text inspection) |

View File

@ -79,6 +79,10 @@ Statuses: **Proposed** (under discussion), **Accepted** (approved and/or impleme
| [ADR-023](ADR-023-trained-densepose-model-ruvector-pipeline.md) | Trained DensePose Model with RuVector Pipeline | Proposed |
| [ADR-024](ADR-024-contrastive-csi-embedding-model.md) | Project AETHER: Contrastive CSI Embeddings | Required |
| [ADR-027](ADR-027-cross-environment-domain-generalization.md) | Project MERIDIAN: Cross-Environment Generalization | Proposed |
| [ADR-149](ADR-149-public-community-leaderboard-huggingface.md) | AetherArena: public spatial-intelligence benchmark on Hugging Face | Proposed |
| [ADR-150](ADR-150-rf-foundation-encoder.md) | RF Foundation Encoder: pose-preserving, subject/room/device-invariant CSI embedding | Proposed |
| [ADR-151](ADR-151-room-calibration-specialist-training.md) | Per-Room Calibration & Specialized Model Training (room-first → bank of small ruVector specialists) | Proposed |
| [ADR-152](ADR-152-wifi-pose-sota-2026-intake.md) | WiFi-Pose SOTA 2026 Intake: geometry-conditioned calibration, external benchmarks, foundation-encoder recipe | Proposed |
### Platform and UI
@ -93,6 +97,8 @@ Statuses: **Proposed** (under discussion), **Accepted** (approved and/or impleme
| [ADR-036](ADR-036-rvf-training-pipeline-ui.md) | Training Pipeline UI Integration | Proposed |
| [ADR-043](ADR-043-sensing-server-ui-api-completion.md) | Sensing Server UI API Completion (14 endpoints) | Accepted |
| [ADR-115](ADR-115-home-assistant-integration.md) | Home Assistant integration via MQTT auto-discovery + Matter bridge (HA-DISCO + HA-FABRIC + HA-MIND) | Accepted (MQTT track) / Proposed (Matter SDK P8b) |
| [ADR-147](ADR-147-adam-mode-light-theme.md) | adam-mode — light theme toggle for the three.js realtime demo | Proposed |
| [ADR-148](ADR-148-yoga-mode-pose-system.md) | yoga-mode — yoga pose detection, classification, and scoring for the three.js realtime demo | Proposed |
### Architecture and infrastructure

View File

@ -0,0 +1,234 @@
# Per-Room Calibration — Integration Overview (for `cognitum-one/v0-appliance`)
**Audience:** integrators wiring the RuView per-room calibration system (ADR-151) into the
Cognitum V0 appliance (`cognitum-v0`, Pi 5 + Hailo). This document is the contract +
deployment spec: data formats, API surface, crate API, and the appliance integration plan.
**Source of truth:** crate `v2/crates/wifi-densepose-calibration` + CLI `v2/crates/wifi-densepose-cli`
(`calibrate`, `calibrate-serve`, `enroll`, `train-room`, `room-status`, `room-watch`) on this PR's branch.
---
## 1. What it is
"Teach the room before you teach the model." A local-first pipeline that turns a few minutes of
clean human anchors — layered on an empty-room baseline — into a versioned **bank of small,
room-calibrated specialists** for presence, posture, breathing, heartbeat, restlessness, and anomaly.
```
baseline (ADR-135) → enroll (anchors + quality gate) → extract (features) → train (specialist bank) → runtime (mixture + veto)
environmental stand/sit/lie/breathe/move periodicity/variance 6 small models RoomState per window
fingerprint (re-prompts bad captures) + STALE invalidation (+ multistatic fusion)
```
**Design invariants (carry these into the appliance):**
- **Specialisation over scale** — six tiny models (threshold / nearest-prototype / autocorrelation), not one big model. They run in microseconds on a Pi CPU; **they do not need the Hailo HAT**.
- **Local-first** — baselines + per-room banks stay on the device. Cross-room sharing is *model deltas* (federation, ADR-105), **never raw CSI**.
- **Honest degradation** — baseline drift marks a bank `STALE`; a physically-implausible window is vetoed rather than emitting a hallucinated reading.
---
## 2. Tiering on the Pi 5 + Hailo (what runs where)
| Tier | Runs on | What | Status |
|------|---------|------|--------|
| **CSI source** | ESP32-S3/C6 nodes (`edge_tier=0` raw CSI) | `0xC5110001` frames over UDP | shipping (v0.7.1-esp32) |
| **Calibration service** | **Pi 5 CPU** (aarch64) | this crate: baseline/enroll/train/runtime + HTTP API | **this PR** |
| **Shared backbone (optional)** | **Hailo HAT (HAILO10H)** | ADR-150 RF Foundation Encoder + neural pose head as HEF | future (ADR-150) |
> The appliance's WiFi (`wlan0`) is `managed` with no nexmon — **the Pi is a CSI *processor*, not a CSI radio.** CSI arrives from the ESP32 nodes (the existing `ruview-vitals-worker:50054` already receives it). Calibration *consumes* that stream; it does not sense directly.
---
## 3. Data contracts (the integration surface)
### 3.1 CSI ingest — ESP32 `0xC5110001` (UDP, little-endian)
```
Offset Size Field
0 4 magic = 0xC511_0001 (LE u32)
4 1 node_id (u8) ← group multistatic nodes by this
5 1 n_antennas (u8)
6 1 n_subcarriers (u8) ← 52/64 (HT20), 114 (HT40), 242 (HE20)
7 1 reserved
8 2 freq_mhz (LE u16)
10 4 sequence (LE u32)
14 1 rssi (i8)
15 1 noise_floor (i8)
16 4 reserved
20 2·n_antennas·n_subcarriers IQ pairs: i (i8), q (i8)
```
Parser reference: `wifi-densepose-cli/src/calibrate.rs::parse_csi_packet`. The appliance can reuse the
ESP32 stream the vitals worker already receives, or tee it to the calibration UDP port.
### 3.2 Baseline (ADR-135) — binary, magic `0xCA1B_0001`
```
Header (16 B LE): magic(4)=0xCA1B0001, version(1)=1, tier(1) {0=HT20,1=HT40,2=HE20,3=HE40},
reserved(2), captured_at_unix_s(8, i64)
Body: frame_count(8,u64), num_subcarriers(4,u32),
per subcarrier: amp_mean(f32), amp_variance(f32), phase_mean(f32), phase_dispersion(f32)
```
Produced by `calibrate` / `calibrate-serve`; `BaselineCalibration::{to_bytes,from_bytes}`. A baseline's
UUID (`calibration_uuid()`) is the `baseline_id` referenced by enrollments and banks for STALE checks.
### 3.3 Enrollment output — JSON (`enroll` → `train-room`)
```jsonc
{
"room_id": "living-room",
"baseline_id": "<uuid>",
"fs_hz": 15.0,
"anchors": [
{ "room_id": "living-room", "label": "stand_still",
"features": { "mean": f32, "variance": f32, "motion": f32,
"breathing_score": f32, "breathing_hz": f32,
"heart_score": f32, "heart_hz": f32 } }
],
"session": { "room_id": "...", "baseline_id": "...", "events": [ /* event-sourced audit log */ ] }
}
```
Anchor labels (fixed sequence, **JSON wire = snake_case**, test-enforced): `empty, stand_still, sit, lie_down, breathe_slow, breathe_normal, small_move, sleep_posture`.
### 3.4 Specialist bank — JSON (`train-room` → `room-watch` / runtime)
```jsonc
{
"room_id": "living-room",
"baseline_id": "<uuid>", // drift vs current → STALE
"trained_at_unix_s": 0,
"anchor_count": 6,
"presence": { "threshold": f32, "occupied_var": f32 } | null,
"posture": { "prototypes": [ ["Standing", [f32;5]], ... ] } | null,
"breathing": { "min_score": f32 },
"heartbeat": { "min_score": f32 },
"restlessness": { "calm_motion": f32, "active_motion": f32 } | null,
"anomaly": { "prototypes": [ [f32;5], ... ], "scale": f32 } | null
}
```
`SpecialistBank::{to_json,from_json}`. A *partial* bank is valid (missing-anchor specialists are `null`).
### 3.5 Runtime output — `RoomState` JSON (per window)
```jsonc
{
"presence": { "kind":"Presence", "value":0|1, "confidence":f32, "label":"present|absent" } | null,
"posture": { "kind":"Posture", "value":f32, "confidence":f32, "label":"standing|sitting|lying" } | null,
"breathing": { "kind":"Breathing", "value": <BPM>, "confidence":f32, "label":null } | null,
"heartbeat": { "kind":"Heartbeat", "value": <BPM>, "confidence":f32, "label":null } | null,
"restlessness": { "kind":"Restlessness", "value": 0.0..1.0, "confidence":f32 } | null,
"anomaly": { "kind":"Anomaly", "value": 0.0..1.0, "confidence":f32, "label":"normal|anomalous" } | null,
"vetoed": bool, // anomaly veto fired → vitals/posture suppressed
"stale": bool // bank trained against a different baseline
}
```
---
## 4. HTTP API — `calibrate-serve` (CORS-enabled; this is what a UI/appliance drives)
| Method | Path | Body / returns |
|--------|------|----------------|
| GET | `/api/v1/calibration/health` | `{ udp_port, frames_seen, last_frame_age_ms, streaming, default_tier, output_dir, session_active }` |
| POST | `/api/v1/calibration/start` | `{ tier?, duration_s?, room_id?, min_frames? }``202` session snapshot |
| GET | `/api/v1/calibration/status` | live `{ state, frames_recorded, target_frames, progress, z_median, eta_s, ... }` |
| POST | `/api/v1/calibration/stop` | finalize early → result summary |
| GET | `/api/v1/calibration/result` | last finalized baseline summary |
| GET | `/api/v1/calibration/baselines` | list persisted `.bin` baselines |
| GET | `/api/v1/room/state?bank=<name>` | **live RoomState** (mixture-of-specialists over the CSI window; bank resolved as a sanitized name under `output_dir`) |
| POST | `/api/v1/room/train` | `{ room_id, baseline_id, anchors[]? }` → train + persist a specialist bank as `<output_dir>/<room_id>.json` (anchors[] optional if enrolled via `/enroll/anchor`; read back via `/room/state?bank=<room_id>`) |
| POST | `/api/v1/enroll/anchor` | `{ room_id, baseline, label, duration_s? }` → capture one guided anchor against a baseline (blocks for the capture); returns the gate verdict + progress |
| GET | `/api/v1/enroll/status?room=<id>` | enrollment progress (accepted anchors, next, complete) |
A single background task owns the UDP socket + recorder (handlers talk to it over an mpsc channel +
shared status snapshot), so the API is non-blocking. **The full pipeline is now drivable over HTTP** — baseline (`start`/`stop`) → `enroll/anchor` (×8) → `room/train``room/state` — so the appliance UI needs no CLI. (The CLI `enroll`/`train-room`/`room-watch` remain for scripted/headless use.)
---
## 5. Public crate API (`wifi-densepose-calibration`)
```rust
// Stage 2 — enrollment
anchor::{AnchorLabel, Anchor, AnchorQuality, EnrollmentEvent, EnrollmentSession, Posture}
enrollment::{AnchorQualityGate, AnchorRecorder}
// Stage 3 — features
extract::{Features, AnchorFeature, autocorr_dominant}
// Stage 4 — specialists + bank
specialist::{Specialist, SpecialistKind, SpecialistReading,
PresenceSpecialist, PostureSpecialist, BreathingSpecialist,
HeartbeatSpecialist, RestlessnessSpecialist, AnomalySpecialist}
bank::SpecialistBank
// Stage 5 — runtime
runtime::{MixtureOfSpecialists, RoomState}
multistatic::MultiNodeMixture // fuse co-located nodes (ADR-029)
```
Pure Rust; deps are `wifi-densepose-core` + `wifi-densepose-signal` (default-features off) + serde/uuid.
**No GPU / no system BLAS** in the calibration path → builds cleanly on aarch64.
---
## 6. Appliance integration plan (`cognitum-one/v0-appliance`)
Verified on `cognitum-v0`: aarch64, `cargo 1.96.0`, Hailo `HAILO10H`, `ruview-vitals-worker:50054`.
**Step 1 — vendor / depend on the crate.** Add `wifi-densepose-calibration` (path or published crate)
to the appliance workspace. It builds natively on aarch64 — no BLAS/GPU, **and no ONNX/OpenSSL**:
the CLI's `mat`→`nn`→`ort`(ONNX)→`openssl-sys` chain is now feature-gated out of the calibration build.
```bash
# Pi/appliance calibration binary — cross-compiles clean (no ort/openssl):
cargo build -p wifi-densepose-cli --no-default-features --release
# (omit `--no-default-features` only if you also need the MAT subcommands)
```
Verified: `cargo tree -p wifi-densepose-cli --no-default-features` shows **0** `ort`/`openssl-sys` deps;
`cross test --target aarch64-unknown-linux-gnu` passes the calibration suite under qemu.
**Step 2 — wire the CSI source.** Two options:
- (a) Tee the ESP32 UDP stream the vitals worker already receives into the calibration ingest, or
- (b) point ESP32 nodes (`edge_tier=0`) at the appliance's calibration UDP port directly.
Reuse `parse_csi_packet` (or the rvCSI `CsiFrame` schema if you normalise upstream).
**Step 3 — run the calibration service.** Either embed the crate (call `CalibrationRecorder` /
`MixtureOfSpecialists` in-process from a worker like `ruview-vitals-worker`), or run the
`calibrate-serve` binary as a sidecar (systemd unit, bind `127.0.0.1` + reverse-proxy through the
appliance gateway on `:9000`). Persist baselines/banks under the appliance data dir, keyed by `room_id`.
**Step 4 — expose to the dashboard.** Surface the `/api/v1/calibration/*` endpoints (and add
`enroll`/`train`/`room-state` endpoints — small additive work) behind the appliance's bearer-token
auth + the existing `Seeds`/`Edge` nav. `RoomState` (§3.5) is the live readout payload.
**Step 5 — (optional) Hailo backbone tier.** Compile the ADR-150 RF Foundation Encoder + neural pose
head to Hailo HEF, serve via `ruvector-hailo-worker:50051`; the small specialists become heads over its
embedding. This is the ADR-150 follow-on — *not required* for the calibration service to run.
**Privacy / security:** keep baselines + banks local; if federating across appliances (ADR-105),
exchange bank/model deltas, never raw CSI. Hardening already in place:
- **`--token <T>`** (or `CALIBRATE_TOKEN` env) requires `Authorization: Bearer <T>` on every route; the
server warns loudly if bound to a non-loopback address without a token.
- **`room_id` is sanitized** to `[A-Za-z0-9_-]` (≤64 chars) before it touches the baseline write path —
no `../` / absolute-path traversal.
- CORS is permissive for dev — in production bind to loopback and reverse-proxy through the appliance
gateway (which already enforces bearer auth).
---
## 7. Status & validation
- **Implemented:** all 5 stages + multistatic fusion; CLI + Stage-1 HTTP API (auth + path-traversal hardened). **55 tests** (35 calibration unit + 1 full-loop integration + 19 CLI), all passing under qemu-aarch64.
**Precise validation matrix (don't overstate this — no clean full calibration has run on-target yet):**
| Stage | Pi-5 (real nexmon→`0xC5110001`, 6,813 frames) | ESP32-S3 (COM8, `edge_tier=0`) | qemu / unit / integration |
|---|---|---|---|
| baseline capture + HTTP API + **auth gate** | ✅ | ✅ (120-frame) | full-loop ✅ |
| **clean** empty-room baseline | ❌ `motion_flagged` (artifact) | ❌ (occupied) | full-loop ✅ (synthetic, zero motion flags) |
| enroll → train-room | ❌ | ❌ (needs operator poses) | full-loop ✅ (8/8 anchors, 6 specialists, JSON round-trip) |
| runtime infer | ❌ on-target | ◐ single-node breathing ~1631 BPM via the **stateless** head (not a trained bank) + node-id fusion | full-loop ✅ (trained bank: 18±2 BPM positive, absent negative, foreign-baseline STALE) |
The complete `baseline → enroll → train-room → infer` loop is now **proven in-process** on deterministic synthetic CSI (`wifi-densepose-calibration/tests/full_loop.rs` — drives the CLI's exact stage order through the public API, seed-robust across 5 seeds, runs with and without default features). Capture + API + auth are proven on real CSI (both boxes). What remains is strictly the **on-target** run: real CSI, a physically empty room for baseline, and an operator performing the 8 guided anchors — that hardware session is the last open item.
- **Known follow-ups (appliance backlog):** `--source-format adr018v6` to drive calibration from the Pi's own nexmon (no ESP32/transcoder); the on-target clean-room enroll→train→infer session (above); phase-based (vs mean-amplitude) breathing carrier; RVF/HNSW persistence (currently JSON); enroll/train HTTP endpoints (live `/room/state` already added); ADR-150 Hailo backbone; true 2-node multistatic; ADR-105 federation.
- **Behavioral findings from the full-loop test — all four FIXED pre-hardware-session:** (1) *z-band squeeze* — anchor motion is now measured from frame-to-frame deltas of the deviation series (`|Δz| > 0.5 |Δφ| > π/6`), not from the absolute `motion_flagged` (which conflated presence strength with motion); a strongly-reflecting still person (z = 3.0, every frame flagged by the old heuristic) now enrolls — regression-guarded in the full-loop test's `StandStill` anchor and `enrollment::tests`. (2) *Variance-only presence*`PresenceSpecialist` gained a mean-shift channel (|mean empty mean| vs a trained threshold); a motionless person is detected via the mean even at empty-level variance — regression-guarded in the full-loop motionless-person case; old persisted banks deserialize with the channel inert (variance-only behavior preserved). (3) *Ungated hz embedding*`Features::embedding()` zeroes `breathing_hz`/`heart_hz` below `EMBED_MIN_SCORE` (0.25), keeping noise-window random frequencies out of the prototype space. (4) *Heart-band leakage* (found while fixing 3): a strong breathing rhythm's autocorrelation leaks into the HR band as a high-score lag-floor edge value (e.g. score 0.67 at 3.33 Hz from a pure 0.30 Hz breath); `autocorr_dominant` now requires the winning lag to be an interior local maximum, rejecting band-edge leakage while preserving true in-band peaks.
**Reference:** ADR-151 (`docs/adr/ADR-151-room-calibration-specialist-training.md`), ADR-135 (baseline),
ADR-029 (multistatic), ADR-150 (RF Foundation Encoder), ADR-105 (federation), ADR-147 (OccWorld/Hailo).