15 KiB
Per-Room Calibration — Integration Overview (for cognitum-one/v0-appliance)
Audience: integrators wiring the RuView per-room calibration system (ADR-151) into the
Cognitum V0 appliance (cognitum-v0, Pi 5 + Hailo). This document is the contract +
deployment spec: data formats, API surface, crate API, and the appliance integration plan.
Source of truth: crate v2/crates/wifi-densepose-calibration + CLI v2/crates/wifi-densepose-cli
(calibrate, calibrate-serve, enroll, train-room, room-status, room-watch) on this PR's branch.
1. What it is
"Teach the room before you teach the model." A local-first pipeline that turns a few minutes of clean human anchors — layered on an empty-room baseline — into a versioned bank of small, room-calibrated specialists for presence, posture, breathing, heartbeat, restlessness, and anomaly.
baseline (ADR-135) → enroll (anchors + quality gate) → extract (features) → train (specialist bank) → runtime (mixture + veto)
environmental stand/sit/lie/breathe/move periodicity/variance 6 small models RoomState per window
fingerprint (re-prompts bad captures) + STALE invalidation (+ multistatic fusion)
Design invariants (carry these into the appliance):
- Specialisation over scale — six tiny models (threshold / nearest-prototype / autocorrelation), not one big model. They run in microseconds on a Pi CPU; they do not need the Hailo HAT.
- Local-first — baselines + per-room banks stay on the device. Cross-room sharing is model deltas (federation, ADR-105), never raw CSI.
- Honest degradation — baseline drift marks a bank
STALE; a physically-implausible window is vetoed rather than emitting a hallucinated reading.
2. Tiering on the Pi 5 + Hailo (what runs where)
| Tier | Runs on | What | Status |
|---|---|---|---|
| CSI source | ESP32-S3/C6 nodes (edge_tier=0 raw CSI) |
0xC5110001 frames over UDP |
shipping (v0.7.1-esp32) |
| Calibration service | Pi 5 CPU (aarch64) | this crate: baseline/enroll/train/runtime + HTTP API | this PR |
| Shared backbone (optional) | Hailo HAT (HAILO10H) | ADR-150 RF Foundation Encoder + neural pose head as HEF | future (ADR-150) |
The appliance's WiFi (
wlan0) ismanagedwith no nexmon — the Pi is a CSI processor, not a CSI radio. CSI arrives from the ESP32 nodes (the existingruview-vitals-worker:50054already receives it). Calibration consumes that stream; it does not sense directly.
3. Data contracts (the integration surface)
3.1 CSI ingest — ESP32 0xC5110001 (UDP, little-endian)
Offset Size Field
0 4 magic = 0xC511_0001 (LE u32)
4 1 node_id (u8) ← group multistatic nodes by this
5 1 n_antennas (u8)
6 1 n_subcarriers (u8) ← 52/64 (HT20), 114 (HT40), 242 (HE20)
7 1 reserved
8 2 freq_mhz (LE u16)
10 4 sequence (LE u32)
14 1 rssi (i8)
15 1 noise_floor (i8)
16 4 reserved
20 2·n_antennas·n_subcarriers IQ pairs: i (i8), q (i8)
Parser reference: wifi-densepose-cli/src/calibrate.rs::parse_csi_packet. The appliance can reuse the
ESP32 stream the vitals worker already receives, or tee it to the calibration UDP port.
3.2 Baseline (ADR-135) — binary, magic 0xCA1B_0001
Header (16 B LE): magic(4)=0xCA1B0001, version(1)=1, tier(1) {0=HT20,1=HT40,2=HE20,3=HE40},
reserved(2), captured_at_unix_s(8, i64)
Body: frame_count(8,u64), num_subcarriers(4,u32),
per subcarrier: amp_mean(f32), amp_variance(f32), phase_mean(f32), phase_dispersion(f32)
Produced by calibrate / calibrate-serve; BaselineCalibration::{to_bytes,from_bytes}. A baseline's
UUID (calibration_uuid()) is the baseline_id referenced by enrollments and banks for STALE checks.
3.3 Enrollment output — JSON (enroll → train-room)
{
"room_id": "living-room",
"baseline_id": "<uuid>",
"fs_hz": 15.0,
"anchors": [
{ "room_id": "living-room", "label": "stand_still",
"features": { "mean": f32, "variance": f32, "motion": f32,
"breathing_score": f32, "breathing_hz": f32,
"heart_score": f32, "heart_hz": f32 } }
],
"session": { "room_id": "...", "baseline_id": "...", "events": [ /* event-sourced audit log */ ] }
}
Anchor labels (fixed sequence, JSON wire = snake_case, test-enforced): empty, stand_still, sit, lie_down, breathe_slow, breathe_normal, small_move, sleep_posture.
3.4 Specialist bank — JSON (train-room → room-watch / runtime)
{
"room_id": "living-room",
"baseline_id": "<uuid>", // drift vs current → STALE
"trained_at_unix_s": 0,
"anchor_count": 6,
"presence": { "threshold": f32, "occupied_var": f32 } | null,
"posture": { "prototypes": [ ["Standing", [f32;5]], ... ] } | null,
"breathing": { "min_score": f32 },
"heartbeat": { "min_score": f32 },
"restlessness": { "calm_motion": f32, "active_motion": f32 } | null,
"anomaly": { "prototypes": [ [f32;5], ... ], "scale": f32 } | null
}
SpecialistBank::{to_json,from_json}. A partial bank is valid (missing-anchor specialists are null).
3.5 Runtime output — RoomState JSON (per window)
{
"presence": { "kind":"Presence", "value":0|1, "confidence":f32, "label":"present|absent" } | null,
"posture": { "kind":"Posture", "value":f32, "confidence":f32, "label":"standing|sitting|lying" } | null,
"breathing": { "kind":"Breathing", "value": <BPM>, "confidence":f32, "label":null } | null,
"heartbeat": { "kind":"Heartbeat", "value": <BPM>, "confidence":f32, "label":null } | null,
"restlessness": { "kind":"Restlessness", "value": 0.0..1.0, "confidence":f32 } | null,
"anomaly": { "kind":"Anomaly", "value": 0.0..1.0, "confidence":f32, "label":"normal|anomalous" } | null,
"vetoed": bool, // anomaly veto fired → vitals/posture suppressed
"stale": bool // bank trained against a different baseline
}
4. HTTP API — calibrate-serve (CORS-enabled; this is what a UI/appliance drives)
| Method | Path | Body / returns |
|---|---|---|
| GET | /api/v1/calibration/health |
{ udp_port, frames_seen, last_frame_age_ms, streaming, default_tier, output_dir, session_active } |
| POST | /api/v1/calibration/start |
{ tier?, duration_s?, room_id?, min_frames? } → 202 session snapshot |
| GET | /api/v1/calibration/status |
live { state, frames_recorded, target_frames, progress, z_median, eta_s, ... } |
| POST | /api/v1/calibration/stop |
finalize early → result summary |
| GET | /api/v1/calibration/result |
last finalized baseline summary |
| GET | /api/v1/calibration/baselines |
list persisted .bin baselines |
| GET | /api/v1/room/state?bank=<name> |
live RoomState (mixture-of-specialists over the CSI window; bank resolved as a sanitized name under output_dir) |
| POST | /api/v1/room/train |
{ room_id, baseline_id, anchors[]? } → train + persist a specialist bank as <output_dir>/<room_id>.json (anchors[] optional if enrolled via /enroll/anchor; read back via /room/state?bank=<room_id>) |
| POST | /api/v1/enroll/anchor |
{ room_id, baseline, label, duration_s? } → capture one guided anchor against a baseline (blocks for the capture); returns the gate verdict + progress |
| GET | /api/v1/enroll/status?room=<id> |
enrollment progress (accepted anchors, next, complete) |
A single background task owns the UDP socket + recorder (handlers talk to it over an mpsc channel +
shared status snapshot), so the API is non-blocking. The full pipeline is now drivable over HTTP — baseline (start/stop) → enroll/anchor (×8) → room/train → room/state — so the appliance UI needs no CLI. (The CLI enroll/train-room/room-watch remain for scripted/headless use.)
5. Public crate API (wifi-densepose-calibration)
// Stage 2 — enrollment
anchor::{AnchorLabel, Anchor, AnchorQuality, EnrollmentEvent, EnrollmentSession, Posture}
enrollment::{AnchorQualityGate, AnchorRecorder}
// Stage 3 — features
extract::{Features, AnchorFeature, autocorr_dominant}
// Stage 4 — specialists + bank
specialist::{Specialist, SpecialistKind, SpecialistReading,
PresenceSpecialist, PostureSpecialist, BreathingSpecialist,
HeartbeatSpecialist, RestlessnessSpecialist, AnomalySpecialist}
bank::SpecialistBank
// Stage 5 — runtime
runtime::{MixtureOfSpecialists, RoomState}
multistatic::MultiNodeMixture // fuse co-located nodes (ADR-029)
Pure Rust; deps are wifi-densepose-core + wifi-densepose-signal (default-features off) + serde/uuid.
No GPU / no system BLAS in the calibration path → builds cleanly on aarch64.
6. Appliance integration plan (cognitum-one/v0-appliance)
Verified on cognitum-v0: aarch64, cargo 1.96.0, Hailo HAILO10H, ruview-vitals-worker:50054.
Step 1 — vendor / depend on the crate. Add wifi-densepose-calibration (path or published crate)
to the appliance workspace. It builds natively on aarch64 — no BLAS/GPU, and no ONNX/OpenSSL:
the CLI's mat→nn→ort(ONNX)→openssl-sys chain is now feature-gated out of the calibration build.
# Pi/appliance calibration binary — cross-compiles clean (no ort/openssl):
cargo build -p wifi-densepose-cli --no-default-features --release
# (omit `--no-default-features` only if you also need the MAT subcommands)
Verified: cargo tree -p wifi-densepose-cli --no-default-features shows 0 ort/openssl-sys deps;
cross test --target aarch64-unknown-linux-gnu passes the calibration suite under qemu.
Step 2 — wire the CSI source. Two options:
- (a) Tee the ESP32 UDP stream the vitals worker already receives into the calibration ingest, or
- (b) point ESP32 nodes (
edge_tier=0) at the appliance's calibration UDP port directly. Reuseparse_csi_packet(or the rvCSICsiFrameschema if you normalise upstream).
Step 3 — run the calibration service. Either embed the crate (call CalibrationRecorder /
MixtureOfSpecialists in-process from a worker like ruview-vitals-worker), or run the
calibrate-serve binary as a sidecar (systemd unit, bind 127.0.0.1 + reverse-proxy through the
appliance gateway on :9000). Persist baselines/banks under the appliance data dir, keyed by room_id.
Step 4 — expose to the dashboard. Surface the /api/v1/calibration/* endpoints (and add
enroll/train/room-state endpoints — small additive work) behind the appliance's bearer-token
auth + the existing Seeds/Edge nav. RoomState (§3.5) is the live readout payload.
Step 5 — (optional) Hailo backbone tier. Compile the ADR-150 RF Foundation Encoder + neural pose
head to Hailo HEF, serve via ruvector-hailo-worker:50051; the small specialists become heads over its
embedding. This is the ADR-150 follow-on — not required for the calibration service to run.
Privacy / security: keep baselines + banks local; if federating across appliances (ADR-105), exchange bank/model deltas, never raw CSI. Hardening already in place:
--token <T>(orCALIBRATE_TOKENenv) requiresAuthorization: Bearer <T>on every route; the server warns loudly if bound to a non-loopback address without a token.room_idis sanitized to[A-Za-z0-9_-](≤64 chars) before it touches the baseline write path — no..// absolute-path traversal.- CORS is permissive for dev — in production bind to loopback and reverse-proxy through the appliance gateway (which already enforces bearer auth).
7. Status & validation
- Implemented: all 5 stages + multistatic fusion; CLI + Stage-1 HTTP API (auth + path-traversal hardened). 55 tests (35 calibration unit + 1 full-loop integration + 19 CLI), all passing under qemu-aarch64.
Precise validation matrix (don't overstate this — no clean full calibration has run on-target yet):
| Stage | Pi-5 (real nexmon→0xC5110001, 6,813 frames) |
ESP32-S3 (COM8, edge_tier=0) |
qemu / unit / integration |
|---|---|---|---|
| baseline capture + HTTP API + auth gate | ✅ | ✅ (120-frame) | full-loop ✅ |
| clean empty-room baseline | ❌ motion_flagged (artifact) |
❌ (occupied) | full-loop ✅ (synthetic, zero motion flags) |
| enroll → train-room | ❌ | ❌ (needs operator poses) | full-loop ✅ (8/8 anchors, 6 specialists, JSON round-trip) |
| runtime infer | ❌ on-target | ◐ single-node breathing ~16–31 BPM via the stateless head (not a trained bank) + node-id fusion | full-loop ✅ (trained bank: 18±2 BPM positive, absent negative, foreign-baseline STALE) |
The complete baseline → enroll → train-room → infer loop is now proven in-process on deterministic synthetic CSI (wifi-densepose-calibration/tests/full_loop.rs — drives the CLI's exact stage order through the public API, seed-robust across 5 seeds, runs with and without default features). Capture + API + auth are proven on real CSI (both boxes). What remains is strictly the on-target run: real CSI, a physically empty room for baseline, and an operator performing the 8 guided anchors — that hardware session is the last open item.
- Known follow-ups (appliance backlog):
--source-format adr018v6to drive calibration from the Pi's own nexmon (no ESP32/transcoder); the on-target clean-room enroll→train→infer session (above); phase-based (vs mean-amplitude) breathing carrier; RVF/HNSW persistence (currently JSON); enroll/train HTTP endpoints (live/room/statealready added); ADR-150 Hailo backbone; true 2-node multistatic; ADR-105 federation. - Behavioral findings from the full-loop test — all four FIXED pre-hardware-session: (1) z-band squeeze — anchor motion is now measured from frame-to-frame deltas of the deviation series (
|Δz| > 0.5 ∨ |Δφ| > π/6), not from the absolutemotion_flagged(which conflated presence strength with motion); a strongly-reflecting still person (z = 3.0, every frame flagged by the old heuristic) now enrolls — regression-guarded in the full-loop test'sStandStillanchor andenrollment::tests. (2) Variance-only presence —PresenceSpecialistgained a mean-shift channel (|mean − empty mean| vs a trained threshold); a motionless person is detected via the mean even at empty-level variance — regression-guarded in the full-loop motionless-person case; old persisted banks deserialize with the channel inert (variance-only behavior preserved). (3) Ungated hz embedding —Features::embedding()zeroesbreathing_hz/heart_hzbelowEMBED_MIN_SCORE(0.25), keeping noise-window random frequencies out of the prototype space. (4) Heart-band leakage (found while fixing 3): a strong breathing rhythm's autocorrelation leaks into the HR band as a high-score lag-floor edge value (e.g. score 0.67 at 3.33 Hz from a pure 0.30 Hz breath);autocorr_dominantnow requires the winning lag to be an interior local maximum, rejecting band-edge leakage while preserving true in-band peaks.
Reference: ADR-151 (docs/adr/ADR-151-room-calibration-specialist-training.md), ADR-135 (baseline),
ADR-029 (multistatic), ADR-150 (RF Foundation Encoder), ADR-105 (federation), ADR-147 (OccWorld/Hailo).