deploy: 2a307138f2

2026-06-10 19:35:47 +00:00 · 2026-06-10 19:35:47 +00:00 · 5808507cc5
parent 4bdb31b5cf
commit 5808507cc5
4 changed files with 598 additions and 0 deletions
--- a/api-docs/adr/ADR-151-room-calibration-specialist-training.md
+++ b/api-docs/adr/ADR-151-room-calibration-specialist-training.md
@ -0,0 +1,260 @@
+# ADR-151: RuView Per-Room Calibration & Specialized Model Training System
+
+| Field | Value |
+|-------|-------|
+| **Status** | Accepted — Stages 1–5 implemented (statistical specialists); HF-backbone distillation pending |
+| **Date** | 2026-06-09 |
+| **Deciders** | ruv |
+| **Codebase target** | New `wifi-densepose-calibration` crate (orchestration); `wifi-densepose-train` (`rapid_adapt.rs`, `signal_features.rs`, `trainer.rs`); `wifi-densepose-ruvector` (RVF specialist storage); `wifi-densepose-signal/ruvsense/*` (feature extractors); `wifi-densepose-cli` (`enroll`, `train-room`, `room-status` subcommands) |
+| **Relates to** | ADR-135 (Empty-Room Baseline Calibration), ADR-030 (Persistent Field Model), ADR-134 (CIR), ADR-024 (Contrastive CSI Embedding / AETHER), ADR-027 (Cross-Environment Domain Generalization / MERIDIAN), ADR-070 (Self-Supervised Pretraining), ADR-105 (Federated CSI Training), ADR-149 (AetherArena / Hugging Face), ADR-150 (RF Foundation Encoder) |
+
+---
+
+## 1. Context
+
+### 1.1 The thesis — teach the room before you teach the model
+
+RuView's deployment frontier is not a better generic model. ADR-150 documents the wall directly: an MM-Fi pose head scores **81.63% torso-PCK@20 in-domain but ~11.6% leakage-free cross-subject**, and bigger capacity *hurts* cross-subject (transformer 24.8% < conv 27.3%). A single oversized model that "understands the world" overfits the rooms and bodies it has seen. The lever is the opposite of scale: **a small model that understands *one* room and *one* person**, calibrated in minutes, run locally, and specialised per biological signal.
+
+This positions RuView between the two incumbents in ambient sensing:
+
+- **Wearables** — high fidelity, but people forget to wear them, and they only measure the wearer.
+- **Cameras** — powerful, but invasive, store identifiable video, and fail in the dark / under covers.
+
+RuView sits in the middle: it learns the *space*, learns the *person*, and tracks biological rhythm (breathing, heartbeat, restlessness, posture, presence) without seeing skin or storing video. Heartbeat and breathing are not visual problems — they are tiny, repeating disturbances in the RF field. Capturing them well is a *calibration* problem, not a *model-size* problem.
+
+### 1.2 What already exists (and what is missing)
+
+The pieces of a calibration→training pipeline exist as disconnected modules. There is no system that runs them end to end and emits a per-room model bank.
+
+| Capability | Status today | Gap |
+|------------|--------------|-----|
+| Empty-room baseline (environmental fingerprint) | ADR-135 `BaselineCalibration` (Proposed): per-subcarrier amplitude + circular-phase stats, `ruvcal` NVS namespace | Captures the *room*, but there is no step that captures *guided human anchors* on top of it |
+| Field eigenstructure | ADR-030 `field_model.rs` (SVD room eigenmodes) | Consumes calibration; not wired to a training trigger |
+| Shared invariant backbone | ADR-150 RF Foundation Encoder (pose-preserving, subject/room/device-invariant) | Defined as a *foundation* embedding; nothing distills it into per-room specialists |
+| Few-shot adaptation | `train/src/rapid_adapt.rs` — test-time training → LoRA weight deltas (MERIDIAN P5) | Produces a *single* pose-adaptation delta, not a bank of per-modality specialists |
+| Feature extractors | `ruvsense/{bvp,longitudinal,intention,gesture,pose_tracker,adversarial}.rs`, `train/src/signal_features.rs` | Each emits a signal; none is packaged as a labelled training source for enrollment |
+| Small-model storage | `wifi-densepose-ruvector` (RVF cognitive containers, HNSW, sketch) | No schema for "a bank of specialist models scoped to a room_id" |
+| HF publishing | ADR-149 AetherArena (Hugging Face Space + signed scorer), `sensing-server` `from_pretrained` path | Publishes/評価s a *global* model; no notion of a published *base* + private *local* heads |
+
+**The missing system is the connective tissue**: a guided enrollment protocol, a feature-extraction-to-label bridge, a specialist-bank trainer that reuses the frozen HF backbone, and a runtime that fuses the specialists with confidence gating. This ADR defines that system.
+
+### 1.3 The four-step user model (and where each step lands)
+
+The system is deliberately presented to operators as four plain steps. Each maps to existing or new code:
+
+1. **Capture a quiet baseline** — no people, just room/router/reflections/noise/drift → the *environmental fingerprint*. → **Reuse ADR-135** `BaselineCalibration` + **ADR-030** field eigenmodes. No new capture code; the calibration crate calls it.
+2. **Capture guided samples** — stand, sit, lie down, slow vs normal breathing, small movement, sleep posture. Clean anchors, not hours of data. → **NEW** `EnrollmentProtocol` (Section 2.2).
+3. **Extract the useful signal** — CSI phase, amplitude, Doppler shift, micro-motion, periodicity, variance, timing. → **Reuse** `signal_features.rs` + ruvsense extractors, packaged as labelled `AnchorFeature` records (Section 2.3).
+4. **Compress patterns into small ruVector models** — *specialised* per signal: breathing, heartbeat, sleep restlessness, posture, presence, anomaly. → **NEW** `SpecialistBank` trained via `rapid_adapt` LoRA heads over the frozen ADR-150 backbone, stored as RVF (Section 2.4).
+
+---
+
+## 2. Decision
+
+**Build the RuView Per-Room Calibration & Specialized Model Training System: a four-stage, local-first pipeline (`baseline → enroll → extract → train`) that produces a versioned *bank of small specialised ruVector models* scoped to one `room_id`, each a lightweight head distilled/adapted from the frozen, Hugging-Face-published RF Foundation Encoder (ADR-150).** Big model understands the world; small ruVector models understand *your room*.
+
+Two invariants govern every design choice below:
+
+> **(A) Specialisation over scale.** One small model per biological signal, not one large model for all of them. Each specialist is faster, cheaper, more private, and — because it is calibrated to the room's actual fingerprint — often *more accurate* than a general model.
+>
+> **(B) Local-first, base-shared.** The frozen room/subject/device-invariant backbone is the only artifact published to Hugging Face. Per-room baselines and per-specialist heads never leave the device unless the operator opts into federation (ADR-105).
+
+### 2.1 System architecture
+
+```
+                       HUGGING FACE HUB (public, room-agnostic)
+                       ┌───────────────────────────────────────┐
+                       │  RF Foundation Encoder (ADR-150)       │
+                       │  pose-preserving · subject/room/device │
+                       │  -invariant · frozen · safetensors     │
+                       └───────────────┬───────────────────────┘
+                                       │  from_pretrained() once, cached on device
+                                       ▼
+  STAGE 1 baseline        STAGE 2 enroll        STAGE 3 extract         STAGE 4 train (per room_id)
+  ┌──────────────┐        ┌──────────────┐      ┌────────────────┐      ┌─────────────────────────┐
+  │ ADR-135      │        │ Enrollment   │      │ signal_features│      │ SpecialistBank          │
+  │ Baseline-    │──fp──► │ Protocol     │─clip►│ + ruvsense     │─AF──►│  frozen backbone        │
+  │ Calibration  │        │ guided       │      │ extractors     │      │   │  ┌────────────────┐  │
+  │ (env finger- │        │ anchors:     │      │ → AnchorFeature│      │   ├─►│ breathing head │  │
+  │  print)      │        │ stand/sit/   │      │ (phase, amp,   │      │   ├─►│ heartbeat head │  │
+  │ ADR-030      │        │ lie/breathe/ │      │  doppler,      │      │   ├─►│ restless head  │  │
+  │ field eigen  │        │ move/sleep   │      │  micromotion,  │      │   ├─►│ posture head   │  │
+  └──────────────┘        └──────────────┘      │  periodicity,  │      │   ├─►│ presence head  │  │
+        │                                        │  variance,     │      │   └─►│ anomaly head   │  │
+        │  baseline drift > τ → invalidate bank  │  timing)       │      │     (LoRA / ruVector    │
+        └───────────────────────────────────────┴────────────────┴──────┤      small models)      │
+                                                                          └───────────┬─────────────┘
+                                                                                      │ RVF container
+                                                                                      ▼
+                                                              RUNTIME: Mixture-of-Specialists
+                                                              each head emits {value, confidence};
+                                                              coherence_gate (ADR-135) + anomaly
+                                                              head veto → fused RoomState
+```
+
+The shared backbone is loaded **once per device** and frozen. Every specialist is a small head over its embedding — so the marginal cost of a sixth specialist is kilobytes of LoRA weights, not another full model.
+
+### 2.2 Stage 2 — the guided enrollment protocol (NEW)
+
+`EnrollmentProtocol` is a CLI-driven state machine that walks the operator through a fixed sequence of labelled **anchors**. The design rule from the user vision is explicit: *clean anchors, not hours of data.* Each anchor is a short (default 20 s @ 20 Hz = 400 frames) labelled clip captured against the already-recorded baseline.
+
+| Anchor | Label | Duration | Primary signal taught | Feature emphasis |
+|--------|-------|----------|-----------------------|------------------|
+| `empty` | presence=0 | (reuse ADR-135 baseline) | absence reference | amplitude variance floor |
+| `stand_still` | posture=standing, presence=1 | 20 s | static human load | amplitude mean shift, eigenmode delta |
+| `sit` | posture=sitting | 20 s | lower static load | amplitude profile |
+| `lie_down` | posture=lying | 20 s | sleep-position load | amplitude profile, low Doppler |
+| `breathe_slow` | resp≈0.1–0.15 Hz | 30 s | slow respiration | periodicity, micro-Doppler |
+| `breathe_normal` | resp≈0.2–0.3 Hz | 30 s | normal respiration | periodicity, BVP phase |
+| `small_move` | motion=1 | 20 s | limb micro-motion | Doppler spread, variance |
+| `sleep_posture` | posture=lying, restless=0 | 30 s | quiescent sleep baseline | long-window variance, timing |
+
+The protocol is **adaptive**: an anchor is only accepted when its captured features pass a quality gate (coherence ≥ threshold from `coherence_gate.rs`, sufficient SNR vs baseline, no saturation). A failed anchor is re-prompted rather than silently kept — bad anchors poison small models far more than large ones. Total guided enrollment is ~4 minutes of wall-clock, producing 8 clean anchors. This is intentionally far below the "hours of data" that a from-scratch model needs, because the backbone already carries world knowledge; enrollment only teaches *this* room's offsets.
+
+Anchors are persisted as an append-only `EnrollmentSession` (event-sourced, per CLAUDE.md state rules) under `room_id`, so re-enrollment is incremental and auditable.
+
+### 2.3 Stage 3 — feature extraction to labelled records (REUSE + bridge)
+
+Each accepted anchor clip is run through the existing extractor stack, baseline-subtracted per ADR-135, and packaged into an `AnchorFeature` record. No new DSP is invented — this stage is a *bridge*, not a new algorithm.
+
+| Feature group | Source module | Used by specialists |
+|---------------|---------------|---------------------|
+| CSI amplitude mean/variance | ADR-135 baseline subtraction + `signal_features.rs` | presence, posture |
+| CSI phase (sanitised, LO-aligned) | `phase_sanitizer` → `phase_align` | posture, heartbeat |
+| Doppler shift / micro-Doppler | `ruvsense/bvp.rs`, `breathing` path | breathing, small-move |
+| Micro-motion / intention lead | `ruvsense/intention.rs` | restlessness, anomaly |
+| Periodicity / spectral peaks | `bvp.rs` autocorrelation + FFT | breathing, heartbeat |
+| Long-window variance / drift | `ruvsense/longitudinal.rs` (Welford) | restlessness, presence |
+| Timing / inter-frame epoch | `c6_timesync` epoch, frame Δt | all (rhythm alignment) |
+| Field eigenmode coefficients | ADR-030 `field_model.rs` | posture, presence |
+
+`AnchorFeature` = `{ room_id, anchor_label, t_epoch_us, embedding: [f32; D] (backbone output), aux: { resp_hz?, doppler_spread, variance, periodicity_score, eigen_coeffs } }`. The backbone embedding is the *shared* representation; `aux` carries the cheap hand-features that let small heads specialise without re-learning DSP.
+
+### 2.4 Stage 4 — the specialist bank (NEW, the core contribution)
+
+A **`SpecialistBank`** is a versioned collection of small models scoped to one `room_id`, persisted as a single RVF cognitive container (`wifi-densepose-ruvector`). Each specialist is a *head* over the frozen backbone embedding, trained from the labelled `AnchorFeature` records via the existing `rapid_adapt.rs` LoRA machinery (test-time/few-shot training, contrastive + entropy losses), **not** a from-scratch network.
+
+| Specialist | Model type | Params (typ.) | Label source | Output |
+|------------|-----------|---------------|--------------|--------|
+| **breathing** | 1-D temporal head + periodicity regressor | ~8 KB LoRA + aux | `breathe_slow`/`breathe_normal` | resp rate (Hz) + confidence |
+| **heartbeat** | narrowband phase head (harmonic-aware) | ~12 KB | quiescent anchors + periodicity | HR (bpm) + confidence |
+| **sleep restlessness** | variance/drift classifier | ~4 KB | `sleep_posture` vs `small_move` | restlessness score [0,1] |
+| **posture** | k-way prototype classifier (HNSW NN) | prototypes only | `stand/sit/lie` anchors | posture class + margin |
+| **presence** | binary energy/eigenmode gate | ~2 KB | `empty` vs occupied anchors | presence prob |
+| **anomaly** | one-class / physically-impossible detector (`adversarial.rs`) | ~6 KB | baseline + all anchors (novelty) | anomaly score + veto flag |
+
+Design properties that follow from invariant (A):
+
+- **Independently versioned & swappable.** Re-enrolling breathing does not retrain posture. A specialist carries its own `{trained_at, anchor_set_hash, baseline_hash, backbone_rev}`.
+- **HNSW prototype storage for the classifiers.** Posture and presence are nearest-prototype lookups in the RVF index — no inference engine, microsecond latency, and new postures are added by inserting a prototype, not retraining.
+- **SONA online adaptation.** Each specialist may carry a SONA/MicroLoRA online-adaptation slot (`ruvllm_sona_*` / `microlora` primitives) so it tracks slow drift (furniture moved, seasonal RF change) between full re-enrollments, gated by ADR-135 baseline drift.
+- **Teacher–student distillation (optional, offline).** Where a labelled public corpus exists (MM-Fi, Wi-Pose), the ADR-150 backbone acts as teacher to pre-shape a head before per-room fine-tuning, improving cold-start. The *teacher* is global/HF; the *student head* is local.
+
+**Invalidation contract.** The bank stores the `baseline_id` (the baseline UUID) it was trained against. **As implemented**, the runtime marks the bank `STALE` whenever the *current* baseline id differs from the trained one — a conservative trigger that catches re-calibration (room rearranged, AP moved, band changed) because any of those produces a new baseline. A finer **drift-threshold** trigger (mark STALE when ADR-135's per-subcarrier deviation exceeds τ *without* a full re-baseline) is a planned refinement (P6). Either way the runtime prompts re-enrollment rather than emitting silently wrong vitals — the calibration analogue of the #954 `DEGRADED` honesty rule: never report confident numbers from an invalid model.
+
+### 2.5 Runtime — mixture of specialists with confidence gating
+
+At inference, the frozen backbone embeds each CSI window once; every specialist consumes that shared embedding and emits `{value, confidence}`. Fusion rules:
+
+- The **anomaly** specialist holds a **veto**: a high anomaly score (physically-impossible signal per `adversarial.rs`, or a coherence-gate `Reject`) suppresses positive vitals/posture output and raises a flag, rather than propagating a hallucinated reading.
+- **presence=0** short-circuits breathing/heartbeat/posture to `null` (you cannot have a respiration rate in an empty room).
+- Each emitted reading is tagged with the specialist's confidence and the `baseline_hash`/`backbone_rev` provenance, so downstream consumers (sensing-server, MQTT, Home Assistant) can gate on quality — consistent with ADR-135 coherence-gate semantics.
+
+### 2.6 Crate & module layout
+
+New bounded-context crate `wifi-densepose-calibration` (orchestration only; files < 500 lines, typed public APIs, event-sourced sessions — per CLAUDE.md):
+
+```
+wifi-densepose-calibration/
+  src/
+    lib.rs                 # public API: CalibrationSystem facade
+    enrollment.rs          # EnrollmentProtocol state machine (Stage 2)
+    anchor.rs              # Anchor, EnrollmentSession (event-sourced)
+    extract.rs             # AnchorFeature bridge over signal_features + ruvsense (Stage 3)
+    specialist.rs          # Specialist trait, SpecialistKind enum
+    bank.rs                # SpecialistBank (RVF container, versioning, invalidation)
+    runtime.rs             # MixtureOfSpecialists fusion + veto (Stage 5)
+    backbone.rs            # frozen ADR-150 encoder loader (hf_hub from_pretrained, cached)
+    error.rs
+```
+
+Dependencies (no duplication — orchestrates existing crates): `wifi-densepose-signal` (ruvsense extractors, ADR-135 baseline), `wifi-densepose-train` (`rapid_adapt`, `signal_features`, `trainer`), `wifi-densepose-ruvector` (RVF, HNSW), `wifi-densepose-nn` (backbone inference). The `wifi-densepose-cli` gains `enroll`, `train-room`, and `room-status` subcommands, sequenced after the existing ADR-135 `calibrate`.
+
+### 2.7 CLI flow (operator-facing)
+
+```bash
+# Stage 1 — environmental fingerprint (ADR-135, existing)
+wifi-densepose calibrate --room living-room --duration 60s     # empty room
+
+# Stage 2+3 — guided enrollment (NEW); prompts through 8 anchors, ~4 min
+wifi-densepose enroll --room living-room
+#   → "Stand still in view of the sensor…"  [✓ anchor accepted: coherence 0.91]
+#   → "Sit down…"                            [✗ low SNR, retrying]
+#   ...
+
+# Stage 4 — train the specialist bank (NEW); reuses cached HF backbone
+wifi-densepose train-room --room living-room \
+    --specialists breathing,heartbeat,restlessness,posture,presence,anomaly
+
+# Status / invalidation
+wifi-densepose room-status --room living-room
+#   baseline: fresh (drift 0.04 < 0.20) · backbone: rf-foundation@1.2.0
+#   breathing  ✓ trained 2026-06-09  conf p50 0.88
+#   heartbeat  ✓ trained 2026-06-09  conf p50 0.71
+#   posture    ✓ 3 prototypes (stand/sit/lie)
+#   anomaly    ✓  · presence ✓  · restlessness ✓
+```
+
+---
+
+## 3. Consequences
+
+### 3.1 Positive
+
+- **Fidelity through specialisation.** Six small calibrated heads beat one oversized general model on the cross-room/cross-subject frontier that ADR-150 quantified — and each runs in microseconds-to-milliseconds, on-device.
+- **Privacy by construction.** Only the room-agnostic backbone is public (HF). The environmental fingerprint and the person-specific heads stay local; no video, no skin, no cloud round-trip. This is the core differentiator vs cameras and the convenience differentiator vs wearables.
+- **Minutes, not hours.** Because the backbone carries world knowledge, ~4 minutes of clean anchors calibrates a room. Re-enrollment is incremental.
+- **Honest degradation.** The `baseline_hash` invalidation + anomaly veto mean an out-of-calibration room reports `STALE`/flagged rather than confidently wrong — the same honesty principle as the firmware `DEGRADED` flag.
+- **Composable & cheap to extend.** A new biological signal = a new small head over the same embedding, not a new model.
+
+### 3.2 Negative / risks
+
+- **Backbone dependency.** Every specialist rides on ADR-150's encoder; its quality and revision compatibility (`backbone_rev`) are a single point of leverage. Mitigation: pin `backbone_rev` in each specialist; distillation cold-start reduces sensitivity.
+- **Enrollment burden.** 4 minutes is small but non-zero, and anchor quality depends on the operator following prompts. Mitigation: adaptive re-prompting + quality gates; ship sane defaults so a partial bank (presence+posture) works after just the static anchors.
+- **Heartbeat is hard.** Sub-mm chest displacement at HR frequencies is near the ESP32-S3 noise floor; the heartbeat specialist will have lower and more variable confidence than breathing. The confidence-gated runtime surfaces this rather than faking it.
+- **Per-room storage proliferation.** A bank per room per person; needs a clear RVF lifecycle (list/prune/export) — handled by `bank.rs` versioning and the `room-status` CLI.
+
+### 3.3 Alternatives considered
+
+| Alternative | Verdict | Reason |
+|-------------|---------|--------|
+| One large general model for all signals | **Rejected** | The ADR-150 evidence: scale overfits rooms/subjects and collapses cross-domain; also slower, costlier, less private. Directly contradicts invariant (A). |
+| Cloud training of per-room models | **Rejected** | Violates invariant (B): would ship raw CSI of a person's home/sleep to a server. Local-first is the privacy promise. Federation (ADR-105) is the *opt-in* path for shared improvement, exchanging gradients/deltas, never raw CSI. |
+| Skip the backbone; train each specialist from scratch | **Rejected** | Reintroduces the "hours of data" requirement the user vision explicitly rejects, and loses cross-room priors. |
+| Fold this into ADR-135 | **Rejected** | ADR-135 is *room* calibration (no humans). This ADR is *human-anchor* enrollment + model training on top of it. Distinct lifecycles, distinct invalidation; kept as separate bounded contexts. |
+
+---
+
+## 4. Implementation phases
+
+| Phase | Scope | Exit criterion | Status |
+|-------|-------|----------------|--------|
+| **P1** | Scaffold `wifi-densepose-calibration` crate; `AnchorFeature` schema; (backbone via `hf_hub` deferred) | Crate + schema; unit tests | ✅ Done (crate + Stage-1 baseline via `calibrate`/`calibrate-serve`; HF backbone deferred) |
+| **P2** | `EnrollmentProtocol` + `anchor.rs` (event-sourced sessions) + CLI `enroll` with quality gates | 8-anchor enrollment; bad anchors re-prompt | ✅ Done (`anchor.rs`, `enrollment.rs`, CLI `enroll`) |
+| **P3** | `extract.rs` bridge → labelled records; baseline subtraction (ADR-135) | `AnchorFeature` records persisted per `room_id` | ✅ Done (`extract.rs`; autocorr periodicity + variance/motion) |
+| **P4** | `SpecialistBank` + presence/posture (prototype) + breathing (periodicity); persistence + versioning | `train-room` produces a bank; `room-status` reads it back | ✅ Done (`specialist.rs`, `bank.rs`, CLI `train-room`/`room-status`; JSON persistence — RVF/HNSW = future) |
+| **P5** | heartbeat + restlessness + anomaly specialists; `runtime.rs` mixture + veto + confidence gating | End-to-end RoomState on hardware; anomaly veto verified | ✅ Done (`runtime.rs`, CLI `room-watch`; breathing read live on COM8 ESP32) |
+| **P6** | Baseline-drift `STALE` invalidation; SONA online adaptation; optional ADR-105 federation; HF teacher–student distillation | Drift marks bank STALE; AetherArena entry | ◐ Partial (STALE done; SONA/federation/HF-backbone = follow-ups) |
+
+**Current status (2026-06-10):** Stages 1–5 implemented with *statistical* specialists (threshold/prototype/autocorrelation). 55 tests (35 unit incl. multistatic + 1 full-loop integration + 19 CLI), all passing under qemu-aarch64. **Validation scope is precise:** baseline capture + HTTP API + auth are proven on real CSI (Pi-5 nexmon, 6,813 frames; and an ESP32-S3). The complete `baseline → enroll → train-room → infer` loop is now **proven in-process** on deterministic synthetic CSI (`tests/full_loop.rs`: clean baseline with zero motion flags, 8/8 anchors through the quality gate, 6 specialists trained, JSON bank round-trip, trained-bank inference 18±2 BPM positive / absent negative / foreign-baseline STALE; seed-robust). The one live runtime signal (breathing ~16–31 BPM via `room-watch`) used the *stateless* breathing head, **not** a trained bank; the clean empty-room loop has **not** yet run on-target — the remaining gap is strictly the hardware session (empty room + operator anchors). The four behavioral findings from the full-loop test (z-band squeeze, variance-only presence, ungated hz embedding, heart-band lag-floor leakage) are FIXED and regression-guarded — see the integration doc §7. SOTA-intake decisions affecting this system (geometry conditioning, checkerboard alignment) are recorded in ADR-152. Open refinements: `--source-format adr018v6` (drive from the Pi's own nexmon), phase-based breathing carrier, RVF/HNSW storage, and the ADR-150 frozen HF backbone the specialists would distill from.
+
+Validation per CLAUDE.md: `cargo test --workspace --no-default-features` green; hardware verification on the ESP32-S3 (currently COM8) before any release; witness bundle regenerated if the proof surface changes.
+
+---
+
+## 5. Summary
+
+> Big models understand the world. Small ruVector models understand *your room*.
+
+ADR-151 makes that operational: a local-first `baseline → enroll → extract → train` pipeline that turns ~4 minutes of clean human anchors — layered on ADR-135's empty-room fingerprint and ADR-150's Hugging-Face-published invariant backbone — into a versioned bank of tiny, specialised, privacy-preserving models for breathing, heartbeat, restlessness, posture, presence, and anomaly. Specialisation over scale; local heads over a shared base; honest `STALE` degradation over confident error.
--- a/api-docs/adr/ADR-152-wifi-pose-sota-2026-intake.md
+++ b/api-docs/adr/ADR-152-wifi-pose-sota-2026-intake.md
@ -0,0 +1,98 @@
+# ADR-152: WiFi-Pose SOTA 2026 Intake — Geometry-Conditioned Calibration, External Benchmarks, and the Foundation-Encoder Training Recipe
+
+| Field | Value |
+|-------|-------|
+| **Status** | Proposed |
+| **Date** | 2026-06-10 |
+| **Deciders** | ruv |
+| **Codebase target** | `wifi-densepose-calibration` (geometry conditioning, ADR-151 Stage 2), `wifi-densepose-train` (camera-supervised path, MAE recipe), `wifi-densepose-cli` (benchmark harness), docs |
+| **Relates to** | ADR-151 (Per-Room Calibration), ADR-150 (RF Foundation Encoder), ADR-135 (Empty-Room Baseline), ADR-079 (Camera-Supervised Pose), ADR-027 (MERIDIAN), ADR-024 (AETHER), ADR-149 (AetherArena), ADR-029 (Multistatic) |
+| **Research provenance** | Deep-research run 2026-06-10: 22 sources fetched, 110 claims extracted, 25 adversarially verified (3-vote), 24 confirmed / 1 refuted. Evidence grades per source below. |
+
+---
+
+## 1. Context
+
+A structured survey of the 2025–2026 WiFi human-sensing state of the art was run on 2026-06-10 to answer: *what should RuView integrate next, and does anything published invalidate our current direction?* Every claim below was verified against the primary source by independent adversarial reviewers; **evidence grades distinguish what the papers measured from what they merely claim**. Almost all performance numbers are author-self-reported preprint results — treated here as CLAIMED until reproduced on our hardware.
+
+### 1.1 The five verified findings
+
+**(F1) "Coordinate overfitting" is a named, diagnosed failure mode of camera-supervised WiFi pose — and our ADR-079 pipeline has the exact shape of it.**
+PerceptAlign (arXiv [2601.12252](https://arxiv.org/abs/2601.12252), accepted ACM MobiCom 2026) shows that models regressing CSI directly to camera-frame coordinates memorize the deployment-specific transceiver layout; SOTA baselines degrade to >600 mm MPJPE in unseen scenes. Their fix is cheap: a <5-minute calibration using two checkerboards and a few photos to align WiFi and vision in one shared 3D frame, plus **fusing transceiver-position embeddings with CSI features**. Claimed: −12.3% in-domain error, −60%+ cross-domain error. They release the claimed-largest cross-domain 3D WiFi pose dataset (21 subjects, 5 scenes, 18 actions, **7 device layouts**). *Evidence: improvements CLAIMED (preprint w/ MobiCom acceptance); the failure mode itself is corroborated across the cross-domain literature — and independently by our own ADR-150 data (81.63% in-domain vs ~11.6% leakage-free cross-subject torso-PCK).*
+
+**(F2) An external model named "WiFlow" claims 97.25% PCK@20 with 2.23M params and ships everything.**
+arXiv [2602.08661](https://arxiv.org/abs/2602.08661) (Apr 2026) — spatio-temporal-decoupled CSI pose, 97.25% PCK@20 / 99.48% PCK@50 / 0.007 m MPJPE, 2.23M parameters (~2.2 MB int8). Code, pretrained weights, and a 360k-sample CSI-pose dataset are public under Apache-2.0 ([repo](https://github.com/DY2434/WiFlow-WiFi-Pose-Estimation-with-Spatio-Temporal-Decoupling), Kaggle dataset). *Evidence: artifact availability MEASURED (verified by direct repo inspection); PCK numbers CLAIMED (5-subject, in-domain, self-collected dataset; hardware unspecified; 15 keypoints vs our 17).* ⚠️ **Name collision:** this is unrelated to RuView's internal WiFlow model. In all RuView docs the external model is referred to as **WiFlow-STD (DY2434)**.
+
+**(F3) For CSI foundation encoders, data scale — not model capacity — is the bottleneck, and the tokenization recipe is now known.**
+UNSW's MAE pretraining study (arXiv [2511.18792](https://arxiv.org/abs/2511.18792), Nov 2025) — the largest heterogeneous CSI pretraining run to date (1,320,892 samples, 14 public datasets incl. MM-Fi, Widar 3.0, Person-in-WiFi 3D; 4 devices; 2.4/5/6 GHz; 20–160 MHz) — reports zero-shot cross-domain gains of 2.2–15.7% over supervised baselines, with unseen-domain performance scaling **log-linearly with pretraining data, unsaturated at 1.3M samples**, while ViT-Base adds only 0.4–0.9% over ViT-Small. Optimal recipe: **80% masking ratio, small (30,3) patches** (+4.7% over (40,5) by preserving fine temporal dynamics). *Evidence: MEASURED within-study (ablations verified in body text) but preprint; downstream tasks are classification, NOT pose — pose transfer is a hypothesis. Independently corroborates ADR-150's finding that capacity hurts cross-subject.*
+
+**(F4) Hardware/standards: 802.11bf is finished; Espressif ships official sensing; Wi-Fi 6 AP CSI is reachable.**
+- **IEEE 802.11bf-2025** published **2025-09-26** (verified against the IEEE SA record) — sensing standardization is complete for both sub-7 GHz and >45 GHz, with formal sensing setup/feedback procedures. No ESP32 silicon implements it yet. *Evidence: MEASURED (standards-body record).*
+- **Espressif `esp_wifi_sensing`** (Apache-2.0, v0.1.x, ESP Component Registry): official CSI presence/motion FSM; esp-csi actively maintained (commit 2026-04-22, verified), CSI confirmed across ESP32/S2/C3/S3/C5/C6/C61. *Evidence: MEASURED (vendor pages + commit log).* ⚠️ A stronger "drop-in compatible with RuView nodes" claim was **REFUTED 0-3** — WiFi-6 parts use a different CSI acquisition config struct.
+- **ZTECSITool** (arXiv [2506.16957](https://arxiv.org/abs/2506.16957), [code](https://github.com/WiFiZTE2025/ZTE_WiFi_Sensing)): CSI from commercial Wi-Fi 6 APs at up to 160 MHz / 512 subcarriers (~5–10× ESP32 subcarrier count; the gain is aperture, not per-Hz granularity). Firmware is gated behind a ZTE serial-number approval. *Evidence: capability CLAIMED by the vendor-authored tool paper; code artifact MEASURED.*
+
+**(F5) Nothing in 2025–2026 does full DensePose UV regression from commodity WiFi.** Keypoint pose remains the field's frontier. Three "wireless foundation model" papers were screened out by full-text inspection (HeterCSI = simulated cellular channels only; the NeurIPS-2025 FMCW pilot = mmWave radar, presence-only; arXiv 2509.15258 = survey, no artifacts). *Evidence: MEASURED (absence verified by full-text inspection of the candidates that surfaced; absence of evidence across the whole literature is necessarily weaker).*
+
+### 1.2 What this means for the ADR-151 calibration system
+
+ADR-151's enrollment protocol captures guided human anchors but does **not** record or condition on transceiver geometry. F1 says that omission is precisely the thing that makes camera-supervised (and, plausibly, anchor-supervised) heads layout-brittle. ADR-151's per-room thesis ("teach the room before you teach the model") is *strengthened* by F1 — PerceptAlign is independent evidence that layout must be modeled explicitly — and the fix composes naturally with our Stage-2 enrollment.
+
+ADR-150's masked-CSI-encoder design is *validated* by F3, which also hands us the hyperparameters and the priority call: **collect/aggregate more heterogeneous CSI before scaling the encoder.**
+
+## 2. Decision
+
+Adopt four changes, ordered by effort-vs-gain:
+
+### 2.1 Geometry-condition the calibration system (extends ADR-151 Stage 2) — ACCEPTED
+
+1. **Record transceiver geometry at enrollment.** `EnrollmentProtocol` gains an optional `NodeGeometry` record per node (position estimate, antenna orientation, inter-node distances where known). Stored alongside the room baseline in the bank; schema-versioned so existing banks remain readable.
+2. **Fuse geometry embeddings into specialist training.** Where a specialist head consumes the (future, ADR-150) backbone embedding, concatenate a small learned embedding of `NodeGeometry` — the PerceptAlign mechanism, transplanted to our per-room banks. Statistical specialists (current) ignore it; LoRA heads (ADR-151 P6) consume it.
+3. **Adopt the two-checkerboard alignment for the camera-supervised path (ADR-079).** When MediaPipe supervision is used, calibrate camera↔WiFi into one shared 3D frame before regression (<5 min, two checkerboards, a few photos). This is the direct defense against F1 for our 92.9%-PCK@20 pipeline.
+4. **Evaluate on the PerceptAlign cross-domain dataset** (21 subjects / 7 layouts) as the MERIDIAN cross-layout benchmark — *gated on confirming its license and downloadability* (open question; repo per paper: github.com/Trymore-lab/PerceptAlign).
+
+### 2.2 Benchmark against WiFlow-STD (DY2434) — ACCEPTED
+
+Pull the Apache-2.0 weights + 360k-sample dataset; run three measurements: (a) their model on their data (reproduce 97.25% claim), (b) their model fine-tuned on our ESP32 17-keypoint eval set, (c) our internal WiFlow on their dataset (15-keypoint subset mapping). Until (a)–(c) are measured, **no RuView doc may cite 97.25% as a comparable number** — different dataset, subjects, keypoints.
+
+### 2.3 Apply the UNSW recipe to the ADR-150 encoder — ACCEPTED (amends ADR-150 §2.3)
+
+- Pretraining corpus: start from the same 14 public datasets (1.3M samples) + our home/MM-Fi frames; data aggregation takes priority over architecture work.
+- Tokenization: 80% masking, (30,3)-class small patches; encoder stays ViT-Small-class (~15M params) — F3 and our own DANN/transformer results agree that capacity does not pay.
+- The published log-linear scaling (unsaturated) sets the expectation: more heterogeneous CSI in, better zero-shot out.
+
+### 2.4 Hardware watch items — ACCEPTED (no code now)
+
+- **802.11bf**: track silicon/certification; revisit when any commodity chipset exposes standardized sensing measurements. Our opportunistic CSI extraction remains the mechanism until then.
+- **esp_wifi_sensing**: benchmark our presence pipeline against the vendor FSM (one afternoon; useful external baseline). Do **not** treat as drop-in (refuted claim).
+- **ZTECSITool AP**: optional high-resolution anchor node for the ADR-029 multistatic mesh — procurement-gated; only pursue if a 160 MHz anchor materially helps tomography.
+
+### 2.5 Explicitly NOT adopted
+
+- No pivot toward "wireless foundation model" papers that don't ship WiFi-CSI artifacts (HeterCSI, FMCW pilot, surveys).
+- No DensePose-UV work item: the field has not demonstrated UV regression from commodity WiFi; keypoints remain our supervised target (F5).
+
+## 3. Consequences
+
+**Positive:** the calibration system gains the one mechanism (geometry conditioning) the 2026 literature identifies as the difference between layout-brittle and layout-robust supervised WiFi pose; ADR-150 gets a measured training recipe instead of a guessed one; we acquire two external benchmarks (WiFlow-STD, PerceptAlign dataset) to keep our claims honest.
+
+**Negative / risks:** geometry records add schema surface to banks (mitigated: optional + versioned); every adopted number is preprint-grade until our own benchmark runs land (mitigated by §2.2's no-citation rule); PerceptAlign dataset license is unconfirmed (gated); name collision risk in docs (mitigated: "WiFlow-STD (DY2434)" naming rule).
+
+**Re-check by 2026-12:** 802.11bf silicon, esp_wifi_sensing maturity (v0.1.x today), and the preprint field (newest source Apr 2026).
+
+## 4. Open questions (carried from the research run)
+
+1. Does WiFlow-STD retain accuracy when fine-tuned on ESP32-S3/C6 CSI (fewer subcarriers, lower SNR), scored on our 17-keypoint set? (§2.2 answers this.)
+2. Is the PerceptAlign dataset downloadable under a usable license, and does the two-checkerboard procedure work with ESP32 transceiver geometry? (§2.1.4 gate.)
+3. Will esp_wifi_sensing evolve toward 802.11bf compliance, replacing opportunistic CSI extraction?
+
+## 5. Source register (evidence-graded)
+
+| Source | Type | Used for | Grade |
+|---|---|---|---|
+| arXiv 2601.12252 (PerceptAlign, MobiCom'26) | preprint+acceptance | F1, §2.1 | CLAIMED numbers; failure mode corroborated |
+| arXiv 2602.08661 + DY2434 repo (WiFlow-STD) | preprint + code | F2, §2.2 | numbers CLAIMED; artifacts MEASURED |
+| arXiv 2511.18792 (UNSW MAE) | preprint | F3, §2.3 | ablations MEASURED in-study; pose transfer hypothesis |
+| IEEE SA 802.11bf-2025 record | standards body | F4, §2.4 | MEASURED |
+| Espressif component registry + esp-csi repo | vendor | F4, §2.4 | MEASURED; "drop-in" REFUTED 0-3 |
+| arXiv 2506.16957 + ZTE repo (ZTECSITool) | vendor preprint + code | F4, §2.4 | capability CLAIMED; code MEASURED |
+| arXiv 2601.18200 (HeterCSI), OpenReview LMufK3vzE5 (FMCW pilot), arXiv 2509.15258 (survey) | preprints | F5, §2.5 (screened out) | MEASURED (full-text inspection) |
--- a/api-docs/adr/README.md
+++ b/api-docs/adr/README.md
@ -79,6 +79,10 @@ Statuses: **Proposed** (under discussion), **Accepted** (approved and/or impleme
 | [ADR-023](ADR-023-trained-densepose-model-ruvector-pipeline.md) | Trained DensePose Model with RuVector Pipeline | Proposed |
 | [ADR-024](ADR-024-contrastive-csi-embedding-model.md) | Project AETHER: Contrastive CSI Embeddings | Required |
 | [ADR-027](ADR-027-cross-environment-domain-generalization.md) | Project MERIDIAN: Cross-Environment Generalization | Proposed |
+| [ADR-149](ADR-149-public-community-leaderboard-huggingface.md) | AetherArena: public spatial-intelligence benchmark on Hugging Face | Proposed |
+| [ADR-150](ADR-150-rf-foundation-encoder.md) | RF Foundation Encoder: pose-preserving, subject/room/device-invariant CSI embedding | Proposed |
+| [ADR-151](ADR-151-room-calibration-specialist-training.md) | Per-Room Calibration & Specialized Model Training (room-first → bank of small ruVector specialists) | Proposed |
+| [ADR-152](ADR-152-wifi-pose-sota-2026-intake.md) | WiFi-Pose SOTA 2026 Intake: geometry-conditioned calibration, external benchmarks, foundation-encoder recipe | Proposed |

 ### Platform and UI

@ -93,6 +97,8 @@ Statuses: **Proposed** (under discussion), **Accepted** (approved and/or impleme
 | [ADR-036](ADR-036-rvf-training-pipeline-ui.md) | Training Pipeline UI Integration | Proposed |
 | [ADR-043](ADR-043-sensing-server-ui-api-completion.md) | Sensing Server UI API Completion (14 endpoints) | Accepted |
 | [ADR-115](ADR-115-home-assistant-integration.md) | Home Assistant integration via MQTT auto-discovery + Matter bridge (HA-DISCO + HA-FABRIC + HA-MIND) | Accepted (MQTT track) / Proposed (Matter SDK P8b) |
+| [ADR-147](ADR-147-adam-mode-light-theme.md) | adam-mode — light theme toggle for the three.js realtime demo | Proposed |
+| [ADR-148](ADR-148-yoga-mode-pose-system.md) | yoga-mode — yoga pose detection, classification, and scoring for the three.js realtime demo | Proposed |

 ### Architecture and infrastructure

--- a/api-docs/integration/calibration-appliance-integration.md
+++ b/api-docs/integration/calibration-appliance-integration.md
@ -0,0 +1,234 @@
+# Per-Room Calibration — Integration Overview (for `cognitum-one/v0-appliance`)
+
+**Audience:** integrators wiring the RuView per-room calibration system (ADR-151) into the
+Cognitum V0 appliance (`cognitum-v0`, Pi 5 + Hailo). This document is the contract +
+deployment spec: data formats, API surface, crate API, and the appliance integration plan.
+
+**Source of truth:** crate `v2/crates/wifi-densepose-calibration` + CLI `v2/crates/wifi-densepose-cli`
+(`calibrate`, `calibrate-serve`, `enroll`, `train-room`, `room-status`, `room-watch`) on this PR's branch.
+
+---
+
+## 1. What it is
+
+"Teach the room before you teach the model." A local-first pipeline that turns a few minutes of
+clean human anchors — layered on an empty-room baseline — into a versioned **bank of small,
+room-calibrated specialists** for presence, posture, breathing, heartbeat, restlessness, and anomaly.
+
+```
+baseline (ADR-135)  →  enroll (anchors + quality gate)  →  extract (features)  →  train (specialist bank)  →  runtime (mixture + veto)
+   environmental         stand/sit/lie/breathe/move        periodicity/variance     6 small models             RoomState per window
+   fingerprint           (re-prompts bad captures)                                  + STALE invalidation       (+ multistatic fusion)
+```
+
+**Design invariants (carry these into the appliance):**
+- **Specialisation over scale** — six tiny models (threshold / nearest-prototype / autocorrelation), not one big model. They run in microseconds on a Pi CPU; **they do not need the Hailo HAT**.
+- **Local-first** — baselines + per-room banks stay on the device. Cross-room sharing is *model deltas* (federation, ADR-105), **never raw CSI**.
+- **Honest degradation** — baseline drift marks a bank `STALE`; a physically-implausible window is vetoed rather than emitting a hallucinated reading.
+
+---
+
+## 2. Tiering on the Pi 5 + Hailo (what runs where)
+
+| Tier | Runs on | What | Status |
+|------|---------|------|--------|
+| **CSI source** | ESP32-S3/C6 nodes (`edge_tier=0` raw CSI) | `0xC5110001` frames over UDP | shipping (v0.7.1-esp32) |
+| **Calibration service** | **Pi 5 CPU** (aarch64) | this crate: baseline/enroll/train/runtime + HTTP API | **this PR** |
+| **Shared backbone (optional)** | **Hailo HAT (HAILO10H)** | ADR-150 RF Foundation Encoder + neural pose head as HEF | future (ADR-150) |
+
+> The appliance's WiFi (`wlan0`) is `managed` with no nexmon — **the Pi is a CSI *processor*, not a CSI radio.** CSI arrives from the ESP32 nodes (the existing `ruview-vitals-worker:50054` already receives it). Calibration *consumes* that stream; it does not sense directly.
+
+---
+
+## 3. Data contracts (the integration surface)
+
+### 3.1 CSI ingest — ESP32 `0xC5110001` (UDP, little-endian)
+
+```
+Offset  Size  Field
+ 0      4     magic = 0xC511_0001 (LE u32)
+ 4      1     node_id (u8)            ← group multistatic nodes by this
+ 5      1     n_antennas (u8)
+ 6      1     n_subcarriers (u8)      ← 52/64 (HT20), 114 (HT40), 242 (HE20)
+ 7      1     reserved
+ 8      2     freq_mhz (LE u16)
+10      4     sequence (LE u32)
+14      1     rssi (i8)
+15      1     noise_floor (i8)
+16      4     reserved
+20      2·n_antennas·n_subcarriers   IQ pairs: i (i8), q (i8)
+```
+Parser reference: `wifi-densepose-cli/src/calibrate.rs::parse_csi_packet`. The appliance can reuse the
+ESP32 stream the vitals worker already receives, or tee it to the calibration UDP port.
+
+### 3.2 Baseline (ADR-135) — binary, magic `0xCA1B_0001`
+
+```
+Header (16 B LE): magic(4)=0xCA1B0001, version(1)=1, tier(1) {0=HT20,1=HT40,2=HE20,3=HE40},
+                  reserved(2), captured_at_unix_s(8, i64)
+Body:             frame_count(8,u64), num_subcarriers(4,u32),
+                  per subcarrier: amp_mean(f32), amp_variance(f32), phase_mean(f32), phase_dispersion(f32)
+```
+Produced by `calibrate` / `calibrate-serve`; `BaselineCalibration::{to_bytes,from_bytes}`. A baseline's
+UUID (`calibration_uuid()`) is the `baseline_id` referenced by enrollments and banks for STALE checks.
+
+### 3.3 Enrollment output — JSON (`enroll` → `train-room`)
+
+```jsonc
+{
+  "room_id": "living-room",
+  "baseline_id": "<uuid>",
+  "fs_hz": 15.0,
+  "anchors": [
+    { "room_id": "living-room", "label": "stand_still",
+      "features": { "mean": f32, "variance": f32, "motion": f32,
+                    "breathing_score": f32, "breathing_hz": f32,
+                    "heart_score": f32, "heart_hz": f32 } }
+  ],
+  "session": { "room_id": "...", "baseline_id": "...", "events": [ /* event-sourced audit log */ ] }
+}
+```
+Anchor labels (fixed sequence, **JSON wire = snake_case**, test-enforced): `empty, stand_still, sit, lie_down, breathe_slow, breathe_normal, small_move, sleep_posture`.
+
+### 3.4 Specialist bank — JSON (`train-room` → `room-watch` / runtime)
+
+```jsonc
+{
+  "room_id": "living-room",
+  "baseline_id": "<uuid>",            // drift vs current → STALE
+  "trained_at_unix_s": 0,
+  "anchor_count": 6,
+  "presence":     { "threshold": f32, "occupied_var": f32 } | null,
+  "posture":      { "prototypes": [ ["Standing", [f32;5]], ... ] } | null,
+  "breathing":    { "min_score": f32 },
+  "heartbeat":    { "min_score": f32 },
+  "restlessness": { "calm_motion": f32, "active_motion": f32 } | null,
+  "anomaly":      { "prototypes": [ [f32;5], ... ], "scale": f32 } | null
+}
+```
+`SpecialistBank::{to_json,from_json}`. A *partial* bank is valid (missing-anchor specialists are `null`).
+
+### 3.5 Runtime output — `RoomState` JSON (per window)
+
+```jsonc
+{
+  "presence":     { "kind":"Presence", "value":0|1, "confidence":f32, "label":"present|absent" } | null,
+  "posture":      { "kind":"Posture", "value":f32, "confidence":f32, "label":"standing|sitting|lying" } | null,
+  "breathing":    { "kind":"Breathing", "value": <BPM>, "confidence":f32, "label":null } | null,
+  "heartbeat":    { "kind":"Heartbeat", "value": <BPM>, "confidence":f32, "label":null } | null,
+  "restlessness": { "kind":"Restlessness", "value": 0.0..1.0, "confidence":f32 } | null,
+  "anomaly":      { "kind":"Anomaly", "value": 0.0..1.0, "confidence":f32, "label":"normal|anomalous" } | null,
+  "vetoed": bool,   // anomaly veto fired → vitals/posture suppressed
+  "stale":  bool    // bank trained against a different baseline
+}
+```
+
+---
+
+## 4. HTTP API — `calibrate-serve` (CORS-enabled; this is what a UI/appliance drives)
+
+| Method | Path | Body / returns |
+|--------|------|----------------|
+| GET | `/api/v1/calibration/health` | `{ udp_port, frames_seen, last_frame_age_ms, streaming, default_tier, output_dir, session_active }` |
+| POST | `/api/v1/calibration/start` | `{ tier?, duration_s?, room_id?, min_frames? }` → `202` session snapshot |
+| GET | `/api/v1/calibration/status` | live `{ state, frames_recorded, target_frames, progress, z_median, eta_s, ... }` |
+| POST | `/api/v1/calibration/stop` | finalize early → result summary |
+| GET | `/api/v1/calibration/result` | last finalized baseline summary |
+| GET | `/api/v1/calibration/baselines` | list persisted `.bin` baselines |
+| GET | `/api/v1/room/state?bank=<name>` | **live RoomState** (mixture-of-specialists over the CSI window; bank resolved as a sanitized name under `output_dir`) |
+| POST | `/api/v1/room/train` | `{ room_id, baseline_id, anchors[]? }` → train + persist a specialist bank as `<output_dir>/<room_id>.json` (anchors[] optional if enrolled via `/enroll/anchor`; read back via `/room/state?bank=<room_id>`) |
+| POST | `/api/v1/enroll/anchor` | `{ room_id, baseline, label, duration_s? }` → capture one guided anchor against a baseline (blocks for the capture); returns the gate verdict + progress |
+| GET | `/api/v1/enroll/status?room=<id>` | enrollment progress (accepted anchors, next, complete) |
+
+A single background task owns the UDP socket + recorder (handlers talk to it over an mpsc channel +
+shared status snapshot), so the API is non-blocking. **The full pipeline is now drivable over HTTP** — baseline (`start`/`stop`) → `enroll/anchor` (×8) → `room/train` → `room/state` — so the appliance UI needs no CLI. (The CLI `enroll`/`train-room`/`room-watch` remain for scripted/headless use.)
+
+---
+
+## 5. Public crate API (`wifi-densepose-calibration`)
+
+```rust
+// Stage 2 — enrollment
+anchor::{AnchorLabel, Anchor, AnchorQuality, EnrollmentEvent, EnrollmentSession, Posture}
+enrollment::{AnchorQualityGate, AnchorRecorder}
+// Stage 3 — features
+extract::{Features, AnchorFeature, autocorr_dominant}
+// Stage 4 — specialists + bank
+specialist::{Specialist, SpecialistKind, SpecialistReading,
+             PresenceSpecialist, PostureSpecialist, BreathingSpecialist,
+             HeartbeatSpecialist, RestlessnessSpecialist, AnomalySpecialist}
+bank::SpecialistBank
+// Stage 5 — runtime
+runtime::{MixtureOfSpecialists, RoomState}
+multistatic::MultiNodeMixture            // fuse co-located nodes (ADR-029)
+```
+Pure Rust; deps are `wifi-densepose-core` + `wifi-densepose-signal` (default-features off) + serde/uuid.
+**No GPU / no system BLAS** in the calibration path → builds cleanly on aarch64.
+
+---
+
+## 6. Appliance integration plan (`cognitum-one/v0-appliance`)
+
+Verified on `cognitum-v0`: aarch64, `cargo 1.96.0`, Hailo `HAILO10H`, `ruview-vitals-worker:50054`.
+
+**Step 1 — vendor / depend on the crate.** Add `wifi-densepose-calibration` (path or published crate)
+to the appliance workspace. It builds natively on aarch64 — no BLAS/GPU, **and no ONNX/OpenSSL**:
+the CLI's `mat`→`nn`→`ort`(ONNX)→`openssl-sys` chain is now feature-gated out of the calibration build.
+
+```bash
+# Pi/appliance calibration binary — cross-compiles clean (no ort/openssl):
+cargo build -p wifi-densepose-cli --no-default-features --release
+#   (omit `--no-default-features` only if you also need the MAT subcommands)
+```
+Verified: `cargo tree -p wifi-densepose-cli --no-default-features` shows **0** `ort`/`openssl-sys` deps;
+`cross test --target aarch64-unknown-linux-gnu` passes the calibration suite under qemu.
+
+**Step 2 — wire the CSI source.** Two options:
+  - (a) Tee the ESP32 UDP stream the vitals worker already receives into the calibration ingest, or
+  - (b) point ESP32 nodes (`edge_tier=0`) at the appliance's calibration UDP port directly.
+  Reuse `parse_csi_packet` (or the rvCSI `CsiFrame` schema if you normalise upstream).
+
+**Step 3 — run the calibration service.** Either embed the crate (call `CalibrationRecorder` /
+`MixtureOfSpecialists` in-process from a worker like `ruview-vitals-worker`), or run the
+`calibrate-serve` binary as a sidecar (systemd unit, bind `127.0.0.1` + reverse-proxy through the
+appliance gateway on `:9000`). Persist baselines/banks under the appliance data dir, keyed by `room_id`.
+
+**Step 4 — expose to the dashboard.** Surface the `/api/v1/calibration/*` endpoints (and add
+`enroll`/`train`/`room-state` endpoints — small additive work) behind the appliance's bearer-token
+auth + the existing `Seeds`/`Edge` nav. `RoomState` (§3.5) is the live readout payload.
+
+**Step 5 — (optional) Hailo backbone tier.** Compile the ADR-150 RF Foundation Encoder + neural pose
+head to Hailo HEF, serve via `ruvector-hailo-worker:50051`; the small specialists become heads over its
+embedding. This is the ADR-150 follow-on — *not required* for the calibration service to run.
+
+**Privacy / security:** keep baselines + banks local; if federating across appliances (ADR-105),
+exchange bank/model deltas, never raw CSI. Hardening already in place:
+- **`--token <T>`** (or `CALIBRATE_TOKEN` env) requires `Authorization: Bearer <T>` on every route; the
+  server warns loudly if bound to a non-loopback address without a token.
+- **`room_id` is sanitized** to `[A-Za-z0-9_-]` (≤64 chars) before it touches the baseline write path —
+  no `../` / absolute-path traversal.
+- CORS is permissive for dev — in production bind to loopback and reverse-proxy through the appliance
+  gateway (which already enforces bearer auth).
+
+---
+
+## 7. Status & validation
+
+- **Implemented:** all 5 stages + multistatic fusion; CLI + Stage-1 HTTP API (auth + path-traversal hardened). **55 tests** (35 calibration unit + 1 full-loop integration + 19 CLI), all passing under qemu-aarch64.
+
+**Precise validation matrix (don't overstate this — no clean full calibration has run on-target yet):**
+
+| Stage | Pi-5 (real nexmon→`0xC5110001`, 6,813 frames) | ESP32-S3 (COM8, `edge_tier=0`) | qemu / unit / integration |
+|---|---|---|---|
+| baseline capture + HTTP API + **auth gate** | ✅ | ✅ (120-frame) | full-loop ✅ |
+| **clean** empty-room baseline | ❌ `motion_flagged` (artifact) | ❌ (occupied) | full-loop ✅ (synthetic, zero motion flags) |
+| enroll → train-room | ❌ | ❌ (needs operator poses) | full-loop ✅ (8/8 anchors, 6 specialists, JSON round-trip) |
+| runtime infer | ❌ on-target | ◐ single-node breathing ~16–31 BPM via the **stateless** head (not a trained bank) + node-id fusion | full-loop ✅ (trained bank: 18±2 BPM positive, absent negative, foreign-baseline STALE) |
+
+The complete `baseline → enroll → train-room → infer` loop is now **proven in-process** on deterministic synthetic CSI (`wifi-densepose-calibration/tests/full_loop.rs` — drives the CLI's exact stage order through the public API, seed-robust across 5 seeds, runs with and without default features). Capture + API + auth are proven on real CSI (both boxes). What remains is strictly the **on-target** run: real CSI, a physically empty room for baseline, and an operator performing the 8 guided anchors — that hardware session is the last open item.
+
+- **Known follow-ups (appliance backlog):** `--source-format adr018v6` to drive calibration from the Pi's own nexmon (no ESP32/transcoder); the on-target clean-room enroll→train→infer session (above); phase-based (vs mean-amplitude) breathing carrier; RVF/HNSW persistence (currently JSON); enroll/train HTTP endpoints (live `/room/state` already added); ADR-150 Hailo backbone; true 2-node multistatic; ADR-105 federation.
+- **Behavioral findings from the full-loop test — all four FIXED pre-hardware-session:** (1) *z-band squeeze* — anchor motion is now measured from frame-to-frame deltas of the deviation series (`|Δz| > 0.5 ∨ |Δφ| > π/6`), not from the absolute `motion_flagged` (which conflated presence strength with motion); a strongly-reflecting still person (z = 3.0, every frame flagged by the old heuristic) now enrolls — regression-guarded in the full-loop test's `StandStill` anchor and `enrollment::tests`. (2) *Variance-only presence* — `PresenceSpecialist` gained a mean-shift channel (|mean − empty mean| vs a trained threshold); a motionless person is detected via the mean even at empty-level variance — regression-guarded in the full-loop motionless-person case; old persisted banks deserialize with the channel inert (variance-only behavior preserved). (3) *Ungated hz embedding* — `Features::embedding()` zeroes `breathing_hz`/`heart_hz` below `EMBED_MIN_SCORE` (0.25), keeping noise-window random frequencies out of the prototype space. (4) *Heart-band leakage* (found while fixing 3): a strong breathing rhythm's autocorrelation leaks into the HR band as a high-score lag-floor edge value (e.g. score 0.67 at 3.33 Hz from a pure 0.30 Hz breath); `autocorr_dominant` now requires the winning lag to be an interior local maximum, rejecting band-edge leakage while preserving true in-band peaks.
+
+**Reference:** ADR-151 (`docs/adr/ADR-151-room-calibration-specialist-training.md`), ADR-135 (baseline),
+ADR-029 (multistatic), ADR-150 (RF Foundation Encoder), ADR-105 (federation), ADR-147 (OccWorld/Hailo).