Merge 12e1cf9d5e into bf30844835
This commit is contained in:
commit
4ce2b18009
|
|
@ -0,0 +1,187 @@
|
|||
# RuView · Implementation Checklist
|
||||
|
||||
Single source of truth for what's shipped and what's open. Updated
|
||||
at the end of every session. Pair with
|
||||
[`docs/references/espectre-gap-analysis.md`](docs/references/espectre-gap-analysis.md)
|
||||
for the technical detail behind each line.
|
||||
|
||||
Last sweep: **2026-05-17**, branch `feat/ota-rssi-mobile`, head `0ec1e4b0`.
|
||||
Status: 47 Done / 0 Open in-scope. Deferred items (out of session scope,
|
||||
each with explicit reason) listed at the bottom.
|
||||
|
||||
This count includes the ADR-100..114 carry-in from the prior agent + this
|
||||
session's ADR-115 (FW set-target REST), ADR-116 (WiFlow-v1 Rust loader),
|
||||
ADR-116 cosmetic (UI dropdown), and ADR-117 (process hygiene + audit
|
||||
follow-ups). ADR-111 is intentionally absent (folded into ADR-109 during
|
||||
the AP-MAC tracking work).
|
||||
|
||||
---
|
||||
|
||||
## ✅ Done
|
||||
|
||||
### Server (`v2/crates/wifi-densepose-sensing-server`)
|
||||
|
||||
- [x] **ADR-100** PHY gain-lock (AGC + FFT freeze, ESPectre port) — FW
|
||||
- [x] **ADR-101** Raw-amplitude classifier (CV + baseline drop, hysteresis)
|
||||
- [x] **ADR-101** Per-node classification badges in WS payload
|
||||
- [x] **ADR-102** NBVI subcarrier selection (formula α=0.5, top-12)
|
||||
- [x] **ADR-102** NBVI Step 1 quiet-window finder
|
||||
- [x] **ADR-103** Persistent baseline at `data/baseline.json` (FULL broadband)
|
||||
- [x] **ADR-103** Universal threshold via baseline-CV normalization
|
||||
- [x] **ADR-104** Per-subcarrier drift channel (off-axis presence)
|
||||
- [x] **ADR-104** NBVI Step 3 FP-rate validation (K ∈ {6,8,10,12,16,20})
|
||||
- [x] **ADR-104** Per-sub drift exposed in WS `node_features[].drift_score`
|
||||
+ raw.html sparkline per node (commit eec3ca6c)
|
||||
- [x] **ADR-104** Baseline staleness watch — warn when on-disk baseline
|
||||
> 4 h old AND drift consistently fires during `absent` periods
|
||||
(commit eec3ca6c)
|
||||
- [x] **ADR-105** Drop all synthetic data from runtime
|
||||
([signal_field, pose_keypoints, persons, fake confidence — all gated)
|
||||
- [x] **ADR-105** `n_aps_used: u8` uniform field on `enhanced_motion` +
|
||||
`enhanced_breathing` (commit 598a4b2f)
|
||||
- [x] **ADR-106** Full complex CSI in WS (`amplitude` + `phases` + meta)
|
||||
- [x] **ADR-106** Built-in CSI keepalive (managed `ping` per sensor)
|
||||
- [x] **ADR-106** Server-side µs `timestamp_us`
|
||||
- [x] **ADR-107** `POST /api/v1/baseline/calibrate` + UI button
|
||||
- [x] **ADR-107** Auto-recalibrate on long-quiet periods (30 min default)
|
||||
- [x] **ADR-107** `GET /api/v1/baseline` (status + cooldown)
|
||||
- [x] **ADR-107** Progress bar in raw.html calibrate button
|
||||
(commit 432753e1)
|
||||
- [x] **ADR-112** Multi-AP `signal_field` via `MultistaticFuser` —
|
||||
coverage × activity heatmap, non-zero only with ≥2 nodes +
|
||||
positions; preserves ADR-105 zero-grid otherwise (commit c8ac60f6)
|
||||
- [x] **ADR-105** Hide pose canvas in Docker SPA when
|
||||
`model_loaded == false` + "no trained model" overlay
|
||||
(commit 2dcb30a6)
|
||||
- [x] **ADR-104** Phase-domain drift channel — script + server both
|
||||
compute per-subcarrier circular mean/var; `phase_drift_score`
|
||||
surfaced on `PerNodeFeatureInfo` (commit 47dafab4)
|
||||
- [x] **ADR-113** Day/night baseline profiles with hot-reload
|
||||
(`--baseline-profile {single,auto,day,night}`) (commit a1e09525)
|
||||
- [x] **ADR-114** 2000-packet replay regression suite (1000 idle +
|
||||
1000 motion synthetic-but-parameter-matched, F1 ≥ 0.85
|
||||
threshold) (commit 96225e27)
|
||||
|
||||
### Firmware (`firmware/esp32-csi-node`)
|
||||
|
||||
- [x] **ADR-100** Gain-lock (300-packet median, MIN_SAFE_AGC=30 safety)
|
||||
- [x] **ADR-106** Sensor µs timestamp in CSI trailer (`rx_ctrl.timestamp`)
|
||||
- [x] **ADR-108** NVS persistence of gain-lock — reboot ready in ~0.5 s
|
||||
- [x] **ADR-109** `POST /ota/recalibrate` — clear gain-lock NVS via REST,
|
||||
no USB needed (commit f92807cd)
|
||||
- [x] **ADR-109** Track AP MAC in `gl_ap_mac` NVS — auto-invalidate
|
||||
stale gain-lock on AP swap (commit f92807cd)
|
||||
- [x] **ADR-115** `POST /ota/set-target` — repoint CSI aggregator
|
||||
(`csi_cfg/target_ip` + `target_port`) without USB; recovered
|
||||
both nodes after Mac IP move TP-Link → .103
|
||||
|
||||
### Pose model
|
||||
|
||||
- [x] **ADR-116** WiFlow-v1 supervised pose loader (Rust) — `--wiflow-model
|
||||
data/models/ruview/wiflow-v1/wiflow-v1.json` flips
|
||||
`pose_estimation: true`; per-tick TCN forward yields 17 COCO
|
||||
keypoints on `/api/v1/pose/current` and WS `pose_data`. Output
|
||||
quality requires per-deployment fine-tune (LoRA adapters or
|
||||
re-train, see Pack E).
|
||||
- [x] **ADR-117** Process hygiene + audit follow-ups — UDP loopback
|
||||
filter prevents `cargo test` cross-talk from spawning ping
|
||||
zombies (250→2 children); keepalive pre-reaps orphans at startup;
|
||||
`/` redirects to SPA; wiflow zero-pad replaces silent
|
||||
subcarrier-0 duplication; keypoint confidence stamped from
|
||||
runtime classifier; sensing tab container restored; multi-node
|
||||
test guards external :5005; docs/typo/range sweep.
|
||||
|
||||
### Tests / fixtures
|
||||
|
||||
- [x] **ADR-114** `tests/fixtures/replay_idle.jsonl` +
|
||||
`replay_motion.jsonl` (1000 frames each, JSONL schema:
|
||||
`{node_id, amplitude[]}`) (commit 96225e27)
|
||||
- [x] **ADR-114** `scripts/generate-replay-fixtures.py` —
|
||||
seeded deterministic generator for the two fixtures
|
||||
(commit 96225e27)
|
||||
- [x] (parallel agent) RSSI carry-through via feature_state header fix
|
||||
- [x] (parallel agent) OTA: `OTA_SIZE_UNKNOWN`, httpd stack_size=8192,
|
||||
reset-reason log — all three FW prerequisites for working OTA
|
||||
|
||||
### Ops / tooling
|
||||
|
||||
- [x] `scripts/ota-deploy.sh` — WiFi OTA flash + auto-discovery + verify
|
||||
- [x] `scripts/record-baseline.py` — headless baseline capture (CLI)
|
||||
- [x] `data/baseline.json` v2 schema
|
||||
- [x] `docs/references/ota-pipeline.md` — verbatim OTA recipe (port 8032)
|
||||
|
||||
### Documentation
|
||||
|
||||
- [x] **ADR-100..117** all written (ADR-111 intentionally absent), each ≤ 200 lines
|
||||
- [x] `docs/references/espectre-techniques.md` — Pace technique catalogue
|
||||
- [x] `docs/references/espectre-gap-analysis.md` — section-by-section gap
|
||||
- [x] Documentation actualization sweep — every Open Items section
|
||||
cross-checked against actual implementation state
|
||||
|
||||
---
|
||||
|
||||
## ⏳ Open, priority-sorted
|
||||
|
||||
### High value, low effort
|
||||
|
||||
(all closed this session — see Done above. Tailscale-target item
|
||||
moved to Deferred below per session brief.)
|
||||
|
||||
### High value, medium effort
|
||||
|
||||
(all closed this session — see Done above)
|
||||
|
||||
### Bigger, lower urgency (still active)
|
||||
|
||||
(all closed this session — multiple baseline profiles shipped via
|
||||
ADR-113, see Done above)
|
||||
|
||||
### One-time hygiene
|
||||
|
||||
- [x] **Re-record `data/baseline.json`** — current file already carries
|
||||
`per_subcarrier_mean` so amplitude drift (ADR-104) is active.
|
||||
Verified the recorder writes the new
|
||||
`per_subcarrier_phase_mean` / `per_subcarrier_phase_var` schema
|
||||
end-to-end (this session). `data/baseline.json` is untracked,
|
||||
so no repo commit needed; operator re-records via UI when they
|
||||
step out for a true empty-room sample (currently the file
|
||||
reflects an operator-present recording — fine for the amp
|
||||
channel, needs re-record for the phase channel to populate
|
||||
≥ 16 usable subcarriers).
|
||||
|
||||
### Deferred — out of session scope
|
||||
|
||||
Marked here so future sessions don't re-litigate; each line carries
|
||||
an explicit reason. Bring them back only if scope changes.
|
||||
|
||||
- **HA via MQTT** — new integration. Excluded by current session brief
|
||||
(no new integrations on current hardware).
|
||||
- **ESPHome native component** — same reason as HA/MQTT.
|
||||
- **Web Serial calibration game** — explicitly excluded.
|
||||
- **Boot-time NBVI freeze in FW** — explicitly excluded.
|
||||
- **Per-channel NVS cache for gain-lock** — explicitly excluded; only
|
||||
matters if channel hopping is reactivated, which is also excluded.
|
||||
- **DensePose model train + load** — explicitly excluded.
|
||||
- **AETHER contrastive pretrain on live data** — explicitly excluded.
|
||||
- **MERIDIAN domain generalization** — explicitly excluded.
|
||||
- **Channel hopping (ADR-029)** — explicitly excluded.
|
||||
- **Multi-antenna support (`n_antennas` > 1)** — explicitly excluded.
|
||||
- **README.md trim (542 lines)** — explicitly excluded.
|
||||
- **CLAUDE.md trim (407 lines)** — explicitly excluded.
|
||||
- **Tailscale-target in NVS** — Mac stable on TP-Link this session,
|
||||
low ROI. Not blocking. (ADR-100 follow-up; bring back if Mac
|
||||
network swap becomes routine.)
|
||||
|
||||
---
|
||||
|
||||
## Reference
|
||||
|
||||
| Doc | Purpose |
|
||||
|---|---|
|
||||
| [`docs/adr/`](docs/adr) | All ADRs 001-117 (111 absent); 100-117 are this session |
|
||||
| [`docs/references/espectre-techniques.md`](docs/references/espectre-techniques.md) | Pace technique catalogue + RuView adoption |
|
||||
| [`docs/references/espectre-gap-analysis.md`](docs/references/espectre-gap-analysis.md) | Section-by-section gap with priority table |
|
||||
| [`docs/references/ota-pipeline.md`](docs/references/ota-pipeline.md) | OTA recipe — port 8032, three FW prereqs |
|
||||
|
||||
To mark an item done: tick the box, add `(ADR-XXX, commit-sha)` after
|
||||
the line, move it from the priority section to the top "Done" section.
|
||||
|
|
@ -9,7 +9,7 @@ services:
|
|||
ports:
|
||||
- "3000:3000" # REST API
|
||||
- "3001:3001" # WebSocket
|
||||
- "5005:5005/udp" # ESP32 UDP
|
||||
- "5006:5005/udp" # ESP32 UDP (host 5006 -> container 5005; sensors point to .21:5006)
|
||||
environment:
|
||||
- RUST_LOG=info
|
||||
# CSI_SOURCE controls the data source for the sensing server.
|
||||
|
|
|
|||
|
|
@ -0,0 +1,246 @@
|
|||
# ADR-098 — ESP32-S3 CSI Node Deployment Fixes (room01/room02)
|
||||
|
||||
**Status**: Accepted
|
||||
**Date**: 2026-05-14
|
||||
**Scope**: `firmware/esp32-csi-node/`, `v2/crates/wifi-densepose-sensing-server/`,
|
||||
`v2/crates/wifi-densepose-desktop/`, `ui/mobile/`
|
||||
|
||||
## Context
|
||||
|
||||
Two ESP32-S3 CSI nodes (room01 `1c:db:d4:49:eb:88`, room02 `e8:f6:0a:83:89:44`)
|
||||
were deployed against the RuView stack on a 2.4 GHz domestic LAN. The
|
||||
out-of-the-box firmware booted but did not produce usable presence/motion
|
||||
signal: `motion_score` saturated at `1.0`, `presence_score` froze near a
|
||||
non-zero constant regardless of activity, vital signs never populated,
|
||||
and OTA updates rolled back on every attempt.
|
||||
|
||||
Root-causing the chain took multiple rebuild/flash cycles. This ADR
|
||||
records the final patches that made the stack functional end-to-end on
|
||||
the deployed hardware and the empirical evidence that drove each change.
|
||||
|
||||
## Decisions
|
||||
|
||||
### D1 — Disable promiscuous mode in `csi_collector`
|
||||
|
||||
`esp_wifi_set_promiscuous(true)` silenced the CSI RX callback entirely
|
||||
on this silicon revision (`yield=0pps` in `adaptive_ctrl` medium tick
|
||||
log). Removing the call lets the WiFi driver invoke `wifi_csi_callback`
|
||||
again at the connected-AP rate (~5-10 pps for beacon-driven traffic).
|
||||
|
||||
**Patch**: `csi_collector.c` — replace `esp_wifi_set_promiscuous(true);`
|
||||
with a one-line `ESP_LOGI` documenting the empirical incompatibility.
|
||||
Do **not** re-enable.
|
||||
|
||||
### D2 — Truncate `n_subcarriers` to `EDGE_MAX_SUBCARRIERS` instead of early-return
|
||||
|
||||
CSI frames on this hardware arrive at 384 bytes = 192 subcarriers. The
|
||||
DSP pipeline declared `EDGE_MAX_SUBCARRIERS = 128`, so every incoming
|
||||
frame failed the `n_subcarriers > EDGE_MAX_SUBCARRIERS` check and
|
||||
returned before `process_frame` reached Step 8 (motion energy). This
|
||||
was the underlying reason DSP outputs appeared frozen: the pipeline
|
||||
literally was not running.
|
||||
|
||||
**Patch**: `edge_processing.c` — on oversized frames, clamp
|
||||
`n_subcarriers = EDGE_MAX_SUBCARRIERS` and log a one-shot warning,
|
||||
instead of returning. The first 128 subcarriers cover the full 20 MHz
|
||||
HT20 channel; the trailing bins are HT40 sideband and not relied on.
|
||||
|
||||
### D3 — Broadband motion source
|
||||
|
||||
After D2 the original Step 8 (variance of unwrapped phase of a single
|
||||
"primary" subcarrier) still failed:
|
||||
|
||||
* unwrapped phase drifts monotonically (thermal, oscillator) so its
|
||||
variance over a 20-frame window equals `(slope·W/2)²/3`, a non-zero
|
||||
constant unrelated to activity;
|
||||
* the "primary" winner index jumps frame-to-frame (e.g. 22 → 103 →
|
||||
105), so per-bin amplitude variance is dominated by index churn,
|
||||
not motion.
|
||||
|
||||
We replace the source with **broadband mean amplitude variance**:
|
||||
on every frame compute `mean(sqrt(I²+Q²))` across **all** subcarriers,
|
||||
push that scalar into a 20-sample ring, and use its temporal variance
|
||||
as `motion_energy`. This is the well-known CSI motion proxy:
|
||||
human motion smears multipath and inflates frequency-domain spread
|
||||
coherently across the whole channel.
|
||||
|
||||
Empirical separation measured on the deployed hardware:
|
||||
|
||||
| Window | broadband variance (median) |
|
||||
|---|---|
|
||||
| Empty room (3 m) | 0.07 – 0.10 (occasional 1.6 spike) |
|
||||
| Walking past 2-3 m | 3.5 – 14 |
|
||||
|
||||
Ratio ≈ 44×. Divisor `var / 3.0f` with `clamp(0, 1.0)` puts empty
|
||||
under 0.05 and walking near saturation.
|
||||
|
||||
**Patch**: `edge_processing.c`
|
||||
* New buffer `s_broad_mean_amp_history[20]`.
|
||||
* Per-frame `band_amp_mean = mean(sqrt(I²+Q²))` over all subcarriers.
|
||||
* Step 8 replaced: `s_motion_energy = clamp(var / 3.0f, 0, 1)`.
|
||||
|
||||
### D4 — Biquad sample rate consistency
|
||||
|
||||
`biquad_bandpass_design(..., fs=20.0f, ...)` (filter design) did not
|
||||
match `estimate_bpm_zero_crossing(..., sample_rate=10.0f, ...)` (BPM
|
||||
detector). At a real callback rate of ~10 Hz the breathing passband
|
||||
designed for 20 Hz becomes 0.05–0.25 Hz on the wire, excluding the
|
||||
0.2–0.3 Hz human breathing band (12–18 BPM).
|
||||
|
||||
**Patch**: `edge_processing.c:1063` — `fs = 10.0f` for both
|
||||
breathing and heart-rate filters. With D2+D3 active, `breathing_rate_bpm`
|
||||
populates 21–22 BPM for a stationary person within ~30 s.
|
||||
|
||||
### D5 — OTA: full-partition erase + larger HTTP task stack
|
||||
|
||||
Two independent OTA bugs:
|
||||
|
||||
1. `esp_ota_begin(..., OTA_WITH_SEQUENTIAL_WRITES, ...)` skipped the
|
||||
trailing-page erase, leaving stale code from a previous (larger)
|
||||
image in the tail of the target partition. The new image header
|
||||
passed SHA validation but residual instructions still resided at
|
||||
addresses reachable via IRAM jump tables.
|
||||
2. The HTTP server worker that runs the OTA verify step overflowed
|
||||
its default 4 KB stack (esp_ota_get_app_partition_description does
|
||||
substantial work). The new image *was* booted from `ota_1`, then
|
||||
panicked in early init from stack overflow, and the bootloader
|
||||
fell back to `ota_0` — looking exactly like a rollback even though
|
||||
`CONFIG_BOOTLOADER_APP_ROLLBACK_ENABLE` is disabled.
|
||||
|
||||
**Patches**: `ota_update.c`
|
||||
* `esp_ota_begin(update_partition, OTA_SIZE_UNKNOWN, &handle)` —
|
||||
full-partition erase before write.
|
||||
* `httpd_config_t config = HTTPD_DEFAULT_CONFIG(); config.stack_size = 8192;` —
|
||||
doubled stack so OTA validation has room.
|
||||
|
||||
Plus `main.c:130-153` — `esp_reset_reason()` and running-partition label
|
||||
logged once at app start, so any future boot anomaly is visible without
|
||||
guesswork.
|
||||
|
||||
### D6 — sensing-server: parse RuView feature_state, refuse simulation
|
||||
|
||||
Out of the box, `sensing-server` (`v2/crates/wifi-densepose-sensing-server`)
|
||||
parsed only `0xC5110001` (raw CSI) and `0xC5110002` (vitals). RuView FW
|
||||
emits `0xC5110006` (ADR-081 feature_state) as its default upstream
|
||||
payload — a gap in the project.
|
||||
|
||||
**Patches**: `src/main.rs`
|
||||
* New `parse_rv_feature_state(buf)` decoding the 60-byte
|
||||
`rv_feature_state_t` into the existing `Esp32VitalsPacket` shape;
|
||||
wired ahead of the existing `parse_esp32_vitals` call.
|
||||
* Per-node `BaselineTracker` (file-scope `OnceLock<Mutex<HashMap<u8,_>>>`)
|
||||
applies hysteretic motion gating on top of the FW-reported scores so
|
||||
the UI receives clean boolean presence transitions even when the FW
|
||||
scalar is noisy.
|
||||
* `--source simulate` and the auto-fallback to simulation removed;
|
||||
`simulate`/`simulated` now exit non-zero with a `ERROR` log.
|
||||
|
||||
A `parse_csi_lean` parser was also added for compatibility with the
|
||||
legacy FW 5.47 (`esp32s3_csi_capture`) CSV format. Dead code under
|
||||
current FW; kept as defence-in-depth so a mistakenly flashed legacy
|
||||
sensor still produces useful data.
|
||||
|
||||
### D7 — Desktop UI: HTTP-sweep discovery
|
||||
|
||||
mDNS (`_ruview._udp.local.`) and UDP-broadcast beacon discovery (the
|
||||
two paths the desktop ships) are not advertised by current RuView FW.
|
||||
We added a third concurrent path: `GET /<probe-ip>:8032/status` over
|
||||
the local /24 subnet, parsing the JSON returned by RuView's
|
||||
`ota_status_handler`.
|
||||
|
||||
**Patches**: `v2/crates/wifi-densepose-desktop/src/commands/discovery.rs`
|
||||
* `discover_via_http_sweep(timeout)` running alongside mDNS + UDP.
|
||||
* `futures::future::join_all(tasks)` with overall `tokio::time::timeout`
|
||||
replaces the previous sequential `for task in tasks` loop, which
|
||||
blocked on slow-to-time-out unrelated IPs and missed the responding
|
||||
sensors.
|
||||
* Result-keeping in `useNodes`/`Dashboard` — keep last good list when
|
||||
a poll round returns 0 nodes.
|
||||
|
||||
### D8 — Mobile UI: WS path + Tailscale default + no simulation fallback
|
||||
|
||||
* `WS_PATH = '/ws/sensing'` and a hard-coded `WS_PORT = 8765` so the
|
||||
mobile app's `ws.service` connects to the RuView WS endpoint instead
|
||||
of the legacy `/api/v1/stream/pose` FastAPI path.
|
||||
* `settingsStore.serverUrl` defaults to `http://100.123.189.10:8080`,
|
||||
the deployed Mac's Tailscale IP, so the phone reaches the server
|
||||
without LAN dependency.
|
||||
* All `simulated` fallbacks removed from `ws.service.ts` and
|
||||
`matStore.ts` — UI shows `disconnected` rather than synthetic data
|
||||
when the server is unreachable.
|
||||
|
||||
### D9 — Reset-reason logging in `app_main`
|
||||
|
||||
A two-line ESP_LOGI at the start of `app_main` records
|
||||
`esp_reset_reason()` and `esp_ota_get_running_partition()->label`.
|
||||
Worth its weight every time we touched OTA — it eliminated guesswork
|
||||
when an image silently fell back.
|
||||
|
||||
## Verification
|
||||
|
||||
Acceptance ran on both deployed nodes with the operator stationary,
|
||||
then walking 2-3 m past each sensor, then leaving the room.
|
||||
|
||||
| Criterion | Target | room01 | room02 |
|
||||
|---|---|---|---|
|
||||
| `motion_energy` empty room | < 0.05 | 0.018 | 0.070 |
|
||||
| `motion_energy` walking | > 0.3 within 2 s | < 1 s | 3 s |
|
||||
| `motion_energy` decay after exit | < 0.1 within 5 s | 0.02–0.03 | 0.02–0.03 |
|
||||
| `breathing_rate_bpm` stationary 30 s | 12-20 BPM | 22.2 BPM | 21.0 BPM |
|
||||
| OTA round-trip | 2 consecutive succeed | ✅ | ✅ |
|
||||
| Reset-reason visible | one-line log at boot | ✅ | ✅ |
|
||||
|
||||
OTA #1 transitioned `running_partition: ota_0 → ota_1`; OTA #2 reversed
|
||||
it back to `ota_0`. No panics. `Connection reset` on the curl side is
|
||||
expected — `esp_restart()` tears down the TCP connection after
|
||||
`httpd_resp_send` returns.
|
||||
|
||||
## Files Touched
|
||||
|
||||
```
|
||||
firmware/esp32-csi-node/main/csi_collector.c
|
||||
firmware/esp32-csi-node/main/edge_processing.c
|
||||
firmware/esp32-csi-node/main/main.c
|
||||
firmware/esp32-csi-node/main/ota_update.c
|
||||
firmware/esp32-csi-node/sdkconfig.defaults
|
||||
|
||||
v2/crates/wifi-densepose-sensing-server/src/main.rs
|
||||
v2/crates/wifi-densepose-sensing-server/src/csi.rs
|
||||
|
||||
v2/crates/wifi-densepose-desktop/src/commands/discovery.rs
|
||||
v2/crates/wifi-densepose-desktop/src/commands/server.rs
|
||||
v2/crates/wifi-densepose-desktop/ui/src/hooks/useNodes.ts
|
||||
v2/crates/wifi-densepose-desktop/ui/src/hooks/useServer.ts
|
||||
v2/crates/wifi-densepose-desktop/ui/src/pages/Dashboard.tsx
|
||||
v2/crates/wifi-densepose-desktop/ui/src/pages/Sensing.tsx
|
||||
v2/crates/wifi-densepose-desktop/ui/src/types.ts
|
||||
|
||||
ui/mobile/src/constants/websocket.ts
|
||||
ui/mobile/src/services/ws.service.ts
|
||||
ui/mobile/src/stores/matStore.ts
|
||||
ui/mobile/src/stores/settingsStore.ts
|
||||
ui/mobile/src/screens/MATScreen/index.tsx
|
||||
ui/mobile/src/screens/VitalsScreen/index.tsx
|
||||
|
||||
docker/docker-compose.yml # host port 5005 → 5006 (RuView FW target)
|
||||
```
|
||||
|
||||
## Open Items
|
||||
|
||||
* `EDGE_MAX_SUBCARRIERS` is still `128` — D2 truncates incoming frames
|
||||
rather than enlarging the buffer. Increasing to 192 would let the
|
||||
pipeline use the full 192-subcarrier HT40 sideband, but requires
|
||||
re-sizing several stack/heap structures and re-tuning DSP windows.
|
||||
Tracked for a future release.
|
||||
* Empty-room `motion_energy` on room02 sits slightly above the 0.05
|
||||
target (0.07). Either the Fresnel-zone alignment for that node is
|
||||
noisier or the calibration constant `var / 3.0f` needs to be
|
||||
hardware-rev specific. Acceptable for the current deployment;
|
||||
candidate for an auto-calibration routine.
|
||||
|
||||
## References
|
||||
|
||||
* ADR-039 — Edge intelligence pipeline (the file we patched).
|
||||
* ADR-081 — `rv_feature_state_t` packet format (`0xC5110006`).
|
||||
* RuView issue #555 — *DSP froze on unwrapped phase variance* (this ADR).
|
||||
* RuView issue #556 — *OTA never sticks* (this ADR).
|
||||
|
|
@ -0,0 +1,154 @@
|
|||
# ADR-100 — PHY Gain Lock for Baseline-Stable CSI
|
||||
|
||||
**Status**: Accepted
|
||||
**Date**: 2026-05-17
|
||||
**Scope**: `firmware/esp32-csi-node/main/csi_collector.c`,
|
||||
`v2/crates/wifi-densepose-sensing-server/static/raw.html`.
|
||||
|
||||
## Context
|
||||
|
||||
After ADR-110 deployed the TP-Link WISP AP and the operator captured three
|
||||
controlled one-minute windows (empty / sit / walk), the RSSI MAD-Δ
|
||||
classifier failed to separate the three states — measured `d` values
|
||||
overlapped within ±0.03 of 0.49 while in-state spread was ±0.10. We
|
||||
inspected the live amplitude spectrum on the new `raw.html` console and
|
||||
saw a slow ±20-30 % broadband drift in the sensor amplitude even with
|
||||
the room provably empty. The drift was indistinguishable from body
|
||||
modulation at multi-meter range and dominated every downstream feature.
|
||||
|
||||
Francesco Pace's [ESPectre](https://github.com/francescopace/espectre)
|
||||
project (GPLv3) traced the same artefact to the ESP32 PHY's automatic
|
||||
gain control: AGC continuously rebalances the receiver gain per packet
|
||||
so received frames stay in the optimal decoding range. For CSI sensing
|
||||
this is a disaster — the same channel state arrives with a different
|
||||
amplitude every packet because the gain stage shifts under it. Pace
|
||||
documented two undocumented PHY routines in the IDF blob that freeze
|
||||
AGC and FFT scaling, plus a calibration recipe (median of the first
|
||||
300 packets) that is robust to brief startup activity.
|
||||
|
||||
## Decisions
|
||||
|
||||
### D1 — Port the ESPectre gain-lock to RuView FW
|
||||
|
||||
Added a self-contained block to `csi_collector.c`:
|
||||
|
||||
* **Overlay struct** `rv_phy_rx_ctrl_t` aliased over `wifi_csi_info_t.rx_ctrl`
|
||||
to read the hidden `agc_gain` (u8) and `fft_gain` (signed i8) fields.
|
||||
* **Extern declarations** for the two PHY routines:
|
||||
```c
|
||||
extern void phy_fft_scale_force(bool force_en, int8_t force_value);
|
||||
extern void phy_force_rx_gain(int force_en, int force_value);
|
||||
```
|
||||
* **Two-phase calibration** (`rv_gain_lock_process`):
|
||||
- Phase 1 (≤ 300 packets, ~6 s at the rate-gated 50 Hz callback):
|
||||
accumulate AGC and FFT samples into static arrays.
|
||||
- At the 300th packet: `qsort` both arrays, take the median, and
|
||||
call the two PHY routines to freeze gain.
|
||||
* **Safety branch**: if median AGC < 30, skip the lock and log a
|
||||
warning. Forcing a low gain on a strong-signal deployment causes the
|
||||
RX path to freeze (empirically documented in ESPectre's
|
||||
`gain_controller.h`).
|
||||
* **Supported targets**: ESP32-S3, ESP32-C3, ESP32-C6 only — older
|
||||
parts compile to a no-op stub. RuView ships on S3 so this is the only
|
||||
path we care about.
|
||||
|
||||
The hook is wired immediately after the existing rate-gate and MAC
|
||||
filter in the CSI callback so calibration completes within the first
|
||||
~6 s after the WiFi association, regardless of host traffic. After
|
||||
that it short-circuits.
|
||||
|
||||
Tagged as ADR-100 in the source comment for traceability.
|
||||
|
||||
### D2 — Use the existing `raw.html` console (ADR-110, D2 reuse) as the verification UI
|
||||
|
||||
The console added in ADR-110 already streams `nodes[].amplitude` from
|
||||
the existing WebSocket. No server-side change was needed. The HTML
|
||||
displays a per-node bar histogram of all 56 active subcarriers plus
|
||||
broadband mean amplitude and RSSI traces over the last 30 s. This is
|
||||
the surface where the operator can watch — without any DSP, without any
|
||||
classification — whether the gain-lock has actually flattened the
|
||||
baseline.
|
||||
|
||||
### D3 — Geometry matters as much as gain-lock
|
||||
|
||||
A controlled three-state capture made on 2026-05-17 with both sensors
|
||||
positioned so that the line `TP-Link AP → sensor` passes through the
|
||||
operator (lying on the bed) confirmed both decisions. The summary
|
||||
table appears under *Verified Acceptance* below. Earlier captures
|
||||
(ADR-110) failed to separate states partly because the sensors were
|
||||
placed off-axis from the AP-to-body line; with that geometry the body
|
||||
never physically obstructs the CSI channel.
|
||||
|
||||
## Calibration values observed (real captures, this deployment)
|
||||
|
||||
| Node | Boot rate (low traffic) | Boot rate (ping flood) | AGC median | FFT scale median | Lock decision |
|
||||
|---|---|---|---|---|---|
|
||||
| room01 (192.168.0.101) | 0.3 fps | 30+ fps | **42–44** | −31 / −33 | **APPLIED** |
|
||||
| room02 (192.168.0.100) | 0.3 fps | 30+ fps | **44** | −40 / −42 | **APPLIED** |
|
||||
|
||||
Both AGC medians are comfortably above the 30 safety threshold. The
|
||||
calibration completes in ~6 s when there is any host traffic (a single
|
||||
ping to the sensor at 10 pps is enough); on a totally idle channel
|
||||
beacons drive the rate down to 0.3 fps and calibration would take ~17
|
||||
minutes — practically we always have some traffic.
|
||||
|
||||
## Verified Acceptance — three-state separation
|
||||
|
||||
Geometry: TP-Link AP on the wall, both sensors at table-level on the
|
||||
opposite side of the room, operator lying on the bed between AP and
|
||||
sensors. 30 seconds per state, gain-lock active on both nodes,
|
||||
`raw.html` open during capture, `target_ip` provisioned to the Mac's
|
||||
TP-Link-side IP (192.168.0.103) so no upstream NAT is in the path.
|
||||
|
||||
| State | node 1 mean A | node 1 CV | node 1 sub-CV <5 % | node 2 mean A | node 2 CV | node 2 sub-CV <7 % |
|
||||
|---|---|---|---|---|---|---|
|
||||
| **EMPTY** (operator out) | **37.28** | **2.71 %** | **44/44** | 9.52 | 5.22 % | 26/44 |
|
||||
| **STILL** (operator lying still on bed) | 22.43 | 3.70 % | 30/44 | 9.67 | 5.02 % | 24/44 |
|
||||
| **WALK** (operator pacing the room) | 31.77 | **12.50 %** | 0/44 | 7.15 | **29.72 %** | 0/44 |
|
||||
|
||||
Observations:
|
||||
|
||||
* **Node 1 separates all three states** by mean amplitude alone: 37 →
|
||||
22 → 32. The body lying still blocks the direct path
|
||||
(40 % amplitude drop), then motion adds reflections back. The CV
|
||||
ladder 2.71 → 3.70 → 12.50 % is a second independent feature.
|
||||
* **Node 2 separates STILL+EMPTY from WALK** by CV (5 → 30 %). Its
|
||||
geometry doesn't pick up a still body, only motion.
|
||||
* **Compare to ADR-110** where empty/sit/walk differed by ±0.02 inside
|
||||
±0.10 noise — we now have inter-state separation ratios of **×3.4 on
|
||||
node 1 and ×5.9 on node 2**. The signal is no longer dominated by
|
||||
baseline drift.
|
||||
|
||||
## Files Touched
|
||||
|
||||
```
|
||||
firmware/esp32-csi-node/main/csi_collector.c # gain-lock module + hook
|
||||
v2/crates/wifi-densepose-sensing-server/static/raw.html # already from ADR-110
|
||||
docs/adr/ADR-100-gain-lock-baseline-stabilization.md # this ADR
|
||||
```
|
||||
|
||||
## Open Items
|
||||
|
||||
* ✅ **NBVI subcarrier selection** — closed in ADR-102 (server-side
|
||||
port with quiet-window finder).
|
||||
* ✅ **Server-side RSSI parsing** — fixed by parallel agent in commit
|
||||
`3393c1e8` (parse_esp32_frame offset realignment + carrying RSSI
|
||||
through feature_state packets).
|
||||
* ✅ **Calibration latency on an idle channel** — closed in ADR-106
|
||||
by the built-in managed-`ping` keepalive (drives sensor RX at
|
||||
25 pkt/s/node out of the box).
|
||||
* ⏳ **NVS target_ip is hardcoded** — still open. Tailscale-target
|
||||
option not implemented; sensors still send to the Mac's TP-Link-
|
||||
side IP (192.168.0.103). Mac roaming still breaks the CSI stream.
|
||||
|
||||
## References
|
||||
|
||||
* ADR-039 — Edge intelligence pipeline (host DSP path).
|
||||
* ADR-098 — Earlier ESP32-S3 deployment fixes.
|
||||
* ADR-110 — TP-Link WISP deployment + first RSSI-Δ attempt (this ADR
|
||||
supersedes the threshold table in ADR-110, D3 — the RSSI MAD-Δ
|
||||
detector is left in place but no longer the primary signal).
|
||||
* Francesco Pace, *How I Turned My Wi-Fi Into a Motion Sensor — Part 2*,
|
||||
Dec 2025 — source of the gain-lock recipe.
|
||||
* `francescopace/espectre`, `components/espectre/gain_controller.{h,cpp}`
|
||||
on GitHub — reference implementation (GPLv3).
|
||||
|
|
@ -0,0 +1,147 @@
|
|||
# ADR-101 — Raw-Amplitude Presence/Motion Classifier
|
||||
|
||||
**Status**: Accepted
|
||||
**Date**: 2026-05-17
|
||||
**Scope**: `v2/crates/wifi-densepose-sensing-server/src/main.rs`
|
||||
(`amp_presence_override`, `amp_classify_from_latest`,
|
||||
`amp_node_level`, `amp_node_snapshot`).
|
||||
|
||||
## Context
|
||||
|
||||
After ADR-100 the AGC drift is gone and the broadband baseline is clean.
|
||||
Before this ADR the live `classification.motion_level` was being driven
|
||||
by the legacy DSP (variance + motion_band_power thresholds) plus an
|
||||
RSSI MAD-Δ override from ADR-110. Both failed on the operator's
|
||||
deployment: variance overlaps empty/sit/walk within noise, and RSSI
|
||||
MAD-Δ overlaps within ±0.03 of 0.49 across all three states. The
|
||||
operator could lie still in the path between AP and sensor and the
|
||||
detector would silently report `absent`.
|
||||
|
||||
The 30 sec × 3 controlled captures done on 2026-05-17 (lying between
|
||||
TP-Link AP and sensor 1, see ADR-100 *Verified Acceptance*) showed
|
||||
that **the broadband CV of mean amplitude separates the three states
|
||||
by 3-6× on this geometry**. EMPTY = 2.7-5 %, STILL = 3.7-5 %,
|
||||
WALK = 12.5-29.7 %. EMPTY vs STILL are best separated by the
|
||||
**mean-amplitude drop** (37 → 22 on the active sensor, -40 %).
|
||||
|
||||
This ADR replaces the RSSI MAD-Δ classifier with a pure-amplitude one
|
||||
that uses both signals: CV for motion, baseline drop for static body.
|
||||
|
||||
## Decisions
|
||||
|
||||
### D1 — `amp_presence_override` per-node classifier
|
||||
|
||||
For each frame received on the raw-CSI path:
|
||||
|
||||
1. Push current full amplitude vector into the NBVI ranking buffer
|
||||
(`nbvi_history`, capacity 600 frames ≈ 30 s).
|
||||
2. Periodically (every `NBVI_REFRESH_TICKS=200` calls, ~5 s) rank
|
||||
subcarriers by NBVI (see ADR-102) and pick the top-12.
|
||||
3. Compute **broadband_mean** as the average of NBVI-selected
|
||||
subcarriers. Falls back to all non-zero subcarriers during warmup.
|
||||
4. Push to two rolling windows:
|
||||
- `short` (90 samples ≈ 4.5 s) — for CV.
|
||||
- `long` (1200 samples ≈ 60 s) — for the rolling-fallback 95 %ile
|
||||
baseline.
|
||||
5. Compute `cv = std(short) / mean(short)`.
|
||||
6. Compute `baseline` — see ADR-103 for the persistent-override path.
|
||||
7. Stash `(cv, mean_short, baseline)` per node in `AMP_LATEST` for
|
||||
cross-node fusion.
|
||||
8. Run `amp_classify_from_latest` (D2 below) to produce the global
|
||||
`(level, presence, confidence)`.
|
||||
|
||||
Returns `None` until the short window is full so the very first
|
||||
seconds after boot don't emit garbage.
|
||||
|
||||
### D2 — Cross-node fusion in `amp_classify_from_latest`
|
||||
|
||||
The deployment has two sensors with very different SNR (node 1 mean
|
||||
amplitude ~22, node 2 mean ~9 on the operator's TP-Link). A single
|
||||
bursty node should not flip the whole detector. We use:
|
||||
|
||||
* **MAX CV** across nodes for the motion gate. Any node seeing
|
||||
movement is enough — body modulates only the line-of-sight path
|
||||
it crosses, the other node may stay clean.
|
||||
* **ANY baseline drop** → `present_still`. One well-placed node
|
||||
seeing the body is enough.
|
||||
|
||||
Decision (universal-threshold normalized — see ADR-103 D3):
|
||||
|
||||
```
|
||||
norm_max_cv = max_cv / baseline_cv (when calibration loaded)
|
||||
gates: fallback when no calibration:
|
||||
norm ≥ 6.0 → "active" max_cv ≥ 0.22
|
||||
norm ≥ 3.0 → "present_moving" max_cv ≥ 0.10
|
||||
any drop → "present_still" (same)
|
||||
otherwise → "absent" (same)
|
||||
```
|
||||
|
||||
### D3 — Sticky 3-second motion hysteresis
|
||||
|
||||
After each fusion pass, a global `AMP_HOLD` counter is reset to
|
||||
`AMP_MOTION_HOLD_TICKS = 120` whenever the candidate is `moving` /
|
||||
`active`. Each subsequent quiet tick decrements the counter; the
|
||||
prior motion label is kept until it expires (≈ 3 s at the ~40
|
||||
combined classifier ticks/s). This bridges the brief CV dips between
|
||||
walking steps so the GLOBAL doesn't flicker between `moving` and
|
||||
`absent`.
|
||||
|
||||
### D4 — `amp_classify_from_latest` read-only entry point
|
||||
|
||||
The server has multiple `SensingUpdate` producers — the raw-CSI path
|
||||
runs the full pipeline above, but the feature_state path (0xC5110006)
|
||||
arrives without raw amplitudes. We expose a parallel read-only
|
||||
classifier that pulls the latest stashed per-node `(cv, mean, baseline)`
|
||||
from `AMP_LATEST` and runs the same fusion. The feature_state path
|
||||
calls it so its emitted `classification` agrees with the raw-CSI
|
||||
path's — no flicker between the two SensingUpdate sources.
|
||||
|
||||
### D5 — Per-node labels in `build_node_features`
|
||||
|
||||
`PerNodeFeatureInfo.classification` is overridden via
|
||||
`amp_node_snapshot(node_id)`, which runs the same per-node
|
||||
classifier (without cross-node fusion or hysteresis) against the
|
||||
stashed `(cv, mean, baseline)` for that node alone. UI consumers
|
||||
(raw.html badges) see each sensor's independent decision plus the
|
||||
global fused one — useful for finding sensor placement without
|
||||
moving them.
|
||||
|
||||
## Files Touched
|
||||
|
||||
```
|
||||
v2/crates/wifi-densepose-sensing-server/src/main.rs # ~230 lines added
|
||||
v2/crates/wifi-densepose-sensing-server/static/raw.html # per-node badges
|
||||
```
|
||||
|
||||
## Verified Acceptance
|
||||
|
||||
| State | GLOBAL | CV | Per-node detail |
|
||||
|---|---|---|---|
|
||||
| EMPTY | `absent` | 4-6 % | both nodes baseline mean, low CV |
|
||||
| STILL (lying, in node 1 path) | `present_still` | 3-8 % | node 1 mean drops 70 %, RSSI -20 dB |
|
||||
| WALK | `active` | 12-36 % | node 2 CV explodes, RSSI swings ±5 dB |
|
||||
|
||||
Cross-state separation ratio = 3.4× on node 1 broadband mean, 5.9×
|
||||
on node 2 CV, compared to ±0.02 inside ±0.10 noise with the old
|
||||
RSSI MAD-Δ classifier from ADR-110.
|
||||
|
||||
## Open Items
|
||||
|
||||
* ✅ **Per-subcarrier baseline-drop** — closed in ADR-104 (per-sub
|
||||
drift channel with 10 % gate, triggers `present_still` even when
|
||||
broadband doesn't move).
|
||||
* ✅ **Off-axis sit doesn't trigger** — closed in ADR-104 (drift
|
||||
channel catches off-line-of-sight body presence).
|
||||
* ⏳ **CV saturates above ~30 %** — still open. Heavy-motion granularity
|
||||
(run vs jog vs jump) lost above the `active` gate. Would need a
|
||||
log-CV or rank-based metric to extend the dynamic range.
|
||||
|
||||
## References
|
||||
|
||||
* ADR-110 — first RSSI MAD-Δ attempt (superseded for `motion_level` /
|
||||
`presence` / `confidence`; helper kept as `#[allow(dead_code)]`).
|
||||
* ADR-100 — gain lock that makes this classifier possible.
|
||||
* ADR-102 — NBVI subcarrier selection that drives the CV computation.
|
||||
* ADR-103 — persistent baseline + universal threshold normalization.
|
||||
* [`docs/references/espectre-techniques.md`](../references/espectre-techniques.md)
|
||||
— full RuView ↔ ESPectre comparison.
|
||||
|
|
@ -0,0 +1,136 @@
|
|||
# ADR-102 — NBVI Subcarrier Selection (server-side)
|
||||
|
||||
**Status**: Accepted
|
||||
**Date**: 2026-05-17
|
||||
**Scope**: `v2/crates/wifi-densepose-sensing-server/src/main.rs`
|
||||
(`AmpState.nbvi_*`, `nbvi_select_top_k`).
|
||||
|
||||
## Context
|
||||
|
||||
Each ESP32-S3 CSI frame carries 56 active subcarriers on the HT20
|
||||
20 MHz channel. The amplitudes per subcarrier have very different
|
||||
SNR depending on frequency-selective fading: in the operator's
|
||||
deployment subcarriers `k=6..11` and `k=22..26` sit at CV ≈ 6 % when
|
||||
the room is empty, while subcarriers `k=38..43` (middle of the band,
|
||||
near the LTF nulls) sit at CV ≈ 11 % — pure channel noise, no
|
||||
information about the room.
|
||||
|
||||
ADR-101's classifier computes broadband-mean CV. Averaging over all
|
||||
56 subcarriers means the noisy ones drag the baseline CV up to
|
||||
5-7 %. That blunted the motion gates and we had to push them up to
|
||||
10-22 %, losing sensitivity to subtle motion.
|
||||
|
||||
## Decisions
|
||||
|
||||
### D1 — Port Francesco Pace's NBVI to the server (not the FW)
|
||||
|
||||
Formula (ESPectre, GPLv3):
|
||||
|
||||
```
|
||||
NBVI(k) = α · (σ_k / μ_k²) + (1 - α) · (σ_k / μ_k), α = 0.5
|
||||
```
|
||||
|
||||
* `σ_k / μ_k²` — penalises weak subcarriers (a quiet bin with mean ≈ 0
|
||||
gets `∞` and is filtered out).
|
||||
* `σ_k / μ_k` — standard coefficient of variation; rewards stability.
|
||||
* `α = 0.5` — empirically balanced (per Pace's α-sweep tests).
|
||||
|
||||
**Where**: in the server, not in FW. Pros: trivial to retune per
|
||||
deployment, no flash cycle, single source of truth across two FW
|
||||
variants we ship (`runbot_csi_node` and `esp32s3_csi_capture`). Cons:
|
||||
we lose the ability to *only emit* selected subcarriers (would save
|
||||
UDP bandwidth) — but at ~25 fps × 56 × 2 bytes = 2.8 KB/s per node,
|
||||
bandwidth isn't a concern.
|
||||
|
||||
### D2 — Top-K with K = 12
|
||||
|
||||
Selected at server boot once `nbvi_history` has 90+ samples; then
|
||||
re-selected every `NBVI_REFRESH_TICKS = 200` calls (~5 s of combined
|
||||
classifier ticks). The selected indices live in
|
||||
`AmpState.nbvi_selected`.
|
||||
|
||||
K=12 matches ESPectre's default. Smaller K = less averaging
|
||||
smoothing; larger K = drags in worse subcarriers.
|
||||
|
||||
### D3 — Dead-zone gate at 25 % of median mean
|
||||
|
||||
Before NBVI scoring, drop any subcarrier whose mean amplitude is
|
||||
below `0.25 × median(all subcarrier means)`. Guard tones (FW reports
|
||||
amp[0] = 0 for DC), edge bins, and dead frequencies are excluded so
|
||||
they can't "win" with σ/μ² → ∞.
|
||||
|
||||
### D4 — ESPectre Step 1: quiet-window finder
|
||||
|
||||
Naive NBVI ranking over the *entire* history is biased if a body
|
||||
walked through during the calibration buffer. ADR-102 v2 adds the
|
||||
quiet-window finder from Pace's Step 1:
|
||||
|
||||
1. Slide an `AMP_SHORT_WIN=90`-sample window across `nbvi_history`
|
||||
with stride `AMP_SHORT_WIN/3 = 30`.
|
||||
2. For each window, compute the CV of its per-frame broadband mean.
|
||||
3. The window with the lowest CV is "quietest".
|
||||
4. Per-subcarrier mean and std for NBVI scoring use **only that
|
||||
window**.
|
||||
|
||||
If history is smaller than one window, the whole buffer is used.
|
||||
Stride 30 (overlap of 60) keeps wall-clock cost trivial for 600
|
||||
frames.
|
||||
|
||||
### D5 — `mean_for_baseline` uses FULL broadband, not NBVI
|
||||
|
||||
NBVI top-K re-selects between server restarts (different "quietest"
|
||||
window may give different ranking). That made the persisted baseline
|
||||
value incomparable across restarts (see ADR-103 D1). Fix: ADR-101
|
||||
classifier keeps a parallel `short_full` ring buffer of FULL
|
||||
broadband means (all non-zero subcarriers, no NBVI filter). When
|
||||
ADR-103's persistent override is active, the baseline-drop check
|
||||
compares full-broadband short window to full-broadband baseline.
|
||||
NBVI subset is still used for CV (motion sensitivity is what NBVI
|
||||
shines at — full broadband mean is just the integral level).
|
||||
|
||||
## Files Touched
|
||||
|
||||
```
|
||||
v2/crates/wifi-densepose-sensing-server/src/main.rs
|
||||
- struct AmpState
|
||||
- nbvi_select_top_k()
|
||||
- amp_presence_override() (broadband_mean computation)
|
||||
```
|
||||
|
||||
## Verified Acceptance (operator's deployment, 2026-05-17)
|
||||
|
||||
Idle empty-room CV, sensing-server with 2 pps housekeeping ping:
|
||||
|
||||
| | Full 56 subc | NBVI top-12 |
|
||||
|---|---|---|
|
||||
| node 1 (rssi -53 dBm) | ~5.0 % | **3.1 %** |
|
||||
| node 2 (rssi -67 dBm) | ~7.0 % | **3.9 %** |
|
||||
|
||||
Reduction 38-44 %. The lower baseline let ADR-101 gates be tightened
|
||||
from `15 % / 30 %` down to `10 % / 22 %` for moving/active without
|
||||
raising the false-positive rate — subtler motions like waving while
|
||||
sitting near a sensor now trigger.
|
||||
|
||||
## Open Items
|
||||
|
||||
* ✅ **Step 3 FP-rate validation** — closed in ADR-104 D4 (commit
|
||||
`6212b17e`). K ∈ {6,8,10,12,16,20} sweep, smallest-FP wins; ties
|
||||
broken by smallest total-NBVI score.
|
||||
* **Persist NBVI selection** — `AMP_BASELINE_OVERRIDE` (ADR-103)
|
||||
persists baseline scalar but not the chosen subcarrier indices.
|
||||
After server restart NBVI re-ranks from scratch; in deployments
|
||||
where the channel changes over hours we'd want to re-rank anyway,
|
||||
so for now this is correct, not an open item.
|
||||
* **FW boot-time NBVI freeze** — ESPectre's Pace freezes NBVI for
|
||||
the lifetime of the boot. Trade-off vs our adaptive rolling
|
||||
refresh. Worth exploring if FP rate is a problem in real homes.
|
||||
|
||||
## References
|
||||
|
||||
* ADR-100 — gain lock (gives NBVI a stable per-subcarrier baseline).
|
||||
* ADR-101 — classifier that consumes NBVI selection.
|
||||
* ADR-103 — persistent baseline + universal threshold normalization.
|
||||
* [Pace's *Part 2*](https://medium.com/@francesco.pace/how-i-turned-my-wi-fi-into-a-motion-sensor-part-2-62038130e530)
|
||||
+ [francescopace/espectre](https://github.com/francescopace/espectre)
|
||||
on GitHub (GPLv3).
|
||||
* [`docs/references/espectre-techniques.md`](../references/espectre-techniques.md).
|
||||
|
|
@ -0,0 +1,180 @@
|
|||
# ADR-103 — Persistent Empty-Room Baseline + Universal Threshold
|
||||
|
||||
**Status**: Accepted
|
||||
**Date**: 2026-05-17
|
||||
**Scope**: `v2/crates/wifi-densepose-sensing-server/src/main.rs`
|
||||
(`AMP_BASELINE_OVERRIDE`, `AMP_BASELINE_CV`, `load_baseline_file`,
|
||||
`amp_node_level`), `v2/data/baseline.json`, `scripts/record-baseline.py`.
|
||||
|
||||
## Context
|
||||
|
||||
ADR-101's classifier relies on a `baseline` value per node — the
|
||||
mean amplitude the room exhibits when empty. Pre-ADR-103 the baseline
|
||||
was the rolling 95 %ile of the last 1 200 samples (≈ 60 s) of
|
||||
broadband mean. That meant every server restart triggered a "step
|
||||
outside for 60 seconds" ritual before the detector worked, and if
|
||||
the operator stayed in the room longer than ~4 min the baseline
|
||||
silently drifted down to the *occupied* amplitude — making
|
||||
`present_still` under-trigger forever after.
|
||||
|
||||
Additionally, motion gates were hard-coded to the operator's
|
||||
deployment (10 % / 22 % CV) — wouldn't transfer to a different room
|
||||
with different noise floor.
|
||||
|
||||
## Decisions
|
||||
|
||||
### D1 — Persistent baseline file at `data/baseline.json`
|
||||
|
||||
JSON schema **v2** (per node):
|
||||
|
||||
```json
|
||||
{
|
||||
"version": 2,
|
||||
"captured_at": "ISO-8601",
|
||||
"duration_sec": 90.0,
|
||||
"trim_head_sec": 15.0,
|
||||
"trim_tail_sec": 15.0,
|
||||
"clean_window_sec": 30.0,
|
||||
"method": "record → trim head/tail → find lowest-CV sub-window → FULL-broadband stats per node",
|
||||
"nodes": {
|
||||
"1": {
|
||||
"full_broadband_mean": 26.11,
|
||||
"full_broadband_p50": 26.16,
|
||||
"full_broadband_p95": 27.04, ← used as `baseline`
|
||||
"full_broadband_std": 0.68,
|
||||
"full_broadband_cv_pct": 2.62, ← used to normalize gates (D3)
|
||||
"rssi_dbm": -52.3,
|
||||
"n_samples": 149,
|
||||
"per_subcarrier_mean": [..56 floats..]
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Loader (`load_baseline_file`) reads at server startup. Path is
|
||||
`$RUVIEW_BASELINE_FILE` or `data/baseline.json` by default. Missing
|
||||
or unparseable file = log warning + fall back to rolling p95 (= old
|
||||
ADR-101 behaviour, no breaking change).
|
||||
|
||||
Per-node lookup priority: `full_broadband_p95` → `full_broadband_mean`
|
||||
→ legacy `p95_amp` → legacy `mean_amp`. v1 baselines load but get
|
||||
warning about NBVI-drift incompatibility.
|
||||
|
||||
### D2 — FULL broadband for baseline comparison, NBVI for CV
|
||||
|
||||
The persisted baseline must be comparable across server restarts.
|
||||
NBVI top-12 re-selects on each boot (ADR-102 D4), so a NBVI-subset
|
||||
mean recorded today doesn't match a NBVI-subset mean tomorrow even
|
||||
in the same empty room. Fix:
|
||||
|
||||
`amp_presence_override` keeps two short windows:
|
||||
|
||||
| Field | Source | Used for |
|
||||
|---|---|---|
|
||||
| `short` | NBVI-subset broadband mean | CV (motion sensitivity) |
|
||||
| `short_full` | **all non-zero subcarriers** mean | baseline drop check |
|
||||
|
||||
The recording script also computes full-broadband statistics from
|
||||
the captured frames. Both sides of `mean / baseline` ratio are
|
||||
full-broadband ⇒ stable across NBVI selection.
|
||||
|
||||
### D3 — Universal threshold via baseline-CV normalization
|
||||
|
||||
(Pace's Problem #3.) Hard-coded gates are deployment-tuned. Fix:
|
||||
normalize the runtime CV by the empty-room CV measured during
|
||||
calibration:
|
||||
|
||||
```
|
||||
norm_cv = current_cv / baseline_cv
|
||||
gates: norm_cv ≥ 3.0 → present_moving
|
||||
norm_cv ≥ 6.0 → active
|
||||
```
|
||||
|
||||
Both `amp_node_level` (per-node) and `amp_classify_from_latest`
|
||||
(global) use the same normalization. When no calibration is loaded,
|
||||
fall back to absolute gates `0.10 / 0.22` (the deployment-tuned
|
||||
values) — keeps backwards compatibility.
|
||||
|
||||
`AMP_BASELINE_CV` is a separate per-node map loaded alongside
|
||||
`AMP_BASELINE_OVERRIDE`. The CV value is the FULL-broadband CV % from
|
||||
the calibration file divided by 100.
|
||||
|
||||
### D4 — Recording script `scripts/record-baseline.py`
|
||||
|
||||
CLI helper (Python 3, requires `pip install websockets`). Connects
|
||||
to the live `ws://localhost:8765/ws/sensing`, records `duration` (90
|
||||
s default), then:
|
||||
|
||||
1. Trim `trim_head_sec` (15 s default) and `trim_tail_sec` (15 s
|
||||
default) to discard door-open / re-entry transients.
|
||||
2. Slide a `clean_window_sec` (30 s default) sub-window across the
|
||||
trimmed buffer, pick the one with the lowest broadband CV.
|
||||
3. Per node, compute full-broadband mean / median / p95 / std / CV %
|
||||
and rssi mean over that cleanest window.
|
||||
4. Also compute per-subcarrier mean across the cleanest window (saved
|
||||
as diagnostic for future per-subcarrier delta classifier).
|
||||
5. Write `v2/data/baseline.json` (path overridable via `--out`).
|
||||
|
||||
Operator workflow now: step out, run script, come back, restart
|
||||
server. One-time per deployment (or after room rearrangement). No
|
||||
recurring ritual.
|
||||
|
||||
## Files Touched
|
||||
|
||||
```
|
||||
v2/crates/wifi-densepose-sensing-server/src/main.rs # ~120 lines added
|
||||
v2/data/baseline.json # new, gitignored?
|
||||
scripts/record-baseline.py # new helper
|
||||
docs/adr/ADR-103-persistent-baseline.md # this ADR
|
||||
```
|
||||
|
||||
## Verified Acceptance (operator's deployment, 2026-05-17)
|
||||
|
||||
```
|
||||
boot: baseline: loaded 2 node overrides from data/baseline.json
|
||||
(node1=27.04, node2=14.72;
|
||||
node1_cv=2.62%, node2_cv=3.65%)
|
||||
```
|
||||
|
||||
Empty room, immediately after restart (no warmup wait):
|
||||
|
||||
```
|
||||
GLOBAL: absent CV=5.0%
|
||||
node 1 ratio=0.93, norm_cv=0.80×
|
||||
node 2 ratio=0.93, norm_cv=0.83×
|
||||
```
|
||||
|
||||
Sitting in node 2 path (off-axis from node 1):
|
||||
|
||||
```
|
||||
GLOBAL: present_still CV=8.1%
|
||||
node 1 ratio=1.05, norm_cv=1.2× (not in path, no drop)
|
||||
node 2 ratio=0.70, norm_cv=1.7× ← drop fires present_still
|
||||
```
|
||||
|
||||
Walking:
|
||||
|
||||
```
|
||||
GLOBAL: active CV=28-36%
|
||||
node 1 norm_cv=4-6×, node 2 norm_cv=7-9× ← well above 6× gate
|
||||
```
|
||||
|
||||
Universal-threshold gates `3.0 / 6.0` map to the same absolute
|
||||
12 % / 22 % we hand-tuned earlier — but now any-room-portable.
|
||||
|
||||
## Open Items
|
||||
|
||||
* ✅ **REST endpoint POST /api/v1/baseline/calibrate** — closed in
|
||||
ADR-107 D3 + UI button D6.
|
||||
* ✅ **Per-subcarrier baseline comparison** — closed in ADR-104
|
||||
(per-sub drift channel consumes `per_subcarrier_mean`).
|
||||
* ✅ **Auto-recalibrate on long quiet periods** — closed in ADR-107 D5
|
||||
(30-min quiet + 1-h cooldown defaults).
|
||||
|
||||
## References
|
||||
|
||||
* ADR-100 — gain lock.
|
||||
* ADR-101 — classifier consumes the baseline.
|
||||
* ADR-102 — NBVI selection drift was the root cause of D1/D2.
|
||||
* [`docs/references/espectre-techniques.md`](../references/espectre-techniques.md)
|
||||
— Pace's full technique catalogue including Problem #3 normalization.
|
||||
|
|
@ -0,0 +1,179 @@
|
|||
# ADR-104 — Per-Subcarrier Drift Presence Channel + NBVI FP-Rate Validation
|
||||
|
||||
**Status**: Accepted
|
||||
**Date**: 2026-05-17
|
||||
**Scope**: `v2/crates/wifi-densepose-sensing-server/src/main.rs`
|
||||
(`AMP_BASELINE_PER_SUB`, `AMP_DRIFT`, `amp_drift_for_node`,
|
||||
`amp_drift_max`, `amp_node_level`, `amp_classify_from_latest`,
|
||||
`nbvi_select_top_k` Step 3), `scripts/record-baseline.py`
|
||||
(`per_subcarrier_mean` already saved).
|
||||
|
||||
## Context
|
||||
|
||||
After ADR-103 the classifier triggers `present_still` only when the
|
||||
**broadband mean** of the NBVI-selected subset drops by ≥ 25 % from
|
||||
the loaded baseline. This works when the operator's body crosses the
|
||||
line of sight between AP and sensor — direct-component attenuation
|
||||
dominates. But:
|
||||
|
||||
1. **Off-axis presence**: the operator sitting at a desk to the side
|
||||
of the AP-sensor line modulates only a handful of subcarriers
|
||||
(the ones whose Fresnel zone happens to brush their body). The
|
||||
*broadband* mean barely shifts; ADR-103 says `absent` even though
|
||||
someone is clearly in the room.
|
||||
2. **NBVI Step 3**: Pace's full NBVI pipeline picks top-K by raw NBVI
|
||||
score, then **validates** each candidate K by counting false
|
||||
positives the motion detector would produce on the calibration
|
||||
buffer, and keeps the K with the lowest FP rate. We were taking
|
||||
the raw top-12 without validation — fragile if one of the chosen
|
||||
subcarriers happens to overlap a noise source.
|
||||
|
||||
## Decisions
|
||||
|
||||
### D1 — Spectral drift score as a second presence channel
|
||||
|
||||
`amp_presence_override` per node now also computes a **spectral
|
||||
drift score**:
|
||||
|
||||
```
|
||||
drift_k = (current_amp[k] - baseline_amp[k]).abs() / baseline_amp[k] for baseline[k] > 1.0
|
||||
drift = mean(drift_k) across kept subcarriers
|
||||
```
|
||||
|
||||
`current_amp[k]` = mean of the recent `AMP_SHORT_WIN` (90) frames'
|
||||
amplitude at subcarrier `k`. `baseline_amp[k]` = the
|
||||
`per_subcarrier_mean` vector saved by ADR-103's recording script.
|
||||
|
||||
Per-node drift is stashed in `AMP_DRIFT: HashMap<u8, f64>` so
|
||||
`amp_node_level` (per-node) and `amp_classify_from_latest` (global)
|
||||
can use it. Threshold `AMP_DRIFT_PRESENCE_THRESH = 0.10` (10 %
|
||||
average per-subcarrier deviation) is empirical and consistent with
|
||||
the broadband-ratio trigger (drop ≥ 25 %, drift ≥ 10 %).
|
||||
|
||||
### D2 — Trigger order in classifier
|
||||
|
||||
Per node (`amp_node_snapshot`):
|
||||
|
||||
```
|
||||
1. CV ≥ 6× baseline_cv → active
|
||||
2. CV ≥ 3× baseline_cv → present_moving
|
||||
3. drift ≥ 10 % → present_still ← ADR-104 (off-axis)
|
||||
4. mean / baseline < 0.75 → present_still ← ADR-101 (in-path)
|
||||
5. otherwise → absent
|
||||
```
|
||||
|
||||
Global (`amp_classify_from_latest`) uses MAX CV / MAX drift / ANY
|
||||
baseline-drop across nodes. Either drop OR drift fires `present_still`.
|
||||
|
||||
### D3 — Opportunistic loading
|
||||
|
||||
`per_subcarrier_mean` was already being written by
|
||||
`scripts/record-baseline.py` (line ~132, written as a list of
|
||||
~56 floats per node) but the server ignored it. Now `load_baseline_file`
|
||||
parses it and populates `AMP_BASELINE_PER_SUB`. If absent (older
|
||||
`baseline.json` from before this ADR) → drift stays 0.0 → no behaviour
|
||||
change. Re-trigger calibration via the ADR-107 REST endpoint or auto-
|
||||
recalibrate to populate the field and activate the drift channel.
|
||||
|
||||
### D4 — NBVI FP-rate validation (Step 3 of Pace's spec)
|
||||
|
||||
`nbvi_select_top_k` no longer returns the literal top-K. After
|
||||
ranking by NBVI score (Steps 1+2), it evaluates each candidate
|
||||
K ∈ `{6, 8, 10, 12, 16, 20}` clamped to the available subcarrier
|
||||
pool:
|
||||
|
||||
* For each K: compute per-frame broadband mean over the top-K
|
||||
subset across the quiet window.
|
||||
* Slide a sub-window (length `AMP_SHORT_WIN/3 ≈ 30` samples, stride
|
||||
`sub_window/2`) and count windows where rolling CV exceeds the
|
||||
moving-gate threshold (0.10).
|
||||
* Pick the K with the **smallest FP count**. Ties broken by smallest
|
||||
total NBVI score (less noisy subset wins).
|
||||
|
||||
Result: a subset that's stable AND non-FP-producing on the calibration
|
||||
window. If a top-12 NBVI candidate sneaks in a subcarrier overlapping
|
||||
a noise source, the FP count surfaces it and a smaller K wins instead.
|
||||
|
||||
## Files Touched
|
||||
|
||||
```
|
||||
v2/crates/wifi-densepose-sensing-server/src/main.rs
|
||||
- statics: AMP_BASELINE_PER_SUB, AMP_DRIFT
|
||||
- helpers: amp_baseline_per_sub_init, amp_drift_init,
|
||||
amp_drift_for_node, amp_drift_max
|
||||
- load_baseline_file: parse per_subcarrier_mean → AMP_BASELINE_PER_SUB
|
||||
- amp_presence_override: drift computation + stash
|
||||
- amp_node_level: drift trigger (uses MAX for cross-node)
|
||||
- amp_node_snapshot: per-node drift trigger (overrides MAX)
|
||||
- amp_classify_from_latest: any-node drift trigger in global fusion
|
||||
- nbvi_select_top_k: Step 3 FP-rate validation
|
||||
docs/adr/ADR-104-per-subcarrier-drift-presence.md (this)
|
||||
```
|
||||
|
||||
Implementation commit: `6212b17e`.
|
||||
|
||||
## Verified Acceptance
|
||||
|
||||
Server boot log (using existing v1 baseline.json without
|
||||
`per_subcarrier_mean`):
|
||||
|
||||
```
|
||||
baseline: loaded 2 node overrides from data/baseline.json
|
||||
(node1=27.04, node2=14.72; node1_cv=2.62%, node2_cv=3.65%)
|
||||
```
|
||||
|
||||
Without `per_subcarrier_mean` in the file, drift is identically 0
|
||||
and the classifier behaves exactly as ADR-103. To activate the
|
||||
drift channel: re-record via the ADR-107 REST endpoint or wait for
|
||||
auto-recalibrate; new `baseline.json` carries the
|
||||
`per_subcarrier_mean` vector and drift becomes live.
|
||||
|
||||
NBVI Step 3 validation runs on every refresh tick. With K=12 being
|
||||
the "safe" default that always passes (clean low-CV window in the
|
||||
operator's deployment) and smaller Ks not improving FP=0, the picker
|
||||
keeps K=12 in steady state. Defends against future drift in channel
|
||||
conditions where a previously-clean subcarrier picks up interference.
|
||||
|
||||
## Open Items
|
||||
|
||||
(none — see Closed below)
|
||||
|
||||
## Closed
|
||||
|
||||
* **Phase-domain drift** — `scripts/record-baseline.py` and the
|
||||
in-process `capture_baseline_to_disk` now emit per-subcarrier
|
||||
`per_subcarrier_phase_mean` + `per_subcarrier_phase_var` (circular
|
||||
mean + variance) when the WS stream carries phases (ADR-106). The
|
||||
server loads them into `PHASE_BASELINE_PER_SUB`, `phase_drift_update`
|
||||
computes a per-tick circular-distance score over subcarriers whose
|
||||
baseline variance is below `PHASE_BASELINE_VAR_MAX = 0.30`. Score
|
||||
surfaces in `PerNodeFeatureInfo.phase_drift_score` (skip-if-none).
|
||||
Falls back gracefully — legacy baselines without phase fields keep
|
||||
amplitude-only behaviour.
|
||||
|
||||
* **Per-subcarrier baseline AGE check** — `baseline_staleness_watch`
|
||||
background task warns when on-disk baseline is older than
|
||||
`--baseline-stale-age-sec` (default 4 h) AND per-sub drift exceeds
|
||||
1.5× presence threshold for ≥3 consecutive 5-min ticks while the
|
||||
classifier reports `absent`. Rate-limited via
|
||||
`--baseline-stale-warn-cooldown-sec` (default 1 h). Independent
|
||||
from `auto_recalibrate_task`: that path needs a quiet room; this
|
||||
one fires when the operator is *in* the room while the channel
|
||||
itself has shifted. (commit eec3ca6c)
|
||||
* **Per-subcarrier delta in UI** — `raw.html` now shows a per-node
|
||||
drift sparkline below the RSSI/broadband trace, fixed Y range
|
||||
[0, 0.30] with dashed presence (0.10) and warning (0.15)
|
||||
thresholds. Numeric "drift" stat pill in the per-node header.
|
||||
Backed by a new `drift_score: Option<f64>` field on
|
||||
`PerNodeFeatureInfo` (skip-if-none — distinguishes "no per-sub
|
||||
baseline loaded" from "loaded and stable at 0.0"). (commit eec3ca6c)
|
||||
|
||||
## References
|
||||
|
||||
* ADR-101 — broadband classifier; this ADR adds a parallel channel.
|
||||
* ADR-102 — NBVI; this ADR adds Step 3 validation per Pace's spec.
|
||||
* ADR-103 — persistent baseline; `per_subcarrier_mean` already written.
|
||||
* ADR-107 — REST calibrate endpoint; how the operator refreshes the
|
||||
per-sub vector on demand.
|
||||
* [`docs/references/espectre-techniques.md`](../references/espectre-techniques.md)
|
||||
§1.Step 3.
|
||||
|
|
@ -0,0 +1,192 @@
|
|||
# ADR-105 — No Synthetic Data in Production Runtime
|
||||
|
||||
**Status**: Accepted
|
||||
**Date**: 2026-05-17
|
||||
**Scope**: `v2/crates/wifi-densepose-sensing-server/src/main.rs`
|
||||
(REST handlers under `/api/v1/pose/*`, `/api/v1/info`,
|
||||
`derive_pose_from_sensing`, `generate_signal_field`).
|
||||
|
||||
## Context
|
||||
|
||||
After we pulled the upstream Docker UI (`ruvnet/wifi-densepose:latest`)
|
||||
and pointed it at our backend via `--ui-path /tmp/wdp_ui/ui`, the
|
||||
operator inspected the rich SPA and noticed several panels showing
|
||||
data we have no business showing:
|
||||
|
||||
* **Pose dashboard rendered a 17-keypoint skeleton** even though no
|
||||
DensePose model is loaded. Trace: `derive_pose_from_sensing` →
|
||||
`derive_single_person_pose` synthesised a geometric placeholder
|
||||
with keypoint `confidence = 0.0` but plausible-looking coordinates.
|
||||
* **`/api/v1/pose/stats.average_confidence` was the literal `0.87`**
|
||||
hard-coded in the handler.
|
||||
* **`/api/v1/pose/zones/summary` invented four zones** (`zone_1..4`)
|
||||
marked `clear`, even though no zone configuration exists on this
|
||||
deployment.
|
||||
* **`/api/v1/info.features.pose_estimation` was permanently `true`**
|
||||
regardless of whether a model was actually loaded.
|
||||
* **`SignalField` (the 20×20 room-heatmap in WS payload) was
|
||||
procedurally generated** by mapping subcarrier index `k` to angle
|
||||
`2π·k/N` and dropping Gaussian hotspots at radius proportional to
|
||||
variance. A single sensor has no directional information — the
|
||||
resulting heatmap had no correspondence to where anything actually
|
||||
was in the room. UI rendered a believable spatial visual that was
|
||||
entirely a fiction.
|
||||
|
||||
All five were cosmetic noise hiding the real capability gap. Operator
|
||||
asked for boots-on-the-ground honesty: surface real ESP32-derived
|
||||
state and nothing else.
|
||||
|
||||
## Decisions
|
||||
|
||||
### D1 — `derive_pose_from_sensing` returns empty
|
||||
|
||||
The function body is now `Vec::new()`. The legacy heuristic
|
||||
(`derive_single_person_pose` + bone-length tables) is unreachable
|
||||
from production paths but left in the source for the day a real
|
||||
trained pose model is wired in. All call sites compile unchanged
|
||||
and just get an empty vector when there is no model.
|
||||
|
||||
### D2 — `/api/v1/pose/current` gated on `model_loaded`
|
||||
|
||||
```rust
|
||||
let persons = if s.model_loaded {
|
||||
s.latest_update.as_ref().and_then(|u| u.persons.clone()).unwrap_or_default()
|
||||
} else {
|
||||
Vec::new()
|
||||
};
|
||||
```
|
||||
|
||||
Response now includes `"model_loaded": false` so the UI can decide
|
||||
whether to render a placeholder ("No pose model loaded") or hide the
|
||||
panel entirely.
|
||||
|
||||
### D3 — `/api/v1/pose/stats` drops the fake confidence
|
||||
|
||||
The hard-coded `"average_confidence": 0.87` is removed. Only
|
||||
counters that come from real frame ingest remain
|
||||
(`total_detections`, `frames_processed`) plus `model_loaded`.
|
||||
|
||||
### D4 — `/api/v1/pose/zones/summary` reports actual zone state
|
||||
|
||||
```json
|
||||
{ "presence": <real>, "zones_configured": 0, "zones": {} }
|
||||
```
|
||||
|
||||
No more invented `zone_1..4`. When the operator configures real
|
||||
zones (open work), they get added here.
|
||||
|
||||
### D5 — `/api/v1/info.features.pose_estimation` reflects reality
|
||||
|
||||
```rust
|
||||
"pose_estimation": s.model_loaded,
|
||||
```
|
||||
|
||||
### D6 — `generate_signal_field` returns zero-filled grid
|
||||
|
||||
The body is now:
|
||||
|
||||
```rust
|
||||
let grid = 20usize;
|
||||
return SignalField {
|
||||
grid_size: [grid, 1, grid],
|
||||
values: vec![0.0; grid * grid],
|
||||
};
|
||||
```
|
||||
|
||||
UI renders blank instead of a synthesised spatial map. This is the
|
||||
truthful state until a real multistatic localizer is wired (per
|
||||
ADR-008 multi-AP attention or the `MultistaticFuser` already in
|
||||
state). 77 lines of procedural-art code deleted.
|
||||
|
||||
## Files Touched
|
||||
|
||||
```
|
||||
v2/crates/wifi-densepose-sensing-server/src/main.rs
|
||||
- fn api_info (D5)
|
||||
- fn pose_current (D2)
|
||||
- fn pose_stats (D3)
|
||||
- fn pose_zones_summary (D4)
|
||||
- fn derive_pose_from_sensing (D1)
|
||||
- fn generate_signal_field (D6)
|
||||
docs/adr/ADR-105-no-synthetic-data-in-production-runtime.md (this)
|
||||
```
|
||||
|
||||
Two commits:
|
||||
|
||||
* `9aa027e9` — D1..D5 (REST handlers + `derive_pose_from_sensing`)
|
||||
* `30244d27` — D6 (`generate_signal_field` stub)
|
||||
|
||||
## Verified Acceptance
|
||||
|
||||
`/api/v1/sensing/latest` snapshot, deployment idle:
|
||||
|
||||
```
|
||||
signal_field grid=[20,1,20], 400 values, 0 non-zero (was: random hotspots)
|
||||
pose_keypoints null (was: 17-point heuristic)
|
||||
persons null (was: synthesised array)
|
||||
posture null (was: heuristic string)
|
||||
signal_quality_score null
|
||||
enhanced_motion null
|
||||
vital_signs.br_bpm null (smoothed_br ≤ 1.0)
|
||||
vital_signs.hr_bpm null
|
||||
|
||||
— still real —
|
||||
features.mean_rssi -59 dBm ✓
|
||||
features.variance 8.64 ✓
|
||||
classification absent / present_still / present_moving / active per ADR-101
|
||||
```
|
||||
|
||||
`/api/v1/pose/current`:
|
||||
|
||||
```json
|
||||
{"persons": [], "total_persons": 0, "model_loaded": false, "source": "esp32"}
|
||||
```
|
||||
|
||||
`/api/v1/info`:
|
||||
|
||||
```json
|
||||
{"features": {..., "pose_estimation": false, ...}}
|
||||
```
|
||||
|
||||
## Out of scope (already correct or developer-mode)
|
||||
|
||||
* `--source simulate` already exits with code 2 (parallel agent change).
|
||||
* `--pretrain` / `--train` synthetic-fallback paths are explicit
|
||||
dev-mode CLI flags. They never touch the runtime sensing path and
|
||||
are out of scope for this ADR.
|
||||
* `vital_signs` was already gated: `breathing_rate_bpm = Some(_)` only
|
||||
when smoothed value > 1.0 BPM; otherwise `None`. No spurious BPM
|
||||
reported.
|
||||
* `enhanced_motion` / `enhanced_breathing` / `bssid_count` come from
|
||||
`pipeline.process(&multi_ap_frame)` which consumes real CSI. When
|
||||
the multi-BSSID pipeline is inactive they are `None`. Left alone.
|
||||
|
||||
## Open Items
|
||||
|
||||
* **UI badges for "no model"** — `raw.html` already renders correctly
|
||||
on empty pose data; the richer Docker UI still tries to render a
|
||||
skeleton from `pose_current` even when the array is empty. Need
|
||||
a small UI patch: hide the pose canvas when `model_loaded == false`.
|
||||
|
||||
## Closed
|
||||
|
||||
* **Honest `enhanced_*` fields** — both `enhanced_motion` and
|
||||
`enhanced_breathing` now carry a uniform `n_aps_used: u8` field
|
||||
alongside the legacy `contributing_bssids` / `bssid_count`
|
||||
counts. Consumers can gate on `n_aps_used >= 2` before trusting a
|
||||
multi-AP enhancement. (commit 598a4b2f)
|
||||
* **Real signal_field via multistatic fusion** — shipped in ADR-112.
|
||||
When ≥ 2 ESP32 nodes are active, `MultistaticFuser` output drives
|
||||
a coverage × activity 20×20 heatmap (isotropic Gaussian per node
|
||||
position, gated by `cv²(fused_amplitude) × cross_node_coherence`).
|
||||
Single-sensor / fusion-fail paths still return ADR-105's zero
|
||||
grid. Map is honestly framed as coverage, not target position.
|
||||
|
||||
## References
|
||||
|
||||
* ADR-101 — classifier (only emits real-derived `motion_level`).
|
||||
* ADR-103 — persistent baseline (only emits real-derived
|
||||
baseline/threshold).
|
||||
* [`docs/references/espectre-gap-analysis.md`](../references/espectre-gap-analysis.md)
|
||||
— separate item list for what would replace each of the now-empty
|
||||
outputs with real data.
|
||||
|
|
@ -0,0 +1,161 @@
|
|||
# ADR-106 — Full Complex CSI in WS + Managed-Ping Keepalive
|
||||
|
||||
**Status**: Accepted
|
||||
**Date**: 2026-05-17
|
||||
**Scope**: `v2/crates/wifi-densepose-sensing-server/src/main.rs`
|
||||
(`NodeInfo` struct, `NodeState`, `udp_receiver_task`,
|
||||
`csi_keepalive_task`, CLI `--csi-keepalive-pps`).
|
||||
|
||||
## Context
|
||||
|
||||
The operator's instruction: *"work without a model for now, but make
|
||||
sure the sensors give us everything described in the parent repo so
|
||||
the future model — and fine-motion detection right now — has full
|
||||
signal."* Two gaps stood between the live deployment and that goal:
|
||||
|
||||
1. **WS NodeInfo carried only amplitude.** The 56-bin per-subcarrier
|
||||
`amplitude` vector was exposed, but the equally-important
|
||||
`phases` vector (radians, `atan2(Q, I)`) was parsed by
|
||||
`parse_esp32_frame` and then silently dropped. Vital-signs FFT on
|
||||
phase, MERIDIAN-style hardware normalization, and any future
|
||||
DensePose-class model expect the full complex `H[k] = A_k · e^{jφ_k}`.
|
||||
2. **Raw CSI rate depended on an ad-hoc shell `ping`.** With nothing
|
||||
sending unicast traffic to the sensors, beacon-only rate dropped
|
||||
to ~0.3 fps — too slow even for breathing-band FFT. The operator
|
||||
was running `ping -i 0.05 192.168.0.101 &` by hand; if Mac switched
|
||||
network, it died.
|
||||
|
||||
## Decisions
|
||||
|
||||
### D1 — Expose phases + noise_floor + n_antennas + µs timestamp in `NodeInfo`
|
||||
|
||||
Four new fields, each `#[serde(skip_serializing_if = empty/zero)]` so
|
||||
feature_state ticks (no raw CSI) stay slim:
|
||||
|
||||
```rust
|
||||
phases: Vec<f64>, // atan2(Q, I), radians
|
||||
n_antennas: u8, // RX antenna count
|
||||
noise_floor_dbm: i8, // RX noise floor
|
||||
timestamp_us: u64, // sensor-side µs timestamp
|
||||
```
|
||||
|
||||
This is the same data we already parse out of `0xC511_0001` frames
|
||||
in `parse_esp32_frame`; previously we threw `phases` away and never
|
||||
even surfaced `noise_floor` to the WS envelope. Consumers
|
||||
reconstruct the complex CSI with `H[k] = amplitude[k] · (cos(phases[k]) + j·sin(phases[k]))`.
|
||||
|
||||
### D2 — Per-node stash on `NodeState`
|
||||
|
||||
`NodeState` gains four new fields:
|
||||
`latest_phases: Option<Vec<f64>>`, `latest_noise_floor: i8`,
|
||||
`latest_timestamp_us: u64`, `latest_n_antennas: u8`. Populated on
|
||||
every raw-CSI frame in the second raw-CSI path
|
||||
(`udp_receiver_task` → raw CSI branch). `build_node_features` and
|
||||
the raw-CSI SensingUpdate builder both read from this stash to
|
||||
populate the new `NodeInfo` fields uniformly. Avoids carrying a
|
||||
full per-subcarrier phase history buffer — we only need the most
|
||||
recent vector for the UI / classifier; FFT consumers can build their
|
||||
own window.
|
||||
|
||||
### D3 — Built-in keepalive via managed `ping` children
|
||||
|
||||
`csi_keepalive_task` async task:
|
||||
|
||||
1. Watches `NODE_ADDRS` (per-node sender address, populated on every
|
||||
recv_from via a cheap magic-byte peek).
|
||||
2. For each known node, spawns one `ping -i <interval> <ip>` child
|
||||
process (`/sbin/ping` on macOS, `/usr/bin/ping` on Linux).
|
||||
3. Re-spawns the child if it dies or if the sensor's IP changes
|
||||
(DHCP rotation).
|
||||
4. Default rate `--csi-keepalive-pps 25` → `-i 0.040` for `ping`.
|
||||
`--csi-keepalive-pps 0` disables.
|
||||
|
||||
### D4 — Why ICMP, not UDP
|
||||
|
||||
We first tried a UDP-based keepalive (`sock.send_to(&[0], src_addr)`
|
||||
to the sensor's ephemeral source port). On the operator's deployment
|
||||
(ESP32-S3 + TP-Link WISP) it did **not** drive raw CSI: the sensor's
|
||||
UDP stack rejected the closed-port packet before the CSI callback
|
||||
fired in the WiFi RX path. ICMP echo bypasses user-space port logic
|
||||
entirely — kernel WiFi RX handles it and the CSI callback fires
|
||||
regardless of any listener.
|
||||
|
||||
Trade-off accepted: shelling out to `/sbin/ping` is platform-
|
||||
specific. Linux containers must include `iputils-ping`; macOS has
|
||||
`/sbin/ping` built-in. We probe both paths at startup. A pure-Rust
|
||||
raw-socket ICMP would avoid the dependency but needs root /
|
||||
`CAP_NET_RAW`.
|
||||
|
||||
## Files Touched
|
||||
|
||||
```
|
||||
v2/crates/wifi-densepose-sensing-server/src/main.rs
|
||||
- struct NodeInfo (+4 fields, helpers is_zero_*)
|
||||
- struct NodeState (+4 latest_* fields)
|
||||
- static NODE_ADDRS (per-node source address map)
|
||||
- fn csi_keepalive_task (managed ping pool)
|
||||
- udp_receiver_task (NODE_ADDRS populate via magic peek)
|
||||
- all NodeInfo {...} sites (5 — populate new fields)
|
||||
- Args { csi_keepalive_pps } (CLI flag, default 25)
|
||||
docs/adr/ADR-106-full-complex-csi-keepalive.md (this)
|
||||
```
|
||||
|
||||
Two implementation commits on the branch:
|
||||
|
||||
* `4daa2c9b` — D1 + D2 (WS struct, per-node stash, NodeInfo builders)
|
||||
* `8489efe9` — D3 + D4 (keepalive task, NODE_ADDRS, CLI flag)
|
||||
|
||||
## Verified Acceptance
|
||||
|
||||
Live, server fresh-restart, no shell `ping` running:
|
||||
|
||||
```
|
||||
boot: CSI keepalive: 25 ICMP pkt/s/node (interval 0.040s)
|
||||
boot: keepalive: learned address for node 1 = 192.168.0.101:60492
|
||||
boot: keepalive: learned address for node 2 = 192.168.0.100:51664
|
||||
+2 s: keepalive: ping -i 0.040 192.168.0.101 for node 1
|
||||
+2 s: keepalive: ping -i 0.040 192.168.0.100 for node 2
|
||||
|
||||
WS sample (5 s):
|
||||
node 1: 67.6 Hz updates, 55.6 Hz amp-bearing raw CSI
|
||||
node 2: 67.6 Hz updates, 55.6 Hz amp-bearing raw CSI
|
||||
```
|
||||
|
||||
NodeInfo per node now carries `amplitude[56]`, `phases[56]`,
|
||||
`rssi_dbm`, `noise_floor_dbm=-91`, `n_antennas=1`, plus the
|
||||
empty/zero-suppressed `timestamp_us` (FW doesn't yet emit it —
|
||||
left as a 0 placeholder).
|
||||
|
||||
Sampling rate 55 Hz comfortably covers breathing band (0.1–0.5 Hz)
|
||||
and heart-rate band (0.8–2 Hz) for FFT; with the phase vector now
|
||||
on the wire, those FFTs can run on phase as well as amplitude,
|
||||
which is more sensitive to chest-wall micrometric motion.
|
||||
|
||||
## Out of scope / open
|
||||
|
||||
* ✅ **FW-side µs timestamp** — closed in commit `b787f40a`. FW now
|
||||
appends `info->rx_ctrl.timestamp` (u32 LE) as 4 trailing bytes
|
||||
after I/Q data; server parses opportunistically (None for older
|
||||
FW). NodeInfo.timestamp_us now carries sensor monotonic µs when
|
||||
available, falls back to server SystemTime otherwise.
|
||||
* **Per-frame antenna selection** when ESP32-S3 reports >1 antenna —
|
||||
current FW hard-codes `n_antennas=1` in `csi_collector.c`. Single-
|
||||
antenna deployments are unaffected.
|
||||
* **TP-Link queue limits** — at 55 Hz × 2 nodes = 110 raw frames/s,
|
||||
plus 25 pings/s × 2 = 50 ICMP/s, all going through one consumer-
|
||||
grade AP. Watching for saturation. Reduce `--csi-keepalive-pps` if
|
||||
the AP starts dropping.
|
||||
* **Channel hopping** (ADR-029) would give frequency diversity. Single-
|
||||
channel works fine for one room.
|
||||
|
||||
## References
|
||||
|
||||
* ADR-100 — gain lock (the stability baseline keepalive needs).
|
||||
* ADR-101 — classifier (consumes phase via per-node amplitudes; future
|
||||
micro-motion detector will pull phase too).
|
||||
* ADR-103 — persistent baseline (loaded at server boot, unaffected
|
||||
by keepalive rate).
|
||||
* ADR-105 — no synthetic data (this ADR adds *more* real data, not
|
||||
more synthetic).
|
||||
* [`docs/references/espectre-gap-analysis.md`](../references/espectre-gap-analysis.md)
|
||||
— phase-aware processing is a prerequisite for several open items.
|
||||
|
|
@ -0,0 +1,186 @@
|
|||
# ADR-107 — REST Baseline Calibration + Auto-Recalibrate
|
||||
|
||||
**Status**: Accepted
|
||||
**Date**: 2026-05-17
|
||||
**Scope**: `v2/crates/wifi-densepose-sensing-server/src/main.rs`
|
||||
(`baseline_get`, `baseline_calibrate`, `auto_recalibrate_task`,
|
||||
`capture_baseline_to_disk`, `BASELINE_BUS`), `static/raw.html`
|
||||
(`calibrate empty` button), CLI flags
|
||||
`--auto-recalibrate-quiet-sec` / `--auto-recalibrate-min-age-sec`.
|
||||
|
||||
## Context
|
||||
|
||||
ADR-103 introduced a persistent empty-room baseline at
|
||||
`data/baseline.json` so the classifier no longer needed a 60 s warm-up
|
||||
after every server restart. To refresh it the operator had to:
|
||||
|
||||
1. Step out of the room.
|
||||
2. SSH / open a terminal, run `python scripts/record-baseline.py
|
||||
--duration 90`.
|
||||
3. Wait for the "saved" message.
|
||||
4. Restart the sensing-server (so it reloads the file).
|
||||
5. Walk back in.
|
||||
|
||||
Steps 2, 4 are friction. The operator asked to remove them so a
|
||||
fresh device that just wants to monitor a room doesn't need a CLI
|
||||
or a restart. Two changes:
|
||||
|
||||
* **`POST /api/v1/baseline/calibrate`** — fires the same record-and-
|
||||
trim pipeline from inside the server, hot-reloads the override map
|
||||
on success. UI button in `raw.html` triggers it.
|
||||
* **Auto-recalibrate background task** — silently refreshes the
|
||||
baseline when the classifier reports `absent` and CV stays low for
|
||||
a long-enough window, without any operator action.
|
||||
|
||||
## Decisions
|
||||
|
||||
### D1 — `capture_baseline_to_disk` in-process
|
||||
|
||||
Pure-Rust port of `scripts/record-baseline.py`:
|
||||
|
||||
1. Subscribe to `BASELINE_BUS` (a `tokio::sync::broadcast::Sender<String>`
|
||||
that mirrors every WS JSON message published by the broadcaster).
|
||||
2. Collect `duration_sec` of per-node `(t, amplitudes, rssi)`.
|
||||
3. Trim `trim_sec` from head and tail.
|
||||
4. Slide `clean_window_sec` window across, pick lowest-CV chunk per
|
||||
node.
|
||||
5. Compute FULL-broadband mean/p50/p95/std/CV% (same schema as
|
||||
ADR-103 v2; reload uses the same `load_baseline_file`).
|
||||
6. Write `data/baseline.json` (configurable via JSON body `out`).
|
||||
7. Call `load_baseline_file(path)` to hot-reload `AMP_BASELINE_OVERRIDE`
|
||||
and `AMP_BASELINE_CV`.
|
||||
|
||||
### D2 — `BASELINE_BUS` broadcast forwarder
|
||||
|
||||
Decouples baseline capture from individual WS clients. A small task
|
||||
spawned at startup subscribes to `AppState.tx` and re-publishes every
|
||||
message into `BASELINE_BUS`. Capture subscribers don't need a WS
|
||||
connection or any external network path.
|
||||
|
||||
### D3 — `POST /api/v1/baseline/calibrate`
|
||||
|
||||
Optional JSON body: `{ duration_sec, trim_sec, clean_window_sec, out }`.
|
||||
Defaults: 90 / 15 / 30 s and `data/baseline.json`. Returns immediately
|
||||
with `{ "started": true, "hint": "..." }`. Subsequent calls while a
|
||||
job is running return `{ "started": false, "reason": "calibration
|
||||
already running" }`.
|
||||
|
||||
### D4 — `GET /api/v1/baseline`
|
||||
|
||||
```json
|
||||
{
|
||||
"nodes": { "1": {"full_broadband_p95": …, "full_broadband_cv_pct": …}, … },
|
||||
"last_written_sec_ago": <i64>,
|
||||
"calibration_status": "idle" | "running" | "running (auto)"
|
||||
| "complete" | "complete (auto)" | "error: …"
|
||||
}
|
||||
```
|
||||
|
||||
UI polls this every 2 s while a calibration is running to drive the
|
||||
button state machine.
|
||||
|
||||
### D5 — Auto-recalibrate background task
|
||||
|
||||
Wakes every 5 s. State machine:
|
||||
|
||||
* Read latest `classification.motion_level` and `confidence` (=CV).
|
||||
* `quiet = (motion_level == "absent") && (cv < 0.08)`.
|
||||
* If `quiet` is true continuously for `--auto-recalibrate-quiet-sec`
|
||||
(default 1800 = 30 min) **AND** the last baseline write is older than
|
||||
`--auto-recalibrate-min-age-sec` (default 3600 = 1 h), kick off
|
||||
`capture_baseline_to_disk(90, 5, 45, "data/baseline.json")` in the
|
||||
background.
|
||||
* On error, log + set `calibration_status` so the UI surfaces it.
|
||||
|
||||
The 30-minute / 1-hour defaults are conservative: a person briefly
|
||||
walking through doesn't reset the baseline; long-term drift from
|
||||
WiFi reconfiguration or furniture rearrangement does. `--auto-
|
||||
recalibrate-quiet-sec 0` disables entirely.
|
||||
|
||||
### D6 — `raw.html` button
|
||||
|
||||
`calibrate empty` next to the existing `reset` button. Click →
|
||||
`confirm()` reminds operator to step out → POSTs the endpoint → polls
|
||||
status every 2 s, updating the inline pill `recording… 12/90 s` →
|
||||
`baseline updated ✓` on success. Disables itself while running.
|
||||
|
||||
## Files Touched
|
||||
|
||||
```
|
||||
v2/crates/wifi-densepose-sensing-server/src/main.rs
|
||||
- statics: BASELINE_LAST_WRITTEN, BASELINE_CALIBRATION_STATUS, BASELINE_BUS
|
||||
- fn capture_baseline_to_disk (D1)
|
||||
- fn auto_recalibrate_task (D5)
|
||||
- fn baseline_get (D4)
|
||||
- fn baseline_calibrate (D3)
|
||||
- routes /api/v1/baseline + /api/v1/baseline/calibrate
|
||||
- Args { auto_recalibrate_quiet_sec, auto_recalibrate_min_age_sec }
|
||||
- main(): bus init + auto-recalibrate spawn
|
||||
v2/crates/wifi-densepose-sensing-server/static/raw.html
|
||||
- <button id="calibrateBtn"> (D6)
|
||||
- <span id="calibStatus" class="pill"> (D6)
|
||||
- JS: startCalibrate(), polling loop
|
||||
docs/adr/ADR-107-auto-recalibrate-and-rest-baseline.md (this)
|
||||
```
|
||||
|
||||
One impl commit so far: `0f373467`. UI button + ADR are in this
|
||||
follow-up.
|
||||
|
||||
## Verified Acceptance
|
||||
|
||||
Boot log shows the new task wired:
|
||||
|
||||
```
|
||||
baseline: loaded 2 node overrides from data/baseline.json
|
||||
(node1=27.04, node2=14.72; node1_cv=2.62%, node2_cv=3.65%)
|
||||
Auto-recalibrate enabled: trigger after 1800s of `absent`+low-CV,
|
||||
min 3600s between writes
|
||||
CSI keepalive: 25 ICMP pkt/s/node (interval 0.040s)
|
||||
```
|
||||
|
||||
REST endpoints live:
|
||||
|
||||
```
|
||||
GET /api/v1/baseline → current state + last_written_sec_ago
|
||||
POST /api/v1/baseline/calibrate → { "started": true }
|
||||
```
|
||||
|
||||
End-to-end smoke test (5 s capture window for speed):
|
||||
|
||||
```
|
||||
POST → { started: true, duration_sec: 5 }
|
||||
… 8 s elapsed …
|
||||
GET → { calibration_status: "complete", last_written_sec_ago: 13 }
|
||||
file: /tmp/test_baseline.json contains n_samples=86 per node + full_broadband_*
|
||||
```
|
||||
|
||||
The hot-reload was visible immediately: `GET /api/v1/baseline.nodes`
|
||||
showed the new (capture-window) values before any server restart.
|
||||
|
||||
## Out of scope / open
|
||||
|
||||
* **UI: progress bar instead of pill text** — current state shows
|
||||
textual `recording… 12/90 s`. Could be a thin progress bar.
|
||||
* **Multiple baseline profiles** — only one `data/baseline.json` per
|
||||
server. Future: name-scoped baselines for different deployment
|
||||
contexts (day / night, summer / winter).
|
||||
* **Quiet detection that uses CV alone** — currently AND-gated with
|
||||
`motion_level == "absent"` which itself depends on the loaded
|
||||
baseline. Risk: if the loaded baseline is *bad*, classifier may
|
||||
never report `absent`, auto-recalibrate never fires. Mitigation:
|
||||
REST endpoint stays available; first call out of the box is always
|
||||
manual via the UI button.
|
||||
|
||||
## References
|
||||
|
||||
* ADR-100 — gain lock (the prerequisite that makes baseline meaningful).
|
||||
* ADR-101 — classifier whose `motion_level`/`confidence` drives the
|
||||
quiet-detector.
|
||||
* ADR-103 — persistent baseline file (this ADR adds two ways to
|
||||
refresh it).
|
||||
* ADR-105 — no synthetic data (auto-recalibrate is *real* data, not
|
||||
synthesized — it just runs without operator intervention).
|
||||
* ADR-106 — keepalive (ensures the capture window has enough raw CSI
|
||||
frames to give a meaningful percentile).
|
||||
* [`scripts/record-baseline.py`](../../scripts/record-baseline.py)
|
||||
— original CLI workflow, kept for headless use.
|
||||
|
|
@ -0,0 +1,177 @@
|
|||
# ADR-108 — FW NVS Persistence of Gain-Lock Values
|
||||
|
||||
**Status**: Accepted
|
||||
**Date**: 2026-05-17
|
||||
**Scope**: `firmware/esp32-csi-node/main/csi_collector.c`
|
||||
(`rv_gain_load_from_nvs`, `rv_gain_save_to_nvs`, NVS hook in
|
||||
`rv_gain_lock_process`).
|
||||
|
||||
## Context
|
||||
|
||||
ADR-100 introduced the FW-side gain-lock (AGC + FFT scale) but the
|
||||
calibration runs on *every* boot:
|
||||
|
||||
1. Collect 300 packets (~3 s at 100 pps, but realistically 6-12 s
|
||||
in production where keepalive drives only 25 pps).
|
||||
2. Take the median of AGC and FFT samples.
|
||||
3. Call `phy_force_rx_gain` / `phy_fft_scale_force` to freeze.
|
||||
|
||||
This means after every reboot — OTA, power blip, watchdog — the chip
|
||||
goes through 6-12 s where CSI is generated with **unlocked AGC** that
|
||||
drifts ±20–30 % (the very artefact gain-lock was meant to suppress).
|
||||
The operator's classifier, ADR-101's NBVI selector, and ADR-103's
|
||||
baseline comparison all see noisy data during that warm-up.
|
||||
|
||||
Pace's ESPectre persists everything calibration-related to NVS so
|
||||
post-reboot the sensor is back in detect mode in well under a
|
||||
second. This ADR ports the gain-lock half of that policy
|
||||
(NBVI lives server-side in RuView, doesn't apply).
|
||||
|
||||
## Decisions
|
||||
|
||||
### D1 — NVS namespace + keys
|
||||
|
||||
```c
|
||||
#define RV_GAIN_NVS_NS "csi_cfg"
|
||||
#define RV_GAIN_NVS_K_AGC "gl_agc" // u8
|
||||
#define RV_GAIN_NVS_K_FFT "gl_fft" // i8
|
||||
```
|
||||
|
||||
`csi_cfg` is the same namespace the WiFi creds / collector IP / node_id
|
||||
live in (so it's already initialised + checked by `nvs_config_load`).
|
||||
Two single-byte values — minimal NVS footprint.
|
||||
|
||||
### D2 — Two thin helpers
|
||||
|
||||
```c
|
||||
static esp_err_t rv_gain_load_from_nvs(uint8_t *agc, int8_t *fft);
|
||||
static void rv_gain_save_to_nvs(uint8_t agc, int8_t fft);
|
||||
```
|
||||
|
||||
Both are local to `csi_collector.c`. Load returns `ESP_ERR_NVS_NOT_FOUND`
|
||||
on a fresh chip; save logs a warning but never blocks the boot path
|
||||
if NVS write fails.
|
||||
|
||||
### D3 — One-shot NVS load at top of `rv_gain_lock_process`
|
||||
|
||||
A static `s_nvs_checked` flag triggers exactly **one** load attempt
|
||||
on the first packet after boot:
|
||||
|
||||
```c
|
||||
if (!s_nvs_checked) {
|
||||
s_nvs_checked = true;
|
||||
uint8_t agc; int8_t fft;
|
||||
if (rv_gain_load_from_nvs(&agc, &fft) == ESP_OK
|
||||
&& agc >= RV_GAIN_MIN_SAFE_AGC)
|
||||
{
|
||||
phy_fft_scale_force(true, fft);
|
||||
phy_force_rx_gain(1, (int)agc);
|
||||
s_gain_locked = true;
|
||||
ESP_LOGI(TAG, "gain-lock RESTORED from NVS: AGC=%u FFT=%d", agc, fft);
|
||||
return;
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
The `agc >= RV_GAIN_MIN_SAFE_AGC` guard preserves ADR-100's "skip if
|
||||
signal too strong" safety: a stale low-AGC value that would freeze
|
||||
the RX path is rejected even if it's in NVS.
|
||||
|
||||
### D4 — Save after every successful lock
|
||||
|
||||
The existing `phy_*_force` branch in `rv_gain_lock_process` is wrapped
|
||||
with a save call:
|
||||
|
||||
```c
|
||||
phy_fft_scale_force(true, s_gain_fft_value);
|
||||
phy_force_rx_gain(1, (int)s_gain_agc_value);
|
||||
rv_gain_save_to_nvs(s_gain_agc_value, s_gain_fft_value);
|
||||
ESP_LOGI(TAG, "gain-lock PERSISTED to NVS (%s/%s, %s)",
|
||||
RV_GAIN_NVS_NS, RV_GAIN_NVS_K_AGC, RV_GAIN_NVS_K_FFT);
|
||||
```
|
||||
|
||||
So the first boot ever does the full 300-packet calibration **and**
|
||||
saves; every subsequent boot loads instantly from D3.
|
||||
|
||||
### D5 — Invalidation policy
|
||||
|
||||
Stored values are tied to: this sensor's physical location + this AP's
|
||||
MAC + this channel + this antenna orientation. If any of those change,
|
||||
the saved AGC/FFT may be slightly off-optimal — but **not dangerous**.
|
||||
The WiFi PHY just receives slightly off-optimal CSI; the host will
|
||||
see higher baseline noise until the operator triggers a re-calibration.
|
||||
|
||||
Today: erase via `idf.py erase-flash` over USB, or `nvs_flash_erase()`
|
||||
called from a future REST endpoint. No automatic invalidation — the
|
||||
operator decides when a deployment change is significant enough.
|
||||
|
||||
## Files Touched
|
||||
|
||||
```
|
||||
firmware/esp32-csi-node/main/csi_collector.c
|
||||
- #include "nvs.h" / "nvs_flash.h"
|
||||
- rv_gain_load_from_nvs / rv_gain_save_to_nvs (D2)
|
||||
- s_nvs_checked one-shot in rv_gain_lock_process (D3)
|
||||
- save call after lock branch (D4)
|
||||
docs/adr/ADR-108-fw-nvs-persist-gain-lock.md (this)
|
||||
```
|
||||
|
||||
Implementation commit: `3779bb76`. Flashed to both sensors via OTA
|
||||
(no USB) — `python3 scripts/ota-deploy.sh`.
|
||||
|
||||
## Verified Acceptance
|
||||
|
||||
Test sequence:
|
||||
|
||||
1. OTA flash new FW to both nodes (first boot, NVS empty).
|
||||
2. Wait 15 s for FW to complete first calibration + write to NVS.
|
||||
3. OTA flash the SAME binary again (forces a reboot; new FW has
|
||||
values in NVS from step 2).
|
||||
4. Sample WS amplitude rate in the first 3 s after the second boot.
|
||||
|
||||
Before this ADR: ~5-12 s gap between boot and first amp-bearing WS
|
||||
frame (waiting for fresh calibration). After this ADR: WS shows
|
||||
**44 Hz raw CSI in the first 3 s** — instant resume.
|
||||
|
||||
Logs from a chip that has values in NVS:
|
||||
|
||||
```
|
||||
I (335) main: boot: reset_reason=SW running_partition=ota_1
|
||||
I (520) csi_collector: gain-lock RESTORED from NVS: AGC=44 FFT=-33
|
||||
(0-packet calibration; clear NVS to recalibrate)
|
||||
```
|
||||
|
||||
vs first-boot ever:
|
||||
|
||||
```
|
||||
I (335) main: boot: reset_reason=POWERON running_partition=ota_0
|
||||
I (4980) csi_collector: gain-lock APPLIED: AGC=44 FFT=-33
|
||||
(median of 300 packets)
|
||||
I (4980) csi_collector: gain-lock PERSISTED to NVS (csi_cfg/gl_agc, gl_fft)
|
||||
```
|
||||
|
||||
## Open Items
|
||||
|
||||
* **Per-channel cache** — `csi_cfg/gl_<chan>_agc`. If the channel hop
|
||||
table (ADR-029) is reactivated, each channel needs its own values.
|
||||
~1 h FW. Deferred — channel hopping is out of scope for the current
|
||||
single-channel deployment.
|
||||
|
||||
## Closed
|
||||
|
||||
* **REST endpoint to clear gain-lock NVS** — shipped via
|
||||
`POST /ota/recalibrate` in ADR-109.
|
||||
* **Track AP MAC alongside AGC/FFT** — shipped via `gl_ap_mac` NVS key
|
||||
+ boot-time comparison in ADR-109.
|
||||
|
||||
## References
|
||||
|
||||
* ADR-100 — gain-lock implementation that this ADR persists.
|
||||
* ADR-101 — classifier that suffers during the 6-12 s warm-up gap
|
||||
that this ADR closes.
|
||||
* `docs/references/ota-pipeline.md` — the WiFi flash flow used to
|
||||
deploy this FW change without USB.
|
||||
* Francesco Pace, *How I Turned My Wi-Fi Into a Motion Sensor —
|
||||
Part 2*, "Persisted calibration" — the upstream pattern this ADR
|
||||
ports (their NVS payload also includes NBVI indices + baseline,
|
||||
which RuView keeps server-side).
|
||||
|
|
@ -0,0 +1,145 @@
|
|||
# ADR-109 — FW Gain-Lock Invalidation (REST trigger + AP-MAC binding)
|
||||
|
||||
**Status**: Accepted
|
||||
**Date**: 2026-05-17
|
||||
**Scope**: `firmware/esp32-csi-node/main/ota_update.c`,
|
||||
`firmware/esp32-csi-node/main/csi_collector.c`. Closes both Open Items in
|
||||
ADR-108.
|
||||
|
||||
## Context
|
||||
|
||||
ADR-108 persists the FW-side gain-lock (AGC + FFT scale) to NVS so a
|
||||
reboot resumes detect mode in ~0.5 s. Two follow-ups remained:
|
||||
|
||||
1. **No way to clear the cache without USB.** When an operator moved a
|
||||
sensor or swapped the AP, they had to plug the device in and run
|
||||
`idf.py erase-flash` to force a re-calibration. Defeats the whole
|
||||
point of OTA-only ops.
|
||||
2. **No automatic invalidation on AP swap.** Gain-lock is tied to a
|
||||
specific RF path (AP location, distance, multipath). Connecting the
|
||||
same sensor to a different AP and re-using the cached AGC/FFT yields
|
||||
either over-saturated or under-amplified CSI for the whole session
|
||||
until manual intervention.
|
||||
|
||||
## Decisions
|
||||
|
||||
### D1 — `POST /ota/recalibrate` REST trigger
|
||||
|
||||
New HTTP handler registered on the existing port 8032 next to `/ota`
|
||||
and `/ota/status`. Same Bearer-token auth path as the firmware upload
|
||||
endpoint (reuses `ota_check_auth`).
|
||||
|
||||
Behaviour:
|
||||
|
||||
1. Open NVS namespace `csi_cfg` RW.
|
||||
2. Erase three keys: `gl_agc`, `gl_fft`, `gl_ap_mac` (D2).
|
||||
3. `nvs_commit` + close.
|
||||
4. Send `200 OK {status:"ok"}` JSON.
|
||||
5. `vTaskDelay(1 s)` to flush the response, then `esp_restart()`.
|
||||
|
||||
Next boot: `rv_gain_load_from_nvs` returns `ESP_ERR_NVS_NOT_FOUND` →
|
||||
the existing 300-packet calibration runs as on a never-calibrated chip.
|
||||
|
||||
### D2 — `gl_ap_mac` NVS key (6-byte blob)
|
||||
|
||||
Stored alongside `gl_agc` / `gl_fft` whenever the calibration writes
|
||||
back. Source: `esp_wifi_sta_get_ap_info(&ap).bssid`. Read at the same
|
||||
moment as AGC/FFT during the one-shot NVS short-circuit at the top of
|
||||
`rv_gain_lock_process`.
|
||||
|
||||
Comparison rule on boot:
|
||||
|
||||
| Saved MAC | Current AP MAC | Action |
|
||||
|--------------------|-------------------------|---------------------------------------|
|
||||
| all-zero (legacy) | any | Use cached gain-lock (wildcard match) |
|
||||
| matches current | same | Use cached gain-lock |
|
||||
| differs | any | Log warning, fall through to full cal |
|
||||
| any | AP info unavailable | Defensive: fall through to full cal |
|
||||
|
||||
The all-zero wildcard is the one-time upgrade case: NVS blobs written
|
||||
by ADR-108 builds predate ADR-109 and have no MAC. Treating them as
|
||||
match-anything avoids forcing every existing deployment to re-calibrate
|
||||
on the first ADR-109 boot. The next save (post-re-cal or at the next
|
||||
natural calibration trigger) populates the real MAC, after which the
|
||||
strict comparison applies.
|
||||
|
||||
### D3 — `rv_gain_save_to_nvs` writes MAC too
|
||||
|
||||
Signature changes from `(uint8_t agc, int8_t fft)` to
|
||||
`(uint8_t agc, int8_t fft, const uint8_t mac[6])`. The caller reads
|
||||
`ap.bssid` at save time so the saved MAC reflects the AP the
|
||||
calibration actually ran against (not whatever AP the sensor is
|
||||
connected to N seconds later, which on a roaming-capable mesh could
|
||||
differ).
|
||||
|
||||
If the save-time AP MAC is unavailable (extremely rare — the gain-lock
|
||||
hook only fires from a CSI callback, and CSI callbacks require an
|
||||
established WiFi link), the saved MAC is left as all-zero. The next
|
||||
boot then takes the wildcard path, preserving the current behaviour
|
||||
rather than failing closed.
|
||||
|
||||
### D4 — Recalibrate handler also clears `gl_ap_mac`
|
||||
|
||||
Even though removing only AGC/FFT would force a re-cal by virtue of
|
||||
the missing keys, also erasing `gl_ap_mac` is cleaner: the next write
|
||||
will repopulate it from the current AP, and there's no stale MAC
|
||||
sitting in NVS that could be partially restored by a future bug.
|
||||
|
||||
## Trade-offs
|
||||
|
||||
* **One-time false re-cal on first ADR-109 boot for chips that ever
|
||||
saw an AP swap before this ADR shipped.** Acceptable: gain-lock
|
||||
re-cal takes 6-12 s and produces a brief noise spike, but it's a
|
||||
one-time event and the result is correct from that point onward.
|
||||
* **No multi-AP cache.** If a sensor roams between two APs (rare in
|
||||
this deployment: each sensor is parked next to a fixed TP-Link)
|
||||
it will re-calibrate on every AP swap. Multi-AP storage would need
|
||||
per-AP-MAC sub-keys (`gl_agc:<bssid>`, etc.) — explicitly out of
|
||||
scope; cross-references ADR-108's per-channel cache item which has
|
||||
the same "wait until needed" disposition.
|
||||
* **`gl_ap_mac` blob doubles NVS size of the gain-lock bundle from
|
||||
2 bytes to 8 bytes.** Negligible — the gain-lock namespace `csi_cfg`
|
||||
already holds SSID/password/IP and a few other keys totalling a few
|
||||
hundred bytes.
|
||||
|
||||
## Files Touched
|
||||
|
||||
```
|
||||
firmware/esp32-csi-node/main/ota_update.c
|
||||
- ota_recalibrate_handler (D1, D4)
|
||||
- register POST /ota/recalibrate
|
||||
|
||||
firmware/esp32-csi-node/main/csi_collector.c
|
||||
- RV_GAIN_NVS_K_AP_MAC define (D2)
|
||||
- rv_gain_load_from_nvs: optional MAC out-param + wildcard support
|
||||
- rv_gain_save_to_nvs: MAC in-param + nvs_set_blob (D3)
|
||||
- rv_gain_lock_process: AP-MAC comparison branch (D2)
|
||||
- rv_gain_lock_process: read current bssid before save (D3)
|
||||
|
||||
docs/adr/ADR-109-fw-gain-lock-invalidation.md (this)
|
||||
```
|
||||
|
||||
## Verified Acceptance
|
||||
|
||||
1. `idf.py build` clean (only the pre-existing `wifi_promiscuous_cb`
|
||||
unused warning, unchanged by this ADR).
|
||||
2. After OTA flash of both nodes:
|
||||
* `curl -X POST http://192.168.0.100:8032/ota/recalibrate`
|
||||
* `curl -X POST http://192.168.0.101:8032/ota/recalibrate`
|
||||
Both return `{"status":"ok","message":"gain-lock NVS cleared; rebooting"}`.
|
||||
3. Boot log on next reboot shows `gain-lock APPLIED:` (full cal) +
|
||||
`gain-lock PERSISTED to NVS (AGC=N FFT=M AP=…)` instead of the
|
||||
`gain-lock RESTORED from NVS:` line that fast-path boots produce.
|
||||
4. AP-swap path verified by manually flipping the WiFi credentials to
|
||||
a different SSID via `provision.py`, re-flashing, and confirming
|
||||
the boot log shows `gain-lock NVS MISS: saved AP=… → current=…
|
||||
Re-calibrating.` followed by a full cal.
|
||||
|
||||
## References
|
||||
|
||||
* ADR-108 — NVS persistence of gain-lock. Both Open Items in ADR-108
|
||||
resolved by this ADR (REST trigger, AP-MAC binding).
|
||||
* ADR-050 — OTA Bearer-token auth. Same `ota_check_auth` shared with
|
||||
the new endpoint.
|
||||
* `docs/references/ota-pipeline.md` — port 8032 recipe; gains a new
|
||||
bullet for `/ota/recalibrate`.
|
||||
|
|
@ -0,0 +1,160 @@
|
|||
# ADR-110 — TP-Link WISP Deployment + RSSI-Δ Presence Detector
|
||||
|
||||
**Status**: Accepted
|
||||
**Date**: 2026-05-15
|
||||
**Scope**: `v2/crates/wifi-densepose-sensing-server/`,
|
||||
deployment of TP-Link TL-WR841N as a dedicated CSI AP for room01/room02.
|
||||
|
||||
## Context
|
||||
|
||||
After ADR-098 made the RuView FW boot cleanly and FW5.47 fallback gave real
|
||||
motion, the deployed sensors still produced unreliable presence in the
|
||||
operator's home environment. Investigation revealed two compounding factors:
|
||||
|
||||
1. **Ambient WiFi noise.** Both sensors were associated with the main
|
||||
household AP (`Tran Thanh T3`), which is heavily used by neighbouring
|
||||
networks on the same channel. Per-frame broadband variance in an *empty*
|
||||
room measured higher than when the operator was sitting at the desk
|
||||
— the multipath geometry plus neighbour traffic dominated the CSI
|
||||
signal.
|
||||
2. **The wrong feature.** Even on a clean channel, CSI variance does not
|
||||
monotonically track human presence at multi-meter range. A stationary
|
||||
body modifies multipath consistently (variance drops), while an empty
|
||||
room exhibits more multipath spread (variance rises). The host DSP
|
||||
features `variance`, `motion_band_power`, and `spectral_power` all
|
||||
showed this inversion at the deployed sensor locations.
|
||||
|
||||
Three one-minute measurements collected with TP-Link as the isolated AP,
|
||||
sensors connected only to it:
|
||||
|
||||
| Feature | STILL (sitting) | WALK (room loop) | EMPTY |
|
||||
|---|---|---|---|
|
||||
| `variance` mean | 29.7 | 33.7 | **35.8** |
|
||||
| `motion_band_power` mean | 49.8 | 54.6 | **57.4** |
|
||||
| `spectral_power` mean | 161 | 172 | 172 |
|
||||
| `mean_rssi` mean (dBm) | -59.13 | -59.12 | -58.98 |
|
||||
| **`mean_rssi` std** | **0.60** | **1.02** | **0.35** |
|
||||
|
||||
Only **standard deviation of mean_rssi** monotonically separates the three
|
||||
states. The human body physically perturbs RF path loss to the sensor:
|
||||
absent → flat RSSI, still → small fluctuations from breathing/microtremor,
|
||||
walking → large per-second swings.
|
||||
|
||||
## Decisions
|
||||
|
||||
### D1 — Isolate sensors on a dedicated AP (TP-Link TL-WR841N, WISP mode)
|
||||
|
||||
The household AP serves dozens of clients across multiple channels and is
|
||||
constantly retransmitting management frames for neighbours and BT-coex
|
||||
overlay. We deployed a TP-Link TL-WR841N in **WISP mode**:
|
||||
|
||||
* TP-Link associates with `Tran Thanh T3` over WiFi as a single client.
|
||||
* TP-Link runs its own NAT and broadcasts a clean SSID (`TP-Link_8340`,
|
||||
WPA2-PSK, fixed channel) on the 2.4 GHz band.
|
||||
* Sensors are provisioned to associate only with `TP-Link_8340`.
|
||||
* TP-Link's NAT forwards their UDP/5006 packets to the Mac on the
|
||||
household subnet (Mac stays connected to `Tran Thanh T3` for internet,
|
||||
no LAN reconfiguration on the host side).
|
||||
|
||||
Empirical effect: per-minute broadband variance in an empty room dropped
|
||||
from **50.7** (on `Tran Thanh T3`) to **35.8** (on `TP-Link_8340`).
|
||||
|
||||
### D2 — Replace CSI-variance presence detector with rolling RSSI MAD-Δ
|
||||
|
||||
The host-side classifier in `sensing-server` runs `extract_features_from_frame`
|
||||
→ `smooth_and_classify` and outputs `motion_level` ∈ {`absent`, `present_still`,
|
||||
`present_moving`, `active`} based on a `motion_score` derived from CSI
|
||||
amplitude variance + temporal change-points. On the deployed geometry the
|
||||
score crosses thresholds for body-far-from-sensor cases but not for body-near-
|
||||
sensor stationary cases; the `present_still` band especially is unreliable.
|
||||
|
||||
We add an **RSSI-based override** layered after the existing classifier:
|
||||
|
||||
* Per-node rolling window of the last 120 frame RSSI samples (~10 s at
|
||||
12 Hz).
|
||||
* Metric: **mean absolute delta of consecutive RSSI values** (MAD-Δ).
|
||||
This is more robust than standard deviation for the int8-quantised RSSI
|
||||
the WiFi driver reports — a single 1-dB step in a quiet window
|
||||
inflates std but contributes minimally to MAD-Δ.
|
||||
* Thresholds (calibrated empirically; see D3):
|
||||
* `d < 0.20` → `absent`
|
||||
* `0.20 ≤ d < 0.55` → `present_still`
|
||||
* `0.55 ≤ d < 1.10` → `present_moving`
|
||||
* `d ≥ 1.10` → `active`
|
||||
* Confidence is surfaced as the raw `d` value during the tuning phase so
|
||||
that downstream UIs (the calibration console at `static/spectrum.html`)
|
||||
can drive threshold refinement on new deployments.
|
||||
|
||||
The CSI-based features are preserved in the `features.*` block so that
|
||||
downstream consumers (vital signs, signal-quality estimator, multi-node
|
||||
fusion) continue to operate.
|
||||
|
||||
### D3 — Threshold calibration via UI-assisted "tell me your state" protocol
|
||||
|
||||
Tunable thresholds are per-deployment. The procedure documented for the
|
||||
operator:
|
||||
|
||||
1. Open `http://localhost:8091/spectrum.html` (also reachable via Tailscale
|
||||
at the Mac's `100.x.y.z:8091`).
|
||||
2. Confidence on that page shows the raw RSSI-Δ for the user's environment.
|
||||
3. With a stopwatch:
|
||||
* Leave the room for 60 s. Record median `d`.
|
||||
* Sit at the workstation for 60 s. Record median `d`.
|
||||
* Walk the loop for 60 s. Record median `d`.
|
||||
4. Thresholds = midpoints between consecutive medians.
|
||||
|
||||
For the operator's room (TP-Link AP at `192.168.1.14`, sensors at .17 / .19):
|
||||
|
||||
| State | `d` median (target) | `d` measured (operator) |
|
||||
|---|---|---|
|
||||
| absent | should be near 0 | **0.49** (empty room) |
|
||||
|
||||
The operator's empty-room baseline of `d ≈ 0.49` is *higher* than the
|
||||
heuristic 0.20 threshold the code currently ships with. This is consistent
|
||||
with the int8 quantisation: even an empty channel jitters by ±1 dB
|
||||
across consecutive frames. Final threshold tuning for this deployment is
|
||||
**still pending** — the captures for `sit` and `walk` are needed to set
|
||||
the boundaries. The code surfaces `d` via `confidence` to let the
|
||||
operator capture those next two states.
|
||||
|
||||
## Files Touched
|
||||
|
||||
```
|
||||
v2/crates/wifi-densepose-sensing-server/src/main.rs # RSSI MAD-Δ + override
|
||||
v2/crates/wifi-densepose-sensing-server/static/spectrum.html # live console
|
||||
v2/crates/wifi-densepose-sensing-server/static/calibrate.html # peak-tracker view
|
||||
docs/adr/ADR-110-tplink-wisp-deployment-and-rssi-presence.md # this ADR
|
||||
```
|
||||
|
||||
## Verified Acceptance
|
||||
|
||||
| Criterion | Result |
|
||||
|---|---|
|
||||
| Sensors associate only with TP-Link AP (no `Tran Thanh T3` direct) | ✅ |
|
||||
| Mac receives UDP/5006 packets via TP-Link NAT | ✅ (~12 Hz combined) |
|
||||
| Empty-room ambient noise reduced vs household AP | ✅ (variance 50.7 → 35.8) |
|
||||
| `confidence` field carries raw RSSI-Δ for live tuning | ✅ |
|
||||
| Vital signs (breathing 9–11 BPM) continue to populate when occupied | ✅ |
|
||||
|
||||
## Open Items
|
||||
|
||||
* Threshold final-tune (sit + walk medians not yet measured on TP-Link).
|
||||
* Replace MAD-Δ with `quantile(|Δ|, 0.9) - quantile(|Δ|, 0.1)` if
|
||||
occasional packet-rate hiccups inflate the simple mean.
|
||||
* The TP-Link runs WISP NAT — all sensor source IPs collapse to one
|
||||
(`192.168.1.14` on the household side). The server discriminates nodes
|
||||
by **MAC address** parsed from the `CSI_LEAN` payload, not by source IP,
|
||||
so this works today. If we later switch FW back to raw `0xC5110001`
|
||||
binary frames (which carry MAC) the same discrimination holds. If
|
||||
`parse_esp32_vitals` (0xC5110002) becomes the upstream format,
|
||||
per-node state tracking needs a separate MAC-bearing field added to
|
||||
that packet.
|
||||
* On longer test sessions: the `motion_band_power` and `variance` features
|
||||
remain present in `features.*` and are useful for vital-sign signal-quality
|
||||
estimation; do not strip them.
|
||||
|
||||
## References
|
||||
|
||||
* ADR-039 — Edge intelligence pipeline (host DSP path).
|
||||
* ADR-098 — Earlier ESP32-S3 deployment fixes (CSI callback, OTA, mobile UI).
|
||||
* RuView issue thread on RSSI-vs-CSI presence inversion (this ADR).
|
||||
|
|
@ -0,0 +1,154 @@
|
|||
# ADR-112 — Multi-AP `signal_field` via `MultistaticFuser`
|
||||
|
||||
**Status**: Accepted
|
||||
**Date**: 2026-05-17
|
||||
**Scope**: `v2/crates/wifi-densepose-sensing-server/src/main.rs`
|
||||
(`signal_field_from_multistatic`, two ESP32 vitals call sites). Closes
|
||||
the "Real signal_field via multistatic fusion" Open Item in ADR-105.
|
||||
|
||||
## Context
|
||||
|
||||
ADR-105 D6 stripped the synthetic `signal_field` paint and left a 20×20
|
||||
zero grid in its place. The honesty contract was: never emit visual
|
||||
positional output without a physically grounded source. A real
|
||||
multistatic fuser (`MultistaticFuser` in `wifi-densepose-signal`) is
|
||||
already wired into the server via `multistatic_bridge::fuse_or_fallback`
|
||||
and consumed by `compute_person_score_from_amplitudes` — but its
|
||||
output didn't feed the `signal_field` heatmap.
|
||||
|
||||
This ADR consumes that fusion output to produce a *coverage × activity*
|
||||
spatial map when ≥ 2 ESP32 nodes are simultaneously active.
|
||||
|
||||
## What the new map honestly is (and isn't)
|
||||
|
||||
* **Is**: a 20×20 floor-plane heatmap where each cell value =
|
||||
Σ over active nodes of `global_activity · exp(-d²/2σ²)`, with `d`
|
||||
the Euclidean distance from the cell to that node's configured
|
||||
position, σ a fixed radius, and `global_activity` =
|
||||
`cv²(fused_amplitude) · cross_node_coherence`. Both factors live in
|
||||
`[0, 1]`; their product gates the field on simultaneous CSI
|
||||
modulation AND inter-node agreement.
|
||||
* **Is not**: a person-location estimate. Commodity ESP32s have no
|
||||
phase-coherent ranging (no UWB, no two-way ranging); any "target
|
||||
position" would be fabrication. The map shows *where the active
|
||||
sensors' coverage zones overlap when they collectively see
|
||||
modulation*. That's a real, derivable quantity. A "where is the
|
||||
person" claim is not, and is deliberately withheld.
|
||||
|
||||
## Decisions
|
||||
|
||||
### D1 — `signal_field_from_multistatic(fuser, node_states) -> SignalField`
|
||||
|
||||
New function in `main.rs`. Re-runs `multistatic_bridge::fuse_or_fallback`
|
||||
(cheap — attention-weighted mean across O(N_nodes × N_subcarriers)),
|
||||
discards the count-fallback path, and proceeds only when:
|
||||
|
||||
* `fused.active_nodes >= 2`, AND
|
||||
* `fused.node_positions` non-empty, AND
|
||||
* `fused.fused_amplitude` non-empty, AND
|
||||
* `global_activity > 1e-3` (everything below is rounding noise).
|
||||
|
||||
Otherwise returns the same zero-filled grid `generate_signal_field`
|
||||
produces. This preserves ADR-105's contract on single-sensor
|
||||
deployments and degenerate fusion failures.
|
||||
|
||||
### D2 — Render constants
|
||||
|
||||
* Grid `20 × 1 × 20` (matches the existing `SignalField` shape and the
|
||||
UI's heatmap consumer).
|
||||
* `ROOM_EXTENT_M = 3.0` m (half-width of the square the grid spans —
|
||||
6 m × 6 m floor). Matches the typical "operator room" dimension and
|
||||
the placement of the two physical sensors.
|
||||
* `SIGMA_M = ROOM_EXTENT_M / 4.0 = 0.75 m` for the isotropic Gaussian.
|
||||
Borrowed from Pace's ESPectre heuristic (his code uses ~room/4 for
|
||||
a similar overlap-rendering pass).
|
||||
* `(grid_x, grid_y) → (x, z)` projection — the WiFi sensors live in
|
||||
3D position space `[x, y, z]` where `y` is height, but the heatmap
|
||||
is a floor-plan view, so we ignore `y` and use `(x, z)`.
|
||||
|
||||
### D3 — `cv² × coherence` as the activity scalar
|
||||
|
||||
Two factors so that EITHER a quiet channel (low cv²) OR disagreeing
|
||||
sensors (low coherence) collapses the field to zeros. This means:
|
||||
|
||||
* Empty room (low cv²) → blank map. Truthful.
|
||||
* One sensor saw a transient (high cv² for one node, low coherence
|
||||
across nodes) → blank map. Truthful — no multistatic signal.
|
||||
* All sensors see synchronized modulation → bright map. Truthful —
|
||||
there really is something in the shared coverage.
|
||||
|
||||
The product is bounded in `[0, 1]`; we clamp each cell to `[0, 1]`
|
||||
post-sum because two overlapping gaussians can sum to > 1 in their
|
||||
shared region.
|
||||
|
||||
### D4 — Call-site contract: prefer multistatic, else zero
|
||||
|
||||
Both ESP32 vitals paths build the field as:
|
||||
|
||||
```rust
|
||||
let multi = signal_field_from_multistatic(&s.multistatic_fuser, &s.node_states);
|
||||
if multi.values.iter().any(|&v| v > 0.0) { multi } else { /* zero */ }
|
||||
```
|
||||
|
||||
A `multi` that is all-zero — either because `< 2` nodes are active or
|
||||
because the activity threshold wasn't met — gets discarded and the
|
||||
existing `generate_signal_field` zero is emitted. This keeps the
|
||||
output identical to today's behavior when the multistatic path can't
|
||||
produce signal, so no consumer is surprised.
|
||||
|
||||
The Windows WiFi / multi-BSSID paths (`windows_wifi_task`) are not
|
||||
touched: they have no per-node spatial positions, so the multistatic
|
||||
approach doesn't apply and they keep their zero grid.
|
||||
|
||||
## Trade-offs
|
||||
|
||||
* **Node positions must be configured.** The `--node-positions`
|
||||
CLI flag (`SENSING_NODE_POSITIONS` env) is the source of truth.
|
||||
If unset, `multistatic_fuser` has empty positions, so this ADR
|
||||
silently degrades to zero output — no user-visible regression.
|
||||
* **Coverage map ≠ target map.** Operators looking at the heatmap
|
||||
will be tempted to read it as "the person is here." Mitigation:
|
||||
the field is brightest *at the nodes themselves*, not between
|
||||
them, so the visual signature is "sensor coverage glow," not "blob
|
||||
in the middle of the room." A future ADR (e.g. ADR-115, RF
|
||||
tomography or RSSI MUSIC) could replace this with a real
|
||||
localizer; this ADR is the honest baseline that holds until then.
|
||||
* **σ is fixed.** A room-sized parameter should arguably scale with
|
||||
the inter-node distance, but until we have more than two sensors
|
||||
in one deployment that's premature parameter sprawl. The
|
||||
`ROOM_EXTENT_M` / `SIGMA_M` constants are intentionally
|
||||
hard-coded in one place to be easy to find and tune.
|
||||
|
||||
## Files Touched
|
||||
|
||||
```
|
||||
v2/crates/wifi-densepose-sensing-server/src/main.rs
|
||||
- signal_field_from_multistatic (D1, D2, D3)
|
||||
- two vitals-path call sites adopt the prefer-multistatic-else-zero
|
||||
contract (D4)
|
||||
|
||||
docs/adr/ADR-112-multi-ap-signal-field.md (this)
|
||||
docs/adr/ADR-105-no-synthetic-data-in-production-runtime.md
|
||||
- close "Real signal_field via multistatic fusion" Open Item
|
||||
```
|
||||
|
||||
## Verified Acceptance
|
||||
|
||||
* `cargo build --release -p wifi-densepose-sensing-server` clean.
|
||||
* `cargo test --release -p wifi-densepose-sensing-server
|
||||
--no-default-features` — 313 tests pass (no regressions).
|
||||
* With one sensor active, `signal_field.values` are all zero —
|
||||
matches ADR-105 behaviour.
|
||||
* With two sensors active and a person moving in shared coverage,
|
||||
the field is non-zero with bright cells overlapping at each
|
||||
sensor's footprint and tapering between them.
|
||||
|
||||
## References
|
||||
|
||||
* ADR-105 D6 — the "no synthetic signal_field" honesty contract.
|
||||
* `wifi_densepose_signal::ruvsense::multistatic::MultistaticFuser` —
|
||||
the upstream attention-weighted fuser this ADR consumes.
|
||||
* `multistatic_bridge::fuse_or_fallback` — the existing call path
|
||||
this ADR reuses.
|
||||
* Francesco Pace, *How I Turned My Wi-Fi Into a Motion Sensor —
|
||||
Part 2*, "Multi-AP heatmap" — the σ ≈ room/4 heuristic source.
|
||||
|
|
@ -0,0 +1,156 @@
|
|||
# ADR-113 — Multiple Baseline Profiles (Day/Night)
|
||||
|
||||
**Status**: Accepted
|
||||
**Date**: 2026-05-17
|
||||
**Scope**: `v2/crates/wifi-densepose-sensing-server/src/main.rs`
|
||||
(`resolve_baseline_profile`, `baseline_profile_watch`,
|
||||
`--baseline-profile` CLI flag). Closes the "Multiple baseline profiles"
|
||||
item in CHECKLIST.
|
||||
|
||||
## Context
|
||||
|
||||
The empty-room baseline that ADR-103 / ADR-104 store in
|
||||
`data/baseline.json` is captured at one point in time. The channel state
|
||||
it reflects is sensitive to:
|
||||
|
||||
* People walking through corridors / adjacent apartments at night vs.
|
||||
day (different building-wide ambient WiFi traffic).
|
||||
* AC / refrigerator compressor duty cycles (broadband noise at the
|
||||
~Hz scale that changes per-time-of-day).
|
||||
* Sunlight on building walls (~mm-scale thermal expansion changes
|
||||
multipath).
|
||||
|
||||
In the current deployment we observe the `absent` baseline mean shift
|
||||
by ~3-5 % between 14:00 and 04:00 — small but enough to push the CV
|
||||
of a stationary subcarrier across the ADR-103 threshold and trigger
|
||||
false `present_still` flags overnight.
|
||||
|
||||
A single baseline can't model both regimes simultaneously. The lowest-
|
||||
complexity fix is to keep two: a day baseline and a night baseline,
|
||||
loaded at startup and hot-swapped at the day/night boundary.
|
||||
|
||||
## Decisions
|
||||
|
||||
### D1 — `--baseline-profile` selector with four modes
|
||||
|
||||
```
|
||||
--baseline-profile {single,auto,day,night} (default: single)
|
||||
```
|
||||
|
||||
| Mode | Behaviour |
|
||||
|----------|--------------------------------------------------------------------------------------------|
|
||||
| `single` | Legacy. Load `RUVIEW_BASELINE_FILE` or `data/baseline.json`. No watch task. **Default.** |
|
||||
| `auto` | Pick day/night by local hour. Hot-reload at 07:00 / 21:00 transitions. |
|
||||
| `day` | Force `data/baseline.day.json`. No auto switching. |
|
||||
| `night` | Force `data/baseline.night.json`. No auto switching. |
|
||||
|
||||
Default is `single` so existing deployments don't have to migrate.
|
||||
Operators opt in by recording two profiles + flipping the flag.
|
||||
|
||||
### D2 — Day window: 07:00–20:59 local
|
||||
|
||||
Hard-coded for now. The split matches the ambient-WiFi pattern in
|
||||
this deployment (residential building, no commercial traffic).
|
||||
Tunable in code (future ADR can parameterise if a second deployment
|
||||
needs different hours), but a flag is premature parameter sprawl.
|
||||
|
||||
`chrono::Local::now().hour()` drives the choice — no UTC offset
|
||||
arithmetic; the OS provides the local hour directly.
|
||||
|
||||
### D3 — Filename convention
|
||||
|
||||
```
|
||||
data/baseline.day.json
|
||||
data/baseline.night.json
|
||||
data/baseline.json (legacy / single-profile fallback)
|
||||
```
|
||||
|
||||
Same JSON schema as ADR-103 v2 (`full_broadband_*`,
|
||||
`per_subcarrier_mean`, optionally `per_subcarrier_phase_mean` per
|
||||
ADR-104). The recording script and REST endpoint can write to any of
|
||||
the three paths via `--out` / `out` body field — no schema change.
|
||||
|
||||
### D4 — Missing-file fallback to `data/baseline.json`
|
||||
|
||||
If a requested profile file doesn't exist (e.g., operator set
|
||||
`--baseline-profile auto` but only recorded `baseline.json`), the
|
||||
server logs a warning and loads the legacy single-baseline file
|
||||
instead. This makes the migration path "set the flag, then start
|
||||
recording per-profile baselines one at a time" — no big-bang switch.
|
||||
|
||||
### D5 — Hot-reload via `baseline_profile_watch`
|
||||
|
||||
Background task fires every 5 min, re-resolves the profile, and if the
|
||||
profile tag changed (day → night or vice versa) calls
|
||||
`load_baseline_file` on the new path. `load_baseline_file` already
|
||||
hot-swaps in place — the per-node override maps and per-subcarrier
|
||||
baselines update without touching live frame ingest.
|
||||
|
||||
5 min cadence means transitions land within 5 min of the schedule —
|
||||
acceptable lag for a baseline whose channel-side variance is on the
|
||||
~hour timescale.
|
||||
|
||||
A `static` `CURRENT_BASELINE_PROFILE` mutex tracks the loaded tag so
|
||||
the watch avoids redundant disk reads when nothing changed.
|
||||
|
||||
### D6 — Watch is a no-op outside `auto`
|
||||
|
||||
`single`, `day`, and `night` modes don't need switching — those are
|
||||
"set once at startup". The watch task logs a one-line "disabled"
|
||||
message and returns immediately. Saves a tokio task slot and
|
||||
suppresses log noise on the common single-profile deployment.
|
||||
|
||||
## Trade-offs
|
||||
|
||||
* **Operator has to record two baselines.** Twice the operator time
|
||||
(~5 min × 2). Unavoidable for the use case.
|
||||
* **Hard-coded 07:00 / 21:00 split.** A different deployment (office,
|
||||
shift-work) would want different hours. Defer to a future ADR; for
|
||||
this deployment the residential cadence works.
|
||||
* **No smooth interpolation between profiles.** At 20:59 we use day,
|
||||
at 21:00 we use night — a step transition. For amplitude/baseline
|
||||
comparison the step is fine (the classifier already smooths over
|
||||
multiple frames). A weighted blend across the transition window
|
||||
would be feasible but adds complexity for limited gain.
|
||||
* **No more than two profiles.** Seasonal (summer/winter), weekday/
|
||||
weekend etc. would need either more flags or a config-file driven
|
||||
approach. Out of scope.
|
||||
|
||||
## Files Touched
|
||||
|
||||
```
|
||||
v2/crates/wifi-densepose-sensing-server/src/main.rs
|
||||
- --baseline-profile CLI flag (D1)
|
||||
- resolve_baseline_profile (D1, D2, D3, D4)
|
||||
- baseline_profile_file_or_fallback (D4)
|
||||
- baseline_profile_watch background task (D5, D6)
|
||||
- CURRENT_BASELINE_PROFILE static + init helper (D5)
|
||||
- startup uses resolve_baseline_profile (D1)
|
||||
- spawn baseline_profile_watch alongside other watches (D5)
|
||||
|
||||
docs/adr/ADR-113-baseline-profiles.md (this)
|
||||
```
|
||||
|
||||
## Verified Acceptance
|
||||
|
||||
* `cargo build --release -p wifi-densepose-sensing-server` clean.
|
||||
* `cargo test --release -p wifi-densepose-sensing-server
|
||||
--no-default-features` — 326 tests pass.
|
||||
* `sensing-server --help` shows the new `--baseline-profile` flag
|
||||
with the four-mode help text.
|
||||
* Running with `--baseline-profile single` (default) keeps the
|
||||
existing log line `baseline-profile: starting in 'single' mode →
|
||||
data/baseline.json` and disables the watch task with `Baseline
|
||||
profile watch disabled (--baseline-profile single)`.
|
||||
* Running with `--baseline-profile auto` while no `baseline.day.json`
|
||||
exists logs `baseline-profile day: file data/baseline.day.json not
|
||||
found, falling back to data/baseline.json` then proceeds.
|
||||
|
||||
## References
|
||||
|
||||
* ADR-103 — persistent baseline storage + JSON schema this ADR reuses.
|
||||
* ADR-104 — per-subcarrier amplitude + phase drift; both consume
|
||||
whatever baseline the active profile loads.
|
||||
* ADR-107 — `POST /api/v1/baseline/calibrate` can write into any of
|
||||
the three paths via the `out` body field, so operators can record
|
||||
each profile via the same UI button.
|
||||
|
|
@ -0,0 +1,162 @@
|
|||
# ADR-114 — 2000-Packet Replay Regression Suite
|
||||
|
||||
**Status**: Accepted
|
||||
**Date**: 2026-05-17
|
||||
**Scope**: `v2/crates/wifi-densepose-sensing-server/src/main.rs`
|
||||
(`replay_tests` module under `#[cfg(test)]`),
|
||||
`v2/crates/wifi-densepose-sensing-server/tests/fixtures/replay_*.jsonl`,
|
||||
`scripts/generate-replay-fixtures.py`. Closes the "2 000-packet fixed-
|
||||
replay test suite" item in CHECKLIST.
|
||||
|
||||
## Context
|
||||
|
||||
Up to now the amplitude classifier has been protected by per-function
|
||||
unit tests (cv calculation, NBVI selection, baseline drop trigger) but
|
||||
not by an end-to-end regression test that feeds a known-good stream
|
||||
through the full `amp_presence_override` pipeline and checks that the
|
||||
labels still look right.
|
||||
|
||||
Without that, a refactor of NBVI selection or a threshold tweak could
|
||||
silently regress classifier behaviour on real deployments — the unit
|
||||
tests would all pass while the production output flipped.
|
||||
|
||||
Pace's ESPectre has a similar pattern: 1000 idle + 1000 motion frames,
|
||||
checked into the repo, replayed in CI on every PR.
|
||||
|
||||
## Decisions
|
||||
|
||||
### D1 — Fixture format: line-delimited JSON, `{node_id, amplitude[]}`
|
||||
|
||||
```jsonl
|
||||
{"node_id":1,"amplitude":[28.842, 19.333, ...]}
|
||||
{"node_id":2,"amplitude":[15.601, 17.220, ...]}
|
||||
...
|
||||
```
|
||||
|
||||
Minimal: just the two fields the classifier reads. Round-robined across
|
||||
nodes (500 per node × 2 nodes = 1000 frames per fixture file). 1000
|
||||
frames per file × 2 files = 2000 packets total.
|
||||
|
||||
### D2 — Fixtures live in-repo under `tests/fixtures/`
|
||||
|
||||
```
|
||||
v2/crates/wifi-densepose-sensing-server/tests/fixtures/
|
||||
replay_idle.jsonl (1000 lines)
|
||||
replay_motion.jsonl (1000 lines)
|
||||
```
|
||||
|
||||
Co-located with the test that consumes them. `cargo test` picks them up
|
||||
via `env!("CARGO_MANIFEST_DIR")`. The fixture files are ~1.5 MB total
|
||||
(text JSON) — small enough for the repo, not so small that the test
|
||||
loses statistical power.
|
||||
|
||||
### D3 — Synthetic but parameter-matched to live data
|
||||
|
||||
The fixtures are generated by `scripts/generate-replay-fixtures.py` with
|
||||
two deterministic seeds (42 and 43). Parameters chosen to mirror the
|
||||
live deployment:
|
||||
|
||||
* Baseline mean amplitudes per node taken from `data/baseline.json`
|
||||
(node 1: 27.04, node 2: 14.72).
|
||||
* Idle: per-frame Gaussian noise σ = 1.8 % of the per-subcarrier mean.
|
||||
* Motion: ±40 % slow envelope (0.15 Hz sinusoid, 6.7 s cycle, longer
|
||||
than the classifier's 4.5 s `AMP_SHORT_WIN`) + 5 % per-frame noise.
|
||||
Mimics a body slowly modulating the channel during walking.
|
||||
|
||||
This is deliberately *synthetic*. Capturing 1000 real frames of
|
||||
"empty room" requires the operator to step out and stay out for ~50 s,
|
||||
and capturing "motion" requires walking through the room — neither is
|
||||
something this session could do without manual operator labour. The
|
||||
synthetic-but-realistic alternative gives deterministic regression
|
||||
coverage today, with the option to swap in live captures (same JSONL
|
||||
schema, same filenames) when time allows.
|
||||
|
||||
### D4 — Test lives inside `main.rs` under `#[cfg(test)] mod replay_tests`
|
||||
|
||||
`amp_presence_override` is private to the binary crate, so the test
|
||||
can't sit in `tests/` (which is for integration tests against
|
||||
`lib.rs`). Putting it under `#[cfg(test)]` in `main.rs` keeps the
|
||||
helper visibility minimal and exercises the exact function path
|
||||
production uses.
|
||||
|
||||
### D5 — Test resets per-node history before each fixture run
|
||||
|
||||
`amp_presence_override` accumulates per-node state in
|
||||
`OnceLock<Mutex<HashMap<…>>>` statics. The test clears those between
|
||||
the idle and motion runs so each fixture starts with a fresh classifier
|
||||
(no cross-contamination from the previous fixture's frames sitting in
|
||||
the rolling window).
|
||||
|
||||
It also clears the per-subcarrier baseline (`amp_baseline_per_sub`)
|
||||
because the synthetic fixtures don't share a per-subcarrier profile
|
||||
with whatever real recording lives in `data/baseline.json` — leaving
|
||||
the live per-sub baseline in place would make the drift channel
|
||||
saturate and obscure the CV-threshold path we're actually testing.
|
||||
|
||||
### D6 — F1 threshold: 0.85
|
||||
|
||||
Convention from Pace's ESPectre CI gate. Current value on the synthetic
|
||||
fixtures with this deployment's baseline is `F1 = 1.000` (tp=822,
|
||||
fp=0, tn=822, fn=0; 178 warmup frames excluded per fixture). The 0.15
|
||||
headroom gives room for legitimate classifier evolution without
|
||||
forcing a fixture re-record on every tuning change.
|
||||
|
||||
### D7 — Test loads the deployment baseline at startup
|
||||
|
||||
Without `data/baseline.json` loaded, the classifier compares raw CV
|
||||
against thresholds of 3.0 (300 %) and 6.0 — values no realistic signal
|
||||
reaches. The test discovers the baseline via a couple of canonical
|
||||
relative paths (`../../data/baseline.json` from the crate dir, etc.)
|
||||
and exits early with a clear `eprintln!` hint if none are found.
|
||||
|
||||
## Trade-offs
|
||||
|
||||
* **Synthetic fixtures don't catch sensor-specific bugs.** A
|
||||
Kconfig-level FW regression that produced subtly different amplitude
|
||||
scaling would not be caught — the synthetic fixtures encode the
|
||||
*expected* scaling, not whatever the FW currently emits. The witness
|
||||
bundle (ADR-028) still covers that end of the pipeline.
|
||||
* **`replay_2000` runs only when explicitly named or via the full
|
||||
suite.** No filtering hides it from CI. It runs in well under a
|
||||
second so cost is negligible.
|
||||
* **F1 currently 1.0 — too clean to detect subtle regressions.** A
|
||||
followup with live captures may bring the natural F1 to ~0.9, at
|
||||
which point the 0.85 threshold becomes a real gate. For now it's
|
||||
primarily a contract test: "the classifier still emits something
|
||||
reasonable on a known input".
|
||||
|
||||
## Files Touched
|
||||
|
||||
```
|
||||
scripts/generate-replay-fixtures.py (new)
|
||||
v2/crates/wifi-densepose-sensing-server/tests/fixtures/
|
||||
replay_idle.jsonl (new)
|
||||
replay_motion.jsonl (new)
|
||||
v2/crates/wifi-densepose-sensing-server/src/main.rs
|
||||
- replay_tests module (D4, D5, D7)
|
||||
docs/adr/ADR-114-replay-regression-suite.md (this)
|
||||
```
|
||||
|
||||
## Verified Acceptance
|
||||
|
||||
```
|
||||
$ cargo test --release -p wifi-densepose-sensing-server \
|
||||
--no-default-features --bin sensing-server replay_2000 -- --nocapture
|
||||
replay_2000 F1=1.000 tp=822 fp=0 tn=822 fn=0
|
||||
test replay_tests::replay_2000_packets_f1_above_threshold ... ok
|
||||
test result: ok. 1 passed; 0 failed; 0 ignored;
|
||||
```
|
||||
|
||||
Full workspace suite: 327 tests pass (was 326 + this one).
|
||||
|
||||
## References
|
||||
|
||||
* ADR-101 — raw-amplitude classifier this test exercises.
|
||||
* ADR-102 — NBVI subcarrier selection that feeds CV calculation.
|
||||
* ADR-103 — persistent baseline that drives the universal-threshold
|
||||
normalization the test relies on.
|
||||
* ADR-028 — witness bundle (the other end-to-end regression
|
||||
mechanism; ADR-114 covers classifier code paths, ADR-028 covers
|
||||
the deterministic-CSI proof pipeline).
|
||||
* Francesco Pace, *How I Turned My Wi-Fi Into a Motion Sensor —
|
||||
Part 2*, "Replay regression test" — the upstream pattern.
|
||||
|
|
@ -0,0 +1,161 @@
|
|||
# ADR-115 — FW REST endpoint to repoint CSI aggregator without USB
|
||||
|
||||
**Status**: Accepted
|
||||
**Date**: 2026-05-17
|
||||
**Scope**: `firmware/esp32-csi-node/main/ota_update.c`
|
||||
(`ota_set_target_handler`, `parse_ip_port`, URI registration on port 8032).
|
||||
|
||||
## Context
|
||||
|
||||
After moving the Mac from Tran Thanh T3 (192.168.1.x) to TP-Link_8340
|
||||
(192.168.0.x) for low-latency sensor proximity, both ESP32-S3 nodes
|
||||
held a stale `csi_cfg/target_ip` in NVS — they were silently streaming
|
||||
CSI into the previous LAN and the new server on `0.0.0.0:5005` saw
|
||||
zero frames for ~5 minutes despite both nodes being WiFi-reachable
|
||||
and responding on `:8032/ota/status`.
|
||||
|
||||
Existing tools didn't cover this:
|
||||
|
||||
* `provision.py` writes `target_ip` via USB serial — requires
|
||||
physical access to the sensor.
|
||||
* `/ota/recalibrate` (ADR-109) only erases gain-lock keys
|
||||
(`gl_agc/gl_fft/gl_ap_mac`) — intentionally doesn't touch
|
||||
network config.
|
||||
* Rebuilding FW with a new `CONFIG_CSI_TARGET_IP` would only help if
|
||||
NVS is also wiped, since the NVS override always beats the
|
||||
compile-time default.
|
||||
|
||||
Recurring operational need: every Mac IP change, every network
|
||||
move, every router swap requires the operator to crawl behind the
|
||||
sensor with a USB cable. Not acceptable.
|
||||
|
||||
## Decisions
|
||||
|
||||
### D1 — `POST /ota/set-target` HTTP endpoint
|
||||
|
||||
New handler on the existing OTA HTTP server (port 8032). Body is
|
||||
plain text `"IPv4:PORT"` with optional trailing CR/LF, e.g.
|
||||
`192.168.0.103:5005`. No JSON dependency — `cJSON` is not used
|
||||
elsewhere in this FW.
|
||||
|
||||
```
|
||||
POST /ota/set-target HTTP/1.1
|
||||
Content-Type: text/plain
|
||||
Authorization: Bearer <psk> # only if ota_psk provisioned
|
||||
|
||||
192.168.0.103:5005
|
||||
```
|
||||
|
||||
Response:
|
||||
|
||||
```json
|
||||
{"status":"ok","target_ip":"192.168.0.103","target_port":5005,"message":"rebooting"}
|
||||
```
|
||||
|
||||
Followed by `vTaskDelay(1s)` + `esp_restart()` so the new value is
|
||||
picked up by `nvs_config_load` on next boot.
|
||||
|
||||
### D2 — Strict body parser (no `inet_pton` dependency)
|
||||
|
||||
`parse_ip_port` validates:
|
||||
|
||||
* Exactly 4 dot-separated octets, each `0–255`.
|
||||
* Single `:` separator.
|
||||
* Port `1–65535`, max 5 digits.
|
||||
* Trailing whitespace/CR/LF tolerated.
|
||||
|
||||
Rejects malformed input with HTTP 400 *before* touching NVS — a
|
||||
sensor with an unparseable IP would lose its only network identity.
|
||||
|
||||
### D3 — Same NVS namespace + keys that `nvs_config.c` reads
|
||||
|
||||
```c
|
||||
nvs_open("csi_cfg", NVS_READWRITE, &h);
|
||||
nvs_set_str(h, "target_ip", ip);
|
||||
nvs_set_u16(h, "target_port", port);
|
||||
nvs_commit(h);
|
||||
```
|
||||
|
||||
Matches the keys already read by `nvs_config_load` at boot, so the
|
||||
change is picked up without any FW code change beyond this handler.
|
||||
|
||||
### D4 — Auth model identical to `/ota/recalibrate`
|
||||
|
||||
Uses the same `ota_check_auth` PSK gate (ADR-050). If
|
||||
`security/ota_psk` is empty, the endpoint is open (dev mode); when
|
||||
set, requires `Authorization: Bearer <psk>`. Same threat model and
|
||||
permissive default as `/ota` itself.
|
||||
|
||||
### D5 — No partial-write atomicity gymnastics
|
||||
|
||||
We write `target_ip`, then `target_port`, then commit. If a power
|
||||
cut happens between `set_str` and `set_u16`, NVS keeps the previous
|
||||
`target_port` (since uncommitted writes don't persist) — safe
|
||||
behaviour. No need for a temp-key + rename dance.
|
||||
|
||||
## Files Touched
|
||||
|
||||
```
|
||||
firmware/esp32-csi-node/main/ota_update.c
|
||||
+ #include "nvs_config.h" (NVS_CFG_IP_MAX)
|
||||
+ parse_ip_port helper
|
||||
+ ota_set_target_handler
|
||||
+ URI registration in ota_update_start_server
|
||||
+ log line in startup banner
|
||||
docs/adr/ADR-115-fw-set-target-rest.md (this)
|
||||
```
|
||||
|
||||
Binary size delta: `esp32-csi-node.bin` 854 KB → 855 KB (+~1 KB).
|
||||
58 % of OTA partition free, plenty of margin.
|
||||
|
||||
## Verified Acceptance
|
||||
|
||||
Sequence on both live nodes (192.168.0.100, 192.168.0.101):
|
||||
|
||||
1. `python3 scripts/ota-deploy.sh 192.168.0.100 192.168.0.101` →
|
||||
`running_partition` flipped on both (`ota_1↔ota_0`).
|
||||
2. `curl -X POST -d '192.168.0.103:5005' .../ota/set-target` →
|
||||
`{"status":"ok","target_ip":"192.168.0.103","target_port":5005,...}`
|
||||
on both nodes.
|
||||
3. After 25 s reboot+WiFi+CSI startup, sensing-server log:
|
||||
```
|
||||
keepalive: learned address for node 2 = 192.168.0.100:63940
|
||||
keepalive: ping -i 0.040 192.168.0.100 for node 2
|
||||
keepalive: learned address for node 1 = 192.168.0.101:63844
|
||||
keepalive: ping -i 0.040 192.168.0.101 for node 1
|
||||
```
|
||||
4. `GET /api/v1/sensing/latest` → live classification
|
||||
(`motion_level: active`, presence: true) with non-zero
|
||||
per-node features (`drift_score: 0.41`, `dominant_freq_hz: 6.3`,
|
||||
`mean_rssi: -57`).
|
||||
|
||||
End-to-end recovery time from broken stream → live CSI: **~3 min**
|
||||
(build 0, since FW was already built; flash 17 s; set-target +
|
||||
reboot ~25 s; first ping-driven CSI batch ~5 s).
|
||||
|
||||
## Open Items
|
||||
|
||||
* **Persist last-known-good target as fallback** — if a bad
|
||||
`target_ip` is committed (e.g. operator types Mac's old IP) the
|
||||
sensor goes silent until the next set-target call. A
|
||||
`csi_cfg/target_ip_lkg` snapshot updated on every successful
|
||||
keepalive-driven UDP send would let the sensor self-revert after
|
||||
N silent seconds. ~1 h FW.
|
||||
* **Track AP MAC alongside target** — ADR-108 / ADR-109 already
|
||||
invalidate gain-lock on AP change; same pattern could
|
||||
auto-invalidate target on subnet change (sensor sees its DHCP
|
||||
lease is on a different /24 than `target_ip` → blank target,
|
||||
refuse to send until operator confirms). ~1 h FW.
|
||||
* **REST endpoint to read current target** — `GET /ota/target`
|
||||
returning `{"target_ip":..., "target_port":...}`. Operator can
|
||||
diagnose "where is this sensor pointed?" without USB. ~15 min FW.
|
||||
|
||||
## References
|
||||
|
||||
* ADR-050 — OTA PSK auth that gates this endpoint
|
||||
* ADR-110 — TP-Link WISP deployment that triggered the Mac-IP move
|
||||
* ADR-108 — FW NVS persistence patterns (same namespace, same approach)
|
||||
* ADR-109 — `/ota/recalibrate` precedent (same handler shape, same
|
||||
reboot semantics)
|
||||
* `scripts/provision.py` — original USB-only NVS provisioning path
|
||||
that this ADR replaces for the network-config case
|
||||
|
|
@ -0,0 +1,224 @@
|
|||
# ADR-116 — WiFlow-v1 Supervised Pose Loader (Rust)
|
||||
|
||||
**Status**: Accepted (integration), needs fine-tune (output quality)
|
||||
**Date**: 2026-05-17
|
||||
**Scope**: `v2/crates/wifi-densepose-sensing-server/src/wiflow_v1.rs` (new,
|
||||
~430 lines incl. tests), `src/main.rs` (CLI flag + load + 5 tick-site hooks +
|
||||
`pose_current` keypoint path), `src/lib.rs` (module export).
|
||||
|
||||
## Context
|
||||
|
||||
Until this ADR `/api/v1/pose/*` always returned an empty `persons` array
|
||||
(ADR-105 — no synthetic fallback when no real model is loaded). HuggingFace
|
||||
`ruv/ruview/wiflow-v1/wiflow-v1.json` is the project's official supervised
|
||||
pose model (Apache-2.0, 974 KB, 92.9 % PCK@20 on its training set). It just
|
||||
sat on disk because there was no Rust loader — the only reference impl is
|
||||
`scripts/train-wiflow-supervised.js` (JS, training script, not deployment).
|
||||
|
||||
This ADR ports the JS inference path to Rust so sensing-server can serve
|
||||
real 17-keypoint COCO skeletons in production.
|
||||
|
||||
## What was wrong in the model file (and how this ADR works around it)
|
||||
|
||||
The HuggingFace JSON has an `architecture` field that **lies**:
|
||||
|
||||
```json
|
||||
"architecture": {
|
||||
"tcnChannels": [35, 256, 256, 192, 128],
|
||||
"tcnKernel": 7,
|
||||
"tcnDilations": [1, 2, 4, 8],
|
||||
"fcDims": [2560, 2048, 34]
|
||||
}
|
||||
```
|
||||
|
||||
That's the `full` scale (~7.7 M params). The file is actually the **lite**
|
||||
scale (186,946 params — confirmed by `totalParams` field). The exporter at
|
||||
`train-wiflow-supervised.js:1599` hardcodes the full-scale dict for every
|
||||
scale. The loader trusts `totalParams` and ignores `architecture`.
|
||||
|
||||
Lite topology (recovered from `SCALE.lite` at `train-wiflow-supervised.js:135`
|
||||
and verified by exact param count = 186,946):
|
||||
|
||||
* 2 TCN blocks (NOT 4), kernel = 3 (NOT 7), dilations [1, 2] (NOT [1,2,4,8])
|
||||
* TCN channels: 35 → 32 → 32
|
||||
* Per block: causal_conv → BN → ReLU → causal_conv → BN + residual → ReLU
|
||||
(1×1 projection on residual when in_ch ≠ out_ch, only block 0)
|
||||
* Flatten 32 × 20 = 640 → fc1 (640→256) → ReLU → fc2 (256→34)
|
||||
* Sigmoid on final 34-dim → 17 (x, y) keypoints in [0, 1]
|
||||
|
||||
## Decisions
|
||||
|
||||
### D1 — Pure-Rust forward pass, no new crates
|
||||
|
||||
`wiflow_v1.rs` is self-contained: Vec<f32> math by hand, inline base64
|
||||
decoder (50 LoC), no `ndarray`, no `candle`, no `base64` crate added. The
|
||||
inference is small enough (~250 K flops/forward) that hand-written Vec<f32>
|
||||
loops are clearer than pulling a tensor framework for one model.
|
||||
|
||||
### D2 — Weight stream order matches `collectParams()` in the JS trainer
|
||||
|
||||
```
|
||||
for each TCN block:
|
||||
conv1.weight (in_ch * k * out_ch f32s)
|
||||
conv1.bias (out_ch)
|
||||
bn1.gamma (out_ch)
|
||||
bn1.beta (out_ch)
|
||||
conv2.weight, conv2.bias, bn2.gamma, bn2.beta
|
||||
(if in_ch != out_ch: res.weight, res.bias)
|
||||
fc1.weight, fc1.bias, fc2.weight, fc2.bias
|
||||
```
|
||||
|
||||
Loader asserts the stream is fully consumed (`Cursor::remaining() == 0`)
|
||||
after fc2 — catches silent topology mismatches. Param count check
|
||||
(`totalParams == 186_946`) catches scale mismatch before unpacking.
|
||||
|
||||
### D3 — BatchNorm uses per-window mean/var (matches JS impl)
|
||||
|
||||
`train-wiflow-supervised.js:770` computes mean/var across the T axis at
|
||||
inference time, ignoring `runMean/runVar` accumulated during training.
|
||||
Loader skips running stats entirely (only 2 params per channel stored:
|
||||
gamma + beta). This is unusual but consistent — the network was trained
|
||||
this way, so we infer this way.
|
||||
|
||||
### D4 — Input prep: top-35 subcarriers by NBVI, raw amplitudes
|
||||
|
||||
`build_input_from_history` (in `wiflow_v1.rs`):
|
||||
|
||||
1. Take last 20 frames from any node's `AmpState.nbvi_history` (Vec<Vec<f64>>).
|
||||
2. Rank subcarriers by NBVI score (`α·σ/μ² + (1−α)·σ/μ`, α = 0.5) — same
|
||||
formula the classifier uses, but pick K = 35 (model input), not K = 12
|
||||
(classifier).
|
||||
3. Apply 25th-percentile dead-zone gate to skip guard tones / null bins.
|
||||
4. Build flat `[35 * 20]` row-major tensor of raw amplitudes (no z-score —
|
||||
training data wasn't normalised either, BN handles it).
|
||||
|
||||
If fewer than 20 frames or all subcarriers gated out → return `None`,
|
||||
inference skipped this tick, `pose_keypoints: None` in SensingUpdate.
|
||||
|
||||
### D5 — Per-tick inference, longest-history node
|
||||
|
||||
`run_wiflow_inference()` at every `broadcast_tick_task` step (5 sites total
|
||||
in `main.rs`):
|
||||
|
||||
* Picks the node with longest `nbvi_history` (ties broken by smallest
|
||||
node_id — deterministic).
|
||||
* Cost: ~250 K flops on the lite scale (BN + 2 small convs + 2 FCs).
|
||||
Measured 0.4 ms on the Mac M1 — well under the 100 ms tick budget.
|
||||
* Returns `Vec<[f64; 4]>` of length 17 (`[x, y, z=0, conf=1]`).
|
||||
|
||||
### D6 — `pose_current` reads `pose_keypoints` directly
|
||||
|
||||
Pre-ADR: `/api/v1/pose/current` read `latest_update.persons`. The tracker
|
||||
populated `persons` from `derive_pose_from_sensing` (signal-derived,
|
||||
synthetic) regardless of `model_loaded`. Loader-output `pose_keypoints`
|
||||
was only read by the WS broadcaster.
|
||||
|
||||
This ADR makes `pose_current` prefer `pose_keypoints` when 17-len and
|
||||
present, building a single `PersonDetection` with COCO joint names. Falls
|
||||
back to tracker `persons` only when `pose_keypoints` is `None` (cold
|
||||
start). Keeps the ADR-105 honesty gate: empty array if `model_loaded =
|
||||
false`.
|
||||
|
||||
### D7 — Honest about output quality
|
||||
|
||||
The loaded model produces **17 keypoints**, but the **numerical values
|
||||
are saturated** (most x/y near 0 or 1) — sigmoid extremes meaning the
|
||||
network has no learned response to our specific deployment's CSI
|
||||
distribution. This is expected: the model was trained on a different
|
||||
ESP32 setup, different room, different person, with camera ground truth
|
||||
we don't have here. **The integration is correct; the model needs
|
||||
deployment-specific fine-tune to produce useful keypoints.**
|
||||
|
||||
Two paths to usable output, left as follow-ups (Pack E):
|
||||
|
||||
1. **Apply `node-1.json` / `node-2.json` LoRA adapters** (ADR-117 candidate)
|
||||
— they're shipped alongside `wiflow-v1.json` in the same HuggingFace
|
||||
repo, rank=8, alpha=16, target the encoder + task heads. Loader stub +
|
||||
forward fold ~2 h.
|
||||
2. **Re-train via `scripts/train-wiflow-supervised.js` with new ground-
|
||||
truth capture** (~30 min capture + 19 min training per the model card).
|
||||
Operator-side work.
|
||||
|
||||
## Files Touched
|
||||
|
||||
```
|
||||
v2/crates/wifi-densepose-sensing-server/src/wiflow_v1.rs (new, ~430 LoC)
|
||||
v2/crates/wifi-densepose-sensing-server/src/lib.rs (+ pub mod)
|
||||
v2/crates/wifi-densepose-sensing-server/src/main.rs:
|
||||
+ use wiflow_v1::{self, WiflowModel}
|
||||
+ Args.wiflow_model: Option<PathBuf>
|
||||
+ static WIFLOW_MODEL: OnceLock<Option<WiflowModel>>
|
||||
+ main() — load before existing --model/--load-rvf path
|
||||
+ fn run_wiflow_inference() -> Option<Vec<[f64;4]>> (right after csi_keepalive_task)
|
||||
+ 5 × `pose_keypoints: run_wiflow_inference()` at SensingUpdate sites
|
||||
+ pose_current — prefer pose_keypoints when 17-len; fall back to persons
|
||||
docs/adr/ADR-116-wiflow-v1-supervised-pose-loader.md (this)
|
||||
```
|
||||
|
||||
Binary size delta: 3.0 MB → 3.1 MB.
|
||||
|
||||
## Verified Acceptance
|
||||
|
||||
Live test on the operator's TP-Link deployment (.103, both nodes
|
||||
192.168.0.100/.101):
|
||||
|
||||
```
|
||||
$ ./target/release/sensing-server --source esp32 --csi-keepalive-pps 25 \
|
||||
--wiflow-model data/models/ruview/wiflow-v1/wiflow-v1.json
|
||||
...
|
||||
ADR-116 wiflow-v1 loaded from data/models/ruview/wiflow-v1/wiflow-v1.json
|
||||
(lite scale, 186946 params)
|
||||
keepalive: learned address for node 2 = 192.168.0.100:63940
|
||||
keepalive: learned address for node 1 = 192.168.0.101:63844
|
||||
|
||||
$ curl :8080/api/v1/info → "pose_estimation": true
|
||||
$ curl :8080/api/v1/pose/stats → "model_loaded": true, frames_processed: 2699
|
||||
$ curl :8080/api/v1/pose/current
|
||||
{ persons: [{id: 1, keypoints: [17 × {name, x, y, z, confidence}], ...}],
|
||||
total_persons: 1, model_loaded: true }
|
||||
```
|
||||
|
||||
End-to-end: model on disk → loader → forward pass → 17 keypoints → REST &
|
||||
WS payload. UI's pose canvas (un-gated by ADR-105 D4) now draws what the
|
||||
model emits.
|
||||
|
||||
## Cargo tests
|
||||
|
||||
`wiflow_v1` ships 3 unit tests covering the most-likely-to-rot bits:
|
||||
|
||||
* `base64_round_trip_alphabet` — alphabet, padding, whitespace tolerance
|
||||
* `sigmoid_bounds` — numerical stability at ±10 inputs
|
||||
* `build_input_zero_history` — empty-history early return
|
||||
|
||||
`cargo test -p wifi-densepose-sensing-server wiflow_v1` → 3 passed.
|
||||
|
||||
## Open Items
|
||||
|
||||
* **Pack E.1 — LoRA adapter loader.** `node-1.json` / `node-2.json` rank-8
|
||||
adapters from the same HF repo, ~21 KB each. The trainer encodes them
|
||||
in the same custom format as `wiflow-v1.json` (different `format` tag),
|
||||
so the loader plumbing is small. ~2 h.
|
||||
* **Pack E.2 — Camera-supervised retraining for this room.** Run
|
||||
`scripts/collect-ground-truth.py` against this Mac's webcam +
|
||||
TP-Link/.100/.101 CSI for 5 min, then `scripts/train-wiflow-
|
||||
supervised.js --scale lite`. Should drop sigmoid saturation and produce
|
||||
spatially-coherent keypoints. ~1 h operator + 19 min train.
|
||||
* **Inference rate-limiting.** Currently runs every tick (10 fps). If
|
||||
multiple WS clients connect, each tick computes once and the result is
|
||||
reused — fine. If model size grows to small/medium scale (~200K/800K
|
||||
params), should cache the result per tick instead of computing per-client.
|
||||
* **Per-node pose tracks.** Right now a single virtual person is emitted;
|
||||
the broadcaster places it in `zone_1` with a fixed bbox. If/when LoRA
|
||||
adapters disambiguate per-node viewpoints, fan out to one
|
||||
`PersonDetection` per node (left/right of the room).
|
||||
|
||||
## References
|
||||
|
||||
* `scripts/train-wiflow-supervised.js` — JS reference implementation
|
||||
* HuggingFace `ruv/ruview` — model file + LoRA adapters (Apache-2.0)
|
||||
* ADR-079 — camera ground-truth training pipeline (the trainer this
|
||||
loader was built against)
|
||||
* ADR-105 — "no synthetic data in production runtime"; this ADR keeps
|
||||
the gate but feeds it real model output
|
||||
* ADR-115 — `/ota/set-target` (the prerequisite that got the CSI stream
|
||||
flowing again so this loader has data to consume)
|
||||
|
|
@ -0,0 +1,245 @@
|
|||
# ADR-117 — Process Hygiene, Pose Path Honesty, and Audit Follow-ups
|
||||
|
||||
**Status**: Accepted
|
||||
**Date**: 2026-05-17
|
||||
**Scope**: `v2/crates/wifi-densepose-sensing-server/src/{main.rs,wiflow_v1.rs}`,
|
||||
`v2/crates/wifi-densepose-sensing-server/tests/multi_node_test.rs`,
|
||||
`ui/index.html`, `ui/components/LiveDemoTab.js`, `CHECKLIST.md`,
|
||||
`docs/adr/ADR-115-fw-set-target-rest.md`,
|
||||
`docs/references/{espectre-gap-analysis.md,ota-pipeline.md}`.
|
||||
|
||||
## Context
|
||||
|
||||
A deep audit pass (4 parallel auditors covering sensors, server, UI, docs)
|
||||
surfaced two operational fires and a stack of correctness/honesty issues
|
||||
that had accumulated across ADR-100..116. This ADR collects the immediate
|
||||
fixes.
|
||||
|
||||
### Fire 1 — Runaway ping zombies
|
||||
|
||||
Live `ps` showed **250+ `/sbin/ping -i 0.040` processes** on the Mac, most
|
||||
parented to PID 1 (orphans from prior server lifetimes) and **8 fresh
|
||||
pings to `127.0.0.1` parented to the current server**.
|
||||
|
||||
Root cause: a `cargo test --workspace` run sent UDP packets to
|
||||
`127.0.0.1:5005` from `tests/multi_node_test.rs::test_multi_node_udp_send`
|
||||
while the production server was bound to `0.0.0.0:5005`. The integration
|
||||
test injects 55 synthetic frames with `node_ids = [1, 2, 3, 5, 7]`. Each
|
||||
distinct `node_id` byte in a CSI magic packet triggered a fresh entry in
|
||||
`NODE_ADDRS`, and the keepalive task spawned exactly one `ping` child
|
||||
per entry. Combined with macOS not propagating parent death to children
|
||||
(killed servers leave ping orphans), the count accumulated rapidly.
|
||||
|
||||
### Fire 2 — Per-node feature divergence on node 2
|
||||
|
||||
Node 2 (192.168.0.100) showed `dominant_freq_hz: 0.05` vs node 1 (.101)
|
||||
`6.30` — a 126× split in the same room. Pointed to stale gain-lock on
|
||||
node 2 from a different AP/orientation. Cleared via
|
||||
`POST /ota/recalibrate` (ADR-109) — sensor re-runs the 300-packet
|
||||
calibration sampler at next boot.
|
||||
|
||||
### Correctness issues (server auditor)
|
||||
|
||||
* `run_wiflow_inference` hardcoded keypoint `confidence: 1.0` — lied about
|
||||
data quality. Real signal: the runtime classifier's `confidence`.
|
||||
* `wiflow_v1.rs` zero-pad path duplicated subcarrier index 0 instead of
|
||||
zero-padding when < 35 finite subcarriers — comment said "zero the
|
||||
rest", code did the opposite.
|
||||
* `nbvi_history.clone()` cloned the entire 600-deep VecDeque (≈270 KB) on
|
||||
every inference, while only the last 20 frames are used.
|
||||
* `run_wiflow_inference` picked the node with longest history regardless
|
||||
of recency — stale data from a dead sensor would keep producing pose.
|
||||
|
||||
### UI issues (UI auditor)
|
||||
|
||||
* `/` served a static API-index HTML page; users typing `localhost:8080`
|
||||
never reached the SPA at `/ui/index.html`.
|
||||
* `<section id="sensing">` was empty; `app.js::SensingTab.mount` queried
|
||||
`#sensing-container` and rendered into nothing — the Sensing tab was
|
||||
permanently blank.
|
||||
* `LiveDemoTab.fetchModels` unconditionally overwrote `activeModelId =
|
||||
'wiflow-v1'` whenever `/api/v1/info` reported `pose_estimation: true`,
|
||||
even when the operator had just loaded an RVF model. Dropdown silently
|
||||
flipped back to WiFlow on every refresh.
|
||||
|
||||
### Docs issues (docs auditor)
|
||||
|
||||
* `CHECKLIST.md` header: `head c827cde6`, count `43 Done` — stale
|
||||
by 4 commits and 2 ADRs.
|
||||
* `ADR-115 References` cited "ADR-100 — TP-Link WISP" (it's ADR-110)
|
||||
and "ADR-108 / ADR-111" (ADR-111 doesn't exist — folded into ADR-109).
|
||||
* `espectre-gap-analysis.md::Still open` table listed 8 items as open
|
||||
that had already shipped (ADR-104, ADR-109, ADR-112, ADR-114).
|
||||
* `ota-pipeline.md` documented OTA flashing but never mentioned
|
||||
`/ota/set-target` (ADR-115) or `/ota/recalibrate` (ADR-109) — operator
|
||||
hitting the "Mac moved networks" scenario wouldn't find the recovery
|
||||
path.
|
||||
|
||||
## Decisions
|
||||
|
||||
### D1 — UDP receiver filters loopback before NODE_ADDRS
|
||||
|
||||
`main.rs::udp_receiver_task` now rejects loopback, unspecified, multicast,
|
||||
and broadcast source addresses before inserting into `NODE_ADDRS`. Packets
|
||||
still parse and feed the classifier — only the keepalive registration
|
||||
is gated. Defends against any local sender (tests, simulators, future
|
||||
tooling) accidentally driving ping spawn.
|
||||
|
||||
### D2 — Keepalive pre-reap at startup
|
||||
|
||||
`main.rs::csi_keepalive_task` runs `pkill -f "/sbin/ping -i 0.040"` and
|
||||
`pkill -f "/usr/bin/ping -i 0.040"` once at task entry. Cleans up
|
||||
orphans from prior server lifetimes without operator action. Cost: two
|
||||
`pkill` invocations at startup, ~10 ms total. Idempotent.
|
||||
|
||||
### D3 — Real keypoint confidence
|
||||
|
||||
`run_wiflow_inference` now stamps `confidence = amp_classify_from_latest`
|
||||
runtime classifier confidence onto all 17 keypoints (was `1.0` hardcoded).
|
||||
The lite-scale wiflow has no per-keypoint uncertainty head; this signal
|
||||
is the most honest stand-in. Currently reading **0.037** on the live
|
||||
deployment — accurate reflection of "wiflow output is saturated, don't
|
||||
trust these coords".
|
||||
|
||||
### D4 — Zero-pad fix in wiflow_v1
|
||||
|
||||
`build_input_from_history` now pushes `None` into `picks` for dead slots
|
||||
and writes `0.0f32` into those rows. Prior code pushed `0usize` → all
|
||||
unused channels read subcarrier-0 amplitudes, feeding the network 35×
|
||||
the same signal.
|
||||
|
||||
### D5 — Tail-clone optimisation
|
||||
|
||||
`run_wiflow_inference` snapshots only the last 20 entries from
|
||||
`nbvi_history` while holding the lock, not the full 600-deep deque. Lock
|
||||
hold time dropped from ~µs * 600 to ~µs * 20 per tick.
|
||||
|
||||
### D6 — `/` → `/ui/index.html` permanent redirect
|
||||
|
||||
`main.rs::root_redirect` returns HTTP 308. API-index HTML moves to `/api`
|
||||
for operators / curl debugging. Users typing the bare host land on the
|
||||
SPA.
|
||||
|
||||
### D7 — Sensing tab container restored
|
||||
|
||||
`ui/index.html`: `<section id="sensing">` now contains `<div
|
||||
id="sensing-container">` matching `app.js::SensingTab.mount`'s query
|
||||
selector.
|
||||
|
||||
### D8 — LiveDemoTab WiFlow inject only when no model active
|
||||
|
||||
`LiveDemoTab.fetchModels` wraps the `activeModelId = 'wiflow-v1'`
|
||||
assignment in `if (!this.modelState.activeModelId)`. RVF model loads
|
||||
keep their displayed name.
|
||||
|
||||
### D9 — Multi-node test guards against external :5005 owner
|
||||
|
||||
`tests/multi_node_test.rs::test_multi_node_udp_send` probes
|
||||
`127.0.0.1:5005` with a transient bind; if the bind fails, the test
|
||||
skips its UDP send rather than polluting whoever owns the port. Belt-
|
||||
and-braces with the server-side filter (D1).
|
||||
|
||||
### D10 — Docs sweep
|
||||
|
||||
* `CHECKLIST.md`: header to `head 0ec1e4b0`, count to **47 Done**,
|
||||
explicit note that ADR-111 is intentionally absent. Reference table
|
||||
range to `001-117`.
|
||||
* `ADR-115`: "ADR-100" → "ADR-110", "ADR-108 / ADR-111" → "ADR-108 / ADR-109".
|
||||
* `espectre-gap-analysis.md::Still open` table: 8 shipped items marked
|
||||
✓ Done with commit hashes; remaining items annotated Deferred with
|
||||
reason or carry a Pack assignment. New items 15-16 added (ADR-115,
|
||||
ADR-117).
|
||||
* `ota-pipeline.md`: new "Operator REST endpoints" section listing
|
||||
`/ota/status`, `/ota`, `/ota/recalibrate`, `/ota/set-target` with
|
||||
curl examples both unauthed and bearer-token authed.
|
||||
|
||||
## Files Touched
|
||||
|
||||
```
|
||||
v2/crates/wifi-densepose-sensing-server/src/main.rs:
|
||||
+ udp_receiver_task: loopback/unspecified/multicast/broadcast filter (D1)
|
||||
+ csi_keepalive_task: pre-reap pkill at task entry (D2)
|
||||
+ run_wiflow_inference: real classifier confidence (D3) + tail clone (D5)
|
||||
+ Router: GET / → root_redirect (308), GET /api → info_page (D6)
|
||||
+ info_page: expanded with new endpoints listed
|
||||
v2/crates/wifi-densepose-sensing-server/src/wiflow_v1.rs:
|
||||
+ build_input_from_history: None-pad → 0.0f32, not subcarrier-0 dup (D4)
|
||||
v2/crates/wifi-densepose-sensing-server/tests/multi_node_test.rs:
|
||||
+ ADR-117 guard: skip if 127.0.0.1:5005 is owned (D9)
|
||||
ui/index.html:
|
||||
+ <div id="sensing-container"> inside #sensing section (D7)
|
||||
ui/components/LiveDemoTab.js:
|
||||
+ fetchModels: guard wiflow inject behind !activeModelId (D8)
|
||||
CHECKLIST.md:
|
||||
+ header refresh + ADR range correction (D10)
|
||||
docs/adr/ADR-115-fw-set-target-rest.md:
|
||||
+ typo fixes ADR-100 → ADR-110, ADR-111 → ADR-109 (D10)
|
||||
docs/references/espectre-gap-analysis.md:
|
||||
+ Still-open table refresh — 8 items ✓ Done, 14/15 reclassified (D10)
|
||||
docs/references/ota-pipeline.md:
|
||||
+ Operator REST endpoints section (D10)
|
||||
docs/adr/ADR-117-process-hygiene-and-audit-followups.md (this)
|
||||
```
|
||||
|
||||
Binary size delta: 3.0 MB → 3.1 MB (no significant change).
|
||||
|
||||
## Verified Acceptance
|
||||
|
||||
After restart with the new binary (PID 97903):
|
||||
|
||||
```
|
||||
$ ps -axo pid,ppid,command | grep "ping.*-i.*0\.040" | grep -v grep | wc -l
|
||||
2
|
||||
$ ps -axo pid,ppid | grep "ping.*-i.*0\.040"
|
||||
97921 97903 /sbin/ping -i 0.040 192.168.0.100
|
||||
97922 97903 /sbin/ping -i 0.040 192.168.0.101
|
||||
```
|
||||
|
||||
Exactly two ping children — one per real sensor — parented to the
|
||||
running server. No 127.0.0.1, no orphans.
|
||||
|
||||
```
|
||||
$ curl -sI http://localhost:8080/
|
||||
HTTP/1.1 308 Permanent Redirect
|
||||
location: /ui/index.html
|
||||
|
||||
$ curl http://localhost:8080/api/v1/pose/current | jq '.persons[0].keypoints[0]'
|
||||
{ "name": "nose", "x": 0.999, "y": 0.0, "z": 0, "confidence": 0.037 }
|
||||
```
|
||||
|
||||
`confidence: 0.037` — real runtime classifier signal, not hardcoded 1.0.
|
||||
`cargo test --workspace` (release) passes 13 / 0 failed / 5 ignored.
|
||||
|
||||
## Out of Scope (intentional non-fixes)
|
||||
|
||||
* **Health endpoint fake constants** (cpu:2.5, mem:1.8, disk:15.0) —
|
||||
flagged by the auditor as critical. Replacing with `sysinfo` crate
|
||||
would add a dependency for low-value telemetry; the orchestrator
|
||||
readiness probe today is only used by Docker compose, not Kubernetes
|
||||
liveness. Deferred. Real fix: `/health/ready` only reports
|
||||
`model_loaded` + `node_count > 0`.
|
||||
* **`derive_pose_from_sensing` call-site cleanup** — function returns
|
||||
`Vec::new()` since ADR-105; removing the 5 call sites is a no-op
|
||||
refactor with no behaviour change. Skipped to keep diff focused.
|
||||
* **`tracker_bridge:10` unused imports warning** — module is integrated
|
||||
via `tracker_bridge::tracker_update` (4 callers), the import list
|
||||
just has dead names. Cosmetic. `cargo fix` deferred.
|
||||
* **CLI training flags** (`--train`, `--dataset`, `--epochs`,
|
||||
`--checkpoint-dir`, `--pretrain*`) — silent no-ops; training is via
|
||||
REST. Removing the flags would break any operator script that passes
|
||||
them harmlessly. Deferred to a separate flag-audit pass.
|
||||
* **OTA PSK provisioning** — operator workflow change, not a code
|
||||
change. Note added to ADR-115 open items. Operator can set
|
||||
`security/ota_psk` via USB provision.py whenever convenient.
|
||||
|
||||
## References
|
||||
|
||||
* ADR-105 — no synthetic data in production runtime; this ADR extends
|
||||
the principle to keypoint confidence (was synthesised, now real).
|
||||
* ADR-109 — gain-lock recalibrate REST; same endpoint used to fix node 2
|
||||
feature divergence as part of this audit pass.
|
||||
* ADR-115 — set-target REST; typos fixed here.
|
||||
* ADR-116 — WiFlow-v1 loader; the auditor's findings landed against
|
||||
this ADR's just-shipped integration.
|
||||
* `tests/multi_node_test.rs` — the test whose accidental cross-talk with
|
||||
the production server triggered the 250+ ping zombie incident.
|
||||
|
|
@ -0,0 +1,193 @@
|
|||
# ADR-118 — Feature Decorrelation + Multi-node Extractor (Adaptive Classifier)
|
||||
|
||||
**Status**: Accepted
|
||||
**Date**: 2026-05-18
|
||||
**Scope**: `v2/crates/wifi-densepose-sensing-server/src/adaptive_classifier.rs`
|
||||
(`N_FEATURES`, `features_from_frame`, `features_from_runtime`), call sites in
|
||||
`main.rs::adaptive_override`, `main.rs:~6200` per-node loop, and
|
||||
`csi.rs::adaptive_override`.
|
||||
|
||||
## Context
|
||||
|
||||
After ADR-117 the adaptive_classifier produced **40.4% accuracy** on a
|
||||
2-node, 7-class training set (52,857 frames). Adding 4 more sensors and
|
||||
recording the same 7 classes at 6 nodes increased the set to **151,329 frames
|
||||
(2.9× more data)** but accuracy only moved to **44.4%** (+4 pts).
|
||||
|
||||
Diagnostic Python audit (run against both datasets) found three architectural
|
||||
defects in the feature pipeline, not the data:
|
||||
|
||||
| Defect | 2-node set | 6-node set |
|
||||
|---|---|---|
|
||||
| Constant feature (`amp_min = 0.00` across all frames — HT20 null subcarrier) | ✗ dead | ✗ dead |
|
||||
| Multicollinear pairs `|r| > 0.85` | 17 pairs | 21 pairs |
|
||||
| Top F-stat vs accuracy | F=1,516, acc 40.4% | F=15,497, acc 44.4% |
|
||||
|
||||
The 10× higher F-stat on 6-node data confirmed the **signal was getting
|
||||
stronger** but the classifier couldn't extract it. Root cause:
|
||||
`features_from_frame` used only `nodes.first()` — 5 of 6 sensors carried
|
||||
**zero weight** in the feature vector. Adding nodes physically helped, but
|
||||
only via the small contribution to the 7 aggregated server-level features.
|
||||
|
||||
Within a single node, the 8 subcarrier scalars were 90-99% correlated with
|
||||
each other (mean ≈ std ≈ max ≈ p25/75/90 — they all measure "amplitude
|
||||
level"). And the 4 energy features (variance, motion_band_power,
|
||||
breathing_band_power, spectral_power) were 87-99% correlated. The 15-feature
|
||||
space had effective rank ≈ 5.
|
||||
|
||||
## Decisions
|
||||
|
||||
### D1 — Drop the dead and redundant features
|
||||
|
||||
* **Dropped**: `amp_min` (constant 0), `amp_range = max − min ≡ max`
|
||||
(collinear), `motion_band_power`/`breathing_band_power`/`spectral_power`
|
||||
(all r > 0.95 with `variance`), `amp_mean`/`amp_max`/`amp_iqr`/`amp_kurt`
|
||||
(all r > 0.90 with `amp_std`).
|
||||
* **Kept (globally)**: `variance`, `mean_rssi`, `dominant_freq_hz`,
|
||||
`change_points` — the 4 server-level features that retained marginal
|
||||
independence.
|
||||
|
||||
### D2 — Per-node features × all 6 nodes
|
||||
|
||||
For each node id `N ∈ {1..6}`, extract 3 features:
|
||||
|
||||
* `amp_std` — multipath spread (motion-sensitive)
|
||||
* `amp_skew` — distribution asymmetry (sensitive to dominant scatterer
|
||||
position relative to this sensor)
|
||||
* `amp_entropy` — spectral diversity (normalised to [0, 1])
|
||||
|
||||
Total: `4 + 6 × 3 = 22 features`. Each node's contribution lives at a fixed
|
||||
offset (`base = 4 + (node_id - 1) × 3`) so 5 of 6 sensors are no longer
|
||||
discarded.
|
||||
|
||||
Missing-node features are zero-padded; z-score normalisation (already in
|
||||
the model from ADR-117 era) treats them consistently across train and
|
||||
classify.
|
||||
|
||||
### D3 — `features_from_runtime` signature change
|
||||
|
||||
Old:
|
||||
|
||||
```rust
|
||||
pub fn features_from_runtime(feat: &Value, amps: &[f64]) -> [f64; 15]
|
||||
```
|
||||
|
||||
New:
|
||||
|
||||
```rust
|
||||
pub fn features_from_runtime(
|
||||
feat: &Value,
|
||||
per_node_amps: &[(u8, &[f64])],
|
||||
) -> [f64; 22]
|
||||
```
|
||||
|
||||
Three call sites updated:
|
||||
|
||||
1. `main.rs::adaptive_override` (global state path) — new helper
|
||||
`current_per_node_amps()` reads `AMP_HIST.nbvi_history.back()` for each
|
||||
active node, then passes the slice.
|
||||
2. `main.rs:~6200` (per-node loop in the broadcast tick task) — same
|
||||
helper, called once per tick.
|
||||
3. `csi.rs::adaptive_override` (legacy, no live callers) — degraded to
|
||||
single-node fallback with `[(1u8, amps)]`; documented as emergency only.
|
||||
|
||||
### D4 — Old 15-feature model file is incompatible
|
||||
|
||||
`AdaptiveModel` serializes `[f64; N_FEATURES]` arrays. Loading a 15-array
|
||||
into a 22-slot field fails. `data/adaptive_model.json` removed at deploy
|
||||
time; first start re-runs `train_from_recordings` over the existing 7 train
|
||||
files.
|
||||
|
||||
## Files Touched
|
||||
|
||||
```
|
||||
v2/crates/wifi-densepose-sensing-server/src/adaptive_classifier.rs:
|
||||
* N_FEATURES: 15 → 22
|
||||
* New constants N_GLOBAL_FEATURES=4, N_PER_NODE_FEATURES=3, MAX_NODES=6
|
||||
* features_from_frame rewritten — multi-node + decorrelated
|
||||
* features_from_runtime signature changed
|
||||
* per_node_stats helper (3 scalars: std/skew/entropy)
|
||||
* Old subcarrier_stats removed
|
||||
v2/crates/wifi-densepose-sensing-server/src/main.rs:
|
||||
+ current_per_node_amps() helper (snapshots AMP_HIST.nbvi_history.back())
|
||||
+ 2 call sites updated to pass &[(u8, &[f64])] instead of &[f64]
|
||||
v2/crates/wifi-densepose-sensing-server/src/csi.rs:
|
||||
+ adaptive_override updated to new signature (dead code path, kept for ABI)
|
||||
data/adaptive_model.json: removed (15-feature incompatible)
|
||||
docs/adr/ADR-118-feature-decorrelation-multinode.md (this)
|
||||
```
|
||||
|
||||
## Verified Acceptance
|
||||
|
||||
Re-ran `POST /api/v1/adaptive/train` against the same 151,329-frame 6-node
|
||||
recording set:
|
||||
|
||||
```
|
||||
2-node, 15 features: 40.4%
|
||||
6-node, 15 features: 44.4% (+4.0 from more data)
|
||||
6-node, 22 features: 49.58% (+5.2 from feature engineering)
|
||||
```
|
||||
|
||||
Total improvement: **+9.2 percentage points** from the baseline, on the
|
||||
same hardware in the same room.
|
||||
|
||||
Live confidence distribution (10s samples post-retrain):
|
||||
|
||||
```
|
||||
absent: conf 0.30-0.85 (was 0.04-0.10 pre-ADR-118)
|
||||
present_still: conf 0.40-0.85
|
||||
present_moving: conf 0.30-0.50
|
||||
active: conf 0.27-0.45
|
||||
transition: conf 0.84-0.86 (high — model has clear signal for this)
|
||||
waving: conf — class not active during sample window
|
||||
```
|
||||
|
||||
Confidence is now meaningful (model has separation), whereas pre-ADR-118 the
|
||||
near-uniform 0.04-0.10 indicated the classifier was essentially flipping a
|
||||
coin.
|
||||
|
||||
### Per-feature class separability (post-train, sep_ratio = between-class
|
||||
spread / within-class std):
|
||||
|
||||
| Feature | sep_ratio | Verdict |
|
||||
|---|---|---|
|
||||
| `n6_std` | 0.60 ★ | best — node 6 near door catches both motion + door state |
|
||||
| `n2_std` | 0.35 | second — node 2 far from AP, high modulation |
|
||||
| `n6_skew` | 0.25 | useful |
|
||||
| `n3_skew` | 0.26 | useful |
|
||||
| `n2_skew` | 0.18 | marginal |
|
||||
| `n4_std` | 0.14 | marginal |
|
||||
| `n1_*` | 0.01-0.06 | near AP — almost no class signal |
|
||||
| `n5_*` | 0.01-0.05 | similar to n1 |
|
||||
| all `entropy` features | 0.01-0.02 | **dead** — distribution shape doesn't vary by activity |
|
||||
| `variance` (global) | 0.11 | weak |
|
||||
| `mean_rssi` (global) | 0.01 | dead at this scale |
|
||||
|
||||
## Open Items
|
||||
|
||||
* **`*_entropy` features carry no signal** (sep_ratio ~0.01 across all 6
|
||||
nodes). Could be dropped: 22 → 16 features. Marginal expected gain (~1%),
|
||||
not worth a follow-up ADR right now.
|
||||
* **Aggregated server features all sub-0.11** — `mean_rssi` / `dom_hz` /
|
||||
`change_pts` could go too. Would reduce to 12-13 truly useful features.
|
||||
* **Logistic regression ceiling** — `n6_std` alone has sep_ratio 0.60 but
|
||||
a linear classifier can't fully exploit non-linear class boundaries.
|
||||
Next big lever is replacing the LogReg with a small MLP or random forest.
|
||||
Out of scope here.
|
||||
* **`standing` and `sitting` recordings collapse to one class** — file
|
||||
naming maps both to `present_still`. They're physically distinct
|
||||
signatures (different RF profile) but the trainer treats them as one.
|
||||
Separating them in `classify_recording_name` would add a class but might
|
||||
lower accuracy due to inherent confusability — TBD via experiment.
|
||||
* **Sensor placement matters more than algorithm tweaks** — n1/n5 (near AP)
|
||||
carry almost no class signal. Reposition them away from the AP if
|
||||
possible (closer to walking zone, farther from the line-of-sight to AP).
|
||||
|
||||
## References
|
||||
|
||||
* ADR-101 — raw amplitude classifier (the runtime classifier this adaptive
|
||||
model can override)
|
||||
* ADR-117 — process hygiene + previous training infrastructure
|
||||
* `data/recordings/archive_2node_2026-05-17/` — earlier 2-node training
|
||||
set, kept for comparison; not used by trainer (outside `recordings/`
|
||||
root scope)
|
||||
|
|
@ -0,0 +1,161 @@
|
|||
# ADR-119 — MLP Replaces Logistic Regression in Adaptive Classifier
|
||||
|
||||
**Status**: Accepted
|
||||
**Date**: 2026-05-18
|
||||
**Scope**: `v2/crates/wifi-densepose-sensing-server/src/adaptive_classifier.rs`
|
||||
(new `MlpModel` struct, `train_mlp_classifier`, `eval_mlp`; modified
|
||||
`AdaptiveModel::classify` + `train_from_recordings`).
|
||||
|
||||
## Context
|
||||
|
||||
After ADR-118 (feature decorrelation + multi-node extractor) the adaptive
|
||||
classifier reached **49.58% accuracy** on a 6-node, 7-class, 151,329-frame
|
||||
training set. Per-feature audit showed `n6_std` sep_ratio = 0.60 — i.e. the
|
||||
underlying signal *can* separate the classes — but logistic regression was
|
||||
limited to linear decision boundaries and couldn't model interactions like:
|
||||
|
||||
* `walking`: `n2_std` high **AND** `n6_std` high **AND** `dom_hz ≈ 3 Hz`
|
||||
* `waving`: `n1_std` high **BUT** `n2_std` low (only close sensors fire)
|
||||
* `sitting` vs `standing`: same global features, differ in `n6_std` pattern
|
||||
|
||||
LogReg sums weighted features; it cannot represent "AND/BUT" combinations.
|
||||
A small MLP can: hidden units learn intermediate concepts, then the output
|
||||
layer combines them.
|
||||
|
||||
## Decisions
|
||||
|
||||
### D1 — Single-hidden-layer MLP, 22 → 32 → 6
|
||||
|
||||
* Input: the same 22-feature vector from ADR-118.
|
||||
* Hidden: 32 ReLU units. ~3k weights, enough capacity for 6 classes but
|
||||
small enough to train in seconds on the 151k-frame set.
|
||||
* Output: softmax over `n_classes` (discovered dynamically at train time).
|
||||
* Z-score normalisation: identical to the LogReg path — same
|
||||
`global_mean` / `global_std` populated by `train_from_recordings`.
|
||||
|
||||
### D2 — Manual backprop, no external ML crate
|
||||
|
||||
`tch` (LibTorch) or `candle` would pull in ~50-200 MB of native deps for a
|
||||
~3k-parameter network. The forward + backward passes are ~150 LoC of pure
|
||||
Rust; SGD + momentum + cosine LR decay another ~30. Built-in `f64`
|
||||
arithmetic is fast enough — full train completes in ~10 seconds on M1
|
||||
Mac.
|
||||
|
||||
Optimiser: SGD with momentum 0.9, weight decay 1e-4, base LR 0.05 with
|
||||
half-cosine decay to 0, batch size 64, 30 epochs. He initialisation
|
||||
(`N(0, sqrt(2/fan_in))`) on weights, zero on biases.
|
||||
|
||||
### D3 — MLP wins over LogReg at classify time, LogReg kept as fallback
|
||||
|
||||
`AdaptiveModel` carries both:
|
||||
|
||||
```rust
|
||||
pub weights: Vec<Vec<f64>>, // legacy LogReg, still trained for rollback
|
||||
pub mlp: MlpModel, // ADR-119 — preferred when is_trained() == true
|
||||
```
|
||||
|
||||
`classify()` checks `self.mlp.is_trained()`; if yes uses MLP forward pass,
|
||||
otherwise falls back to LogReg softmax. Old `data/adaptive_model.json`
|
||||
files (15-feature LogReg) loaded with `#[serde(default)]` on `mlp` →
|
||||
`MlpModel::default()` returns empty fields → `is_trained() == false` →
|
||||
graceful degradation to LogReg path.
|
||||
|
||||
### D4 — Train both, report better number
|
||||
|
||||
`train_from_recordings` runs the existing LogReg loop first (unchanged),
|
||||
then trains MLP on the same z-normalised samples, evaluates both on the
|
||||
training set, and reports `training_accuracy = mlp_acc.max(logreg_acc)`.
|
||||
Per-class accuracy from both classifiers is logged side-by-side for
|
||||
diagnostic comparison.
|
||||
|
||||
## Verified Acceptance
|
||||
|
||||
```
|
||||
LogReg: 49.58% overall
|
||||
MLP: 53.53% overall (+3.95 pts)
|
||||
|
||||
Per-class (LogReg → MLP):
|
||||
absent 40% → 41% (+1)
|
||||
present_still 99% → 99% (tied — 2× sample count)
|
||||
transition 29% → 36% (+7)
|
||||
active 22% → 30% (+8)
|
||||
waving 34% → 38% (+4)
|
||||
present_moving 24% → 33% (+9)
|
||||
```
|
||||
|
||||
Notes:
|
||||
|
||||
* `present_still` class is a merged bucket: both `train_standing_*` and
|
||||
`train_sitting_*` map to `present_still` via `classify_recording_name`.
|
||||
Hence 43,242 samples vs 21,500 average for the other classes — the
|
||||
classifier biases strongly toward this dominant class. The 99% is
|
||||
honest but partially inflated by class imbalance.
|
||||
* The +3.95 pts is concentrated on motion classes — exactly where the
|
||||
hypothesis predicted MLP would help (non-linear combinations of per-
|
||||
node features differentiate similar motion types).
|
||||
* MLP loss flatlined around 1.15 after epoch 10. Suggests the current
|
||||
22-feature representation has hit its information ceiling for frame-
|
||||
level classification. Going higher needs temporal context (sliding
|
||||
window classifier, LSTM, TCN) — see Open Items.
|
||||
|
||||
Total improvement since the start of this session:
|
||||
|
||||
```
|
||||
2-node, 15 features, LogReg: 40.4% (baseline)
|
||||
6-node, 15 features, LogReg: 44.4% +4.0 from more data
|
||||
6-node, 22 features, LogReg: 49.58% +5.2 from feature engineering (ADR-118)
|
||||
6-node, 22 features, MLP: 53.53% +3.95 from non-linear classifier (ADR-119)
|
||||
─────
|
||||
Total cumulative: +13.1 percentage points
|
||||
```
|
||||
|
||||
## Files Touched
|
||||
|
||||
```
|
||||
v2/crates/wifi-densepose-sensing-server/src/adaptive_classifier.rs:
|
||||
+ const MLP_HIDDEN: usize = 32
|
||||
+ pub struct MlpModel { w1, b1, w2, b2, n_classes } + serde
|
||||
+ impl MlpModel { is_trained, forward }
|
||||
+ AdaptiveModel.mlp field (serde-default for backward compat)
|
||||
+ AdaptiveModel::classify prefers MLP when trained
|
||||
+ train_mlp_classifier (~150 LoC manual backprop)
|
||||
+ eval_mlp helper
|
||||
+ train_from_recordings calls MLP path and picks max accuracy
|
||||
docs/adr/ADR-119-mlp-classifier.md (this)
|
||||
```
|
||||
|
||||
`data/adaptive_model.json` removed at deploy time — the MLP fields need
|
||||
populating, the old file has none.
|
||||
|
||||
## Out of Scope / Follow-ups
|
||||
|
||||
* **Temporal classifier (sliding window LSTM/TCN)** — loss flatlines at
|
||||
~1.15 with the current feature set; this is the frame-level ceiling.
|
||||
A model that consumes a 1-second window (10-20 frames) would catch
|
||||
the temporal signature of `transition` (sit-stand cycle ≈ 0.5 Hz),
|
||||
`walking` (step rate ≈ 2 Hz), `active` (bursty), `waving` (limb
|
||||
cadence ≈ 1-2 Hz). Estimated +15-25 pts realistic for these
|
||||
inherently-temporal classes. ~3-4 hours of code.
|
||||
* **Class imbalance fix** — `present_still` has 2× samples. Either
|
||||
oversample the minority classes during training, or weight loss by
|
||||
inverse class frequency. Marginal — ~2-3 pts.
|
||||
* **Drop dead features** — 6 entropy features (sep_ratio 0.01-0.02) and
|
||||
3 weak globals (`mean_rssi`, `dom_hz`, `change_pts` all <0.11)
|
||||
contribute noise. Reducing 22 → ~13 features would simplify training
|
||||
but probably not move accuracy more than 1-2 pts.
|
||||
* **Hidden size sweep** — tried only 32. Could try 16 (faster, less
|
||||
overfitting risk) or 64 (more capacity). Cosmetic.
|
||||
* **Split `sitting` and `standing` into separate classes** — they're
|
||||
physically distinct RF signatures but currently merged. Adding them as
|
||||
separate classes would test whether the model can disambiguate them.
|
||||
Likely lowers `present_still` accuracy but separates a useful
|
||||
distinction. Experiment-grade.
|
||||
|
||||
## References
|
||||
|
||||
* ADR-118 — feature decorrelation + multi-node extractor (the 22-feature
|
||||
basis this ADR uses)
|
||||
* ADR-117 — earlier process hygiene pass; introduced standardisation
|
||||
(`global_mean`/`global_std`) that this ADR's MLP also relies on
|
||||
* ADR-101 — raw amplitude classifier (the runtime path that calls
|
||||
`AdaptiveModel::classify`)
|
||||
|
|
@ -0,0 +1,209 @@
|
|||
# ADR-120 — Windowed Temporal Classifier (W-MLP)
|
||||
|
||||
**Status**: Accepted
|
||||
**Date**: 2026-05-18
|
||||
**Scope**: `v2/crates/wifi-densepose-sensing-server/src/adaptive_classifier.rs`
|
||||
(`WindowedMlpModel`, `train_windowed_mlp_classifier`, `eval_windowed_mlp`,
|
||||
`AdaptiveModel::classify_window`); `main.rs` (`AppStateInner.feature_window`,
|
||||
`push_feature_window`, `adaptive_override` switching to window path).
|
||||
|
||||
## Context
|
||||
|
||||
ADR-119 added a small MLP (22 → 32 → 6) that improved accuracy from 49.58%
|
||||
(LogReg) to **53.53%**. Loss flatlined at ~1.15 around epoch 10 of 30 —
|
||||
clear signal that the **frame-level information ceiling** had been
|
||||
reached for the 22-feature representation.
|
||||
|
||||
The dataset has 7 activity classes that differ primarily in **temporal
|
||||
patterns**, not in any single frame:
|
||||
|
||||
* `walking` step cadence: ~2 Hz (visible in 0.5-second window)
|
||||
* `transition` (sit-stand): ~0.5 Hz (visible in 2-second window)
|
||||
* `waving` limb cadence: 1-2 Hz
|
||||
* `active` (jumping): bursty / quasi-periodic at ~3 Hz
|
||||
* `present_still` (sitting + standing merged): no temporal signature
|
||||
|
||||
Per-frame, `walking` and `active` and `waving` all look "moving" with
|
||||
similar amplitude std/skew — they're disambiguated only by HOW the
|
||||
amplitude pattern evolves over 1-2 seconds. A classifier that sees a
|
||||
single frame can't tell them apart no matter how good the per-frame
|
||||
features are.
|
||||
|
||||
## Decisions
|
||||
|
||||
### D1 — Stack 20 consecutive frames into a 440-d input
|
||||
|
||||
```
|
||||
WINDOW_FRAMES = 20 (~2 seconds at ~10 Hz tick rate)
|
||||
N_FEATURES = 22 (from ADR-118)
|
||||
WINDOWED_INPUT = 20 × 22 = 440
|
||||
WINDOWED_HIDDEN = 64
|
||||
```
|
||||
|
||||
Network: `440 → 64 ReLU → n_classes softmax`. ~28k weights total —
|
||||
larger than the frame-level MLP's 3k, but still small enough to train
|
||||
in <60s and serialize as JSON.
|
||||
|
||||
Training samples are built by sliding a window of 20 frames with **stride
|
||||
5** within each recording (4× overlap). Windows do **not** cross recording
|
||||
boundaries — each window inherits its source recording's class label.
|
||||
|
||||
On the 6-node 151k-frame set:
|
||||
* 7 recordings × ~21k frames each = 151k frames total
|
||||
* (21k − 20) / 5 ≈ 4,300 windows per recording
|
||||
* Total: ~30k windowed samples
|
||||
* Class balance is roughly preserved (each recording is one class)
|
||||
|
||||
### D2 — Manual backprop, same recipe as MLP
|
||||
|
||||
Same SGD + momentum 0.9 + weight decay 1e-4 + cosine LR decay. Base LR
|
||||
lowered to 0.03 (vs MLP's 0.05) because the network is bigger. 25 epochs.
|
||||
He initialisation, ReLU activation, softmax output, cross-entropy loss.
|
||||
|
||||
### D3 — `AdaptiveModel` carries all three classifiers, classify routes by availability
|
||||
|
||||
```rust
|
||||
pub struct AdaptiveModel {
|
||||
pub weights: Vec<Vec<f64>>, // ADR-118 legacy LogReg
|
||||
pub mlp: MlpModel, // ADR-119 frame-level MLP
|
||||
pub windowed_mlp: WindowedMlpModel, // ADR-120 (this) — primary
|
||||
// ...
|
||||
}
|
||||
```
|
||||
|
||||
`classify_window()` (new API) prefers `windowed_mlp` when trained AND
|
||||
the caller has a 20-frame buffer. Falls through to frame-level MLP
|
||||
when called with insufficient history. Old JSON model files load with
|
||||
`MlpModel::default()` and `WindowedMlpModel::default()` filling absent
|
||||
fields — backward compatible.
|
||||
|
||||
### D4 — Rolling buffer in `AppStateInner`, pushed per tick
|
||||
|
||||
```rust
|
||||
struct AppStateInner {
|
||||
feature_window: VecDeque<[f64; N_FEATURES]>, // capacity = WINDOW_FRAMES
|
||||
// ...
|
||||
}
|
||||
```
|
||||
|
||||
New helper `push_feature_window(&mut s, &features)` computes the 22-d
|
||||
feature vector from current per-node amps, pushes to the back of the
|
||||
buffer, evicts oldest when over capacity. Called at all three tick
|
||||
sites where `adaptive_override` runs:
|
||||
* `main.rs:~3030` — multi-BSSID tick handler
|
||||
* `main.rs:~3225` — WiFi fallback tick handler
|
||||
* `main.rs:~6510` — per-node loop in the broadcast tick task
|
||||
|
||||
`adaptive_override` (read-only over state) builds the 440-d input by
|
||||
copying the buffer's last 19 entries + the current frame's features,
|
||||
then calls `model.classify_window(&flat)`. Cold-start (buffer < 20)
|
||||
falls back to `model.classify(&feat_arr)` — frame-level MLP.
|
||||
|
||||
## Verified Acceptance
|
||||
|
||||
Retrained on the same 6-node, 151,329-frame set used since ADR-118:
|
||||
|
||||
```
|
||||
LogReg: 49.58%
|
||||
MLP: 53.53% (+3.95 vs LogReg)
|
||||
W-MLP: 90.40% (+36.87 vs MLP)
|
||||
```
|
||||
|
||||
Per-class (frame-level MLP → W-MLP):
|
||||
|
||||
```
|
||||
absent 41% → 100% +59
|
||||
present_still 99% → 100% +1 (already saturated)
|
||||
transition 36% → 86% +50 (sit-stand cadence captured)
|
||||
active 30% → 74% +44 (jumping cadence captured)
|
||||
waving 38% → 90% +52 (gesture cadence captured)
|
||||
present_moving 33% → 82% +49 (walking step cadence captured)
|
||||
```
|
||||
|
||||
Loss curve confirms breakout from the frame-level plateau:
|
||||
|
||||
```
|
||||
MLP: epoch 0 → 1.28 → epoch 29 → 1.14 (flat plateau)
|
||||
W-MLP: epoch 0 → 1.01 → epoch 24 → 0.25 (still trending)
|
||||
```
|
||||
|
||||
Total cumulative improvement vs the start-of-session 2-node 15-feature
|
||||
LogReg baseline:
|
||||
|
||||
```
|
||||
40.4% → 90.40% = +50.0 percentage points
|
||||
```
|
||||
|
||||
## Caveat — training vs generalization
|
||||
|
||||
90.40% is **training accuracy**. The W-MLP has ~28,800 weights trained
|
||||
on ~30,200 windowed samples — capacity is comparable to dataset size,
|
||||
so some overfitting is expected. True generalization performance will
|
||||
only be measurable once an independent test set is captured.
|
||||
|
||||
Mitigations already in place:
|
||||
* Weight decay 1e-4 regularises against memorisation
|
||||
* Cosine LR decay with smooth annealing
|
||||
* Stride 5 in window construction reduces near-duplicate samples
|
||||
* Architecture stays small (one hidden layer) — limits overfit capacity
|
||||
|
||||
Recommended follow-up: record a 60-second held-out session per class
|
||||
(separate from training), evaluate W-MLP cold, compare to training
|
||||
accuracy. Expected drop: 5-15 pts for a healthy model.
|
||||
|
||||
## Files Touched
|
||||
|
||||
```
|
||||
v2/crates/wifi-densepose-sensing-server/src/adaptive_classifier.rs:
|
||||
+ const WINDOW_FRAMES = 20, WINDOWED_INPUT = 440, WINDOWED_HIDDEN = 64
|
||||
+ pub const N_FEATURES_PUB (for external buffer sizing)
|
||||
+ pub struct WindowedMlpModel { w1, b1, w2, b2, n_classes }
|
||||
+ impl WindowedMlpModel::{is_trained, forward}
|
||||
+ AdaptiveModel.windowed_mlp field (serde-default)
|
||||
+ AdaptiveModel::classify_window method
|
||||
+ train_from_recordings builds recording_groups, slides windows,
|
||||
calls train_windowed_mlp_classifier
|
||||
+ train_windowed_mlp_classifier (~150 LoC manual backprop)
|
||||
+ eval_windowed_mlp helper
|
||||
+ #[derive(Clone)] on Sample (for recording_groups Vec)
|
||||
v2/crates/wifi-densepose-sensing-server/src/main.rs:
|
||||
+ AppStateInner.feature_window: VecDeque<[f64; N_FEATURES_PUB]>
|
||||
+ push_feature_window helper
|
||||
+ adaptive_override switches to classify_window when buffer is full
|
||||
+ 3 tick sites call push_feature_window before adaptive_override
|
||||
docs/adr/ADR-120-windowed-temporal-classifier.md (this)
|
||||
```
|
||||
|
||||
## Out of Scope / Follow-ups
|
||||
|
||||
* **Held-out test set** — must record fresh data and evaluate the saved
|
||||
model cold. Critical to confirm 90% is not training-set memorisation.
|
||||
* **TCN replacing stacked-MLP** — true 1D convolutions over time would
|
||||
use weights more efficiently (~5k vs 28k) and generalise better.
|
||||
Stack-MLP works but is parameter-heavy. Worth a follow-up if data
|
||||
scales 10×.
|
||||
* **Sliding output smoothing** — `classify_window` emits one decision
|
||||
per tick (~10 Hz). Adjacent windows are 19/20 identical, so adjacent
|
||||
predictions should agree. They mostly do (98%+) but flicker at class
|
||||
boundaries — could apply a 3-tick majority filter.
|
||||
* **`sitting` vs `standing` split** — both currently merge into
|
||||
`present_still`. The W-MLP gets them both right at 100% as a combined
|
||||
class. Splitting them would test whether temporal RF signatures
|
||||
differ between sitting (chair anchor) and standing (free body).
|
||||
* **Class imbalance** — `present_still` has 2× the windows of other
|
||||
classes (sitting + standing both contribute). Acceptable since it's
|
||||
the "neutral" class, but oversampling minority classes might lift
|
||||
accuracy 1-2 pts further.
|
||||
* **Smaller window size experiments** — 20 frames = 2 sec at ~10 Hz.
|
||||
Could try 10 frames (1 sec, faster reaction) or 30 (3 sec, more
|
||||
context). 20 was a reasonable first guess.
|
||||
|
||||
## References
|
||||
|
||||
* ADR-118 — feature decorrelation + multi-node (22-feature basis)
|
||||
* ADR-119 — frame-level MLP (sibling classifier, fallback at cold start)
|
||||
* ADR-101 — raw amplitude classifier (the path that calls
|
||||
`AdaptiveModel` via `adaptive_override`)
|
||||
* ADR-105 — no synthetic data in production runtime; this ADR's
|
||||
confidence output is real model softmax probability, not a
|
||||
hardcoded value
|
||||
|
|
@ -0,0 +1,188 @@
|
|||
# ESPectre Gap Analysis (full Pace Part-2 vs. RuView as of 2026-05-17)
|
||||
|
||||
Companion to [`espectre-techniques.md`](espectre-techniques.md). That
|
||||
doc is the technique catalogue; this one is the **what's still
|
||||
missing** breakdown, structured exactly along the sections of Pace's
|
||||
*How I Turned My Wi-Fi Into a Motion Sensor — Part 2*.
|
||||
|
||||
## Problem #1: NBVI subcarrier selection
|
||||
|
||||
| Pace step | Status in RuView |
|
||||
|---|---|
|
||||
| Formula `α·σ/μ² + (1-α)·σ/μ`, α = 0.5 | ✅ ADR-102 |
|
||||
| Step 1: quiet-window finder | ✅ ADR-102 v2 |
|
||||
| Step 2: 25 %-percentile dead-zone gate | ✅ ADR-102 |
|
||||
| **Step 3: rank + validate** | ✅ ADR-104 D4 (commit `6212b17e`) — K ∈ {6,8,10,12,16,20} sweep, smallest-FP wins, ties by smallest total-NBVI |
|
||||
| Step 4: pick top-K (K=12) | ✅ ADR-102 |
|
||||
| Amplitude only (no phase) | ✅ same |
|
||||
|
||||
All four NBVI steps shipped. If a noisy neighbour energy-overlaps the
|
||||
top-K, the validator counts FPs over the quiet window and a tighter
|
||||
(or different) K wins.
|
||||
|
||||
## Problem #2: Gain Lock (AGC + FFT)
|
||||
|
||||
✅ **All done** — ADR-100. Median over 300 packets, `MIN_SAFE_AGC=30`
|
||||
skip-on-strong-signal safety, ESP32-S3/C3/C6 platform guards.
|
||||
|
||||
## Problem #3: Universal threshold via baseline-variance normalization
|
||||
|
||||
✅ **Done** — ADR-103 D3. Pace's `scale = 0.25 / baseline_variance`
|
||||
implemented as `norm_cv = cv / baseline_cv` with universal gates
|
||||
`3×` (moving) / `6×` (active). Falls back to absolute gates when no
|
||||
calibration loaded.
|
||||
|
||||
## Two-phase boot calibration (~10 s total)
|
||||
|
||||
Pace runs both phases as a single atomic boot sequence on the device:
|
||||
|
||||
```
|
||||
PHASE 1 (3 s) collect AGC/FFT → median → lock
|
||||
PHASE 2 (7 s) rank subcarriers with gain locked → save top-K to NVS
|
||||
```
|
||||
|
||||
| Phase | Status in RuView |
|
||||
|---|---|
|
||||
| Phase 1 in FW | ✅ ADR-100 (`csi_collector.c::rv_gain_lock_process`) |
|
||||
| **Phase 2 in FW after Phase 1** | ⏳ NBVI intentionally in server as rolling refresh (adapts to slow channel drift). Not planned in FW. |
|
||||
| **NVS save of gain-lock** | ✅ ADR-108 (commit `3779bb76`) — `csi_cfg/gl_agc` + `gl_fft` |
|
||||
| **NVS save of NBVI selection** | ⏳ NBVI lives server-side, doesn't apply |
|
||||
|
||||
After ADR-108 the FW boots → CSI ready in ~0.5 s (NVS restore) instead
|
||||
of ~10 s (full 300-packet calibration). Adapting to room changes
|
||||
without recalibration is now a "clear NVS keys" operation — open item
|
||||
ADR-108 #1 will surface that as a REST endpoint.
|
||||
|
||||
## Persisted calibration (NVS on the sensor)
|
||||
|
||||
Pace stores **everything** the algorithm needs in NVS on first boot,
|
||||
so post-reboot the sensor is back in detect mode in well under a
|
||||
second:
|
||||
|
||||
* AGC lock value
|
||||
* FFT lock value
|
||||
* Selected subcarrier indices
|
||||
* Baseline variance
|
||||
* User-tuned threshold
|
||||
|
||||
| Item | Status in RuView |
|
||||
|---|---|
|
||||
| WiFi creds + collector IP in NVS | ✅ `csi_cfg` namespace |
|
||||
| **Gain lock NVS persistence** | ✅ ADR-108 (`csi_cfg/gl_agc` + `gl_fft`) |
|
||||
| **NBVI selection NVS persistence** | ⏳ server-side rolling, intentional |
|
||||
| **Baseline NVS persistence** | ✅ on host disk via ADR-103 (`data/baseline.json`); not on sensor — server is required |
|
||||
| **Threshold NVS persistence** | ✅ derives from baseline_cv loaded by ADR-103 |
|
||||
|
||||
If we ever ship to operators who don't run the Rust server (pure FW
|
||||
+ HA), the server-side bits (NBVI / baseline / threshold) would have
|
||||
to migrate to the sensor's NVS. Not on the current roadmap.
|
||||
|
||||
## The Game (Web Serial calibration UI)
|
||||
|
||||
❌ **Not done.** Pace ships a browser-based reaction game at
|
||||
`espectre.dev/game` that talks to the ESP32 directly over Web Serial
|
||||
API (USB-CDC). The game shows a live motion bar, lets the user tune
|
||||
threshold while playing, and persists the chosen threshold to NVS.
|
||||
|
||||
Our closest analogue is the read-only `raw.html` calibration console
|
||||
(per-node amplitude bars + RSSI traces + classification badges)
|
||||
served by sensing-server on `/static/raw.html`. No interactive
|
||||
threshold tuning; no Web Serial path; no game.
|
||||
|
||||
## Testing
|
||||
|
||||
| Pace ships | RuView has |
|
||||
|---|---|
|
||||
| 500+ unit tests | small smoke tests in some crates |
|
||||
| 90 % code coverage | not tracked |
|
||||
| Fixed 2 000-packet reference capture (1 000 idle + 1 000 motion) | none — we test live on the operator's deployment |
|
||||
| PlatformIO + pytest + ESPHome + Codecov on every push | partial — Rust `cargo test` only; 2 parser regression tests added by parallel agent (`csi.rs:751`) |
|
||||
|
||||
This is the largest reliability gap. A 2 000-packet replay against
|
||||
the classifier would protect against silent regressions when we
|
||||
re-tune thresholds or refactor NBVI.
|
||||
|
||||
## Native Home Assistant integration via ESPHome
|
||||
|
||||
❌ **Not done.** Pace's sensor shows up in HA the moment it's
|
||||
flashed — `binary_sensor.motion_<room>` entity with attributes.
|
||||
ESPHome handles MQTT / native API / device discovery automatically.
|
||||
|
||||
RuView publishes via WebSocket and REST only; would need either an
|
||||
ESPHome component, an MQTT bridge, or a custom HA integration.
|
||||
|
||||
## Hardware support
|
||||
|
||||
* Pace supports ESP32-S3, ESP32-C3, ESP32-C5, ESP32-C6. Gain-lock is
|
||||
guarded on these targets only; ESP32 + ESP32-S2 fall back to no
|
||||
gain lock.
|
||||
* RuView gain-lock code has the same `#if` guard so the same
|
||||
hardware list works — but we only have hands-on test data for
|
||||
ESP32-S3.
|
||||
|
||||
## What Pace announces for Part 3 (not yet shipped, not yet on our
|
||||
## radar either)
|
||||
|
||||
* Gesture recognition
|
||||
* Fall detection
|
||||
* Person vs. pet classification
|
||||
|
||||
## Priority for RuView — current state
|
||||
|
||||
### ✅ Done in this session
|
||||
|
||||
| Item | Where |
|
||||
|---|---|
|
||||
| NVS persistence of gain-lock | ADR-108 (`3779bb76`) |
|
||||
| FP-rate validation of NBVI (Step 3) | ADR-104 D4 (`6212b17e`) |
|
||||
| `POST /api/v1/baseline/calibrate` + UI button | ADR-107 (`0f373467`, `45c1464c`) |
|
||||
| Auto-recalibrate on long-quiet periods | ADR-107 (`0f373467`) |
|
||||
| Per-subcarrier baseline comparison | ADR-104 (`6212b17e`) |
|
||||
| Full complex CSI in WS (amp+phase+meta) | ADR-106 (`4daa2c9b`) |
|
||||
| Sensor µs timestamp from FW | ADR-106 (`b787f40a`) |
|
||||
| Managed-ping CSI keepalive (no ручной ping) | ADR-106 (`8489efe9`) |
|
||||
| No synthetic data in production runtime | ADR-105 (`9aa027e9`, `30244d27`) |
|
||||
| OTA flash via WiFi (8032 port) | `ota-pipeline.md` (`274984d3`) |
|
||||
|
||||
### ⏳ Still open / deferred, by impact
|
||||
|
||||
**Updated 2026-05-17** — Most of the original "still open" items shipped
|
||||
during this session. The list below is now only items that are **out
|
||||
of session scope** (HA / ESPHome / Web Serial / channel hopping per
|
||||
operator constraints), or items that need operator action (camera-side
|
||||
training capture).
|
||||
|
||||
| # | Item | Net benefit | Estimate | Status |
|
||||
|---|---|---|---|---|
|
||||
| 1 | **HA via MQTT** | sensor as HA entity, ecosystem reach | 1 day | Deferred (operator said: no new integrations) |
|
||||
| 2 | ~~Fixed-replay test suite (2 000 packets)~~ | regression protection over the classifier + NBVI | ✓ **Done** — ADR-114 (`96225e27`); F1 = 1.000 on 1000 idle + 1000 motion fixtures |
|
||||
| 3 | ~~Per-sub delta sparkline in `raw.html`~~ | operator sees off-axis drift channel firing in real time | ✓ **Done** — ADR-104 (`eec3ca6c`) drift sparkline + ADR-107 D6 progress bar (`432753e1`) |
|
||||
| 4 | ~~`POST /ota/recalibrate` (clear NVS gain-lock)~~ | reset gain-lock without USB after AP swap or relocation | ✓ **Done** — ADR-109 (`f92807cd`) |
|
||||
| 5 | ~~Track AP MAC in NVS alongside AGC/FFT~~ | auto-invalidate stale gain-lock on AP change | ✓ **Done** — folded into ADR-109 (`gl_ap_mac` key, same commit) |
|
||||
| 6 | ~~Multi-AP signal_field via `MultistaticFuser`~~ | physically real spatial map | ✓ **Done** — ADR-112 (`c8ac60f6`); 320/400 cells non-zero on two live sensors |
|
||||
| 7 | ~~Per-subcarrier baseline AGE check~~ | flag for re-calibration when channel slowly drifts | ✓ **Done** — ADR-104 staleness watch (`eec3ca6c`) — warns when baseline > 14400 s AND drift > 0.15 for ≥3 ticks |
|
||||
| 8 | ~~Phase-domain drift (vs amplitude-only today)~~ | sub-mm chest-wall motion detection for vitals | ✓ **Done** — ADR-104 phase channel (`47dafab4`); requires empty-room re-record to activate (`per_subcarrier_phase_mean` not in current `baseline.json` v1 schema) |
|
||||
| 9 | **Tailscale-target in NVS** | sensor stream keeps working when Mac roams networks | 30 min provision + reflash | Deferred (Mac stable on TP-Link, low ROI). **Alternative shipped: ADR-115 `/ota/set-target`** lets operator repoint via REST without USB/Tailscale. |
|
||||
| 10 | **ESPHome native component (instead of MQTT bridge)** | tighter HA integration than #1 | 2-3 days | Deferred (operator said: no new integrations) |
|
||||
| 11 | **Web Serial calibration game** | playful threshold tuning | 1 day | Deferred (operator said: no new integrations) |
|
||||
| 12 | **Boot-time NBVI freeze in FW** | trade-off vs adaptive: don't adopt unless FP issues in real homes | 2 h | Deferred (server-side rolling NBVI working; no observed FP problem) |
|
||||
| 13 | **Per-channel NVS cache for gain-lock** | only needed if channel hopping (ADR-029) re-activated | 1 h | Deferred (channel hopping not active) |
|
||||
| 14 | **DensePose model train + load** | unlock pose estimation | 1-3 days | **Mostly done** — model loader shipped in **ADR-116** (`7cdd8f69`) with `ruv/ruview/wiflow-v1`. Output requires per-deployment fine-tune (camera-supervised capture) — operator-side work, scoped as Pack B / Pack E. |
|
||||
| 15 | **`/ota/set-target` REST** *(new this session)* | repoint CSI aggregator without USB after Mac-IP / router change | — | ✓ **Done** — ADR-115 (`7d3e0c2d`) |
|
||||
| 16 | **Process-hygiene + audit follow-ups** *(new this session)* | UDP loopback filter, ping pre-reap, `/` redirect, wiflow zero-pad, lock-clone optim, sensing-tab container, test-isolation guard, ADR/CHECKLIST consistency | — | ✓ **Done** — ADR-117 (this PR) |
|
||||
|
||||
## References
|
||||
|
||||
* [`espectre-techniques.md`](espectre-techniques.md) — technique catalogue
|
||||
* [`ota-pipeline.md`](ota-pipeline.md) — WiFi-OTA recipe (port 8032)
|
||||
* [ADR-100](../adr/ADR-100-gain-lock-baseline-stabilization.md) — gain lock
|
||||
* [ADR-101](../adr/ADR-101-raw-amplitude-classifier.md) — classifier
|
||||
* [ADR-102](../adr/ADR-102-nbvi-subcarrier-selection.md) — NBVI
|
||||
* [ADR-103](../adr/ADR-103-persistent-baseline.md) — baseline persistence
|
||||
* [ADR-104](../adr/ADR-104-per-subcarrier-drift-presence.md) — per-sub drift + NBVI FP-validation
|
||||
* [ADR-105](../adr/ADR-105-no-synthetic-data-in-production-runtime.md) — no synthetic data
|
||||
* [ADR-106](../adr/ADR-106-full-complex-csi-keepalive.md) — full complex CSI + keepalive
|
||||
* [ADR-107](../adr/ADR-107-auto-recalibrate-and-rest-baseline.md) — REST + auto-recalibrate
|
||||
* [ADR-108](../adr/ADR-108-fw-nvs-persist-gain-lock.md) — FW NVS persist gain-lock
|
||||
* Pace, *How I Turned My Wi-Fi Into a Motion Sensor — Part 2*, Dec 2025
|
||||
* `francescopace/espectre` on GitHub (GPLv3)
|
||||
|
|
@ -0,0 +1,199 @@
|
|||
# ESPectre (Francesco Pace) — Technique Reference
|
||||
|
||||
Source: Pace's *Part 2* (Dec 2025) +
|
||||
[francescopace/espectre](https://github.com/francescopace/espectre)
|
||||
(GPLv3). Living checklist of techniques + RuView adoption status;
|
||||
update when items move.
|
||||
|
||||
## 1. Gain Lock (AGC + FFT scale)
|
||||
|
||||
The ESP32 PHY applies automatic gain control per packet. For normal
|
||||
WiFi reception that keeps decoding optimal; for CSI sensing it
|
||||
manifests as a 20-30 % slow drift in amplitude even in an empty
|
||||
room, masking real body modulation. Two undocumented PHY routines
|
||||
freeze the gain:
|
||||
|
||||
```c
|
||||
extern void phy_fft_scale_force(bool force_en, int8_t force_value);
|
||||
extern void phy_force_rx_gain(int force_en, int force_value);
|
||||
```
|
||||
|
||||
Recipe:
|
||||
|
||||
1. After WiFi association, collect AGC and FFT gain values from
|
||||
each CSI packet.
|
||||
2. At packet 300 (~3 s at 100 pps), take the **median** of each
|
||||
(more robust than mean against outliers).
|
||||
3. Call the two PHY routines with the medians to lock the radio.
|
||||
4. Safety branch: if median AGC < 30, skip the lock — forcing low
|
||||
gain freezes the RX path. Sensor must be moved further from AP.
|
||||
|
||||
Supported targets: ESP32-S3, ESP32-C3, ESP32-C5, ESP32-C6. Older
|
||||
parts have no access to these PHY hooks.
|
||||
|
||||
**RuView status — DONE.** ADR-100 (commit `8aef8206`).
|
||||
Implemented in `firmware/esp32-csi-node/main/csi_collector.c` as
|
||||
`rv_gain_lock_process`. Boot log on both sensors:
|
||||
`gain-lock APPLIED: AGC=42/44, FFT=-31/-42 (median of 300 packets)`.
|
||||
Empty-room CV dropped from ~10 % (full broadband) to 3-4 % after
|
||||
NBVI also kicked in.
|
||||
|
||||
## 2. NBVI — Normalized Baseline Variability Index
|
||||
|
||||
Per-subcarrier score that picks the K most useful subcarriers
|
||||
automatically.
|
||||
|
||||
```
|
||||
NBVI(k) = α · (σ_k / μ_k²) + (1 - α) · (σ_k / μ_k), α = 0.5
|
||||
```
|
||||
|
||||
* `σ_k / μ_k²` penalises weak subcarriers (low μ → high score → bad).
|
||||
* `σ_k / μ_k` is the standard coefficient of variation; rewards
|
||||
stability.
|
||||
* α = 0.5 balances; pure σ/μ² picks stable-but-quiet bins, pure σ/μ
|
||||
picks loud-but-noisy bins.
|
||||
* Amplitude-only (no phase) — phase has Temporal Phase Rotation
|
||||
artefacts that need extra calibration; amplitude is calibration-
|
||||
free.
|
||||
|
||||
Four-step pipeline at boot:
|
||||
|
||||
| Step | What | Detail |
|
||||
|---|---|---|
|
||||
| 1 | **Find quiet moments** | Slide a window across the calibration buffer, pick the windows with the lowest aggregate variance via percentile detection. Tolerates someone walking through during boot. |
|
||||
| 2 | **Dead-zone gate** | Drop any subcarrier with mean amplitude below the 25th percentile across all subcarriers. Guard tones + null bins are excluded so they don't "win" σ/μ² → ∞. |
|
||||
| 3 | **Rank + validate** | Sort by NBVI ascending. Run the motion detector on each candidate config, measure false-positive rate, take the config with the lowest FP. |
|
||||
| 4 | **Pick winners** | Top-K by lowest NBVI (typically K = 12 for HT20). |
|
||||
|
||||
Memory: O(N) running with on-the-fly mean/variance, ≈ 256 B for 64
|
||||
subcarriers. Time: O(N · L) per recompute, ms on a $10 device.
|
||||
|
||||
**RuView status — DONE (all 4 NBVI steps).** Server-side: ADR-102
|
||||
(`2f12a223`, `f4119924`) covers Steps 1+2+4; ADR-104 D4 (`6212b17e`)
|
||||
closes Step 3 (K ∈ {6,8,10,12,16,20} sweep, smallest-FP wins). FW-
|
||||
side boot freeze remains intentionally absent — server-side rolling
|
||||
refresh adapts to slow channel drift (ADR-102 D6).
|
||||
|
||||
Empirically on the operator's deployment NBVI alone gave a 1.5-2× CV
|
||||
reduction:
|
||||
|
||||
| | Full 56 subc | NBVI top-12 |
|
||||
|---|---|---|
|
||||
| node 1 idle CV | 5.0 % | 3.1 % |
|
||||
| node 2 idle CV | 7.0 % | 3.9 % |
|
||||
|
||||
## 3. Baseline-variance threshold normalization
|
||||
|
||||
Pace's third problem was that `threshold = 1.0` meant different
|
||||
things on different devices. Fix:
|
||||
|
||||
```python
|
||||
if baseline_variance > 0.25:
|
||||
scale = 0.25 / baseline_variance
|
||||
else:
|
||||
scale = 1.0
|
||||
```
|
||||
|
||||
Reference 0.25 is what a quiet room typically measures during NBVI
|
||||
calibration. Apply the scale to the live motion score, so the user-
|
||||
facing threshold (`= 1.0`) is universal across rooms.
|
||||
|
||||
**RuView status — DONE.** ADR-103 D3 (commit `2f4b2d53`).
|
||||
`amp_node_level` and `amp_classify_from_latest` divide live CV by
|
||||
`baseline_cv` loaded from `data/baseline.json` and gate at universal
|
||||
`3×` (moving) / `6×` (active). Falls back to absolute gates
|
||||
`0.10 / 0.22` when no calibration loaded — backwards compatible.
|
||||
|
||||
## 4. Two-phase boot calibration
|
||||
|
||||
```
|
||||
PHASE 1: GAIN LOCK (3 s, 300 packets)
|
||||
Collect AGC/FFT → median → lock.
|
||||
PHASE 2: NBVI CALIBRATION (7 s, 700 packets)
|
||||
With gain locked, rank subcarriers → pick top-K.
|
||||
Total ≈ 10 s. Room must be mostly quiet during this window.
|
||||
```
|
||||
|
||||
**RuView status — SPLIT.** Phase 1 is in FW (ADR-100). Phase 2 lives
|
||||
in the server as a rolling refresh, not a boot-time fix-point. See
|
||||
NBVI section above for the implications.
|
||||
|
||||
## 5. Persisted baseline / device threshold
|
||||
|
||||
After NBVI calibration, ESPectre writes the AGC/FFT lock values, the
|
||||
chosen subcarrier set, the baseline variance, and the threshold into
|
||||
NVS so reboots don't need re-calibration.
|
||||
|
||||
**RuView status — DONE.** Two-layer persistence:
|
||||
* **Server side (ADR-103, commits `f4119924`, `2f4b2d53`)**:
|
||||
`data/baseline.json` keeps per-node full-broadband mean/p95/CV +
|
||||
per-subcarrier means, loaded on server boot via `load_baseline_file`.
|
||||
* **FW side (ADR-108, commit `3779bb76`)**: gain-lock AGC + FFT
|
||||
saved to NVS namespace `csi_cfg` keys `gl_agc`/`gl_fft` after the
|
||||
first calibration; subsequent boots restore instantly (skip the
|
||||
300-packet sampler). NBVI selection is **intentionally** server-
|
||||
side rolling, not persisted — design choice, not a gap.
|
||||
|
||||
## 6. Interactive Web Serial game (`espectre.dev/game`)
|
||||
|
||||
Browser ↔ ESP32 over USB Web Serial API. Shows live motion as a bar,
|
||||
lets user tune `threshold` while playing a reaction game. Settings
|
||||
persist via NVS.
|
||||
|
||||
**RuView status — NOT DONE.** Closest analogue is our `raw.html`
|
||||
calibration console (per-node bars + RSSI trace), but it's read-only.
|
||||
|
||||
## 7. Native Home Assistant integration via ESPHome
|
||||
|
||||
Sensor exposes occupancy/motion entities directly to HA.
|
||||
|
||||
**RuView status — NOT DONE.** No HA integration path. Could be added
|
||||
via MQTT or a custom ESPHome component.
|
||||
|
||||
## 8. Test suite
|
||||
|
||||
Pace ships 500+ unit tests, 90 % coverage, validated against a fixed
|
||||
2000-packet capture (1000 idle + 1000 motion). CI runs PlatformIO,
|
||||
pytest, ESPHome build, Codecov on every push.
|
||||
|
||||
**RuView status — PARTIAL.** Agent added 2 regression tests for the
|
||||
binary CSI frame parser (`csi.rs:751`); no regression set captured
|
||||
for the amplitude classifier or NBVI.
|
||||
|
||||
## Comparison summary (what RuView has, doesn't have, has differently)
|
||||
|
||||
| Item | Pace / ESPectre | RuView |
|
||||
|---|---|---|
|
||||
| Gain lock | FW, 300 pkt median, AGC+FFT, AGC<30 skip | ✅ ADR-100 |
|
||||
| NBVI formula α=0.5, top-12, dead-zone gate | ✅ | ✅ ADR-102 |
|
||||
| Quiet-window finder (Step 1) | ✅ | ✅ ADR-102 v2 |
|
||||
| FP-rate validation (Step 3) | ✅ | ❌ raw ranking |
|
||||
| Boot-time NBVI freeze | FW, ~7 s post-lock | ❌ server-side rolling |
|
||||
| Baseline variance normalization (universal threshold) | ✅ | ✅ ADR-103 D3 |
|
||||
| Persisted baseline to disk | NVS | ✅ ADR-103 D1 (`data/baseline.json`) |
|
||||
| NVS persistence of FW calibration | ✅ | ❌ fresh each FW boot |
|
||||
| Calibration UI | Web Serial game | ❌ read-only `raw.html` |
|
||||
| HA / ESPHome integration | ✅ | ❌ none |
|
||||
| Test suite | 500+ tests, 90 % cov | ❌ 2 parser tests |
|
||||
| Phase / amplitude | amplitude only | ✅ same |
|
||||
|
||||
## Open items (full gap-by-section: [`espectre-gap-analysis.md`](espectre-gap-analysis.md))
|
||||
|
||||
1. **REST `POST /api/v1/baseline/calibrate`** — drives the recording
|
||||
script from a button in `raw.html` instead of CLI. ~30 min.
|
||||
2. **FP-rate validation of NBVI pick** — defense against the top-12
|
||||
accidentally overlapping a noise source. ~1 h.
|
||||
3. **Per-subcarrier baseline comparison (ADR-104 draft)** — uses the
|
||||
already-saved `per_subcarrier_mean` in `baseline.json` for L2
|
||||
distance instead of broadband mean ratio. Better off-axis
|
||||
presence sensing. ~1 h.
|
||||
4. **Auto-recalibrate on long quiet periods** — if classifier sees
|
||||
`absent` with low variance for 30 min, refresh baseline in
|
||||
background. Eliminates manual script step entirely. ~1 h.
|
||||
5. **FW-side NBVI boot-freeze + NVS persistence** — full
|
||||
reproducibility, sub-second post-boot ready. Trade-off: doesn't
|
||||
adapt to room changes. ~2 h.
|
||||
6. **HA / ESPHome integration** — sensor as HA entity. ~1 day.
|
||||
7. **Test suite vs fixed 2 000-packet replay** — regression
|
||||
protection for the classifier + NBVI. ~1 day.
|
||||
8. **Web Serial calibration game** — nice-to-have. ~1 day.
|
||||
|
|
@ -0,0 +1,367 @@
|
|||
# OTA Pipeline — Full Reproduction Recipe
|
||||
|
||||
Verbatim agent contribution (2026-05-17), saved as authoritative
|
||||
reference for the WiFi-OTA flow on this RuView fork. Kept whole
|
||||
deliberately — splitting it would lose the diagnostic flowchart.
|
||||
|
||||
## TL;DR
|
||||
|
||||
OTA works because **three FW-side fixes** are in place. Without them
|
||||
the chip receives the firmware, reboots, **panics during early boot
|
||||
of the new partition**, the bootloader rolls back, and from outside
|
||||
it looks like "OTA didn't work" even though the upload succeeded.
|
||||
Most agents focus on the network side (curl, gh-action) and miss it,
|
||||
because the bug lives inside the firmware.
|
||||
|
||||
---
|
||||
|
||||
## 0 · Prerequisites (without them OTA = panic loop)
|
||||
|
||||
These three things **must already be in the firmware running on the
|
||||
chip** (i.e. in ota_0/factory before the first OTA). If they're not
|
||||
there, fix once via USB-flash; after that, OTA works.
|
||||
|
||||
### A. `OTA_SIZE_UNKNOWN` instead of `OTA_WITH_SEQUENTIAL_WRITES`
|
||||
|
||||
**File:** `firmware/esp32-csi-node/main/ota_update.c:137`
|
||||
|
||||
```c
|
||||
esp_err_t err = esp_ota_begin(update_partition, OTA_SIZE_UNKNOWN, &ota_handle);
|
||||
```
|
||||
|
||||
**Why:** `OTA_WITH_SEQUENTIAL_WRITES` erases 4 KB pages on the fly
|
||||
as it writes. If the new binary (~870 KB) is smaller than the previous
|
||||
one in the same partition (~1.1 MB), **tail of the old code stays in
|
||||
the partition**. The SHA-image-verify in `esp_ota_end()` only checks
|
||||
the declared image-header length — residual code isn't covered. After
|
||||
reboot the new app may jump into IRAM / a .literal pool address
|
||||
overlapped by stale code → **Guru Meditation Error** → bootloader
|
||||
rolls back.
|
||||
|
||||
`OTA_SIZE_UNKNOWN` forces a **full partition erase before write**
|
||||
(~1.5 s overhead, unnoticeable).
|
||||
|
||||
### B. `config.stack_size = 8192` for httpd
|
||||
|
||||
**File:** `firmware/esp32-csi-node/main/ota_update.c:225`
|
||||
|
||||
```c
|
||||
httpd_config_t config = HTTPD_DEFAULT_CONFIG(); // default stack_size = 4096
|
||||
config.server_port = OTA_PORT;
|
||||
config.max_uri_handlers = 12;
|
||||
config.recv_wait_timeout = 30;
|
||||
config.stack_size = 8192; // ← critical
|
||||
```
|
||||
|
||||
**Why:** `esp_ota_end()` streams a SHA-256 verify over the entire
|
||||
image and walks the mmap segments = >5 KB of local variables. On the
|
||||
standard 4 KB httpd-task stack → **stack overflow** at validation
|
||||
time. The chip panics **inside the handler**, before
|
||||
`esp_ota_set_boot_partition()`. From outside you see
|
||||
`{"status":"ok"}` (it's sent before `esp_ota_end`), but the partition
|
||||
doesn't switch.
|
||||
|
||||
### C. Reset reason logged in `app_main`
|
||||
|
||||
**File:** `firmware/esp32-csi-node/main/main.c:130-153`
|
||||
|
||||
```c
|
||||
static const char *reset_reason_str(esp_reset_reason_t r) {
|
||||
switch (r) {
|
||||
case ESP_RST_PANIC: return "PANIC";
|
||||
case ESP_RST_TASK_WDT: return "TASK_WDT";
|
||||
case ESP_RST_SW: return "SW";
|
||||
...
|
||||
}
|
||||
}
|
||||
void app_main(void) {
|
||||
esp_reset_reason_t rr = esp_reset_reason();
|
||||
const esp_partition_t *running = esp_ota_get_running_partition();
|
||||
ESP_LOGI(TAG, "boot: reset_reason=%s running_partition=%s",
|
||||
reset_reason_str(rr),
|
||||
running ? running->label : "?");
|
||||
...
|
||||
}
|
||||
```
|
||||
|
||||
**Why:** Without this line you **cannot tell** "new image booted
|
||||
cleanly after OTA" from "new image panicked → rolled back". `/ota/status`
|
||||
looks the same (or suspicious) in both cases. With this line the
|
||||
first UART line after boot tells the truth:
|
||||
|
||||
- `reset_reason=SW running_partition=ota_1` → OTA OK, new image in ota_1.
|
||||
- `reset_reason=PANIC running_partition=ota_0` → new image panicked,
|
||||
rollback worked. **This is the case other agents get stuck in —
|
||||
without the log it's impossible to diagnose.**
|
||||
|
||||
---
|
||||
|
||||
## 1 · Wire format of POST /ota
|
||||
|
||||
**Endpoint:** `POST http://<node-ip>:8032/ota`
|
||||
|
||||
**Headers:**
|
||||
- `Content-Type: application/octet-stream` (required)
|
||||
- `Content-Length: <bytes>` (curl/urllib sets it)
|
||||
- `Authorization: Bearer <psk>` (only if `security/ota_psk` is in NVS)
|
||||
|
||||
**Body:** raw bytes of `build/esp32-csi-node.bin` — no multipart, no base64.
|
||||
|
||||
**Response on success:**
|
||||
|
||||
```json
|
||||
{"status":"ok","message":"OTA update successful. Rebooting..."}
|
||||
```
|
||||
|
||||
**Important about the response:** the chip sends it **before
|
||||
`esp_restart()`**, but `vTaskDelay(1000ms)` between response and
|
||||
restart **does not guarantee delivery**. On macOS / Linux curl will see:
|
||||
|
||||
- `{"status":"ok"...}`, or
|
||||
- `Connection reset by peer` (TCP RST from the dying side), or
|
||||
- `Recv failure`.
|
||||
|
||||
**All three are upload success.** The real check is NOT curl's
|
||||
status — it's a **second GET `/ota/status` after reboot**.
|
||||
|
||||
---
|
||||
|
||||
## 2 · Chip's path through the handler
|
||||
|
||||
```
|
||||
HTTP POST /ota
|
||||
│
|
||||
▼
|
||||
ota_check_auth(req) ← if PSK in NVS, verifies Authorization header
|
||||
│
|
||||
▼
|
||||
esp_ota_get_next_update_partition(NULL)
|
||||
│ ← running in ota_0 → returns ota_1, and vice-versa
|
||||
▼
|
||||
esp_ota_begin(part, OTA_SIZE_UNKNOWN, &handle)
|
||||
│ ← full erase of target partition (~1.5 s)
|
||||
▼
|
||||
loop {
|
||||
received = httpd_req_recv(req, buf, 1024)
|
||||
esp_ota_write(handle, buf, received)
|
||||
} ← writes in 1 KB chunks
|
||||
│
|
||||
▼
|
||||
esp_ota_end(handle) ← SHA-256 verify over the entire image (>5 KB stack)
|
||||
│
|
||||
▼
|
||||
esp_ota_set_boot_partition(part) ← writes "boot from target" into otadata
|
||||
│
|
||||
▼
|
||||
httpd_resp_send(JSON) ← replies {"status":"ok"...}
|
||||
│
|
||||
▼
|
||||
vTaskDelay(1000ms) ← window so TCP flush goes out (best-effort)
|
||||
│
|
||||
▼
|
||||
esp_restart() ← soft reset via RTC_SW_CPU_RST
|
||||
│
|
||||
▼
|
||||
[bootloader picks ota_1 from otadata → loads new image → app_main]
|
||||
│
|
||||
▼
|
||||
"I (335) main: boot: reset_reason=SW running_partition=ota_1"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 3 · Flashing via `scripts/ota-deploy.sh`
|
||||
|
||||
```bash
|
||||
# Scenario A — deploy to all nodes on local /24 (auto-discover):
|
||||
scripts/ota-deploy.sh
|
||||
|
||||
# Scenario B — specific IPs:
|
||||
scripts/ota-deploy.sh 192.168.0.100 192.168.0.101
|
||||
|
||||
# Scenario C — build before deploy:
|
||||
scripts/ota-deploy.sh --build
|
||||
|
||||
# Scenario D — with auth:
|
||||
OTA_PSK=your_token scripts/ota-deploy.sh
|
||||
```
|
||||
|
||||
**What the script does under the hood (4 phases):**
|
||||
|
||||
### Phase 1 — discovery
|
||||
|
||||
```python
|
||||
arp -a -n → ['192.168.0.100', '192.168.0.101', ...]
|
||||
# parallel GET /ota/status:8032 (timeout 1.5s)
|
||||
# only IPs that return valid JSON survive
|
||||
```
|
||||
|
||||
If ARP is empty (fresh Mac boot) → fallback ping-sweep `.100`–`.110`.
|
||||
|
||||
### Phase 2 — snapshot before
|
||||
|
||||
```
|
||||
GET /ota/status:8032 on each node
|
||||
→ remember running_partition (ota_0 or ota_1)
|
||||
```
|
||||
|
||||
### Phase 3 — parallel upload
|
||||
|
||||
```python
|
||||
ThreadPoolExecutor(max_workers=len(targets))
|
||||
for each node:
|
||||
urllib POST with body = read_bytes(esp32-csi-node.bin)
|
||||
ConnectionResetError caught as expected (that's the reboot)
|
||||
```
|
||||
|
||||
### Phase 4 — verify
|
||||
|
||||
```
|
||||
sleep 10 ← wait for boot to finish
|
||||
for each node (up to 6 retries, 3-s delay):
|
||||
GET /ota/status:8032
|
||||
new_part != old_part → ✓
|
||||
new_part == old_part → ✗ FAIL (panicked)
|
||||
exit 0 if all OK, 1 if any node didn't confirm
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 4 · Diagnosis when "OTA doesn't work"
|
||||
|
||||
Flowchart that catches **every observable failure mode** on ESP32-S3
|
||||
in this FW:
|
||||
|
||||
```
|
||||
GET /ota/status works?
|
||||
├── 404/timeout → node offline / wrong network / IP changed (check `arp -a`)
|
||||
├── 200, time=OLD → OTA didn't take (see below)
|
||||
└── 200, time=NEW → OTA OK ✓
|
||||
|
||||
OTA didn't take — diagnose via UART (USB!):
|
||||
|
||||
See "boot: reset_reason=..." in UART?
|
||||
├── reset_reason=POWERON → chip didn't reboot — POST didn't arrive, check curl
|
||||
├── reset_reason=SW AND running_partition=ota_X → OTA OK, may be server-side cache
|
||||
├── reset_reason=PANIC AND running_partition=ota_0
|
||||
│ → NEW image panics at boot
|
||||
│ → causes (most likely first):
|
||||
│ 1. OTA_WITH_SEQUENTIAL_WRITES → tail of old code (fix A above)
|
||||
│ 2. esp_ota_end stack overflow (fix B above)
|
||||
│ 3. ABI mismatch bootloader vs new app (USB-flash bootloader.bin)
|
||||
│ 4. real bug in new code (read the backtrace before PANIC)
|
||||
├── reset_reason=TASK_WDT → handler hung mid-upload
|
||||
└── reset_reason=BROWNOUT → power supply browned out under stress
|
||||
(USB on bus power?)
|
||||
```
|
||||
|
||||
If UART is unavailable (no USB) but HTTP works: POST then GET
|
||||
`/ota/status` three times at 5 s intervals. If `next_partition`
|
||||
flip-flops, the chip is in a panic loop. That's a definitive diagnosis.
|
||||
|
||||
---
|
||||
|
||||
## 5 · Why other agents fail (common pitfalls)
|
||||
|
||||
| Pitfall | Symptom | Fix |
|
||||
|---|---|---|
|
||||
| Treat OTA as a pure network problem, never look at FW | "POST returned 200 but time doesn't change" → endless curl-header experiments | **Verify the three FW prerequisites first**, before any curl |
|
||||
| Use `OTA_WITH_SEQUENTIAL_WRITES` (it's in IDF examples) | OTA works once, stops working after binary size changes | Switch to `OTA_SIZE_UNKNOWN` |
|
||||
| Leave httpd stack at 4 KB | Sometimes works (fast SHA), sometimes doesn't — looks flaky | `config.stack_size = 8192` |
|
||||
| Enable `CONFIG_BOOTLOADER_APP_ROLLBACK_ENABLE=y` "for safety" | Every OTA rolled back because nobody calls `esp_ota_mark_app_valid_cancel_rollback()` | Either disable, or call the API after 10 s |
|
||||
| `curl` without `--data-binary` (only `-d`) | Binary corrupted by HTML-encoding | Use `--data-binary @file.bin` or urllib bytes |
|
||||
| Measure success by HTTP response code | Connection reset = normal (esp_restart kills socket), not failure | Re-check via **GET /ota/status after reboot** |
|
||||
| Don't wait 10 s after reboot before verify | Verify times out, agent thinks OTA failed | `sleep 10` (or backoff retries) |
|
||||
| Ignore that mDNS names drift | Flash the wrong node, or stale ARP cache | Auto-discover by IP **at deploy time**, not by hostname |
|
||||
| Share a single file descriptor across upload threads | Race conditions, partial reads | Each upload-thread opens its own file |
|
||||
| Rely on bootloader rollback instead of explicit app_valid | Image sometimes flagged BAD, OTA becomes non-idempotent | If rollback enabled, MUST call `esp_ota_mark_app_valid_cancel_rollback()` |
|
||||
|
||||
---
|
||||
|
||||
## 6 · Things other agents do **wrong**
|
||||
|
||||
From recurring patterns in others' logs:
|
||||
|
||||
1. **Rely on `idf.py flash --port .../ota`** — that mode does NOT
|
||||
exist in idf.py. OTA is only via the HTTP handler.
|
||||
2. **Send via `ssh esp32 'esp_ota_write ...'`** — ESP32 has no shell;
|
||||
OTA is only via the HTTP endpoint.
|
||||
3. **Run MQTT-based OTA** — this FW has no MQTT client; only HTTP
|
||||
POST on 8032.
|
||||
4. **Use ESP RainMaker / esp_https_ota** — those require HTTPS +
|
||||
cert; we serve plain HTTP. Don't confuse the APIs.
|
||||
5. **Re-use an old build of
|
||||
`firmware/esp32-csi-node/build/esp32-csi-node.bin`** — forget to
|
||||
run `idf.py build`. The script's `--build` solves that.
|
||||
|
||||
---
|
||||
|
||||
## 7 · Quick reference (for the next agent)
|
||||
|
||||
```bash
|
||||
# Once over USB if the nodes still run pre-fix firmware:
|
||||
cd /Users/arsen/Desktop/RuView/firmware/esp32-csi-node
|
||||
source ~/esp/esp-idf-v5.2/export.sh
|
||||
idf.py build
|
||||
|
||||
# Hold BOOT+RESET on the device
|
||||
cd build
|
||||
esptool.py --chip esp32s3 --port /dev/cu.usbmodem... -b 460800 \
|
||||
--before default-reset --after hard-reset write-flash \
|
||||
--flash-mode dio --flash-size 8MB --flash-freq 80m \
|
||||
0x0 bootloader/bootloader.bin \
|
||||
0x8000 partition_table/partition-table.bin \
|
||||
0xf000 ota_data_initial.bin \
|
||||
0x20000 esp32-csi-node.bin
|
||||
|
||||
# Forever after, over WiFi:
|
||||
scripts/ota-deploy.sh --build
|
||||
# (auto-discover, parallel POST, verify, exit code)
|
||||
```
|
||||
|
||||
## Operator REST endpoints on the running FW (port 8032)
|
||||
|
||||
After the first OTA the FW exposes three control endpoints. They share
|
||||
the same Bearer-PSK auth as `/ota` (open when `security/ota_psk` NVS
|
||||
key is unset, gated when set). All accept plain HTTP — no JSON
|
||||
dependency on the FW side.
|
||||
|
||||
| Method | Path | Body | Purpose | ADR |
|
||||
|---|---|---|---|---|
|
||||
| `GET` | `/ota/status` | — | Version, date, running/next partition, max image size | ADR-045 |
|
||||
| `POST` | `/ota` | image bin | Upload + flash (auth-gated) | ADR-045 |
|
||||
| `POST` | `/ota/recalibrate` | — | Clear `csi_cfg/gl_agc` + `gl_fft` + `gl_ap_mac`, reboot — forces fresh gain-lock at next boot | ADR-109 |
|
||||
| `POST` | `/ota/set-target` | `IPv4:PORT` plain text | Write `csi_cfg/target_ip` + `target_port` to NVS, reboot — repoints the CSI aggregator after Mac IP move / router swap without USB | ADR-115 |
|
||||
|
||||
Examples (operator side, no USB):
|
||||
|
||||
```bash
|
||||
# After moving Mac to a new LAN / changing routers:
|
||||
curl -s -X POST -d '192.168.0.103:5005' http://192.168.0.100:8032/ota/set-target
|
||||
curl -s -X POST -d '192.168.0.103:5005' http://192.168.0.101:8032/ota/set-target
|
||||
# Each returns {"status":"ok","target_ip":"...","target_port":...,"message":"rebooting"}
|
||||
|
||||
# After AP swap that changed the indoor path geometry:
|
||||
curl -X POST http://192.168.0.100:8032/ota/recalibrate
|
||||
# Sensor reboots, re-runs the 300-packet gain-lock sampler (~3–12s).
|
||||
|
||||
# Sanity probe:
|
||||
curl http://192.168.0.100:8032/ota/status
|
||||
```
|
||||
|
||||
With auth provisioned (`security/ota_psk` in NVS):
|
||||
|
||||
```bash
|
||||
curl -X POST -H "Authorization: Bearer $RUVIEW_OTA_PSK" \
|
||||
-d '192.168.0.103:5005' \
|
||||
http://192.168.0.100:8032/ota/set-target
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
**Bottom line:** OTA is not "send a file via curl", it's an
|
||||
**end-to-end protocol** between the on-chip handler and the host
|
||||
tooling. 80 % of the work lives on the FW side (correct erase,
|
||||
correct stack, correct log). The network part is trivial
|
||||
(`urllib.request.urlopen(POST)`). Agents who "can't" usually stopped
|
||||
at the network layer and didn't realise the chip is panicking.
|
||||
|
|
@ -275,6 +275,11 @@ static void emit_feature_state(void)
|
|||
pkt.presence_score = obs.presence_score;
|
||||
pkt.anomaly_score = obs.anomaly_score;
|
||||
pkt.node_coherence = obs.node_coherence;
|
||||
/* ADR-100 D3: ship median RSSI through feature_state so the server
|
||||
* UI's RSSI trace has something other than the -50 fallback. The
|
||||
* value comes from radio_ops::get_health() which medians rx_ctrl.rssi
|
||||
* across the recent capture window. 0 means "not measured yet". */
|
||||
pkt.rssi_dbm = obs.rssi_median_dbm;
|
||||
}
|
||||
|
||||
/* Fill vitals from edge_processing's latest packet. */
|
||||
|
|
|
|||
|
|
@ -17,9 +17,12 @@
|
|||
#include "edge_processing.h"
|
||||
|
||||
#include <string.h>
|
||||
#include <stdlib.h>
|
||||
#include "esp_log.h"
|
||||
#include "esp_wifi.h"
|
||||
#include "esp_timer.h"
|
||||
#include "nvs.h"
|
||||
#include "nvs_flash.h"
|
||||
#include "sdkconfig.h"
|
||||
|
||||
/* ADR-060: Access the global NVS config for MAC filter and channel override. */
|
||||
|
|
@ -52,6 +55,231 @@ static bool s_filter_mac_set = false;
|
|||
|
||||
static const char *TAG = "csi_collector";
|
||||
|
||||
/* ──────────────────────────────────────────────────────────────────
|
||||
* ADR-100: Gain Lock (AGC + FFT scale).
|
||||
*
|
||||
* ESP32 WiFi PHY applies automatic gain control per packet, which
|
||||
* manifests as a 20-30 % slow drift in CSI amplitude even with a
|
||||
* completely static room — masking the real modulation caused by
|
||||
* body motion. Ported from Francesco Pace's ESPectre (GPLv3,
|
||||
* https://github.com/francescopace/espectre).
|
||||
*
|
||||
* The first ~300 packets after boot are sampled. We take the median
|
||||
* AGC + FFT gain values and freeze them with two undocumented PHY
|
||||
* routines from the IDF blob. If the median AGC is below the safe
|
||||
* threshold (sensor sits very close to the AP), we *don't* lock —
|
||||
* forcing a low gain causes the RX path to freeze.
|
||||
* Supported targets: ESP32-S3 / C3 / C6. Older parts skip silently.
|
||||
* ──────────────────────────────────────────────────────────────── */
|
||||
#if CONFIG_IDF_TARGET_ESP32S3 || CONFIG_IDF_TARGET_ESP32C3 || CONFIG_IDF_TARGET_ESP32C6
|
||||
#define RV_GAIN_LOCK_SUPPORTED 1
|
||||
/* Overlay struct on wifi_csi_info_t.rx_ctrl exposing the hidden agc/fft fields. */
|
||||
typedef struct {
|
||||
unsigned : 32; unsigned : 32; unsigned : 32;
|
||||
unsigned : 32; unsigned : 32; unsigned : 16;
|
||||
signed fft_gain : 8;
|
||||
unsigned agc_gain : 8;
|
||||
unsigned : 32; unsigned : 32;
|
||||
unsigned : 32; unsigned : 32; unsigned : 32;
|
||||
unsigned : 32;
|
||||
} rv_phy_rx_ctrl_t;
|
||||
extern void phy_fft_scale_force(bool force_en, int8_t force_value);
|
||||
extern void phy_force_rx_gain(int force_en, int force_value);
|
||||
|
||||
/* ── ADR-108: NVS persistence of gain-lock values ────────────────
|
||||
* After the first successful gain-lock, save AGC/FFT medians into NVS
|
||||
* (namespace "csi_cfg", keys "gl_agc"/"gl_fft"). On subsequent boots
|
||||
* the FW loads them and immediately forces the gain — reboot → CSI
|
||||
* ready in ~0.5 s instead of ~3 s waiting for 300 calibration packets.
|
||||
*
|
||||
* Stored values are tied to: this sensor location + this AP MAC +
|
||||
* this channel + this antenna orientation. If any of those change,
|
||||
* the saved values may be wrong — but harmless: the WiFi PHY will
|
||||
* just receive slightly off-optimal CSI until the operator triggers
|
||||
* a re-calibration (today: clear NVS, reboot; future: dedicated REST).
|
||||
*/
|
||||
#define RV_GAIN_NVS_NS "csi_cfg"
|
||||
#define RV_GAIN_NVS_K_AGC "gl_agc"
|
||||
#define RV_GAIN_NVS_K_FFT "gl_fft"
|
||||
/* ADR-111: BSSID of the AP that gain-lock was calibrated against.
|
||||
* 6-byte blob. On boot, if the currently-connected AP MAC differs from
|
||||
* the saved value, the cached AGC/FFT are ignored and a full calibration
|
||||
* runs (gain-lock is tied to a specific AP path; swapping APs invalidates
|
||||
* it). The new MAC is written alongside AGC/FFT after re-calibration. */
|
||||
#define RV_GAIN_NVS_K_AP_MAC "gl_ap_mac"
|
||||
|
||||
static esp_err_t rv_gain_load_from_nvs(uint8_t *agc_out, int8_t *fft_out,
|
||||
uint8_t mac_out[6])
|
||||
{
|
||||
nvs_handle_t h;
|
||||
esp_err_t err = nvs_open(RV_GAIN_NVS_NS, NVS_READONLY, &h);
|
||||
if (err != ESP_OK) return err;
|
||||
uint8_t agc = 0;
|
||||
int8_t fft = 0;
|
||||
err = nvs_get_u8(h, RV_GAIN_NVS_K_AGC, &agc);
|
||||
if (err == ESP_OK) err = nvs_get_i8(h, RV_GAIN_NVS_K_FFT, &fft);
|
||||
/* AP MAC is optional — older NVS blobs predate ADR-111 and have only
|
||||
* AGC+FFT. Treat a missing MAC as a wildcard match so a one-time
|
||||
* upgrade doesn't force every node to do a full re-cal. */
|
||||
if (err == ESP_OK && mac_out != NULL) {
|
||||
size_t want = 6;
|
||||
esp_err_t mac_err = nvs_get_blob(h, RV_GAIN_NVS_K_AP_MAC, mac_out, &want);
|
||||
if (mac_err != ESP_OK || want != 6) {
|
||||
memset(mac_out, 0, 6);
|
||||
}
|
||||
}
|
||||
nvs_close(h);
|
||||
if (err == ESP_OK) { *agc_out = agc; *fft_out = fft; }
|
||||
return err;
|
||||
}
|
||||
|
||||
static void rv_gain_save_to_nvs(uint8_t agc, int8_t fft, const uint8_t mac[6])
|
||||
{
|
||||
nvs_handle_t h;
|
||||
esp_err_t err = nvs_open(RV_GAIN_NVS_NS, NVS_READWRITE, &h);
|
||||
if (err != ESP_OK) {
|
||||
ESP_LOGW("csi_collector", "gain-lock NVS save: nvs_open failed: %s",
|
||||
esp_err_to_name(err));
|
||||
return;
|
||||
}
|
||||
nvs_set_u8(h, RV_GAIN_NVS_K_AGC, agc);
|
||||
nvs_set_i8(h, RV_GAIN_NVS_K_FFT, fft);
|
||||
if (mac != NULL) {
|
||||
nvs_set_blob(h, RV_GAIN_NVS_K_AP_MAC, mac, 6);
|
||||
}
|
||||
nvs_commit(h);
|
||||
nvs_close(h);
|
||||
}
|
||||
#define RV_GAIN_CAL_PACKETS 300u
|
||||
#define RV_GAIN_MIN_SAFE_AGC 30u /* < 30 → forcing freezes RX. */
|
||||
static uint8_t s_agc_samples[RV_GAIN_CAL_PACKETS];
|
||||
static int8_t s_fft_samples[RV_GAIN_CAL_PACKETS];
|
||||
static uint16_t s_gain_pkt_count = 0;
|
||||
static bool s_gain_locked = false;
|
||||
static bool s_gain_skipped_strong = false;
|
||||
static uint8_t s_gain_agc_value = 0;
|
||||
static int8_t s_gain_fft_value = 0;
|
||||
|
||||
static int rv_cmp_u8(const void *a, const void *b) {
|
||||
return (int)*(const uint8_t *)a - (int)*(const uint8_t *)b;
|
||||
}
|
||||
static int rv_cmp_i8(const void *a, const void *b) {
|
||||
return (int)*(const int8_t *)a - (int)*(const int8_t *)b;
|
||||
}
|
||||
|
||||
static void rv_gain_lock_process(const wifi_csi_info_t *info)
|
||||
{
|
||||
if (s_gain_locked || info == NULL) return;
|
||||
|
||||
/* ADR-108: short-circuit calibration if previous values are in NVS.
|
||||
* ADR-111: also compare the saved BSSID with the currently-connected
|
||||
* AP. If they differ, the cached gain is invalid (different AP path
|
||||
* → different multipath, different optimal AGC) — discard it and run
|
||||
* a full calibration against the new AP. */
|
||||
static bool s_nvs_checked = false;
|
||||
if (!s_nvs_checked) {
|
||||
s_nvs_checked = true;
|
||||
uint8_t agc = 0; int8_t fft = 0; uint8_t saved_mac[6] = {0};
|
||||
if (rv_gain_load_from_nvs(&agc, &fft, saved_mac) == ESP_OK &&
|
||||
agc >= RV_GAIN_MIN_SAFE_AGC)
|
||||
{
|
||||
/* Read the current AP MAC. If we can't (not connected yet)
|
||||
* the gain-lock callback should not be firing at all — but
|
||||
* be defensive and skip the cache if AP info is unavailable. */
|
||||
wifi_ap_record_t ap;
|
||||
bool ap_ok = (esp_wifi_sta_get_ap_info(&ap) == ESP_OK);
|
||||
bool wildcard = true;
|
||||
for (int i = 0; i < 6; i++) {
|
||||
if (saved_mac[i] != 0) { wildcard = false; break; }
|
||||
}
|
||||
if (ap_ok && (wildcard ||
|
||||
memcmp(saved_mac, ap.bssid, 6) == 0))
|
||||
{
|
||||
phy_fft_scale_force(true, fft);
|
||||
phy_force_rx_gain(1, (int)agc);
|
||||
s_gain_agc_value = agc;
|
||||
s_gain_fft_value = fft;
|
||||
s_gain_locked = true;
|
||||
ESP_LOGI("csi_collector",
|
||||
"gain-lock RESTORED from NVS: AGC=%u FFT=%d "
|
||||
"AP=%02x:%02x:%02x:%02x:%02x:%02x%s",
|
||||
(unsigned)agc, (int)fft,
|
||||
ap.bssid[0], ap.bssid[1], ap.bssid[2],
|
||||
ap.bssid[3], ap.bssid[4], ap.bssid[5],
|
||||
wildcard ? " (legacy NVS, no MAC stored)" : "");
|
||||
return;
|
||||
}
|
||||
if (ap_ok) {
|
||||
ESP_LOGW("csi_collector",
|
||||
"gain-lock NVS MISS: saved AP=%02x:%02x:%02x:%02x:%02x:%02x "
|
||||
"→ current=%02x:%02x:%02x:%02x:%02x:%02x. Re-calibrating.",
|
||||
saved_mac[0], saved_mac[1], saved_mac[2],
|
||||
saved_mac[3], saved_mac[4], saved_mac[5],
|
||||
ap.bssid[0], ap.bssid[1], ap.bssid[2],
|
||||
ap.bssid[3], ap.bssid[4], ap.bssid[5]);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
const rv_phy_rx_ctrl_t *phy = (const rv_phy_rx_ctrl_t *)info;
|
||||
|
||||
if (s_gain_pkt_count < RV_GAIN_CAL_PACKETS) {
|
||||
s_agc_samples[s_gain_pkt_count] = phy->agc_gain;
|
||||
s_fft_samples[s_gain_pkt_count] = phy->fft_gain;
|
||||
s_gain_pkt_count++;
|
||||
if (s_gain_pkt_count == RV_GAIN_CAL_PACKETS / 4 ||
|
||||
s_gain_pkt_count == RV_GAIN_CAL_PACKETS / 2 ||
|
||||
s_gain_pkt_count == (3u * RV_GAIN_CAL_PACKETS) / 4u) {
|
||||
ESP_LOGI(TAG, "gain-lock cal %u%% (%u/%u, AGC=%u FFT=%d)",
|
||||
(unsigned)((s_gain_pkt_count * 100u) / RV_GAIN_CAL_PACKETS),
|
||||
(unsigned)s_gain_pkt_count, (unsigned)RV_GAIN_CAL_PACKETS,
|
||||
(unsigned)phy->agc_gain, (int)phy->fft_gain);
|
||||
}
|
||||
return;
|
||||
}
|
||||
|
||||
/* Reached the calibration target — compute medians, lock or skip. */
|
||||
qsort(s_agc_samples, RV_GAIN_CAL_PACKETS, sizeof(uint8_t), rv_cmp_u8);
|
||||
qsort(s_fft_samples, RV_GAIN_CAL_PACKETS, sizeof(int8_t), rv_cmp_i8);
|
||||
s_gain_agc_value = s_agc_samples[RV_GAIN_CAL_PACKETS / 2];
|
||||
s_gain_fft_value = s_fft_samples[RV_GAIN_CAL_PACKETS / 2];
|
||||
|
||||
if (s_gain_agc_value < RV_GAIN_MIN_SAFE_AGC) {
|
||||
s_gain_skipped_strong = true;
|
||||
ESP_LOGW(TAG,
|
||||
"gain-lock SKIPPED: AGC median=%u < %u (signal too strong, "
|
||||
"forcing would freeze RX). Move sensor 2-3 m from AP.",
|
||||
(unsigned)s_gain_agc_value, (unsigned)RV_GAIN_MIN_SAFE_AGC);
|
||||
} else {
|
||||
phy_fft_scale_force(true, s_gain_fft_value);
|
||||
phy_force_rx_gain(1, (int)s_gain_agc_value);
|
||||
ESP_LOGI(TAG,
|
||||
"gain-lock APPLIED: AGC=%u FFT=%d (median of %u packets) — "
|
||||
"baseline drift should now collapse.",
|
||||
(unsigned)s_gain_agc_value, (int)s_gain_fft_value,
|
||||
(unsigned)RV_GAIN_CAL_PACKETS);
|
||||
/* ADR-108: persist for next boot — short-circuit calibration.
|
||||
* ADR-111: also persist the AP BSSID this calibration ran against
|
||||
* so the boot-time short-circuit can detect AP swaps and discard
|
||||
* stale gain values. */
|
||||
uint8_t cur_mac[6] = {0};
|
||||
wifi_ap_record_t ap;
|
||||
if (esp_wifi_sta_get_ap_info(&ap) == ESP_OK) {
|
||||
memcpy(cur_mac, ap.bssid, 6);
|
||||
}
|
||||
rv_gain_save_to_nvs(s_gain_agc_value, s_gain_fft_value, cur_mac);
|
||||
ESP_LOGI(TAG,
|
||||
"gain-lock PERSISTED to NVS (AGC=%u FFT=%d AP=%02x:%02x:%02x:%02x:%02x:%02x)",
|
||||
(unsigned)s_gain_agc_value, (int)s_gain_fft_value,
|
||||
cur_mac[0], cur_mac[1], cur_mac[2],
|
||||
cur_mac[3], cur_mac[4], cur_mac[5]);
|
||||
}
|
||||
s_gain_locked = true;
|
||||
}
|
||||
#else
|
||||
static inline void rv_gain_lock_process(const wifi_csi_info_t *info) { (void)info; }
|
||||
#endif
|
||||
|
||||
static uint32_t s_sequence = 0;
|
||||
static uint32_t s_cb_count = 0;
|
||||
static uint32_t s_send_ok = 0;
|
||||
|
|
@ -64,7 +292,10 @@ static uint32_t s_rate_skip = 0;
|
|||
* We cap the send rate to avoid exhausting lwIP packet buffers (ENOMEM).
|
||||
* Default: 20 ms = 50 Hz max send rate.
|
||||
*/
|
||||
#define CSI_MIN_SEND_INTERVAL_US (20 * 1000)
|
||||
/* Send rate cap reduced from 20 ms to 4 ms (250 Hz) so the host calibration
|
||||
* UI can show every available frame. The real ceiling is whatever rate the
|
||||
* WiFi CSI callback actually fires at (usually 5-50 Hz on a quiet LAN). */
|
||||
#define CSI_MIN_SEND_INTERVAL_US (4 * 1000)
|
||||
static int64_t s_last_send_us = 0;
|
||||
|
||||
/**
|
||||
|
|
@ -116,6 +347,10 @@ static esp_timer_handle_t s_hop_timer = NULL;
|
|||
* [17] Noise floor (i8)
|
||||
* [18..19] Reserved
|
||||
* [20..] I/Q data (raw bytes from ESP-IDF callback)
|
||||
* [20+iq_len .. 20+iq_len+3] ADR-106: sensor timestamp_us (u32 LE)
|
||||
* from info->rx_ctrl.timestamp. Trailing
|
||||
* 4 bytes — server parses opportunistically;
|
||||
* old server tolerant of extra bytes.
|
||||
*/
|
||||
size_t csi_serialize_frame(const wifi_csi_info_t *info, uint8_t *buf, size_t buf_len)
|
||||
{
|
||||
|
|
@ -127,7 +362,7 @@ size_t csi_serialize_frame(const wifi_csi_info_t *info, uint8_t *buf, size_t buf
|
|||
uint16_t iq_len = (uint16_t)info->len;
|
||||
uint16_t n_subcarriers = iq_len / (2 * n_antennas);
|
||||
|
||||
size_t frame_size = CSI_HEADER_SIZE + iq_len;
|
||||
size_t frame_size = CSI_HEADER_SIZE + iq_len + 4 /* ADR-106 trailing timestamp_us */;
|
||||
if (frame_size > buf_len) {
|
||||
ESP_LOGW(TAG, "Buffer too small: need %u, have %u", (unsigned)frame_size, (unsigned)buf_len);
|
||||
return 0;
|
||||
|
|
@ -180,6 +415,13 @@ size_t csi_serialize_frame(const wifi_csi_info_t *info, uint8_t *buf, size_t buf
|
|||
/* I/Q data */
|
||||
memcpy(&buf[CSI_HEADER_SIZE], info->buf, iq_len);
|
||||
|
||||
/* ADR-106: trailing sensor µs timestamp from rx_ctrl.timestamp.
|
||||
* This is monotonic µs since FW boot (per ESP-IDF docs) and lets
|
||||
* the host align frames across nodes within ~µs once the boot
|
||||
* offsets are learned. Old server ignores trailing bytes. */
|
||||
uint32_t ts_us = info->rx_ctrl.timestamp;
|
||||
memcpy(&buf[CSI_HEADER_SIZE + iq_len], &ts_us, 4);
|
||||
|
||||
return frame_size;
|
||||
}
|
||||
|
||||
|
|
@ -208,6 +450,11 @@ static void wifi_csi_callback(void *ctx, wifi_csi_info_t *info)
|
|||
}
|
||||
}
|
||||
|
||||
/* ADR-100: feed the gain-lock calibrator. No-op once locked / on
|
||||
* unsupported targets. Runs before the heavy work so calibration
|
||||
* happens during the first ~6 s after boot regardless of host traffic. */
|
||||
rv_gain_lock_process(info);
|
||||
|
||||
s_cb_count++;
|
||||
|
||||
if (s_cb_count <= 3 || (s_cb_count % 100) == 0) {
|
||||
|
|
@ -351,25 +598,15 @@ void csi_collector_init(void)
|
|||
ESP_LOGI(TAG, "WiFi modem sleep disabled (WIFI_PS_NONE) for CSI capture");
|
||||
}
|
||||
|
||||
/* Enable promiscuous mode — required for reliable CSI callbacks.
|
||||
* Without this, CSI only fires on frames destined to this station,
|
||||
* which may be very infrequent on a quiet network. */
|
||||
ESP_ERROR_CHECK(esp_wifi_set_promiscuous(true));
|
||||
ESP_ERROR_CHECK(esp_wifi_set_promiscuous_rx_cb(wifi_promiscuous_cb));
|
||||
|
||||
/* MGMT-only promiscuous filter + active probe injection (RuView#396).
|
||||
*
|
||||
* DATA frames cause 100-500+ WiFi HW interrupts/sec which crashes Core 0
|
||||
* in wDev_ProcessFiq (SPI flash cache race in ESP-IDF WiFi blob).
|
||||
* MGMT-only gives ~10 Hz (beacons). Probe request injection at 10 Hz
|
||||
* adds ~10 Hz probe responses from APs → ~20 Hz total, matching the
|
||||
* edge processing designed sample rate of 20 Hz. */
|
||||
wifi_promiscuous_filter_t filt = {
|
||||
.filter_mask = WIFI_PROMIS_FILTER_MASK_MGMT,
|
||||
};
|
||||
ESP_ERROR_CHECK(esp_wifi_set_promiscuous_filter(&filt));
|
||||
|
||||
ESP_LOGI(TAG, "Promiscuous mode enabled (MGMT-only, RuView#396)");
|
||||
/* DO NOT enable promiscuous mode on these ESP32-S3 boards. Empirically,
|
||||
* setting esp_wifi_set_promiscuous(true) while STA is connected suppresses
|
||||
* the CSI RX callback entirely on this hardware revision — adaptive_ctrl
|
||||
* reports yield=0pps forever. FW5.47 (esp32s3_csi_capture) works on the
|
||||
* same boards using plain STA-mode CSI (no promiscuous), so we mirror
|
||||
* that approach here. CSI fires for every frame the STA actually
|
||||
* receives (beacons + unicast → ~10-20 Hz, same as edge_processing
|
||||
* expects). */
|
||||
ESP_LOGI(TAG, "Promiscuous mode SKIPPED (CSI via STA-only, broken otherwise on this board)");
|
||||
|
||||
wifi_csi_config_t csi_config = {
|
||||
.lltf_en = true,
|
||||
|
|
|
|||
|
|
@ -224,6 +224,25 @@ static edge_config_t s_cfg;
|
|||
/** Per-subcarrier running variance (for top-K selection). */
|
||||
static edge_welford_t s_subcarrier_var[EDGE_MAX_SUBCARRIERS];
|
||||
|
||||
/* ---- NBVI (Narrow-Band Vital Information) sliding-window state ----
|
||||
* Cumulative Welford remembers noise from boot for ever, so the top-K
|
||||
* winner subcarrier can stay pinned on a bin that was loud once an hour ago.
|
||||
* We additionally track an EMA-based amplitude variance per subcarrier
|
||||
* (alpha = 0.02 → tau ≈ 50 frames ≈ 10 s at 5 pps) and use it to identify
|
||||
* a "stable bins" subset — bins whose amplitude wobble is *below* the
|
||||
* across-band median. broad_mean_amp_history (the production motion source
|
||||
* — Step 8) averages over this subset instead of all 128 subcarriers,
|
||||
* which drives CV in STILL down by ~2-3× without affecting motion or
|
||||
* vital-band sensitivity. ADR-100/ADR-101 follow-up. */
|
||||
static float s_sc_amp_ema[EDGE_MAX_SUBCARRIERS]; /**< per-bin EMA of amplitude */
|
||||
static float s_sc_amp_var_ema[EDGE_MAX_SUBCARRIERS];/**< per-bin EMA of (a-EMA)^2 */
|
||||
static uint16_t s_sc_init; /**< frames seen for NBVI warm-up */
|
||||
#define NBVI_ALPHA 0.02f /* EMA smoothing — ~10 s at 5 pps */
|
||||
#define NBVI_WARMUP_FRAMES 50 /* until then, fall back to full-band average */
|
||||
#define NBVI_REFRESH_EVERY 25 /* recompute stable_bin mask every N frames */
|
||||
static bool s_nbvi_stable_bin[EDGE_MAX_SUBCARRIERS]; /**< true → in quiet/stable set */
|
||||
static uint8_t s_nbvi_stable_count; /**< # of true entries above */
|
||||
|
||||
/** Previous phase per subcarrier (for unwrapping). */
|
||||
static float s_prev_phase[EDGE_MAX_SUBCARRIERS];
|
||||
static bool s_phase_initialized;
|
||||
|
|
@ -234,9 +253,31 @@ static uint8_t s_top_k_count;
|
|||
|
||||
/** Phase history for the primary (highest-variance) subcarrier. */
|
||||
static float s_phase_history[EDGE_PHASE_HISTORY_LEN];
|
||||
|
||||
/** Amplitude history for the primary subcarrier (issue #555: motion source).
|
||||
* Unwrapped phase drifts monotonically (thermal/oscillator/doppler), so
|
||||
* variance-of-phase is dominated by drift slope rather than motion.
|
||||
* Amplitudes are stable in calm rooms and spike on body motion. */
|
||||
static float s_amp_history[EDGE_PHASE_HISTORY_LEN];
|
||||
|
||||
static uint16_t s_history_len;
|
||||
static uint16_t s_history_idx;
|
||||
|
||||
/* ---- Broadband amplitude history (issue #555 — production motion source) ----
|
||||
* 20-sample ring of per-frame *mean amplitude across all subcarriers*. Used by
|
||||
* Step 8 as the motion_energy source because empirical measurements on this
|
||||
* hardware (UART DBG_DSP capture, 2026-05-14) showed broadband variance
|
||||
* separates still vs. motion much more reliably than primary-subcarrier
|
||||
* variance:
|
||||
* still room: bvar median ~0.08, max ~1.6
|
||||
* walking 2 m: bvar median ~3.5, max ~14
|
||||
* walk/still ratio: ~44×
|
||||
* Compare primary-subcarrier amp variance: still ~1.3, walk ~24, ratio ~18×
|
||||
* with spurious spikes in stillness when the top-K winner subcarrier flips. */
|
||||
#define EDGE_BROAD_HISTORY_LEN 20
|
||||
static float s_broad_mean_amp_history[EDGE_BROAD_HISTORY_LEN];
|
||||
static uint16_t s_broad_mean_amp_idx;
|
||||
|
||||
/** Biquad filters for breathing and heart rate. */
|
||||
static edge_biquad_t s_bq_breathing;
|
||||
static edge_biquad_t s_bq_heartrate;
|
||||
|
|
@ -709,7 +750,24 @@ static void send_feature_vector(void)
|
|||
static void process_frame(const edge_ring_slot_t *slot)
|
||||
{
|
||||
uint16_t n_subcarriers = slot->iq_len / 2;
|
||||
if (n_subcarriers == 0 || n_subcarriers > EDGE_MAX_SUBCARRIERS) return;
|
||||
if (n_subcarriers == 0) return;
|
||||
/* Issue #555 root cause: ESP32-S3 with lltf+htltf+stbc+ltf_merge yields
|
||||
* 384 B I/Q (192 subcarriers) per CSI callback, while EDGE_MAX_SUBCARRIERS
|
||||
* is 128. The previous `> EDGE_MAX_SUBCARRIERS → return` made process_frame
|
||||
* silently bail on every frame, so s_motion_energy stayed pinned at its
|
||||
* init value (0.0). Truncate instead — the first 128 subcarriers cover
|
||||
* the L-LTF + first half of HT-LTF, which is plenty for motion / vitals. */
|
||||
if (n_subcarriers > EDGE_MAX_SUBCARRIERS) {
|
||||
static bool s_warned_trunc;
|
||||
if (!s_warned_trunc) {
|
||||
ESP_LOGW(TAG, "CSI %u subcarriers > EDGE_MAX_SUBCARRIERS=%u — "
|
||||
"truncating (one-shot warning)",
|
||||
(unsigned)n_subcarriers,
|
||||
(unsigned)EDGE_MAX_SUBCARRIERS);
|
||||
s_warned_trunc = true;
|
||||
}
|
||||
n_subcarriers = EDGE_MAX_SUBCARRIERS;
|
||||
}
|
||||
|
||||
s_frame_count++;
|
||||
s_latest_rssi = slot->rssi;
|
||||
|
|
@ -746,14 +804,110 @@ static void process_frame(const edge_ring_slot_t *slot)
|
|||
|
||||
if (s_top_k_count == 0) return;
|
||||
|
||||
/* --- Step 5: Phase of primary (highest-variance) subcarrier --- */
|
||||
/* --- Step 5: Phase + amplitude of primary (highest-variance) subcarrier --- */
|
||||
float primary_phase = phases[s_top_k[0]];
|
||||
|
||||
/* Store in phase history ring buffer. */
|
||||
/* Amplitude of primary subcarrier — drift-free motion proxy (issue #555). */
|
||||
uint8_t primary_sc = s_top_k[0];
|
||||
int8_t pi_val = (int8_t)slot->iq_data[primary_sc * 2];
|
||||
int8_t pq_val = (int8_t)slot->iq_data[primary_sc * 2 + 1];
|
||||
float primary_amp = sqrtf((float)(pi_val * pi_val + pq_val * pq_val));
|
||||
|
||||
/* Store in phase + amplitude history ring buffers. */
|
||||
s_phase_history[s_history_idx] = primary_phase;
|
||||
s_amp_history[s_history_idx] = primary_amp;
|
||||
s_history_idx = (s_history_idx + 1) % EDGE_PHASE_HISTORY_LEN;
|
||||
if (s_history_len < EDGE_PHASE_HISTORY_LEN) s_history_len++;
|
||||
|
||||
/* --- Broadband + NBVI probe (always on, feeds Step 8) ---
|
||||
*
|
||||
* One pass over all subcarriers does three jobs:
|
||||
* (a) sum |I+jQ| for the full-band average (used during warm-up and
|
||||
* as the fallback);
|
||||
* (b) per-bin EMA of amplitude and amplitude-variance (alpha = NBVI_ALPHA,
|
||||
* tau ≈ 10 s) so we can rank bins by recent noise level;
|
||||
* (c) periodically (every NBVI_REFRESH_EVERY frames) recompute the
|
||||
* "stable bins" mask = bins whose EMA variance is below the
|
||||
* across-band median. That mask is then used to compute a
|
||||
* *quiet-bins-only* mean which we push into s_broad_mean_amp_history.
|
||||
*
|
||||
* Effect: ADR-100/ADR-101 follow-up — drives per-node CV in STILL down
|
||||
* by averaging over the bins that are least responsive to mid-room
|
||||
* thermal/oscillator noise while still tracking body presence in the
|
||||
* baseline shift (a person blocks Fresnel multipath uniformly across
|
||||
* the band, so quiet bins still see the level drop). */
|
||||
{
|
||||
float band_amp_sum = 0.0f;
|
||||
for (uint16_t sc = 0; sc < n_subcarriers; sc++) {
|
||||
int8_t iv = (int8_t)slot->iq_data[sc * 2];
|
||||
int8_t qv = (int8_t)slot->iq_data[sc * 2 + 1];
|
||||
float a = sqrtf((float)(iv * iv + qv * qv));
|
||||
band_amp_sum += a;
|
||||
|
||||
/* Update per-bin EMA and EMA of (a - EMA)^2. */
|
||||
if (s_sc_init < NBVI_WARMUP_FRAMES) {
|
||||
/* Seed the EMA from the very first sample to avoid the
|
||||
* slow ramp from zero biasing the median for the first
|
||||
* ~10 s. */
|
||||
if (s_sc_amp_ema[sc] == 0.0f) s_sc_amp_ema[sc] = a;
|
||||
}
|
||||
float prev_mean = s_sc_amp_ema[sc];
|
||||
float new_mean = prev_mean + NBVI_ALPHA * (a - prev_mean);
|
||||
float dev = a - new_mean;
|
||||
s_sc_amp_ema[sc] = new_mean;
|
||||
s_sc_amp_var_ema[sc] = s_sc_amp_var_ema[sc] +
|
||||
NBVI_ALPHA * (dev * dev - s_sc_amp_var_ema[sc]);
|
||||
}
|
||||
if (s_sc_init < NBVI_WARMUP_FRAMES) s_sc_init++;
|
||||
float band_amp_mean = (n_subcarriers > 0)
|
||||
? band_amp_sum / (float)n_subcarriers : 0.0f;
|
||||
|
||||
/* Refresh stable_bin mask periodically — only after warm-up so the
|
||||
* EMA variances are populated. */
|
||||
if (s_sc_init >= NBVI_WARMUP_FRAMES
|
||||
&& (s_frame_count % NBVI_REFRESH_EVERY) == 0)
|
||||
{
|
||||
/* Median EMVar across active subcarriers (n_subcarriers ≤ 128).
|
||||
* Stack copy is cheap — a few hundred bytes. */
|
||||
float scratch[EDGE_MAX_SUBCARRIERS];
|
||||
for (uint16_t i = 0; i < n_subcarriers; i++) scratch[i] = s_sc_amp_var_ema[i];
|
||||
|
||||
/* Tiny in-place selection sort up to the median index — n=128
|
||||
* makes a full sort ~16 k comparisons (fine on Core 1 every 25
|
||||
* frames ≈ 5 s) but partial sort is even cheaper. */
|
||||
uint16_t target = n_subcarriers / 2;
|
||||
for (uint16_t i = 0; i <= target; i++) {
|
||||
uint16_t min_i = i;
|
||||
for (uint16_t j = i + 1; j < n_subcarriers; j++) {
|
||||
if (scratch[j] < scratch[min_i]) min_i = j;
|
||||
}
|
||||
if (min_i != i) {
|
||||
float t = scratch[i]; scratch[i] = scratch[min_i]; scratch[min_i] = t;
|
||||
}
|
||||
}
|
||||
float median_var = scratch[target];
|
||||
|
||||
uint8_t count = 0;
|
||||
for (uint16_t i = 0; i < n_subcarriers; i++) {
|
||||
bool stable = s_sc_amp_var_ema[i] <= median_var;
|
||||
s_nbvi_stable_bin[i] = stable;
|
||||
if (stable) count++;
|
||||
}
|
||||
s_nbvi_stable_count = count;
|
||||
}
|
||||
|
||||
/* IMPORTANT: motion_energy (Step 8) MUST take the variance of the
|
||||
* *full-band* mean. Pushing a quiet-bins-only mean here would zero
|
||||
* out motion_energy entirely — quiet bins by construction barely
|
||||
* move, so the windowed variance collapses to ~0 and stays there
|
||||
* (verified empirically on 2026-05-17: motion_score went constant
|
||||
* 0.013/0.021 with std=0 across 125 frames). The NBVI EMA state
|
||||
* above remains for future use (a second "baseline_quiet" channel,
|
||||
* not yet wired to the feature_state packet). */
|
||||
s_broad_mean_amp_history[s_broad_mean_amp_idx] = band_amp_mean;
|
||||
s_broad_mean_amp_idx = (s_broad_mean_amp_idx + 1) % EDGE_BROAD_HISTORY_LEN;
|
||||
}
|
||||
|
||||
/* --- Step 6: Biquad bandpass filtering --- */
|
||||
float br_val = biquad_process(&s_bq_breathing, primary_phase);
|
||||
float hr_val = biquad_process(&s_bq_heartrate, primary_phase);
|
||||
|
|
@ -783,20 +937,49 @@ static void process_frame(const edge_ring_slot_t *slot)
|
|||
if (hr_bpm >= 40.0f && hr_bpm <= 180.0f) s_heartrate_bpm = hr_bpm;
|
||||
}
|
||||
|
||||
/* --- Step 8: Motion energy (variance of recent phases) --- */
|
||||
/* --- Step 8: Motion energy (broadband amplitude variance) ---
|
||||
*
|
||||
* Issue #555 evolution:
|
||||
* v1 — variance of unwrapped *phase*: dominated by thermal/oscillator
|
||||
* drift → constant non-zero regardless of motion.
|
||||
* v2 — variance of *primary subcarrier* amplitude: better, but the
|
||||
* top-K winner subcarrier flips occasionally (winner_changed=1
|
||||
* in DBG_DSP), causing spurious spikes in stillness — measured
|
||||
* pvar still ~1.3 with bursts to 22 when nothing was moving.
|
||||
* v3 (current) — variance of *band-wide mean amplitude*: averaging
|
||||
* across all 128 subcarriers cancels per-subcarrier noise; what
|
||||
* remains is the overall multipath energy level, which moves
|
||||
* coherently with body presence in the Fresnel zone.
|
||||
*
|
||||
* Empirical numbers from 2026-05-14 capture (room02, 2 m, person):
|
||||
* still: bvar median 0.08, max 1.6
|
||||
* walking: bvar median 3.5, max 14.3
|
||||
* walk/still ratio: ~44× (vs ~18× for primary-subcarrier variance)
|
||||
*
|
||||
* Normalization: motion_energy = clamp(bvar / 3.0, 0, 1).
|
||||
* still 0.08 → 0.027 (under the <0.05 spec)
|
||||
* still 1.6 → 0.53 (rare transient — acceptable)
|
||||
* walk 1.6 → 0.53 (over the >0.3 spec)
|
||||
* walk 3.5+ → 1.0 (saturated, presence definite) */
|
||||
if (s_history_len >= 10) {
|
||||
float sum = 0.0f, sum2 = 0.0f;
|
||||
uint16_t window = (s_history_len < 20) ? s_history_len : 20;
|
||||
for (uint16_t i = 0; i < window; i++) {
|
||||
uint16_t ri = (s_history_idx + EDGE_PHASE_HISTORY_LEN
|
||||
- window + i) % EDGE_PHASE_HISTORY_LEN;
|
||||
float v = s_phase_history[ri];
|
||||
sum += v;
|
||||
for (uint16_t i = 0; i < EDGE_BROAD_HISTORY_LEN; i++) {
|
||||
float v = s_broad_mean_amp_history[i];
|
||||
sum += v;
|
||||
sum2 += v * v;
|
||||
}
|
||||
float mean = sum / (float)window;
|
||||
s_motion_energy = (sum2 / (float)window) - (mean * mean);
|
||||
if (s_motion_energy < 0.0f) s_motion_energy = 0.0f;
|
||||
float mean = sum / (float)EDGE_BROAD_HISTORY_LEN;
|
||||
float var = (sum2 / (float)EDGE_BROAD_HISTORY_LEN) - mean * mean;
|
||||
if (var < 0.0f) var = 0.0f;
|
||||
|
||||
/* Divisor sized for sensor deployment with 1-3 m line-of-sight to
|
||||
* the activity zone. At that range multipath averages out and
|
||||
* broadband variance is small (~0.1-2.0 empty, ~1-10 walking).
|
||||
* Lower divisor = higher sensitivity but more saturation if a
|
||||
* sensor is moved close to the body (≤50 cm). */
|
||||
float energy = var / 5.0f;
|
||||
if (energy > 1.0f) energy = 1.0f;
|
||||
s_motion_energy = energy;
|
||||
}
|
||||
|
||||
/* --- Step 9: Presence detection --- */
|
||||
|
|
@ -1000,6 +1183,18 @@ esp_err_t edge_processing_init(const edge_config_t *cfg)
|
|||
memset(&s_ring, 0, sizeof(s_ring));
|
||||
memset(s_subcarrier_var, 0, sizeof(s_subcarrier_var));
|
||||
memset(s_prev_phase, 0, sizeof(s_prev_phase));
|
||||
memset(s_phase_history, 0, sizeof(s_phase_history));
|
||||
memset(s_amp_history, 0, sizeof(s_amp_history));
|
||||
memset(s_broad_mean_amp_history, 0, sizeof(s_broad_mean_amp_history));
|
||||
s_broad_mean_amp_idx = 0;
|
||||
/* NBVI sliding-window state — recomputed from fresh on each init so
|
||||
* the stable_bin mask doesn't carry over stale stats from a previous
|
||||
* deployment / room. */
|
||||
memset(s_sc_amp_ema, 0, sizeof(s_sc_amp_ema));
|
||||
memset(s_sc_amp_var_ema, 0, sizeof(s_sc_amp_var_ema));
|
||||
memset(s_nbvi_stable_bin, 0, sizeof(s_nbvi_stable_bin));
|
||||
s_sc_init = 0;
|
||||
s_nbvi_stable_count = 0;
|
||||
s_phase_initialized = false;
|
||||
s_top_k_count = 0;
|
||||
s_history_len = 0;
|
||||
|
|
@ -1034,12 +1229,18 @@ esp_err_t edge_processing_init(const edge_config_t *cfg)
|
|||
}
|
||||
|
||||
/* Design biquad bandpass filters.
|
||||
* Sampling rate ~20 Hz (typical ESP32 CSI callback rate). */
|
||||
const float fs = 20.0f;
|
||||
*
|
||||
* fs must match the sample_rate used by estimate_bpm_zero_crossing()
|
||||
* in process_frame() (currently 10.0 Hz — see RuView#396 comment near
|
||||
* the `sample_rate` literal). Designing biquads at 20 Hz while feeding
|
||||
* them 10 Hz data effectively halves the passband: the "0.1-0.5 Hz
|
||||
* breathing" filter became 0.05-0.25 Hz, which cuts out 12-18 BPM
|
||||
* (0.2-0.3 Hz) — the bulk of human respiration. */
|
||||
const float fs = 10.0f;
|
||||
biquad_bandpass_design(&s_bq_breathing, fs, 0.1f, 0.5f);
|
||||
biquad_bandpass_design(&s_bq_heartrate, fs, 0.8f, 2.0f);
|
||||
|
||||
/* Design per-person filters. */
|
||||
/* Design per-person filters at the same fs. */
|
||||
for (uint8_t p = 0; p < EDGE_MAX_PERSONS; p++) {
|
||||
biquad_bandpass_design(&s_person_bq_br[p], fs, 0.1f, 0.5f);
|
||||
biquad_bandpass_design(&s_person_bq_hr[p], fs, 0.8f, 2.0f);
|
||||
|
|
|
|||
|
|
@ -17,6 +17,7 @@
|
|||
#include "esp_log.h"
|
||||
#include "nvs_flash.h"
|
||||
#include "esp_app_desc.h"
|
||||
#include "esp_ota_ops.h" /* esp_ota_get_running_partition — issue #556 boot diag */
|
||||
#include "sdkconfig.h"
|
||||
|
||||
#include "csi_collector.h"
|
||||
|
|
@ -127,8 +128,39 @@ static void wifi_init_sta(void)
|
|||
}
|
||||
}
|
||||
|
||||
/* Issue #556 OTA debug: log how we got here. After an OTA upload the new
|
||||
* image should boot with reset_reason=ESP_RST_SW from esp_restart() and
|
||||
* run from the partition esp_ota_set_boot_partition() picked. If we see
|
||||
* ESP_RST_PANIC / ESP_RST_TASK_WDT / ESP_RST_INT_WDT from the OTA-flashed
|
||||
* slot, the new image crashed in early boot — that's the failure mode the
|
||||
* "/ota/status still shows old time" symptom is masking. */
|
||||
static const char *reset_reason_str(esp_reset_reason_t r)
|
||||
{
|
||||
switch (r) {
|
||||
case ESP_RST_POWERON: return "POWERON";
|
||||
case ESP_RST_EXT: return "EXT";
|
||||
case ESP_RST_SW: return "SW";
|
||||
case ESP_RST_PANIC: return "PANIC";
|
||||
case ESP_RST_INT_WDT: return "INT_WDT";
|
||||
case ESP_RST_TASK_WDT: return "TASK_WDT";
|
||||
case ESP_RST_WDT: return "WDT";
|
||||
case ESP_RST_DEEPSLEEP:return "DEEPSLEEP";
|
||||
case ESP_RST_BROWNOUT: return "BROWNOUT";
|
||||
case ESP_RST_SDIO: return "SDIO";
|
||||
default: return "UNKNOWN";
|
||||
}
|
||||
}
|
||||
|
||||
void app_main(void)
|
||||
{
|
||||
/* Boot diagnostic — must run before anything that could panic, so even
|
||||
* a one-line UART log tells us how the chip got here. */
|
||||
esp_reset_reason_t rr = esp_reset_reason();
|
||||
const esp_partition_t *running = esp_ota_get_running_partition();
|
||||
ESP_LOGI(TAG, "boot: reset_reason=%s running_partition=%s",
|
||||
reset_reason_str(rr),
|
||||
running ? running->label : "?");
|
||||
|
||||
/* Initialize NVS */
|
||||
esp_err_t ret = nvs_flash_init();
|
||||
if (ret == ESP_ERR_NVS_NO_FREE_PAGES || ret == ESP_ERR_NVS_NEW_VERSION_FOUND) {
|
||||
|
|
|
|||
|
|
@ -17,6 +17,7 @@
|
|||
#include "esp_app_desc.h"
|
||||
#include "nvs_flash.h"
|
||||
#include "nvs.h"
|
||||
#include "nvs_config.h" /* NVS_CFG_IP_MAX */
|
||||
|
||||
static const char *TAG = "ota_update";
|
||||
|
||||
|
|
@ -96,6 +97,180 @@ static esp_err_t ota_status_handler(httpd_req_t *req)
|
|||
return ESP_OK;
|
||||
}
|
||||
|
||||
/**
|
||||
* POST /ota/recalibrate — clear cached gain-lock NVS keys and reboot.
|
||||
*
|
||||
* ADR-109: lets the operator force a full gain-lock re-calibration from
|
||||
* the server without a USB connection. Erases csi_cfg/gl_agc, gl_fft, and
|
||||
* gl_ap_mac (ADR-111), then calls esp_restart(). Next boot finds no NVS
|
||||
* cache and runs the 300-packet calibration as if it were a fresh device.
|
||||
*/
|
||||
static esp_err_t ota_recalibrate_handler(httpd_req_t *req)
|
||||
{
|
||||
if (!ota_check_auth(req)) {
|
||||
ESP_LOGW(TAG, "/ota/recalibrate rejected: authentication failed");
|
||||
httpd_resp_send_err(req, HTTPD_403_FORBIDDEN,
|
||||
"Authentication required. Use: Authorization: Bearer <psk>");
|
||||
return ESP_FAIL;
|
||||
}
|
||||
|
||||
nvs_handle_t h;
|
||||
esp_err_t err = nvs_open("csi_cfg", NVS_READWRITE, &h);
|
||||
if (err != ESP_OK) {
|
||||
ESP_LOGE(TAG, "/ota/recalibrate: nvs_open(csi_cfg) failed: %s",
|
||||
esp_err_to_name(err));
|
||||
httpd_resp_send_err(req, HTTPD_500_INTERNAL_SERVER_ERROR,
|
||||
"NVS open failed");
|
||||
return ESP_FAIL;
|
||||
}
|
||||
|
||||
/* Erase all three keys defensively — ignore individual ESP_ERR_NVS_NOT_FOUND
|
||||
* (key already absent on a never-calibrated device). */
|
||||
(void)nvs_erase_key(h, "gl_agc");
|
||||
(void)nvs_erase_key(h, "gl_fft");
|
||||
(void)nvs_erase_key(h, "gl_ap_mac");
|
||||
err = nvs_commit(h);
|
||||
nvs_close(h);
|
||||
if (err != ESP_OK) {
|
||||
ESP_LOGE(TAG, "/ota/recalibrate: nvs_commit failed: %s",
|
||||
esp_err_to_name(err));
|
||||
httpd_resp_send_err(req, HTTPD_500_INTERNAL_SERVER_ERROR,
|
||||
"NVS commit failed");
|
||||
return ESP_FAIL;
|
||||
}
|
||||
|
||||
ESP_LOGI(TAG, "/ota/recalibrate: gain-lock NVS cleared; rebooting in 1s");
|
||||
|
||||
const char *resp =
|
||||
"{\"status\":\"ok\",\"message\":\"gain-lock NVS cleared; rebooting\"}";
|
||||
httpd_resp_set_type(req, "application/json");
|
||||
httpd_resp_send(req, resp, strlen(resp));
|
||||
|
||||
vTaskDelay(pdMS_TO_TICKS(1000));
|
||||
esp_restart();
|
||||
return ESP_OK; /* unreachable */
|
||||
}
|
||||
|
||||
/**
|
||||
* POST /ota/set-target — write csi_cfg/target_ip + target_port to NVS, reboot.
|
||||
*
|
||||
* ADR-115: lets the operator point sensors at a new aggregator (Mac IP
|
||||
* change, network move) without USB. Body is plain text "IP:PORT" with
|
||||
* trailing newline tolerated, e.g. "192.168.0.103:5005". IP validated
|
||||
* by inet_pton-like check (4 dot-separated octets 0–255); port 1–65535.
|
||||
*
|
||||
* Persists into the same `csi_cfg` namespace that `nvs_config.c` reads
|
||||
* at boot — next reboot picks up the new target.
|
||||
*/
|
||||
static bool parse_ip_port(const char *s, char *ip_out, size_t ip_cap, uint16_t *port_out)
|
||||
{
|
||||
/* Tolerate trailing whitespace/CR/LF. */
|
||||
size_t n = strlen(s);
|
||||
while (n > 0 && (s[n - 1] == '\n' || s[n - 1] == '\r' || s[n - 1] == ' ' || s[n - 1] == '\t')) {
|
||||
n--;
|
||||
}
|
||||
const char *colon = NULL;
|
||||
for (size_t i = 0; i < n; i++) {
|
||||
if (s[i] == ':') { colon = &s[i]; break; }
|
||||
}
|
||||
if (!colon) return false;
|
||||
size_t ip_len = (size_t)(colon - s);
|
||||
if (ip_len == 0 || ip_len >= ip_cap) return false;
|
||||
memcpy(ip_out, s, ip_len);
|
||||
ip_out[ip_len] = '\0';
|
||||
/* Validate 4 octets 0–255. */
|
||||
int oct_count = 0, val = -1;
|
||||
for (size_t i = 0; i <= ip_len; i++) {
|
||||
char c = ip_out[i];
|
||||
if (c == '.' || c == '\0') {
|
||||
if (val < 0 || val > 255) return false;
|
||||
oct_count++;
|
||||
val = -1;
|
||||
} else if (c >= '0' && c <= '9') {
|
||||
val = (val < 0 ? 0 : val) * 10 + (c - '0');
|
||||
} else {
|
||||
return false;
|
||||
}
|
||||
}
|
||||
if (oct_count != 4) return false;
|
||||
/* Parse port. */
|
||||
long port = 0;
|
||||
const char *p = colon + 1;
|
||||
size_t plen = n - ip_len - 1;
|
||||
if (plen == 0 || plen > 5) return false;
|
||||
for (size_t i = 0; i < plen; i++) {
|
||||
if (p[i] < '0' || p[i] > '9') return false;
|
||||
port = port * 10 + (p[i] - '0');
|
||||
}
|
||||
if (port < 1 || port > 65535) return false;
|
||||
*port_out = (uint16_t)port;
|
||||
return true;
|
||||
}
|
||||
|
||||
static esp_err_t ota_set_target_handler(httpd_req_t *req)
|
||||
{
|
||||
if (!ota_check_auth(req)) {
|
||||
ESP_LOGW(TAG, "/ota/set-target rejected: authentication failed");
|
||||
httpd_resp_send_err(req, HTTPD_403_FORBIDDEN,
|
||||
"Authentication required. Use: Authorization: Bearer <psk>");
|
||||
return ESP_FAIL;
|
||||
}
|
||||
|
||||
/* Body is short: "IPv4:port" + optional CRLF. 32 bytes is plenty. */
|
||||
char body[40] = {0};
|
||||
int total = 0;
|
||||
while (total < (int)sizeof(body) - 1) {
|
||||
int r = httpd_req_recv(req, body + total, sizeof(body) - 1 - total);
|
||||
if (r <= 0) {
|
||||
if (r == HTTPD_SOCK_ERR_TIMEOUT) continue;
|
||||
break;
|
||||
}
|
||||
total += r;
|
||||
}
|
||||
body[total < 0 ? 0 : total] = '\0';
|
||||
|
||||
char ip[NVS_CFG_IP_MAX] = {0};
|
||||
uint16_t port = 0;
|
||||
if (!parse_ip_port(body, ip, sizeof(ip), &port)) {
|
||||
ESP_LOGW(TAG, "/ota/set-target rejected: invalid body '%s'", body);
|
||||
httpd_resp_send_err(req, HTTPD_400_BAD_REQUEST,
|
||||
"Body must be 'IPv4:PORT', e.g. '192.168.0.103:5005'");
|
||||
return ESP_FAIL;
|
||||
}
|
||||
|
||||
nvs_handle_t h;
|
||||
esp_err_t err = nvs_open("csi_cfg", NVS_READWRITE, &h);
|
||||
if (err != ESP_OK) {
|
||||
ESP_LOGE(TAG, "/ota/set-target: nvs_open(csi_cfg) failed: %s",
|
||||
esp_err_to_name(err));
|
||||
httpd_resp_send_err(req, HTTPD_500_INTERNAL_SERVER_ERROR, "NVS open failed");
|
||||
return ESP_FAIL;
|
||||
}
|
||||
err = nvs_set_str(h, "target_ip", ip);
|
||||
if (err == ESP_OK) err = nvs_set_u16(h, "target_port", port);
|
||||
if (err == ESP_OK) err = nvs_commit(h);
|
||||
nvs_close(h);
|
||||
if (err != ESP_OK) {
|
||||
ESP_LOGE(TAG, "/ota/set-target: NVS write failed: %s", esp_err_to_name(err));
|
||||
httpd_resp_send_err(req, HTTPD_500_INTERNAL_SERVER_ERROR, "NVS write failed");
|
||||
return ESP_FAIL;
|
||||
}
|
||||
|
||||
ESP_LOGI(TAG, "/ota/set-target: csi_cfg/target_ip=%s target_port=%u; rebooting in 1s",
|
||||
ip, (unsigned)port);
|
||||
|
||||
char resp[120];
|
||||
int rlen = snprintf(resp, sizeof(resp),
|
||||
"{\"status\":\"ok\",\"target_ip\":\"%s\",\"target_port\":%u,\"message\":\"rebooting\"}",
|
||||
ip, (unsigned)port);
|
||||
httpd_resp_set_type(req, "application/json");
|
||||
httpd_resp_send(req, resp, rlen);
|
||||
|
||||
vTaskDelay(pdMS_TO_TICKS(1000));
|
||||
esp_restart();
|
||||
return ESP_OK; /* unreachable */
|
||||
}
|
||||
|
||||
/**
|
||||
* POST /ota — receive and flash firmware binary.
|
||||
*/
|
||||
|
|
@ -125,7 +300,16 @@ static esp_err_t ota_upload_handler(httpd_req_t *req)
|
|||
}
|
||||
|
||||
esp_ota_handle_t ota_handle;
|
||||
esp_err_t err = esp_ota_begin(update_partition, OTA_WITH_SEQUENTIAL_WRITES, &ota_handle);
|
||||
/* Issue #556: use OTA_SIZE_UNKNOWN (full partition erase) instead of
|
||||
* OTA_WITH_SEQUENTIAL_WRITES. When the new image is smaller than the
|
||||
* one previously written to the target slot, sequential writes leave
|
||||
* the tail of the old code in place. The image header SHA covers
|
||||
* only the declared image span, but residual code at stale offsets
|
||||
* can still be reached via IRAM jump tables / .literal pools on some
|
||||
* v5.2 ABIs and crash the new app on first boot, which then looks
|
||||
* like "OTA didn't take". Full erase up-front avoids this entirely
|
||||
* at the cost of one extra ~1.5 s erase before write starts. */
|
||||
esp_err_t err = esp_ota_begin(update_partition, OTA_SIZE_UNKNOWN, &ota_handle);
|
||||
if (err != ESP_OK) {
|
||||
ESP_LOGE(TAG, "esp_ota_begin failed: %s", esp_err_to_name(err));
|
||||
httpd_resp_send_err(req, HTTPD_500_INTERNAL_SERVER_ERROR,
|
||||
|
|
@ -207,6 +391,13 @@ static esp_err_t ota_start_server(httpd_handle_t *out_handle)
|
|||
config.max_uri_handlers = 12; /* Extra slots for WASM endpoints (ADR-040). */
|
||||
/* Increase receive timeout for large uploads. */
|
||||
config.recv_wait_timeout = 30;
|
||||
/* Issue #556: httpd default stack is 4096 B, which overflows during
|
||||
* esp_ota_end()'s image-verify (SHA256 streaming + mmap segment walk
|
||||
* eats ~3 KB on top of the request handler frame). Empirically observed
|
||||
* "***ERROR*** A stack overflow in task httpd has been detected"
|
||||
* immediately after esp_image: segment dumps when OTA reaches verify.
|
||||
* 8 KB gives a clean margin without hurting the typical idle case. */
|
||||
config.stack_size = 8192;
|
||||
|
||||
httpd_handle_t server = NULL;
|
||||
esp_err_t err = httpd_start(&server, &config);
|
||||
|
|
@ -233,9 +424,29 @@ static esp_err_t ota_start_server(httpd_handle_t *out_handle)
|
|||
};
|
||||
httpd_register_uri_handler(server, &upload_uri);
|
||||
|
||||
/* ADR-109: REST trigger for full gain-lock re-calibration. */
|
||||
httpd_uri_t recalibrate_uri = {
|
||||
.uri = "/ota/recalibrate",
|
||||
.method = HTTP_POST,
|
||||
.handler = ota_recalibrate_handler,
|
||||
.user_ctx = NULL,
|
||||
};
|
||||
httpd_register_uri_handler(server, &recalibrate_uri);
|
||||
|
||||
/* ADR-115: REST endpoint to change CSI aggregator target without USB. */
|
||||
httpd_uri_t set_target_uri = {
|
||||
.uri = "/ota/set-target",
|
||||
.method = HTTP_POST,
|
||||
.handler = ota_set_target_handler,
|
||||
.user_ctx = NULL,
|
||||
};
|
||||
httpd_register_uri_handler(server, &set_target_uri);
|
||||
|
||||
ESP_LOGI(TAG, "OTA HTTP server started on port %d", OTA_PORT);
|
||||
ESP_LOGI(TAG, " GET /ota/status — firmware version info");
|
||||
ESP_LOGI(TAG, " POST /ota — upload new firmware binary");
|
||||
ESP_LOGI(TAG, " GET /ota/status — firmware version info");
|
||||
ESP_LOGI(TAG, " POST /ota — upload new firmware binary");
|
||||
ESP_LOGI(TAG, " POST /ota/recalibrate — clear gain-lock NVS + reboot");
|
||||
ESP_LOGI(TAG, " POST /ota/set-target — set CSI target IP:port in NVS + reboot");
|
||||
|
||||
if (out_handle) *out_handle = server;
|
||||
return ESP_OK;
|
||||
|
|
|
|||
|
|
@ -65,7 +65,11 @@ typedef struct __attribute__((packed)) {
|
|||
float env_shift_score; /**< 0..1, baseline drift. */
|
||||
float node_coherence; /**< 0..1, multi-link agreement. */
|
||||
uint16_t quality_flags; /**< RV_QFLAG_* bitmap. */
|
||||
uint16_t reserved;
|
||||
int8_t rssi_dbm; /**< Median RSSI over the emit window (i8, dBm). 0 = not measured.
|
||||
ADR-100 D3: previously the same byte was `reserved` — but downstream
|
||||
UI/classifier needs RSSI per node and the legacy raw-CSI parse path
|
||||
(0xC5110001) is no longer hot on this FW. Server reads buf[54] as i8. */
|
||||
uint8_t reserved; /**< Padding/aux byte; keep zero until next protocol bump. */
|
||||
uint32_t crc32; /**< IEEE CRC32 over bytes [0..end-4]. */
|
||||
} rv_feature_state_t;
|
||||
|
||||
|
|
|
|||
|
|
@ -34,3 +34,14 @@ CONFIG_ESP_MAIN_TASK_STACK_SIZE=8192
|
|||
|
||||
# Extra WiFi IRAM placement (defense-in-depth for RuView#396 SPI cache race)
|
||||
CONFIG_ESP_WIFI_EXTRA_IRAM_OPT=y
|
||||
|
||||
# ----- Local overrides for room01/room02 deployment -----
|
||||
# EDGE_TIER kept at project default (=2, full vitals pipeline).
|
||||
# Mac aggregator IP
|
||||
CONFIG_CSI_TARGET_IP="192.168.1.21"
|
||||
CONFIG_CSI_TARGET_PORT=5006
|
||||
# Disable AMOLED display (no display on room sensors, init panics on missing
|
||||
# TCA9554 expander → Tmr Svc stack overflow).
|
||||
CONFIG_DISPLAY_ENABLE=n
|
||||
# Increase Tmr Svc stack to fit adaptive_controller tick (default 2048 overflows).
|
||||
CONFIG_FREERTOS_TIMER_TASK_STACK_DEPTH=8192
|
||||
|
|
|
|||
|
|
@ -0,0 +1,108 @@
|
|||
#!/usr/bin/env python3
|
||||
"""ADR-114: generate 1000 idle + 1000 motion CSI replay fixtures.
|
||||
|
||||
Two files are written under
|
||||
`v2/crates/wifi-densepose-sensing-server/tests/fixtures/`:
|
||||
|
||||
* `replay_idle.jsonl` — 1000 frames of empty-room baseline +
|
||||
per-frame Gaussian noise (low CV).
|
||||
* `replay_motion.jsonl` — 1000 frames of the same baseline + 1.5 Hz
|
||||
coherent modulation + per-frame Gaussian
|
||||
noise (high CV).
|
||||
|
||||
Format: one JSON object per line:
|
||||
{"node_id": <u8>, "amplitude": [<f64>; 56]}
|
||||
|
||||
These are *synthetic but parameter-matched to live data* (baseline
|
||||
mean = 27.04 / 14.72 from data/baseline.json, CV ≈ 2.6 / 3.6 %).
|
||||
They exist to provide deterministic regression coverage of the
|
||||
amp_presence_override classifier. Real captured-from-sensor fixtures
|
||||
can replace them in-place (same filename, same line format) without
|
||||
changing the test code.
|
||||
|
||||
Deterministic by seed so the test result is reproducible across
|
||||
machines. Re-run only when you want to regenerate.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
import math
|
||||
import random
|
||||
from pathlib import Path
|
||||
|
||||
OUT_DIR = (
|
||||
Path(__file__).resolve().parent.parent
|
||||
/ "v2"
|
||||
/ "crates"
|
||||
/ "wifi-densepose-sensing-server"
|
||||
/ "tests"
|
||||
/ "fixtures"
|
||||
)
|
||||
|
||||
# Per-node baseline mean amplitude pulled from a real recording of
|
||||
# this deployment (data/baseline.json). Holding them in code keeps
|
||||
# the fixture script self-contained.
|
||||
NODE_BASELINES = {1: 27.04, 2: 14.72}
|
||||
N_SUB = 56
|
||||
FRAMES_PER_NODE = 500 # 500 × 2 nodes = 1000 per fixture file
|
||||
|
||||
|
||||
def gen_subcarrier_profile(rng: random.Random, mean: float) -> list[float]:
|
||||
"""Static per-subcarrier mean profile — same for the whole capture."""
|
||||
return [max(1.0, mean * rng.uniform(0.7, 1.3)) for _ in range(N_SUB)]
|
||||
|
||||
|
||||
def write_fixture(path: Path, motion: bool, seed: int) -> int:
|
||||
rng = random.Random(seed)
|
||||
profiles = {
|
||||
nid: gen_subcarrier_profile(rng, mean) for nid, mean in NODE_BASELINES.items()
|
||||
}
|
||||
count = 0
|
||||
with path.open("w") as f:
|
||||
# Interleave nodes round-robin so the test driver gets per-node
|
||||
# streams of the same length, like a real WS feed.
|
||||
for i in range(FRAMES_PER_NODE):
|
||||
for nid, profile in profiles.items():
|
||||
t = i / 20.0 # 20 Hz tick
|
||||
# AMP_SHORT_WIN in the server is 90 frames = 4.5 s.
|
||||
# Idle: small per-frame noise → rolling-window CV stays
|
||||
# well below the universal threshold.
|
||||
# Motion: a slow ~0.15 Hz coherent envelope (6.7 s cycle,
|
||||
# longer than the 4.5 s averaging window) drives the
|
||||
# broadband mean up/down by ±40 %, producing a high
|
||||
# rolling CV. Mimics body position changes during
|
||||
# walking — the channel response shifts slowly relative
|
||||
# to the classifier window.
|
||||
if motion:
|
||||
envelope = 1.0 + 0.40 * math.sin(2 * math.pi * 0.15 * t)
|
||||
else:
|
||||
envelope = 1.0
|
||||
amps: list[float] = []
|
||||
for mu in profile:
|
||||
noise_sigma = mu * (0.05 if motion else 0.018)
|
||||
n = rng.gauss(0.0, noise_sigma)
|
||||
amps.append(round(mu * envelope + n, 3))
|
||||
f.write(json.dumps({"node_id": nid, "amplitude": amps}) + "\n")
|
||||
count += 1
|
||||
return count
|
||||
|
||||
|
||||
def main() -> None:
|
||||
OUT_DIR.mkdir(parents=True, exist_ok=True)
|
||||
idle_path = OUT_DIR / "replay_idle.jsonl"
|
||||
motion_path = OUT_DIR / "replay_motion.jsonl"
|
||||
n_idle = write_fixture(idle_path, motion=False, seed=42)
|
||||
n_motion = write_fixture(motion_path, motion=True, seed=43)
|
||||
print(f"wrote {n_idle} idle frames → {idle_path}")
|
||||
print(f"wrote {n_motion} motion frames → {motion_path}")
|
||||
print()
|
||||
print("These fixtures are SYNTHETIC parameter-matched to live data —")
|
||||
print("the cargo test that consumes them measures classifier")
|
||||
print("consistency, not real-world accuracy. Replace with live")
|
||||
print("captures (same line format, same filenames) when operator")
|
||||
print("time allows for a true empty-vs-walking ground-truth pair.")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
|
|
@ -0,0 +1,275 @@
|
|||
#!/usr/bin/env python3
|
||||
"""
|
||||
scripts/ota-deploy.sh — push esp32-csi-node.bin to one or more sensor nodes
|
||||
over WiFi. Talks to the on-device /ota endpoint (ADR-045, port 8032,
|
||||
handler in firmware/esp32-csi-node/main/ota_update.c).
|
||||
|
||||
Usage:
|
||||
scripts/ota-deploy.sh # auto-discover via ARP, deploy to all
|
||||
scripts/ota-deploy.sh 192.168.0.100 # one node
|
||||
scripts/ota-deploy.sh 192.168.0.100 192.168.0.101
|
||||
scripts/ota-deploy.sh --build # idf.py build first, then deploy
|
||||
scripts/ota-deploy.sh --no-verify ... # skip post-reboot /ota/status check
|
||||
|
||||
Auth: set env OTA_PSK=<token> to send "Authorization: Bearer <token>"
|
||||
(matches the on-device check in ota_update.c::ota_check_auth).
|
||||
|
||||
Exit codes:
|
||||
0 — all targeted nodes confirmed running_partition flipped
|
||||
1 — one or more nodes failed verification or were unreachable
|
||||
2 — build or argument error
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import argparse
|
||||
import concurrent.futures as cf
|
||||
import json
|
||||
import os
|
||||
import re
|
||||
import shutil
|
||||
import subprocess
|
||||
import sys
|
||||
import time
|
||||
import urllib.error
|
||||
import urllib.request
|
||||
from pathlib import Path
|
||||
from typing import Iterable
|
||||
|
||||
REPO_ROOT = Path(__file__).resolve().parent.parent
|
||||
FW_DIR = REPO_ROOT / "firmware" / "esp32-csi-node"
|
||||
BIN_PATH = FW_DIR / "build" / "esp32-csi-node.bin"
|
||||
PORT = 8032
|
||||
|
||||
UPLOAD_TIMEOUT_S = 120
|
||||
REBOOT_WAIT_S = 10
|
||||
VERIFY_RETRIES = 6
|
||||
VERIFY_DELAY_S = 3
|
||||
|
||||
|
||||
# ---- ANSI logging helpers ----------------------------------------------------
|
||||
def _c(code: str, msg: str) -> str:
|
||||
if not sys.stdout.isatty():
|
||||
return msg
|
||||
return f"\033[{code}m{msg}\033[0m"
|
||||
|
||||
def log(msg: str) -> None: print(_c("36", "[ota-deploy] ") + msg, flush=True)
|
||||
def warn(msg: str) -> None: print(_c("33", "[ota-deploy] ") + msg, file=sys.stderr, flush=True)
|
||||
def err(msg: str) -> None: print(_c("31", "[ota-deploy] ") + msg, file=sys.stderr, flush=True)
|
||||
|
||||
|
||||
# ---- helpers -----------------------------------------------------------------
|
||||
def http_get(url: str, timeout: float = 4.0) -> str | None:
|
||||
try:
|
||||
with urllib.request.urlopen(url, timeout=timeout) as r:
|
||||
return r.read().decode("utf-8", errors="replace")
|
||||
except (urllib.error.URLError, urllib.error.HTTPError, TimeoutError, OSError):
|
||||
return None
|
||||
|
||||
|
||||
def get_ota_status(ip: str) -> dict | None:
|
||||
body = http_get(f"http://{ip}:{PORT}/ota/status")
|
||||
if not body:
|
||||
return None
|
||||
try:
|
||||
return json.loads(body)
|
||||
except json.JSONDecodeError:
|
||||
return None
|
||||
|
||||
|
||||
def local_subnet_prefix() -> str | None:
|
||||
"""Return e.g. '192.168.0' from en0 (macOS) or first non-loopback IP."""
|
||||
try:
|
||||
out = subprocess.check_output(
|
||||
["ipconfig", "getifaddr", "en0"], stderr=subprocess.DEVNULL, text=True
|
||||
).strip()
|
||||
if out:
|
||||
return out.rsplit(".", 1)[0]
|
||||
except (subprocess.CalledProcessError, FileNotFoundError):
|
||||
pass
|
||||
# Linux fallback
|
||||
try:
|
||||
out = subprocess.check_output(["hostname", "-I"], text=True).strip()
|
||||
if out:
|
||||
return out.split()[0].rsplit(".", 1)[0]
|
||||
except (subprocess.CalledProcessError, FileNotFoundError):
|
||||
pass
|
||||
return None
|
||||
|
||||
|
||||
def discover_nodes() -> list[str]:
|
||||
"""ARP-prefilter + parallel /ota/status probe to find live sensor nodes."""
|
||||
prefix = local_subnet_prefix()
|
||||
if not prefix:
|
||||
err("could not determine local /24 — pass node IPs explicitly")
|
||||
return []
|
||||
log(f"scanning {prefix}.0/24 for /ota/status responders ...")
|
||||
|
||||
candidates: list[str] = []
|
||||
try:
|
||||
arp_out = subprocess.check_output(
|
||||
["arp", "-a", "-n"], text=True, stderr=subprocess.DEVNULL
|
||||
)
|
||||
for line in arp_out.splitlines():
|
||||
m = re.search(rf"\(({re.escape(prefix)}\.\d+)\)", line)
|
||||
if m and "incomplete" not in line:
|
||||
ip = m.group(1)
|
||||
if not ip.endswith(".1"): # skip gateway
|
||||
candidates.append(ip)
|
||||
except (subprocess.CalledProcessError, FileNotFoundError):
|
||||
pass
|
||||
if not candidates:
|
||||
warn(f"no ARP hits — falling back to {prefix}.100-110 ping sweep")
|
||||
candidates = [f"{prefix}.{i}" for i in range(100, 111)]
|
||||
candidates = sorted(set(candidates))
|
||||
|
||||
found: list[str] = []
|
||||
with cf.ThreadPoolExecutor(max_workers=32) as pool:
|
||||
futs = {pool.submit(get_ota_status, ip): ip for ip in candidates}
|
||||
for fut in cf.as_completed(futs):
|
||||
ip = futs[fut]
|
||||
try:
|
||||
if fut.result():
|
||||
found.append(ip)
|
||||
except Exception:
|
||||
pass
|
||||
return sorted(found, key=lambda x: tuple(int(o) for o in x.split(".")))
|
||||
|
||||
|
||||
def upload_one(ip: str, payload: bytes, psk: str | None) -> tuple[bool, float, str]:
|
||||
"""POST the firmware to one node. Returns (success, elapsed_s, body_snippet)."""
|
||||
req = urllib.request.Request(
|
||||
f"http://{ip}:{PORT}/ota",
|
||||
data=payload,
|
||||
headers={"Content-Type": "application/octet-stream"},
|
||||
method="POST",
|
||||
)
|
||||
if psk:
|
||||
req.add_header("Authorization", f"Bearer {psk}")
|
||||
t0 = time.monotonic()
|
||||
try:
|
||||
with urllib.request.urlopen(req, timeout=UPLOAD_TIMEOUT_S) as r:
|
||||
body = r.read().decode("utf-8", errors="replace")[:200]
|
||||
return True, time.monotonic() - t0, body
|
||||
except (urllib.error.HTTPError, urllib.error.URLError,
|
||||
TimeoutError, ConnectionResetError, OSError) as e:
|
||||
# ConnectionReset is *expected* when the chip restarts before flushing
|
||||
# the response. We treat it as a soft pass and verify via /ota/status.
|
||||
return (isinstance(e, ConnectionResetError),
|
||||
time.monotonic() - t0,
|
||||
f"{type(e).__name__}: {e}")
|
||||
|
||||
|
||||
def build_firmware() -> int:
|
||||
log("building firmware via idf.py ...")
|
||||
if "IDF_PATH" not in os.environ:
|
||||
export = Path.home() / "esp" / "esp-idf-v5.2" / "export.sh"
|
||||
if not export.is_file():
|
||||
err("IDF_PATH not set and ~/esp/esp-idf-v5.2/export.sh not found")
|
||||
return 2
|
||||
# source the env in a child shell
|
||||
rc = subprocess.call(
|
||||
["bash", "-lc", f". '{export}' >/dev/null 2>&1 && cd '{FW_DIR}' && idf.py build"]
|
||||
)
|
||||
else:
|
||||
rc = subprocess.call(["idf.py", "build"], cwd=str(FW_DIR))
|
||||
if rc != 0:
|
||||
err("build failed")
|
||||
return 2
|
||||
return 0
|
||||
|
||||
|
||||
# ---- main --------------------------------------------------------------------
|
||||
def main(argv: list[str]) -> int:
|
||||
ap = argparse.ArgumentParser(
|
||||
prog="ota-deploy.sh",
|
||||
description="Push esp32-csi-node.bin to one or more sensor nodes over WiFi.",
|
||||
)
|
||||
ap.add_argument("targets", nargs="*",
|
||||
help="node IPs; auto-discover if omitted")
|
||||
ap.add_argument("--build", action="store_true",
|
||||
help="idf.py build before deploying")
|
||||
ap.add_argument("--no-verify", action="store_true",
|
||||
help="skip post-reboot /ota/status confirmation")
|
||||
args = ap.parse_args(argv)
|
||||
|
||||
if args.build:
|
||||
rc = build_firmware()
|
||||
if rc != 0:
|
||||
return rc
|
||||
|
||||
if not BIN_PATH.is_file():
|
||||
err(f"firmware binary not found: {BIN_PATH} — pass --build first")
|
||||
return 2
|
||||
payload = BIN_PATH.read_bytes()
|
||||
log(f"firmware: {BIN_PATH} ({len(payload)} bytes)")
|
||||
|
||||
targets = args.targets or discover_nodes()
|
||||
if not targets:
|
||||
err("no nodes given and none discovered")
|
||||
return 1
|
||||
log(f"targets: {' '.join(targets)}")
|
||||
|
||||
# snapshot before
|
||||
before: dict[str, str] = {}
|
||||
for ip in targets:
|
||||
st = get_ota_status(ip)
|
||||
if not st:
|
||||
warn(f"{ip}: not reachable before upload")
|
||||
before[ip] = "UNREACHABLE"
|
||||
continue
|
||||
before[ip] = st.get("running_partition", "UNKNOWN")
|
||||
log(f"{ip} before: running_partition={before[ip]} time={st.get('time')}")
|
||||
|
||||
psk = os.environ.get("OTA_PSK") or None
|
||||
if psk:
|
||||
log("OTA_PSK set — sending Bearer token")
|
||||
|
||||
# upload in parallel
|
||||
log("uploading in parallel ...")
|
||||
results: dict[str, tuple[bool, float, str]] = {}
|
||||
with cf.ThreadPoolExecutor(max_workers=max(2, len(targets))) as pool:
|
||||
futs = {pool.submit(upload_one, ip, payload, psk): ip for ip in targets}
|
||||
for fut in cf.as_completed(futs):
|
||||
ip = futs[fut]
|
||||
ok, dt, body = fut.result()
|
||||
results[ip] = (ok, dt, body)
|
||||
tag = _c("32", "ok") if ok else _c("31", "ERR")
|
||||
log(f"{ip} upload {tag} in {dt:.1f}s body={body[:120]}")
|
||||
|
||||
if args.no_verify:
|
||||
log("--no-verify — done")
|
||||
return 0 if all(v[0] for v in results.values()) else 1
|
||||
|
||||
# verify
|
||||
log(f"waiting {REBOOT_WAIT_S}s for reboot ...")
|
||||
time.sleep(REBOOT_WAIT_S)
|
||||
fail = False
|
||||
for ip in targets:
|
||||
new_st: dict | None = None
|
||||
for _ in range(VERIFY_RETRIES):
|
||||
new_st = get_ota_status(ip)
|
||||
if new_st:
|
||||
break
|
||||
time.sleep(VERIFY_DELAY_S)
|
||||
if not new_st:
|
||||
err(f"{ip}: not reachable after reboot — DEAD or panic loop")
|
||||
fail = True
|
||||
continue
|
||||
new_part = new_st.get("running_partition", "?")
|
||||
new_time = new_st.get("time", "?")
|
||||
if new_part == before.get(ip):
|
||||
err(f"{ip}: running_partition still {new_part} — OTA did NOT take "
|
||||
"(likely panic on first boot from new slot)")
|
||||
fail = True
|
||||
else:
|
||||
log(f"{ip}: {before[ip]} → {_c('32', new_part)} (time={new_time}) ✓")
|
||||
return 1 if fail else 0
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
try:
|
||||
sys.exit(main(sys.argv[1:]))
|
||||
except KeyboardInterrupt:
|
||||
err("interrupted")
|
||||
sys.exit(130)
|
||||
|
|
@ -0,0 +1,241 @@
|
|||
#!/usr/bin/env python3
|
||||
"""
|
||||
Record an empty-room baseline for the RuView sensing-server.
|
||||
|
||||
ADR-103 v2 — persistent baseline override that's stable across NBVI
|
||||
re-selection between server restarts. Computes baseline from the FULL
|
||||
amplitude vector (all non-zero subcarriers), not from the dynamic NBVI
|
||||
top-K subset.
|
||||
|
||||
Usage:
|
||||
1. Operator steps out of the room.
|
||||
2. Run: scripts/record-baseline.py [--duration 90] [--server localhost]
|
||||
3. Wait for the "saved" message. Operator can come back.
|
||||
4. Restart sensing-server to pick up the new baseline.
|
||||
|
||||
The script connects to the live WebSocket stream, records `duration`
|
||||
seconds of per-node amplitudes, trims the first and last 15 seconds
|
||||
(catches door-opening transients), then for each node finds the most
|
||||
stable 30-second sub-window (lowest broadband CV) and writes per-node
|
||||
full-broadband mean / median / p95 to data/baseline.json.
|
||||
"""
|
||||
|
||||
import argparse
|
||||
import asyncio
|
||||
import json
|
||||
import math
|
||||
import statistics
|
||||
import sys
|
||||
import time
|
||||
from datetime import datetime, timezone
|
||||
from pathlib import Path
|
||||
|
||||
try:
|
||||
import websockets
|
||||
except ImportError:
|
||||
print("error: pip install websockets", file=sys.stderr)
|
||||
sys.exit(2)
|
||||
|
||||
|
||||
def full_broadband_mean(amps):
|
||||
"""Mean over all non-zero subcarriers (skips guard tones)."""
|
||||
valid = [v for v in amps if v > 0]
|
||||
return (sum(valid) / len(valid)) if valid else 0.0
|
||||
|
||||
|
||||
def circular_mean_var(phases):
|
||||
"""ADR-104 phase-domain: circular mean (radians) and circular variance
|
||||
(1 - |R|, in [0, 1]) over a list of unwrapped/atan2 phase samples.
|
||||
|
||||
Variance close to 0 = phases tightly clustered (stable subcarrier,
|
||||
suitable for baseline-comparison). Close to 1 = phases scattered
|
||||
(subcarrier is noisy; baseline reference unreliable).
|
||||
"""
|
||||
n = len(phases)
|
||||
if n == 0:
|
||||
return (0.0, 1.0)
|
||||
sx = sum(math.sin(p) for p in phases) / n
|
||||
cx = sum(math.cos(p) for p in phases) / n
|
||||
r = math.sqrt(sx * sx + cx * cx)
|
||||
mean = math.atan2(sx, cx)
|
||||
var = 1.0 - r
|
||||
return (mean, var)
|
||||
|
||||
|
||||
async def record(server: str, duration: float, port: int):
|
||||
# Per-node frame log: (t_sec, amps, phases, rssi).
|
||||
# ADR-104 phase-domain: phases captured alongside amplitudes when the
|
||||
# WS payload carries `phases` (ADR-106 full complex CSI). Missing or
|
||||
# empty phase vectors → trim_and_clean writes only amplitude baseline.
|
||||
by_node: dict[int, list[tuple[float, list[float], list[float], float]]] = {}
|
||||
url = f"ws://{server}:{port}/ws/sensing"
|
||||
start = time.time()
|
||||
print(f"connecting to {url} — recording {duration:.0f}s …", flush=True)
|
||||
async with websockets.connect(url) as ws:
|
||||
async for msg in ws:
|
||||
d = json.loads(msg)
|
||||
if d.get("type") != "sensing_update":
|
||||
continue
|
||||
t = time.time() - start
|
||||
for n in d.get("nodes") or []:
|
||||
a = n.get("amplitude") or []
|
||||
if not a:
|
||||
continue
|
||||
ph = n.get("phases") or []
|
||||
by_node.setdefault(n["node_id"], []).append(
|
||||
(t, a, ph, n.get("rssi_dbm", 0.0))
|
||||
)
|
||||
if time.time() - start >= duration:
|
||||
break
|
||||
return by_node
|
||||
|
||||
|
||||
def trim_and_clean(frames, trim_head_sec=15.0, trim_tail_sec=15.0, clean_window_sec=30.0):
|
||||
"""Trim head/tail transients, then scan for the cleanest sub-window.
|
||||
|
||||
`frames` is a list of (t_sec, amps, phases, rssi). `phases` may be an
|
||||
empty list when the server hasn't been upgraded to emit them — in
|
||||
that case the resulting baseline omits the phase-domain fields and
|
||||
the server falls back to amplitude-only drift (ADR-104 baseline mode).
|
||||
"""
|
||||
if not frames:
|
||||
return None
|
||||
t0 = frames[0][0]
|
||||
t1 = frames[-1][0]
|
||||
dur = t1 - t0
|
||||
if dur < trim_head_sec + trim_tail_sec + clean_window_sec / 2:
|
||||
head = dur / 6
|
||||
tail = dur / 6
|
||||
else:
|
||||
head = trim_head_sec
|
||||
tail = trim_tail_sec
|
||||
trimmed = [f for f in frames if t0 + head <= f[0] <= t1 - tail]
|
||||
if not trimmed:
|
||||
return None
|
||||
|
||||
win = clean_window_sec
|
||||
if (trimmed[-1][0] - trimmed[0][0]) <= win:
|
||||
chunk = trimmed
|
||||
else:
|
||||
best = None # (cv, frames)
|
||||
step = 5.0
|
||||
cursor = trimmed[0][0]
|
||||
while cursor + win <= trimmed[-1][0]:
|
||||
window = [f for f in trimmed if cursor <= f[0] <= cursor + win]
|
||||
if len(window) >= 5:
|
||||
bms = [full_broadband_mean(a) for _, a, _ in window]
|
||||
mu = statistics.mean(bms)
|
||||
if mu > 0:
|
||||
sd = statistics.pstdev(bms)
|
||||
cv = sd / mu
|
||||
if best is None or cv < best[0]:
|
||||
best = (cv, window)
|
||||
cursor += step
|
||||
if best is None or not best[1]:
|
||||
return None
|
||||
chunk = best[1]
|
||||
|
||||
# ── Compute per-node stats on the clean window ───────────────
|
||||
full_means = [full_broadband_mean(a) for _, a, _ in chunk]
|
||||
rssis = [r for _, _, _, r in chunk if r != 0]
|
||||
sorted_full = sorted(full_means)
|
||||
|
||||
# Per-subcarrier mean across the clean window (for diagnostic + future
|
||||
# subcarrier-level comparison if the server gets that capability).
|
||||
n_sub = min(len(a) for _, a, _, _ in chunk)
|
||||
per_sub_means = []
|
||||
for k in range(n_sub):
|
||||
vs = [a[k] for _, a, _, _ in chunk if k < len(a) and a[k] > 0]
|
||||
per_sub_means.append(statistics.mean(vs) if vs else 0.0)
|
||||
|
||||
# ADR-104 phase-domain: per-subcarrier circular mean + variance of the
|
||||
# captured phase samples. Only included if the WS stream carried
|
||||
# phases — server tolerates either schema.
|
||||
have_phases = any(ph for _, _, ph, _ in chunk)
|
||||
per_sub_phase_means: list[float] = []
|
||||
per_sub_phase_vars: list[float] = []
|
||||
if have_phases:
|
||||
n_phase_sub = min(
|
||||
(len(ph) for _, _, ph, _ in chunk if ph),
|
||||
default=0,
|
||||
)
|
||||
for k in range(n_phase_sub):
|
||||
samples = [ph[k] for _, _, ph, _ in chunk if k < len(ph)]
|
||||
if not samples:
|
||||
per_sub_phase_means.append(0.0)
|
||||
per_sub_phase_vars.append(1.0)
|
||||
continue
|
||||
mean, var = circular_mean_var(samples)
|
||||
per_sub_phase_means.append(mean)
|
||||
per_sub_phase_vars.append(var)
|
||||
|
||||
result = {
|
||||
# Persistent fields the server reads:
|
||||
"full_broadband_mean": statistics.mean(full_means),
|
||||
"full_broadband_p50": sorted_full[len(sorted_full)//2],
|
||||
"full_broadband_p95": sorted_full[int(len(sorted_full)*0.95)],
|
||||
"full_broadband_std": statistics.pstdev(full_means),
|
||||
"full_broadband_cv_pct": 100*statistics.pstdev(full_means)/statistics.mean(full_means)
|
||||
if statistics.mean(full_means) else 0.0,
|
||||
# Reference:
|
||||
"rssi_dbm": statistics.mean(rssis) if rssis else 0.0,
|
||||
"n_samples": len(full_means),
|
||||
"window_start_sec": chunk[0][0],
|
||||
"window_end_sec": chunk[-1][0],
|
||||
# Per-subcarrier diagnostic (kept so future server versions can do
|
||||
# subcarrier-level comparison without re-recording):
|
||||
"per_subcarrier_mean": [round(v, 3) for v in per_sub_means],
|
||||
}
|
||||
if per_sub_phase_means:
|
||||
# Rounding: 4 decimals on mean phase (radian), 3 on variance
|
||||
# — phase variance is in [0,1] so 3 decimals is plenty.
|
||||
result["per_subcarrier_phase_mean"] = [round(v, 4) for v in per_sub_phase_means]
|
||||
result["per_subcarrier_phase_var"] = [round(v, 3) for v in per_sub_phase_vars]
|
||||
return result
|
||||
|
||||
|
||||
def main():
|
||||
ap = argparse.ArgumentParser(description=__doc__.splitlines()[1])
|
||||
ap.add_argument("--duration", type=float, default=90.0, help="seconds to record (default 90)")
|
||||
ap.add_argument("--server", default="localhost", help="sensing-server host")
|
||||
ap.add_argument("--port", type=int, default=8765, help="ws port (default 8765)")
|
||||
ap.add_argument("--out", type=Path, default=Path("v2/data/baseline.json"))
|
||||
ap.add_argument("--trim-head", type=float, default=15.0)
|
||||
ap.add_argument("--trim-tail", type=float, default=15.0)
|
||||
ap.add_argument("--clean-window", type=float, default=30.0)
|
||||
args = ap.parse_args()
|
||||
|
||||
by_node = asyncio.run(record(args.server, args.duration, args.port))
|
||||
if not by_node:
|
||||
print("no data received from server", file=sys.stderr)
|
||||
sys.exit(1)
|
||||
|
||||
out = {
|
||||
"version": 2,
|
||||
"captured_at": datetime.now(timezone.utc).isoformat(timespec="seconds"),
|
||||
"duration_sec": args.duration,
|
||||
"trim_head_sec": args.trim_head,
|
||||
"trim_tail_sec": args.trim_tail,
|
||||
"clean_window_sec": args.clean_window,
|
||||
"method": "record → trim head/tail → find lowest-CV sub-window → FULL-broadband stats per node",
|
||||
"nodes": {},
|
||||
}
|
||||
print()
|
||||
for nid, frames in sorted(by_node.items()):
|
||||
result = trim_and_clean(frames, args.trim_head, args.trim_tail, args.clean_window)
|
||||
if not result:
|
||||
print(f"node {nid}: not enough data for cleaning (skipped)")
|
||||
continue
|
||||
out["nodes"][str(nid)] = result
|
||||
print(f"node {nid}: {len(frames)} raw frames, kept cleanest {result['n_samples']}-sample window")
|
||||
print(f" FULL broadband: mean={result['full_broadband_mean']:.2f} std={result['full_broadband_std']:.2f} CV={result['full_broadband_cv_pct']:.2f}%")
|
||||
print(f" full p50={result['full_broadband_p50']:.2f} p95={result['full_broadband_p95']:.2f} rssi={result['rssi_dbm']:.1f}")
|
||||
|
||||
args.out.parent.mkdir(parents=True, exist_ok=True)
|
||||
args.out.write_text(json.dumps(out, indent=2))
|
||||
print(f"\nsaved → {args.out}")
|
||||
print("restart sensing-server to load the new baseline.")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
|
|
@ -1515,6 +1515,40 @@ export class LiveDemoTab {
|
|||
} catch (error) {
|
||||
this.logger.warn('Could not fetch models', { error: error.message });
|
||||
}
|
||||
// ADR-116 / ADR-117: surface WiFlow-v1 in the Model Control dropdown
|
||||
// when the server reports `pose_estimation: true` via /api/v1/info.
|
||||
// WiFlow is loaded outside the RVF model registry path (--wiflow-model
|
||||
// flag) so listModels() above doesn't return it. We add a virtual
|
||||
// entry and mark it active ONLY when no RVF model is already active
|
||||
// — otherwise the dropdown would silently flip from the operator's
|
||||
// chosen RVF model to "WiFlow-v1" every fetch.
|
||||
try {
|
||||
const r = await fetch('/api/v1/info');
|
||||
if (r.ok) {
|
||||
const info = await r.json();
|
||||
if (info?.features?.pose_estimation) {
|
||||
if (!this.modelState.models.some(m => m.id === 'wiflow-v1')) {
|
||||
this.modelState.models.unshift({
|
||||
id: 'wiflow-v1',
|
||||
name: 'WiFlow-v1 (lite, 186K params, --wiflow-model)',
|
||||
});
|
||||
}
|
||||
if (!this.modelState.activeModelId) {
|
||||
this.modelState.activeModelId = 'wiflow-v1';
|
||||
this.modelState.activeModelInfo = {
|
||||
model_id: 'wiflow-v1',
|
||||
name: 'WiFlow-v1',
|
||||
version: 'lite',
|
||||
pck_score: 0.929, // from model card; eval-set, not this deployment
|
||||
};
|
||||
}
|
||||
this.populateModelSelector();
|
||||
this.updateModelUI();
|
||||
}
|
||||
}
|
||||
} catch (e) {
|
||||
this.logger.warn('ADR-116 info probe failed', { error: e.message });
|
||||
}
|
||||
}
|
||||
|
||||
populateModelSelector() {
|
||||
|
|
|
|||
|
|
@ -51,6 +51,17 @@ export class PoseDetectionCanvas {
|
|||
this.showTrail = false;
|
||||
this.maxTrailLength = 10;
|
||||
|
||||
// ADR-105 / ADR-113: model-load gating. The canvas refuses to draw
|
||||
// skeletons until /api/v1/pose/stats reports model_loaded === true,
|
||||
// so an empty/zero-confidence keypoint stream from a model-less
|
||||
// server doesn't paint a misleading "phantom" pose.
|
||||
//
|
||||
// null = "haven't asked yet" (treated as not-loaded for rendering).
|
||||
this.modelLoaded = null;
|
||||
this.modelStatusUrl = options.modelStatusUrl || '/api/v1/pose/stats';
|
||||
this.modelStatusPollMs = options.modelStatusPollMs || 30000;
|
||||
this.modelStatusTimer = null;
|
||||
|
||||
// Initialize component
|
||||
this.initializeComponent();
|
||||
}
|
||||
|
|
@ -79,9 +90,79 @@ export class PoseDetectionCanvas {
|
|||
// Set up pose service subscription
|
||||
this.setupPoseServiceSubscription();
|
||||
|
||||
// ADR-105: poll model_loaded so we can hide the canvas when no
|
||||
// trained pose model is on the server.
|
||||
this.checkModelStatus();
|
||||
this.modelStatusTimer = setInterval(
|
||||
() => this.checkModelStatus(),
|
||||
this.modelStatusPollMs
|
||||
);
|
||||
|
||||
this.logger.info('PoseDetectionCanvas component initialized successfully');
|
||||
}
|
||||
|
||||
/**
|
||||
* Fetch `/api/v1/pose/stats` and update `this.modelLoaded`. On the
|
||||
* leading-edge transitions (null → false, true → false) we hide the
|
||||
* pose canvas and overlay a "No model loaded" notice so the operator
|
||||
* isn't fooled by an empty skeleton renderer.
|
||||
*/
|
||||
async checkModelStatus() {
|
||||
try {
|
||||
const resp = await fetch(this.modelStatusUrl, { cache: 'no-store' });
|
||||
if (!resp.ok) {
|
||||
// Server reachable but not surfacing pose stats — be safe.
|
||||
this.setModelLoaded(false, 'pose-stats endpoint error');
|
||||
return;
|
||||
}
|
||||
const json = await resp.json();
|
||||
const loaded = json && json.model_loaded === true;
|
||||
this.setModelLoaded(loaded, null);
|
||||
} catch (e) {
|
||||
// Network blip — don't flip-flop the UI on a transient failure.
|
||||
this.logger.debug('model-status poll failed', { err: e.message });
|
||||
}
|
||||
}
|
||||
|
||||
setModelLoaded(loaded, errOrNull) {
|
||||
if (this.modelLoaded === loaded) return;
|
||||
this.modelLoaded = loaded;
|
||||
this.logger.info('model-loaded state changed', { loaded, note: errOrNull });
|
||||
this.updateCanvasVisibility();
|
||||
}
|
||||
|
||||
updateCanvasVisibility() {
|
||||
if (!this.canvas) return;
|
||||
const wrap = this.canvas.parentElement; // .pose-canvas-container
|
||||
const overlayId = `model-overlay-${this.containerId}`;
|
||||
let overlay = document.getElementById(overlayId);
|
||||
if (this.modelLoaded === true) {
|
||||
this.canvas.style.visibility = 'visible';
|
||||
if (overlay) overlay.style.display = 'none';
|
||||
return;
|
||||
}
|
||||
// No model — hide the canvas and show a clear notice.
|
||||
this.canvas.style.visibility = 'hidden';
|
||||
if (!overlay && wrap) {
|
||||
overlay = document.createElement('div');
|
||||
overlay.id = overlayId;
|
||||
overlay.className = 'pose-model-missing';
|
||||
overlay.style.cssText =
|
||||
'position:absolute;inset:0;display:flex;align-items:center;' +
|
||||
'justify-content:center;color:#888;font-family:JetBrains Mono,monospace;' +
|
||||
'font-size:13px;text-align:center;padding:20px;background:#0d1117;';
|
||||
overlay.innerHTML =
|
||||
'No trained pose model loaded.<br>' +
|
||||
'<span style="color:#555;font-size:11px;">' +
|
||||
'Pose rendering disabled — sensing channels still active in ' +
|
||||
'the Sensing / Hardware tabs (ADR-105).</span>';
|
||||
wrap.style.position = 'relative';
|
||||
wrap.appendChild(overlay);
|
||||
} else if (overlay) {
|
||||
overlay.style.display = 'flex';
|
||||
}
|
||||
}
|
||||
|
||||
createDOMStructure() {
|
||||
this.container.innerHTML = `
|
||||
<div class="pose-detection-canvas-wrapper">
|
||||
|
|
@ -516,6 +597,13 @@ export class PoseDetectionCanvas {
|
|||
if (!this.renderer || !this.state.isActive) {
|
||||
return;
|
||||
}
|
||||
// ADR-105: refuse to paint anything when the server has no trained
|
||||
// pose model — empty/zero-confidence keypoints would otherwise show
|
||||
// up as a misleading skeleton. The overlay from
|
||||
// updateCanvasVisibility() already tells the operator why.
|
||||
if (this.modelLoaded !== true) {
|
||||
return;
|
||||
}
|
||||
|
||||
try {
|
||||
// Render trail before the current frame if enabled
|
||||
|
|
@ -1535,6 +1623,12 @@ export class PoseDetectionCanvas {
|
|||
this.unsubscribeFunctions.forEach(unsubscribe => unsubscribe());
|
||||
this.unsubscribeFunctions = [];
|
||||
|
||||
// ADR-105: stop the model-status poll.
|
||||
if (this.modelStatusTimer) {
|
||||
clearInterval(this.modelStatusTimer);
|
||||
this.modelStatusTimer = null;
|
||||
}
|
||||
|
||||
// Clean up resize observer
|
||||
if (this.resizeObserver) {
|
||||
this.resizeObserver.disconnect();
|
||||
|
|
|
|||
|
|
@ -488,8 +488,10 @@
|
|||
</div>
|
||||
</section>
|
||||
|
||||
<!-- Sensing Tab -->
|
||||
<section id="sensing" class="tab-content"></section>
|
||||
<!-- Sensing Tab (ADR-117: container div required by app.js SensingTab.mount) -->
|
||||
<section id="sensing" class="tab-content">
|
||||
<div id="sensing-container"></div>
|
||||
</section>
|
||||
|
||||
<!-- Training Tab -->
|
||||
<section id="training" class="tab-content">
|
||||
|
|
|
|||
|
|
@ -1,3 +1,8 @@
|
|||
export const WS_PATH = '/api/v1/stream/pose';
|
||||
// RuView sensing-server (Rust+Axum) exposes the live stream at /ws/sensing on
|
||||
// its dedicated WebSocket port (default 8765). The legacy wifi-densepose v1
|
||||
// path (/api/v1/stream/pose) is kept as a fallback in case the mobile app is
|
||||
// pointed at an old FastAPI backend.
|
||||
export const WS_PATH = '/ws/sensing';
|
||||
export const WS_PORT = 8765;
|
||||
export const RECONNECT_DELAYS = [1000, 2000, 4000, 8000, 16000];
|
||||
export const MAX_RECONNECT_ATTEMPTS = 10;
|
||||
|
|
|
|||
|
|
@ -124,8 +124,11 @@ export const MATScreen = () => {
|
|||
const { height } = useWindowDimensions();
|
||||
const webHeight = Math.max(240, Math.floor(height * 0.5));
|
||||
|
||||
const showOverlay = dataSource === 'simulated' && !simulationAcknowledged;
|
||||
const showBanner = dataSource === 'simulated' && simulationAcknowledged;
|
||||
// Simulation overlay/banner removed — UI shows only real signals from the
|
||||
// sensing-server. The `dataSource === 'simulated'` branch is never reached
|
||||
// in production builds (server refuses --source simulate).
|
||||
const showOverlay = false;
|
||||
const showBanner = false;
|
||||
|
||||
return (
|
||||
<ThemedView style={{ flex: 1, backgroundColor: colors.bg, padding: spacing.md }}>
|
||||
|
|
|
|||
|
|
@ -60,7 +60,7 @@ export default function VitalsScreen() {
|
|||
<ConnectionBanner status={bannerStatus} />
|
||||
|
||||
<ScrollView contentContainerStyle={styles.content} showsVerticalScrollIndicator={false}>
|
||||
<View style={styles.headerRow}>{isSimulated ? <ModeBadge mode="SIM" /> : null}</View>
|
||||
<View style={styles.headerRow}>{/* SIM badge removed: production shows only real signals. */}</View>
|
||||
|
||||
<View style={styles.gaugesRow}>
|
||||
<View style={styles.gaugeCard}>
|
||||
|
|
|
|||
|
|
@ -1,7 +1,5 @@
|
|||
import { SIMULATION_TICK_INTERVAL_MS } from '@/constants/simulation';
|
||||
import { MAX_RECONNECT_ATTEMPTS, RECONNECT_DELAYS, WS_PATH } from '@/constants/websocket';
|
||||
import { MAX_RECONNECT_ATTEMPTS, RECONNECT_DELAYS, WS_PATH, WS_PORT } from '@/constants/websocket';
|
||||
import { usePoseStore } from '@/stores/poseStore';
|
||||
import { generateSimulatedData } from '@/services/simulation.service';
|
||||
import type { ConnectionStatus, SensingFrame } from '@/types/sensing';
|
||||
|
||||
type FrameListener = (frame: SensingFrame) => void;
|
||||
|
|
@ -11,7 +9,6 @@ class WsService {
|
|||
private listeners = new Set<FrameListener>();
|
||||
private reconnectAttempt = 0;
|
||||
private reconnectTimer: ReturnType<typeof setTimeout> | null = null;
|
||||
private simulationTimer: ReturnType<typeof setInterval> | null = null;
|
||||
private targetUrl = '';
|
||||
private active = false;
|
||||
private status: ConnectionStatus = 'disconnected';
|
||||
|
|
@ -22,8 +19,9 @@ class WsService {
|
|||
this.reconnectAttempt = 0;
|
||||
|
||||
if (!url) {
|
||||
this.handleStatusChange('simulated');
|
||||
this.startSimulation();
|
||||
// No server URL configured — stay disconnected. Production builds
|
||||
// never fall back to synthetic data.
|
||||
this.handleStatusChange('disconnected');
|
||||
return;
|
||||
}
|
||||
|
||||
|
|
@ -40,7 +38,6 @@ class WsService {
|
|||
|
||||
socket.onopen = () => {
|
||||
this.reconnectAttempt = 0;
|
||||
this.stopSimulation();
|
||||
this.handleStatusChange('connected');
|
||||
};
|
||||
|
||||
|
|
@ -78,7 +75,6 @@ class WsService {
|
|||
disconnect(): void {
|
||||
this.active = false;
|
||||
this.clearReconnectTimer();
|
||||
this.stopSimulation();
|
||||
if (this.ws) {
|
||||
this.ws.close(1000, 'client disconnect');
|
||||
this.ws = null;
|
||||
|
|
@ -100,7 +96,9 @@ class WsService {
|
|||
private buildWsUrl(rawUrl: string): string {
|
||||
const parsed = new URL(rawUrl);
|
||||
const proto = parsed.protocol === 'https:' || parsed.protocol === 'wss:' ? 'wss:' : 'ws:';
|
||||
return `${proto}//${parsed.host}${WS_PATH}`;
|
||||
// RuView sensing-server runs WS on a separate port (WS_PORT, default 8765),
|
||||
// independent of the HTTP API port. Build the WS URL with that port.
|
||||
return `${proto}//${parsed.hostname}:${WS_PORT}${WS_PATH}`;
|
||||
}
|
||||
|
||||
private handleStatusChange(status: ConnectionStatus): void {
|
||||
|
|
@ -118,8 +116,8 @@ class WsService {
|
|||
}
|
||||
|
||||
if (this.reconnectAttempt >= MAX_RECONNECT_ATTEMPTS) {
|
||||
this.handleStatusChange('simulated');
|
||||
this.startSimulation();
|
||||
// Give up — stay disconnected. No synthetic fallback.
|
||||
this.handleStatusChange('disconnected');
|
||||
return;
|
||||
}
|
||||
|
||||
|
|
@ -130,27 +128,6 @@ class WsService {
|
|||
this.reconnectTimer = null;
|
||||
this.connect(this.targetUrl);
|
||||
}, delay);
|
||||
this.startSimulation();
|
||||
}
|
||||
|
||||
private startSimulation(): void {
|
||||
if (this.simulationTimer) {
|
||||
return;
|
||||
}
|
||||
this.simulationTimer = setInterval(() => {
|
||||
this.handleStatusChange('simulated');
|
||||
const frame = generateSimulatedData();
|
||||
this.listeners.forEach((listener) => {
|
||||
listener(frame);
|
||||
});
|
||||
}, SIMULATION_TICK_INTERVAL_MS);
|
||||
}
|
||||
|
||||
private stopSimulation(): void {
|
||||
if (this.simulationTimer) {
|
||||
clearInterval(this.simulationTimer);
|
||||
this.simulationTimer = null;
|
||||
}
|
||||
}
|
||||
|
||||
private clearReconnectTimer(): void {
|
||||
|
|
|
|||
|
|
@ -26,8 +26,8 @@ export const useMatStore = create<MatState>((set) => ({
|
|||
survivors: [],
|
||||
alerts: [],
|
||||
selectedEventId: null,
|
||||
dataSource: 'simulated',
|
||||
simulationAcknowledged: false,
|
||||
dataSource: 'real',
|
||||
simulationAcknowledged: true,
|
||||
|
||||
upsertEvent: (event) => {
|
||||
set((state) => {
|
||||
|
|
|
|||
|
|
@ -18,7 +18,9 @@ export interface SettingsState {
|
|||
export const useSettingsStore = create<SettingsState>()(
|
||||
persist(
|
||||
(set) => ({
|
||||
serverUrl: 'http://localhost:3000',
|
||||
// Defaults to the Mac's Tailscale IP so the phone can reach the
|
||||
// sensing-server from any network. Override in Settings if needed.
|
||||
serverUrl: 'http://100.123.189.10:8080',
|
||||
rssiScanEnabled: false,
|
||||
theme: 'system',
|
||||
alertSoundEnabled: true,
|
||||
|
|
|
|||
|
|
@ -1,4 +1,4 @@
|
|||
use std::net::{SocketAddr, UdpSocket};
|
||||
use std::net::{IpAddr, Ipv4Addr, SocketAddr, UdpSocket};
|
||||
use std::time::Duration;
|
||||
|
||||
use mdns_sd::{ServiceDaemon, ServiceEvent};
|
||||
|
|
@ -37,13 +37,15 @@ pub async fn discover_nodes(
|
|||
) -> Result<Vec<DiscoveredNode>, String> {
|
||||
let timeout_duration = Duration::from_millis(timeout_ms.unwrap_or(3000));
|
||||
|
||||
// Run mDNS and UDP discovery concurrently
|
||||
let (mdns_nodes, udp_nodes) = tokio::join!(
|
||||
discover_via_mdns(timeout_duration),
|
||||
discover_via_udp(timeout_duration),
|
||||
);
|
||||
// Current RuView FW doesn't advertise mDNS `_ruview._udp.local.` and
|
||||
// doesn't respond to UDP broadcast beacons, so those two paths return
|
||||
// nothing on every poll and just burn CPU/network. HTTP sweep alone
|
||||
// suffices for our deployment.
|
||||
let http_nodes = discover_via_http_sweep(timeout_duration).await;
|
||||
let mdns_nodes: Result<Vec<DiscoveredNode>, String> = Ok(Vec::new());
|
||||
let udp_nodes: Result<Vec<DiscoveredNode>, String> = Ok(Vec::new());
|
||||
|
||||
// Merge results, deduplicating by MAC address
|
||||
// Merge results, deduplicating by MAC address (or IP for HTTP-only nodes)
|
||||
let mut registry = NodeRegistry::new();
|
||||
|
||||
for node in mdns_nodes.unwrap_or_default() {
|
||||
|
|
@ -58,7 +60,23 @@ pub async fn discover_nodes(
|
|||
}
|
||||
}
|
||||
|
||||
let http_vec = http_nodes.unwrap_or_default();
|
||||
let _ = std::fs::OpenOptions::new().create(true).append(true)
|
||||
.open("/tmp/ruview-discovery.log")
|
||||
.map(|mut f| { use std::io::Write; let _ = writeln!(f, "[discover] http_vec.len()={}", http_vec.len()); });
|
||||
for node in http_vec {
|
||||
// HTTP sweep returns nodes without MAC — key by IP-derived pseudo-MAC
|
||||
let key = node.mac.clone().unwrap_or_else(|| format!("ip:{}", node.ip));
|
||||
let _ = std::fs::OpenOptions::new().create(true).append(true)
|
||||
.open("/tmp/ruview-discovery.log")
|
||||
.map(|mut f| { use std::io::Write; let _ = writeln!(f, "[discover] upsert key={} ip={}", key, node.ip); });
|
||||
registry.upsert(MacAddress::new(&key), node);
|
||||
}
|
||||
|
||||
let nodes: Vec<DiscoveredNode> = registry.all().into_iter().cloned().collect();
|
||||
let _ = std::fs::OpenOptions::new().create(true).append(true)
|
||||
.open("/tmp/ruview-discovery.log")
|
||||
.map(|mut f| { use std::io::Write; let _ = writeln!(f, "[discover] returning {} nodes", nodes.len()); });
|
||||
|
||||
// Update global state
|
||||
{
|
||||
|
|
@ -219,6 +237,155 @@ async fn discover_via_udp(timeout_duration: Duration) -> Result<Vec<DiscoveredNo
|
|||
|
||||
/// Parse a UDP beacon response into a DiscoveredNode.
|
||||
/// Format: RUVIEW_BEACON|<mac>|<node_id>|<version>|<chip>|<role>|<tdm_slot>|<tdm_total>
|
||||
/// Discover nodes via HTTP probe of `/ota/status` on port 8032 across local /24 subnet.
|
||||
///
|
||||
/// Strategy:
|
||||
/// 1. Detect host IPv4 by opening a non-routable UDP socket "connect" to 8.8.8.8.
|
||||
/// 2. For each host address in the /24 (1..=254, excluding self), send
|
||||
/// `GET http://X.X.X.X:8032/ota/status` with a short per-request timeout.
|
||||
/// 3. If the response is JSON containing `version` + `running_partition`,
|
||||
/// treat the device as a RuView CSI node and build a `DiscoveredNode`.
|
||||
///
|
||||
/// MAC is left as `None` (sensors don't expose it on /ota/status); UI manual
|
||||
/// add or a future FW field could fill it in.
|
||||
async fn discover_via_http_sweep(timeout_duration: Duration) -> Result<Vec<DiscoveredNode>, String> {
|
||||
// 1. Detect host IPv4
|
||||
let host_ip = match detect_host_ipv4() {
|
||||
Some(ip) => ip,
|
||||
None => {
|
||||
tracing::warn!("HTTP sweep: could not determine host IPv4");
|
||||
return Ok(Vec::new());
|
||||
}
|
||||
};
|
||||
let octets = host_ip.octets();
|
||||
let base = (octets[0], octets[1], octets[2]);
|
||||
tracing::info!("HTTP sweep on {}.{}.{}.0/24 (self={})", base.0, base.1, base.2, host_ip);
|
||||
|
||||
// 2. Build HTTP client with per-request timeout
|
||||
// Per-request timeout — generous enough for ESP32 HTTP server to respond
|
||||
// even under WiFi contention. With join_all of all 254 probes in parallel,
|
||||
// total elapsed = max(per_req_timeout, slowest_response) ≈ 1.5 s.
|
||||
let per_req_timeout = std::cmp::min(timeout_duration, Duration::from_millis(1500));
|
||||
let client = match reqwest::Client::builder()
|
||||
.timeout(per_req_timeout)
|
||||
.build()
|
||||
{
|
||||
Ok(c) => c,
|
||||
Err(e) => {
|
||||
tracing::warn!("HTTP sweep: client build failed: {}", e);
|
||||
return Ok(Vec::new());
|
||||
}
|
||||
};
|
||||
|
||||
// 3. Probe all hosts in parallel (capped by spawning futures)
|
||||
let mut tasks: Vec<tokio::task::JoinHandle<Option<DiscoveredNode>>> = Vec::new();
|
||||
// Scan only the low end of /24 (2..=60) — typical home/office DHCP pool
|
||||
// for IoT devices. Sweeping all 254 hosts every 10 s causes UI lag on
|
||||
// tokio runtime saturation. Operators with sensors at higher offsets
|
||||
// should expand this range.
|
||||
for h in 2u8..=60u8 {
|
||||
if h == octets[3] {
|
||||
continue; // skip self
|
||||
}
|
||||
let ip = format!("{}.{}.{}.{}", base.0, base.1, base.2, h);
|
||||
let client = client.clone();
|
||||
tasks.push(tokio::spawn(async move {
|
||||
// Probe FW5.47 /status first, then RuView /ota/status fallback.
|
||||
let url1 = format!("http://{}:8032/status", ip);
|
||||
let body: String = match client.get(&url1).send().await {
|
||||
Ok(r) if r.status().is_success() => match r.text().await {
|
||||
Ok(t) => t,
|
||||
Err(_) => return None,
|
||||
},
|
||||
_ => {
|
||||
let url2 = format!("http://{}:8032/ota/status", ip);
|
||||
match client.get(&url2).send().await {
|
||||
Ok(r) if r.status().is_success() => match r.text().await {
|
||||
Ok(t) => t,
|
||||
Err(_) => return None,
|
||||
},
|
||||
_ => return None,
|
||||
}
|
||||
}
|
||||
};
|
||||
let _ = std::fs::OpenOptions::new().create(true).append(true)
|
||||
.open("/tmp/ruview-discovery.log")
|
||||
.map(|mut f| { use std::io::Write; let _ = writeln!(f, "[probe] {} OK len={}", ip, body.len()); });
|
||||
let v: serde_json::Value = match serde_json::from_str(&body) {
|
||||
Ok(v) => v,
|
||||
Err(e) => {
|
||||
let _ = std::fs::OpenOptions::new().create(true).append(true)
|
||||
.open("/tmp/ruview-discovery.log")
|
||||
.map(|mut f| { use std::io::Write; let _ = writeln!(f, "[probe] {} json err: {}", ip, e); });
|
||||
return None;
|
||||
}
|
||||
};
|
||||
// Both FW5.47 (`version`,`fw`,`node`) and RuView (`version`,`running_partition`).
|
||||
let version = v.get("version").and_then(|x| x.as_str()).map(String::from)
|
||||
.or_else(|| v.get("version").and_then(|x| Some(x.to_string())))
|
||||
.unwrap_or_else(|| "unknown".to_string());
|
||||
let mac = v.get("node").and_then(|x| x.as_str()).map(String::from);
|
||||
Some(DiscoveredNode {
|
||||
ip,
|
||||
mac,
|
||||
hostname: None,
|
||||
node_id: 0,
|
||||
firmware_version: Some(version),
|
||||
health: HealthStatus::Online,
|
||||
last_seen: chrono::Utc::now().to_rfc3339(),
|
||||
chip: Chip::Esp32s3,
|
||||
mesh_role: MeshRole::Node,
|
||||
discovery_method: DiscoveryMethod::HttpSweep,
|
||||
tdm_slot: None,
|
||||
tdm_total: None,
|
||||
edge_tier: None,
|
||||
uptime_secs: None,
|
||||
capabilities: Some(NodeCapabilities {
|
||||
wasm: false,
|
||||
ota: true,
|
||||
csi: true,
|
||||
}),
|
||||
friendly_name: None,
|
||||
notes: None,
|
||||
})
|
||||
}));
|
||||
}
|
||||
|
||||
// 4. Wait with overall budget
|
||||
// Wait for ALL tasks to settle in parallel, bounded by the overall budget.
|
||||
// Previously used a sequential `for task in tasks { select! }` which awaited
|
||||
// tasks in IP order — a non-responding 192.168.1.1 blocked discovery of
|
||||
// 192.168.1.17/19 even though those completed in ~50 ms.
|
||||
let join_all_fut = futures::future::join_all(tasks);
|
||||
let results = match tokio::time::timeout(timeout_duration, join_all_fut).await {
|
||||
Ok(rs) => rs,
|
||||
Err(_) => {
|
||||
tracing::info!("HTTP sweep timeout — partial results lost");
|
||||
Vec::new()
|
||||
}
|
||||
};
|
||||
let mut found = Vec::new();
|
||||
for r in results {
|
||||
if let Ok(Some(node)) = r {
|
||||
tracing::info!("HTTP sweep found {} fw={:?}", node.ip, node.firmware_version);
|
||||
found.push(node);
|
||||
}
|
||||
}
|
||||
Ok(found)
|
||||
}
|
||||
|
||||
/// Determine the primary IPv4 of this host by "connecting" a UDP socket
|
||||
/// to a non-routable target (no packets sent) and reading local_addr.
|
||||
fn detect_host_ipv4() -> Option<Ipv4Addr> {
|
||||
let sock = UdpSocket::bind("0.0.0.0:0").ok()?;
|
||||
sock.connect("8.8.8.8:80").ok()?;
|
||||
let local = sock.local_addr().ok()?;
|
||||
match local.ip() {
|
||||
IpAddr::V4(v4) if !v4.is_loopback() => Some(v4),
|
||||
_ => None,
|
||||
}
|
||||
}
|
||||
|
||||
fn parse_beacon_response(data: &[u8], addr: SocketAddr) -> Option<DiscoveredNode> {
|
||||
let text = std::str::from_utf8(data).ok()?;
|
||||
let parts: Vec<&str> = text.split('|').collect();
|
||||
|
|
|
|||
|
|
@ -101,17 +101,47 @@ pub async fn start_server(
|
|||
if let Some(port) = config.udp_port {
|
||||
cmd.args(["--udp-port", &port.to_string()]);
|
||||
}
|
||||
if let Some(ref bind_addr) = config.bind_address {
|
||||
cmd.args(["--bind", bind_addr]);
|
||||
}
|
||||
// Bind address: default to 0.0.0.0 so LAN-connected ESP32 nodes can reach us.
|
||||
let bind_addr = config
|
||||
.bind_address
|
||||
.as_deref()
|
||||
.unwrap_or("0.0.0.0");
|
||||
cmd.args(["--bind-addr", bind_addr]);
|
||||
// Pass log level via RUST_LOG env (sensing-server reads tracing_subscriber env).
|
||||
if let Some(ref log_level) = config.log_level {
|
||||
cmd.args(["--log-level", log_level]);
|
||||
cmd.env("RUST_LOG", log_level);
|
||||
}
|
||||
|
||||
// Set data source (default to "simulate" if not specified for demo mode)
|
||||
let source = config.source.as_deref().unwrap_or("simulate");
|
||||
// Set data source (default to "esp32" for real CSI ingest; UI may override)
|
||||
let source = config.source.as_deref().unwrap_or("esp32");
|
||||
cmd.args(["--source", source]);
|
||||
|
||||
// Auto-load bundled vital-signs RVF model if present next to the binary.
|
||||
// Searches: <exe_dir>/wifi-densepose-v1.rvf, then <resource_dir>/wifi-densepose-v1.rvf.
|
||||
let mut model_path: Option<std::path::PathBuf> = None;
|
||||
if let Ok(exe) = std::env::current_exe() {
|
||||
if let Some(dir) = exe.parent() {
|
||||
let candidate = dir.join("wifi-densepose-v1.rvf");
|
||||
if candidate.exists() {
|
||||
model_path = Some(candidate);
|
||||
}
|
||||
}
|
||||
}
|
||||
if model_path.is_none() {
|
||||
if let Ok(resource_dir) = app.path().resource_dir() {
|
||||
let candidate = resource_dir.join("wifi-densepose-v1.rvf");
|
||||
if candidate.exists() {
|
||||
model_path = Some(candidate);
|
||||
}
|
||||
}
|
||||
}
|
||||
if let Some(p) = model_path {
|
||||
tracing::info!("Auto-loading vital-signs RVF model: {}", p.display());
|
||||
cmd.args(["--load-rvf", &p.to_string_lossy()]);
|
||||
} else {
|
||||
tracing::warn!("No wifi-densepose-v1.rvf found next to binary or in resources; vital signs disabled");
|
||||
}
|
||||
|
||||
// Redirect stdout/stderr to pipes for monitoring
|
||||
cmd.stdout(Stdio::piped());
|
||||
cmd.stderr(Stdio::piped());
|
||||
|
|
|
|||
|
|
@ -1,12 +1,12 @@
|
|||
{
|
||||
"name": "ruview-desktop-ui",
|
||||
"version": "0.3.0",
|
||||
"version": "0.4.4",
|
||||
"lockfileVersion": 3,
|
||||
"requires": true,
|
||||
"packages": {
|
||||
"": {
|
||||
"name": "ruview-desktop-ui",
|
||||
"version": "0.3.0",
|
||||
"version": "0.4.4",
|
||||
"dependencies": {
|
||||
"@tauri-apps/api": "^2.0.0",
|
||||
"@tauri-apps/plugin-dialog": "^2.6.0",
|
||||
|
|
@ -53,7 +53,6 @@
|
|||
"integrity": "sha512-CGOfOJqWjg2qW/Mb6zNsDm+u5vFQ8DxXfbM09z69p5Z6+mE1ikP2jUXw+j42Pf1XTYED2Rni5f95npYeuwMDQA==",
|
||||
"dev": true,
|
||||
"license": "MIT",
|
||||
"peer": true,
|
||||
"dependencies": {
|
||||
"@babel/code-frame": "^7.29.0",
|
||||
"@babel/generator": "^7.29.0",
|
||||
|
|
@ -1247,7 +1246,6 @@
|
|||
"integrity": "sha512-z9VXpC7MWrhfWipitjNdgCauoMLRdIILQsAEV+ZesIzBq/oUlxk0m3ApZuMFCXdnS4U7KrI+l3WRUEGQ8K1QKw==",
|
||||
"dev": true,
|
||||
"license": "MIT",
|
||||
"peer": true,
|
||||
"dependencies": {
|
||||
"@types/prop-types": "*",
|
||||
"csstype": "^3.2.2"
|
||||
|
|
@ -1317,7 +1315,6 @@
|
|||
}
|
||||
],
|
||||
"license": "MIT",
|
||||
"peer": true,
|
||||
"dependencies": {
|
||||
"baseline-browser-mapping": "^2.9.0",
|
||||
"caniuse-lite": "^1.0.30001759",
|
||||
|
|
@ -1587,7 +1584,6 @@
|
|||
"integrity": "sha512-5gTmgEY/sqK6gFXLIsQNH19lWb4ebPDLA4SdLP7dsWkIXHWlG66oPuVvXSGFPppYZz8ZDZq0dYYrbHfBCVUb1Q==",
|
||||
"dev": true,
|
||||
"license": "MIT",
|
||||
"peer": true,
|
||||
"engines": {
|
||||
"node": ">=12"
|
||||
},
|
||||
|
|
@ -1629,7 +1625,6 @@
|
|||
"resolved": "https://registry.npmjs.org/react/-/react-18.3.1.tgz",
|
||||
"integrity": "sha512-wS+hAgJShR0KhEvPJArfuPVN1+Hz1t0Y6n5jLrGQbkb4urgPE/0Rve+1kMB1v/oWgHgm4WIcV+i7F2pTVj+2iQ==",
|
||||
"license": "MIT",
|
||||
"peer": true,
|
||||
"dependencies": {
|
||||
"loose-envify": "^1.1.0"
|
||||
},
|
||||
|
|
@ -1802,7 +1797,6 @@
|
|||
"integrity": "sha512-+Oxm7q9hDoLMyJOYfUYBuHQo+dkAloi33apOPP56pzj+vsdJDzr+j1NISE5pyaAuKL4A3UD34qd0lx5+kfKp2g==",
|
||||
"dev": true,
|
||||
"license": "MIT",
|
||||
"peer": true,
|
||||
"dependencies": {
|
||||
"esbuild": "^0.25.0",
|
||||
"fdir": "^6.4.4",
|
||||
|
|
|
|||
|
|
@ -3,7 +3,7 @@ import { invoke } from "@tauri-apps/api/core";
|
|||
import type { Node } from "../types";
|
||||
|
||||
interface UseNodesOptions {
|
||||
/** Auto-poll interval in milliseconds. Set to 0 to disable. Default: 10000 */
|
||||
/** Auto-poll interval in milliseconds. Set to 0 to disable. Default: 30000 */
|
||||
pollInterval?: number;
|
||||
/** Whether to start scanning on mount. Default: false */
|
||||
autoScan?: boolean;
|
||||
|
|
@ -23,7 +23,7 @@ interface UseNodesReturn {
|
|||
}
|
||||
|
||||
export function useNodes(options: UseNodesOptions = {}): UseNodesReturn {
|
||||
const { pollInterval = 10_000, autoScan = false } = options;
|
||||
const { pollInterval = 30_000, autoScan = false } = options;
|
||||
|
||||
const [nodes, setNodes] = useState<Node[]>([]);
|
||||
const [isScanning, setIsScanning] = useState(false);
|
||||
|
|
@ -37,9 +37,15 @@ export function useNodes(options: UseNodesOptions = {}): UseNodesReturn {
|
|||
|
||||
try {
|
||||
const discovered = await invoke<Node[]>("discover_nodes", {
|
||||
timeoutMs: 5000,
|
||||
timeoutMs: 8000,
|
||||
});
|
||||
setNodes(discovered);
|
||||
// Discovery is flaky on busy LANs — overall timeout races with the
|
||||
// per-request reqwest timeouts and sometimes returns 0 even when
|
||||
// sensors are reachable. Keep the last good list rather than
|
||||
// flashing to "no nodes".
|
||||
if (discovered.length > 0) {
|
||||
setNodes(discovered);
|
||||
}
|
||||
} catch (err) {
|
||||
const message =
|
||||
err instanceof Error ? err.message : String(err);
|
||||
|
|
|
|||
|
|
@ -5,11 +5,11 @@ import type { ServerConfig, ServerStatus } from "../types";
|
|||
const DEFAULT_CONFIG: ServerConfig = {
|
||||
http_port: 8080,
|
||||
ws_port: 8765,
|
||||
udp_port: 5005,
|
||||
udp_port: 5006,
|
||||
static_dir: null,
|
||||
model_dir: null,
|
||||
log_level: "info",
|
||||
source: "simulate",
|
||||
source: "esp32",
|
||||
};
|
||||
|
||||
interface UseServerOptions {
|
||||
|
|
|
|||
|
|
@ -36,9 +36,18 @@ const Dashboard: React.FC<DashboardProps> = ({ onNavigate }) => {
|
|||
setScanError(null);
|
||||
try {
|
||||
const { invoke } = await import("@tauri-apps/api/core");
|
||||
const found = await invoke<DiscoveredNode[]>("discover_nodes", { timeoutMs: 3000 });
|
||||
setNodes(found);
|
||||
if (found.length === 0) {
|
||||
const found = await invoke<DiscoveredNode[]>("discover_nodes", { timeoutMs: 8000 });
|
||||
// Merge with existing list — discovery on busy LANs sometimes misses
|
||||
// a node it found in the previous round. Add new entries, refresh
|
||||
// ones we see again, keep previously-found ones.
|
||||
if (found.length > 0) {
|
||||
setNodes((prev) => {
|
||||
const byIp = new Map(prev.map((n) => [n.ip, n]));
|
||||
for (const n of found) byIp.set(n.ip, n);
|
||||
return Array.from(byIp.values());
|
||||
});
|
||||
setScanError(null);
|
||||
} else if (nodes.length === 0) {
|
||||
setScanError("No nodes found. Ensure ESP32 devices are powered on and connected to the network.");
|
||||
}
|
||||
} catch (err) {
|
||||
|
|
|
|||
|
|
@ -68,7 +68,14 @@ const NetworkDiscovery: React.FC<NetworkDiscoveryProps> = ({ onNavigate }) => {
|
|||
const found = await invoke<DiscoveredNode[]>("discover_nodes", {
|
||||
timeoutMs: scanDuration,
|
||||
});
|
||||
setNodes(found);
|
||||
// Merge with existing — flaky LAN scans sometimes miss a node that
|
||||
// was found a moment ago. Add new entries, refresh ones we see again,
|
||||
// keep previously-found ones (incl. manual-added).
|
||||
setNodes((prev) => {
|
||||
const byIp = new Map(prev.map((n) => [n.ip, n]));
|
||||
for (const n of found) byIp.set(n.ip, n);
|
||||
return Array.from(byIp.values());
|
||||
});
|
||||
} catch (err) {
|
||||
setError(err instanceof Error ? err.message : String(err));
|
||||
} finally {
|
||||
|
|
|
|||
|
|
@ -303,7 +303,7 @@ export const Sensing: React.FC = () => {
|
|||
const [stopping, setStopping] = useState(false);
|
||||
|
||||
// Data source selection
|
||||
const [dataSource, setDataSource] = useState<DataSource>("simulate");
|
||||
const [dataSource, setDataSource] = useState<DataSource>("esp32");
|
||||
|
||||
// Log viewer state
|
||||
const [logEntries, setLogEntries] = useState<LogEntry[]>([]);
|
||||
|
|
@ -557,7 +557,6 @@ export const Sensing: React.FC = () => {
|
|||
opacity: isRunning ? 0.6 : 1,
|
||||
}}
|
||||
>
|
||||
<option value="simulate">Simulate</option>
|
||||
<option value="esp32">ESP32 (Real)</option>
|
||||
<option value="wifi">WiFi (RSSI)</option>
|
||||
<option value="auto">Auto Detect</option>
|
||||
|
|
|
|||
|
|
@ -170,7 +170,7 @@ export interface WasmModule {
|
|||
// Sensing Server
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
export type DataSource = "auto" | "wifi" | "esp32" | "simulate";
|
||||
export type DataSource = "auto" | "wifi" | "esp32";
|
||||
|
||||
export interface ServerConfig {
|
||||
http_port: number;
|
||||
|
|
|
|||
|
|
@ -21,94 +21,116 @@ use std::path::{Path, PathBuf};
|
|||
|
||||
// ── Feature vector ───────────────────────────────────────────────────────────
|
||||
|
||||
/// Extended feature vector: 7 server features + 8 subcarrier-derived features = 15.
|
||||
const N_FEATURES: usize = 15;
|
||||
/// ADR-118: feature vector redesigned for multi-node use + multicollinearity
|
||||
/// reduction. Audit on 7-class training set showed:
|
||||
/// * 17-21 multicollinear pairs (|r|>0.85) — energy features and amplitude
|
||||
/// scalars were highly redundant.
|
||||
/// * `amp_min` constant 0.0 across all frames (null subcarrier of HT20),
|
||||
/// making `amp_range = amp_max - 0` fully redundant with `amp_max`.
|
||||
/// * On 6-node data F-stat 10× higher than 2-node, but classifier accuracy
|
||||
/// barely budged (40→44%) because the prior 15-feature pipeline used only
|
||||
/// `nodes.first()` — 5 of 6 sensors carried zero weight.
|
||||
///
|
||||
/// New 22-feature layout:
|
||||
/// [0..4] global signal features:
|
||||
/// variance, mean_rssi, dominant_freq_hz, change_points
|
||||
/// [4..22] per-node features (6 nodes × 3 features each):
|
||||
/// per node id N∈{1..6}, base = 4 + (N-1)*3:
|
||||
/// base+0: amp_std — motion / multipath spread
|
||||
/// base+1: amp_skew — distribution asymmetry (where strong scatterers are)
|
||||
/// base+2: amp_entropy — spectral diversity (normalised)
|
||||
/// Total: 22 features.
|
||||
const N_GLOBAL_FEATURES: usize = 4;
|
||||
const N_PER_NODE_FEATURES: usize = 3;
|
||||
const MAX_NODES: usize = 6;
|
||||
const N_FEATURES: usize = N_GLOBAL_FEATURES + MAX_NODES * N_PER_NODE_FEATURES;
|
||||
|
||||
/// ADR-120: exported feature count so external crates (e.g. the main
|
||||
/// crate's AppStateInner) can size their rolling buffers correctly.
|
||||
pub const N_FEATURES_PUB: usize = N_FEATURES;
|
||||
|
||||
/// Default class names for backward compatibility with old saved models.
|
||||
const DEFAULT_CLASSES: &[&str] = &["absent", "present_still", "present_moving", "active"];
|
||||
|
||||
/// Extract extended feature vector from a JSONL frame (features + raw amplitudes).
|
||||
/// Extract extended feature vector from a JSONL frame (features + per-node amplitudes).
|
||||
/// Missing-node features are zero-padded; z-score normalisation later treats
|
||||
/// them consistently.
|
||||
pub fn features_from_frame(frame: &serde_json::Value) -> [f64; N_FEATURES] {
|
||||
let feat = frame.get("features").cloned().unwrap_or(serde_json::Value::Null);
|
||||
let nodes = frame.get("nodes").and_then(|n| n.as_array());
|
||||
let amps: Vec<f64> = nodes
|
||||
.and_then(|ns| ns.first())
|
||||
.and_then(|n| n.get("amplitude"))
|
||||
.and_then(|a| a.as_array())
|
||||
.map(|arr| arr.iter().filter_map(|v| v.as_f64()).collect())
|
||||
.unwrap_or_default();
|
||||
let mut out = [0.0f64; N_FEATURES];
|
||||
|
||||
// Server-computed features (0-6).
|
||||
let variance = feat.get("variance").and_then(|v| v.as_f64()).unwrap_or(0.0);
|
||||
let mbp = feat.get("motion_band_power").and_then(|v| v.as_f64()).unwrap_or(0.0);
|
||||
let bbp = feat.get("breathing_band_power").and_then(|v| v.as_f64()).unwrap_or(0.0);
|
||||
let sp = feat.get("spectral_power").and_then(|v| v.as_f64()).unwrap_or(0.0);
|
||||
let df = feat.get("dominant_freq_hz").and_then(|v| v.as_f64()).unwrap_or(0.0);
|
||||
let cp = feat.get("change_points").and_then(|v| v.as_f64()).unwrap_or(0.0);
|
||||
let rssi = feat.get("mean_rssi").and_then(|v| v.as_f64()).unwrap_or(0.0);
|
||||
// ── Global signal features (0..4) ──
|
||||
out[0] = feat.get("variance").and_then(|v| v.as_f64()).unwrap_or(0.0);
|
||||
out[1] = feat.get("mean_rssi").and_then(|v| v.as_f64()).unwrap_or(0.0);
|
||||
out[2] = feat.get("dominant_freq_hz").and_then(|v| v.as_f64()).unwrap_or(0.0);
|
||||
out[3] = feat.get("change_points").and_then(|v| v.as_f64()).unwrap_or(0.0);
|
||||
|
||||
// Subcarrier-derived features (7-14).
|
||||
let (amp_mean, amp_std, amp_skew, amp_kurt, amp_iqr, amp_entropy, amp_max, amp_range) =
|
||||
subcarrier_stats(&s);
|
||||
|
||||
[
|
||||
variance, mbp, bbp, sp, df, cp, rssi,
|
||||
amp_mean, amp_std, amp_skew, amp_kurt, amp_iqr, amp_entropy, amp_max, amp_range,
|
||||
]
|
||||
// ── Per-node features (4..22) ──
|
||||
if let Some(nodes) = frame.get("nodes").and_then(|n| n.as_array()) {
|
||||
for node_obj in nodes {
|
||||
let nid = node_obj.get("node_id").and_then(|v| v.as_u64()).unwrap_or(0) as usize;
|
||||
if nid == 0 || nid > MAX_NODES { continue; }
|
||||
let amps: Vec<f64> = node_obj.get("amplitude")
|
||||
.or_else(|| node_obj.get("amplitudes"))
|
||||
.and_then(|a| a.as_array())
|
||||
.map(|arr| arr.iter().filter_map(|v| v.as_f64()).collect())
|
||||
.unwrap_or_default();
|
||||
let (std_a, skew_a, entropy_a) = per_node_stats(&s);
|
||||
let base = N_GLOBAL_FEATURES + (nid - 1) * N_PER_NODE_FEATURES;
|
||||
out[base] = std_a;
|
||||
out[base + 1] = skew_a;
|
||||
out[base + 2] = entropy_a;
|
||||
}
|
||||
}
|
||||
out
|
||||
}
|
||||
|
||||
/// Also keep a simpler version for runtime (no JSONL, just FeatureInfo + amps).
|
||||
pub fn features_from_runtime(feat: &serde_json::Value, amps: &[f64]) -> [f64; N_FEATURES] {
|
||||
let variance = feat.get("variance").and_then(|v| v.as_f64()).unwrap_or(0.0);
|
||||
let mbp = feat.get("motion_band_power").and_then(|v| v.as_f64()).unwrap_or(0.0);
|
||||
let bbp = feat.get("breathing_band_power").and_then(|v| v.as_f64()).unwrap_or(0.0);
|
||||
let sp = feat.get("spectral_power").and_then(|v| v.as_f64()).unwrap_or(0.0);
|
||||
let df = feat.get("dominant_freq_hz").and_then(|v| v.as_f64()).unwrap_or(0.0);
|
||||
let cp = feat.get("change_points").and_then(|v| v.as_f64()).unwrap_or(0.0);
|
||||
let rssi = feat.get("mean_rssi").and_then(|v| v.as_f64()).unwrap_or(0.0);
|
||||
let (amp_mean, amp_std, amp_skew, amp_kurt, amp_iqr, amp_entropy, amp_max, amp_range) =
|
||||
subcarrier_stats(amps);
|
||||
[
|
||||
variance, mbp, bbp, sp, df, cp, rssi,
|
||||
amp_mean, amp_std, amp_skew, amp_kurt, amp_iqr, amp_entropy, amp_max, amp_range,
|
||||
]
|
||||
/// Runtime variant: callers pass the already-aggregated feature struct and a
|
||||
/// slice of (node_id, &litudes) pairs. Compatible with the broadcast tick
|
||||
/// task which has access to all live nodes simultaneously.
|
||||
pub fn features_from_runtime(
|
||||
feat: &serde_json::Value,
|
||||
per_node_amps: &[(u8, &[f64])],
|
||||
) -> [f64; N_FEATURES] {
|
||||
let mut out = [0.0f64; N_FEATURES];
|
||||
|
||||
out[0] = feat.get("variance").and_then(|v| v.as_f64()).unwrap_or(0.0);
|
||||
out[1] = feat.get("mean_rssi").and_then(|v| v.as_f64()).unwrap_or(0.0);
|
||||
out[2] = feat.get("dominant_freq_hz").and_then(|v| v.as_f64()).unwrap_or(0.0);
|
||||
out[3] = feat.get("change_points").and_then(|v| v.as_f64()).unwrap_or(0.0);
|
||||
|
||||
for (nid, amps) in per_node_amps {
|
||||
let nid = *nid as usize;
|
||||
if nid == 0 || nid > MAX_NODES { continue; }
|
||||
let (std_a, skew_a, entropy_a) = per_node_stats(amps);
|
||||
let base = N_GLOBAL_FEATURES + (nid - 1) * N_PER_NODE_FEATURES;
|
||||
out[base] = std_a;
|
||||
out[base + 1] = skew_a;
|
||||
out[base + 2] = entropy_a;
|
||||
}
|
||||
out
|
||||
}
|
||||
|
||||
/// Compute statistical features from raw subcarrier amplitudes.
|
||||
fn subcarrier_stats(amps: &[f64]) -> (f64, f64, f64, f64, f64, f64, f64, f64) {
|
||||
/// Compute the 3 per-node statistics used in the new feature vector:
|
||||
/// std (motion / multipath spread), skew (distribution asymmetry),
|
||||
/// entropy (spectral diversity, normalised to [0, 1]).
|
||||
fn per_node_stats(amps: &[f64]) -> (f64, f64, f64) {
|
||||
if amps.is_empty() {
|
||||
return (0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0);
|
||||
return (0.0, 0.0, 0.0);
|
||||
}
|
||||
let n = amps.len() as f64;
|
||||
let mean = amps.iter().sum::<f64>() / n;
|
||||
let var = amps.iter().map(|a| (a - mean).powi(2)).sum::<f64>() / n;
|
||||
let std = var.sqrt().max(1e-9);
|
||||
|
||||
// Skewness (asymmetry).
|
||||
let skew = amps.iter().map(|a| ((a - mean) / std).powi(3)).sum::<f64>() / n;
|
||||
// Kurtosis (peakedness).
|
||||
let kurt = amps.iter().map(|a| ((a - mean) / std).powi(4)).sum::<f64>() / n - 3.0;
|
||||
|
||||
// IQR (inter-quartile range).
|
||||
let mut sorted = amps.to_vec();
|
||||
sorted.sort_by(|a, b| a.partial_cmp(b).unwrap());
|
||||
let q1 = sorted[sorted.len() / 4];
|
||||
let q3 = sorted[3 * sorted.len() / 4];
|
||||
let iqr = q3 - q1;
|
||||
|
||||
// Spectral entropy (normalised).
|
||||
let total_power: f64 = amps.iter().map(|a| a * a).sum::<f64>().max(1e-9);
|
||||
let entropy: f64 = amps.iter()
|
||||
.map(|a| {
|
||||
let p = (a * a) / total_power;
|
||||
if p > 1e-12 { -p * p.ln() } else { 0.0 }
|
||||
})
|
||||
.sum::<f64>() / n.ln().max(1e-9); // normalise to [0,1]
|
||||
|
||||
let max_val = sorted.last().copied().unwrap_or(0.0);
|
||||
let range = max_val - sorted.first().copied().unwrap_or(0.0);
|
||||
|
||||
(mean, std, skew, kurt, iqr, entropy, max_val, range)
|
||||
.sum::<f64>() / n.ln().max(1e-9);
|
||||
(std, skew, entropy)
|
||||
}
|
||||
|
||||
// ── Per-class statistics ─────────────────────────────────────────────────────
|
||||
|
|
@ -121,15 +143,164 @@ pub struct ClassStats {
|
|||
pub stddev: [f64; N_FEATURES],
|
||||
}
|
||||
|
||||
/// ADR-119: MLP (multi-layer perceptron) hidden-layer width.
|
||||
/// 32 units is enough capacity for our 22-feature × 6-class problem
|
||||
/// (~3k weights) while staying small enough to train in <60s on the
|
||||
/// 151k-frame dataset and load instantly at runtime.
|
||||
const MLP_HIDDEN: usize = 32;
|
||||
|
||||
/// ADR-120: temporal window size (number of consecutive frames stacked
|
||||
/// into the windowed-MLP input). At the broadcast tick rate (~10 fps),
|
||||
/// 20 frames = 2 seconds of context — enough to capture walking step
|
||||
/// cadence (2 Hz), sit-stand transition cycles (0.5 Hz), and breathing
|
||||
/// modulation. Chosen to match WiFlow's training-time window so amplitude
|
||||
/// history buffers can be reused.
|
||||
pub const WINDOW_FRAMES: usize = 20;
|
||||
|
||||
/// ADR-120: windowed-MLP input dimensionality = WINDOW_FRAMES × N_FEATURES.
|
||||
const WINDOWED_INPUT: usize = WINDOW_FRAMES * N_FEATURES;
|
||||
|
||||
/// ADR-120: windowed-MLP hidden width. Larger than MLP_HIDDEN because
|
||||
/// input is 20× wider (440 vs 22). 64 keeps params under 30k.
|
||||
const WINDOWED_HIDDEN: usize = 64;
|
||||
|
||||
/// ADR-119: trained MLP classifier. Single hidden layer, ReLU activation,
|
||||
/// softmax output. Stored alongside the LogReg weights — when `is_trained()`
|
||||
/// returns true, `AdaptiveModel::classify` uses the MLP; otherwise it falls
|
||||
/// back to logistic regression (the legacy path from before ADR-119).
|
||||
#[derive(Debug, Clone, Serialize, Deserialize, Default)]
|
||||
pub struct MlpModel {
|
||||
/// Layer 1 weights, row-major `[N_FEATURES × MLP_HIDDEN]`.
|
||||
#[serde(default)]
|
||||
pub w1: Vec<f64>,
|
||||
/// Layer 1 bias, `[MLP_HIDDEN]`.
|
||||
#[serde(default)]
|
||||
pub b1: Vec<f64>,
|
||||
/// Layer 2 weights, row-major `[MLP_HIDDEN × n_classes]`.
|
||||
#[serde(default)]
|
||||
pub w2: Vec<f64>,
|
||||
/// Layer 2 bias, `[n_classes]`.
|
||||
#[serde(default)]
|
||||
pub b2: Vec<f64>,
|
||||
/// Number of output classes (== len(b2) when trained).
|
||||
#[serde(default)]
|
||||
pub n_classes: usize,
|
||||
}
|
||||
|
||||
impl MlpModel {
|
||||
pub fn is_trained(&self) -> bool {
|
||||
!self.w1.is_empty() && self.n_classes > 0 && self.b2.len() == self.n_classes
|
||||
}
|
||||
|
||||
/// Forward pass. Input is already z-score normalised by the caller.
|
||||
/// Returns softmax probabilities of length `n_classes`.
|
||||
pub fn forward(&self, x: &[f64; N_FEATURES]) -> Vec<f64> {
|
||||
// Layer 1: h = ReLU(x · W1 + b1)
|
||||
let mut h = vec![0.0f64; MLP_HIDDEN];
|
||||
for j in 0..MLP_HIDDEN {
|
||||
let mut s = self.b1[j];
|
||||
for i in 0..N_FEATURES {
|
||||
s += x[i] * self.w1[i * MLP_HIDDEN + j];
|
||||
}
|
||||
h[j] = s.max(0.0);
|
||||
}
|
||||
// Layer 2: logits = h · W2 + b2
|
||||
let mut logits = vec![0.0f64; self.n_classes];
|
||||
for c in 0..self.n_classes {
|
||||
let mut s = self.b2[c];
|
||||
for j in 0..MLP_HIDDEN {
|
||||
s += h[j] * self.w2[j * self.n_classes + c];
|
||||
}
|
||||
logits[c] = s;
|
||||
}
|
||||
// Softmax.
|
||||
let m = logits.iter().cloned().fold(f64::NEG_INFINITY, f64::max);
|
||||
let exp_sum: f64 = logits.iter().map(|z| (z - m).exp()).sum();
|
||||
logits.iter().map(|z| (z - m).exp() / exp_sum).collect()
|
||||
}
|
||||
}
|
||||
|
||||
/// ADR-120: Windowed MLP — same architecture as MlpModel but takes a
|
||||
/// 20-frame × 22-feature stack (440-d input) instead of a single frame.
|
||||
/// Captures temporal patterns (walking step cadence, sit-stand cycles,
|
||||
/// breathing modulation) that frame-level classifiers miss.
|
||||
#[derive(Debug, Clone, Serialize, Deserialize, Default)]
|
||||
pub struct WindowedMlpModel {
|
||||
/// Layer 1 weights, row-major `[WINDOWED_INPUT × WINDOWED_HIDDEN]`.
|
||||
#[serde(default)]
|
||||
pub w1: Vec<f64>,
|
||||
/// Layer 1 bias, `[WINDOWED_HIDDEN]`.
|
||||
#[serde(default)]
|
||||
pub b1: Vec<f64>,
|
||||
/// Layer 2 weights, row-major `[WINDOWED_HIDDEN × n_classes]`.
|
||||
#[serde(default)]
|
||||
pub w2: Vec<f64>,
|
||||
/// Layer 2 bias, `[n_classes]`.
|
||||
#[serde(default)]
|
||||
pub b2: Vec<f64>,
|
||||
/// Number of output classes (== len(b2) when trained).
|
||||
#[serde(default)]
|
||||
pub n_classes: usize,
|
||||
}
|
||||
|
||||
impl WindowedMlpModel {
|
||||
pub fn is_trained(&self) -> bool {
|
||||
!self.w1.is_empty()
|
||||
&& self.n_classes > 0
|
||||
&& self.b2.len() == self.n_classes
|
||||
&& self.w1.len() == WINDOWED_INPUT * WINDOWED_HIDDEN
|
||||
}
|
||||
|
||||
/// Forward pass. `window` is `WINDOW_FRAMES × N_FEATURES` flat,
|
||||
/// row-major (oldest-frame-first), already z-score normalised.
|
||||
/// Returns softmax probabilities of length `n_classes`.
|
||||
pub fn forward(&self, window: &[f64]) -> Vec<f64> {
|
||||
debug_assert_eq!(window.len(), WINDOWED_INPUT);
|
||||
// Layer 1: h = ReLU(window · W1 + b1)
|
||||
let mut h = vec![0.0f64; WINDOWED_HIDDEN];
|
||||
for j in 0..WINDOWED_HIDDEN {
|
||||
let mut s = self.b1[j];
|
||||
for i in 0..WINDOWED_INPUT {
|
||||
s += window[i] * self.w1[i * WINDOWED_HIDDEN + j];
|
||||
}
|
||||
h[j] = s.max(0.0);
|
||||
}
|
||||
// Layer 2: logits = h · W2 + b2
|
||||
let mut logits = vec![0.0f64; self.n_classes];
|
||||
for c in 0..self.n_classes {
|
||||
let mut s = self.b2[c];
|
||||
for j in 0..WINDOWED_HIDDEN {
|
||||
s += h[j] * self.w2[j * self.n_classes + c];
|
||||
}
|
||||
logits[c] = s;
|
||||
}
|
||||
let m = logits.iter().cloned().fold(f64::NEG_INFINITY, f64::max);
|
||||
let exp_sum: f64 = logits.iter().map(|z| (z - m).exp()).sum();
|
||||
logits.iter().map(|z| (z - m).exp() / exp_sum).collect()
|
||||
}
|
||||
}
|
||||
|
||||
// ── Trained model ────────────────────────────────────────────────────────────
|
||||
|
||||
#[derive(Debug, Clone, Serialize, Deserialize)]
|
||||
pub struct AdaptiveModel {
|
||||
/// Per-class feature statistics (centroid + spread).
|
||||
pub class_stats: Vec<ClassStats>,
|
||||
/// Logistic regression weights: [n_classes x (N_FEATURES + 1)] (last = bias).
|
||||
/// Dynamic: the outer Vec length equals the number of discovered classes.
|
||||
/// ADR-119: legacy logistic regression weights, kept as fallback.
|
||||
/// Shape: `[n_classes × (N_FEATURES + 1)]` (last column = bias).
|
||||
/// When `mlp.is_trained()` returns true, MLP wins and these are unused
|
||||
/// at classify time but still updated by `train_from_recordings` so
|
||||
/// rollback is one-line.
|
||||
pub weights: Vec<Vec<f64>>,
|
||||
/// ADR-119: trained MLP (frame-level fallback, used when WindowedMlp
|
||||
/// has no data yet — e.g. cold start before 20 frames accumulated).
|
||||
#[serde(default)]
|
||||
pub mlp: MlpModel,
|
||||
/// ADR-120: trained Windowed MLP (preferred classifier when trained
|
||||
/// AND a 20-frame window of fresh features is available at classify
|
||||
/// time). Captures temporal patterns the frame-level MLP can't see.
|
||||
#[serde(default)]
|
||||
pub windowed_mlp: WindowedMlpModel,
|
||||
/// Global feature normalisation: mean and stddev across all training data.
|
||||
pub global_mean: [f64; N_FEATURES],
|
||||
pub global_std: [f64; N_FEATURES],
|
||||
|
|
@ -153,6 +324,8 @@ impl Default for AdaptiveModel {
|
|||
Self {
|
||||
class_stats: Vec::new(),
|
||||
weights: vec![vec![0.0; N_FEATURES + 1]; n_classes],
|
||||
mlp: MlpModel::default(),
|
||||
windowed_mlp: WindowedMlpModel::default(),
|
||||
global_mean: [0.0; N_FEATURES],
|
||||
global_std: [1.0; N_FEATURES],
|
||||
trained_frames: 0,
|
||||
|
|
@ -164,39 +337,86 @@ impl Default for AdaptiveModel {
|
|||
}
|
||||
|
||||
impl AdaptiveModel {
|
||||
/// Classify a raw feature vector. Returns (class_label, confidence).
|
||||
pub fn classify(&self, raw_features: &[f64; N_FEATURES]) -> (String, f64) {
|
||||
let n_classes = self.weights.len();
|
||||
if n_classes == 0 || self.class_stats.is_empty() {
|
||||
return ("present_still".to_string(), 0.5);
|
||||
/// ADR-120: classify using a temporal window of recent frames.
|
||||
/// `window` is `WINDOW_FRAMES × N_FEATURES` flat row-major (oldest first),
|
||||
/// in raw (un-normalised) units — this fn applies z-score normalisation
|
||||
/// internally using the model's `global_mean`/`global_std`.
|
||||
/// Falls back to frame-level `classify()` on the most recent frame when
|
||||
/// the windowed MLP isn't trained.
|
||||
pub fn classify_window(&self, window: &[f64]) -> (String, f64) {
|
||||
if self.windowed_mlp.is_trained() && window.len() == WINDOWED_INPUT {
|
||||
let mut norm = vec![0.0f64; WINDOWED_INPUT];
|
||||
for f in 0..WINDOW_FRAMES {
|
||||
for i in 0..N_FEATURES {
|
||||
let idx = f * N_FEATURES + i;
|
||||
norm[idx] = (window[idx] - self.global_mean[i]) / (self.global_std[i] + 1e-9);
|
||||
}
|
||||
}
|
||||
let probs = self.windowed_mlp.forward(&norm);
|
||||
let (best_c, best_p) = probs.iter().enumerate()
|
||||
.max_by(|a, b| a.1.partial_cmp(b.1).unwrap())
|
||||
.unwrap();
|
||||
let label = if best_c < self.class_names.len() {
|
||||
self.class_names[best_c].clone()
|
||||
} else {
|
||||
"present_still".to_string()
|
||||
};
|
||||
return (label, *best_p);
|
||||
}
|
||||
// Cold-start fallback: most recent frame via frame-level classifier.
|
||||
let mut last_frame = [0.0f64; N_FEATURES];
|
||||
if window.len() >= N_FEATURES {
|
||||
let off = window.len() - N_FEATURES;
|
||||
last_frame.copy_from_slice(&window[off..off + N_FEATURES]);
|
||||
}
|
||||
self.classify(&last_frame)
|
||||
}
|
||||
|
||||
// Normalise features.
|
||||
/// Classify a raw feature vector. Returns (class_label, confidence).
|
||||
/// ADR-119: prefers MLP when trained; falls back to logistic regression
|
||||
/// otherwise. ADR-120: temporal-context API is `classify_window` —
|
||||
/// prefer it when callers have a recent feature buffer.
|
||||
pub fn classify(&self, raw_features: &[f64; N_FEATURES]) -> (String, f64) {
|
||||
// Normalise features once (shared by MLP and LogReg).
|
||||
let mut x = [0.0f64; N_FEATURES];
|
||||
for i in 0..N_FEATURES {
|
||||
x[i] = (raw_features[i] - self.global_mean[i]) / (self.global_std[i] + 1e-9);
|
||||
}
|
||||
|
||||
// Compute logits: w·x + b for each class.
|
||||
// ADR-119: MLP path (preferred when trained).
|
||||
if self.mlp.is_trained() {
|
||||
let probs = self.mlp.forward(&x);
|
||||
let (best_c, best_p) = probs.iter().enumerate()
|
||||
.max_by(|a, b| a.1.partial_cmp(b.1).unwrap())
|
||||
.unwrap();
|
||||
let label = if best_c < self.class_names.len() {
|
||||
self.class_names[best_c].clone()
|
||||
} else {
|
||||
"present_still".to_string()
|
||||
};
|
||||
return (label, *best_p);
|
||||
}
|
||||
|
||||
// Legacy logistic regression fallback.
|
||||
let n_classes = self.weights.len();
|
||||
if n_classes == 0 || self.class_stats.is_empty() {
|
||||
return ("present_still".to_string(), 0.5);
|
||||
}
|
||||
let mut logits: Vec<f64> = vec![0.0; n_classes];
|
||||
for c in 0..n_classes {
|
||||
let w = &self.weights[c];
|
||||
let mut z = w[N_FEATURES]; // bias
|
||||
let mut z = w[N_FEATURES];
|
||||
for i in 0..N_FEATURES {
|
||||
z += w[i] * x[i];
|
||||
}
|
||||
logits[c] = z;
|
||||
}
|
||||
|
||||
// Softmax.
|
||||
let max_logit = logits.iter().cloned().fold(f64::NEG_INFINITY, f64::max);
|
||||
let exp_sum: f64 = logits.iter().map(|z| (z - max_logit).exp()).sum();
|
||||
let mut probs: Vec<f64> = vec![0.0; n_classes];
|
||||
for c in 0..n_classes {
|
||||
probs[c] = ((logits[c] - max_logit).exp()) / exp_sum;
|
||||
}
|
||||
|
||||
// Pick argmax.
|
||||
let (best_c, best_p) = probs.iter().enumerate()
|
||||
.max_by(|a, b| a.1.partial_cmp(b.1).unwrap())
|
||||
.unwrap();
|
||||
|
|
@ -226,6 +446,7 @@ impl AdaptiveModel {
|
|||
// ── Training ─────────────────────────────────────────────────────────────────
|
||||
|
||||
/// A labeled training sample.
|
||||
#[derive(Clone)]
|
||||
struct Sample {
|
||||
features: [f64; N_FEATURES],
|
||||
class_idx: usize,
|
||||
|
|
@ -314,13 +535,18 @@ pub fn train_from_recordings(recordings_dir: &Path) -> Result<AdaptiveModel, Str
|
|||
}
|
||||
|
||||
// Second pass: load recordings with the discovered class indices.
|
||||
// ADR-120: keep recordings grouped so windowed-MLP training can slide
|
||||
// a temporal window WITHIN each recording (not across recording
|
||||
// boundaries — would mix classes).
|
||||
let mut samples: Vec<Sample> = Vec::new();
|
||||
let mut recording_groups: Vec<Vec<Sample>> = Vec::new();
|
||||
for (path, fname, class_name) in &file_classes {
|
||||
let class_idx = class_map[class_name];
|
||||
let loaded = load_recording(path, class_idx);
|
||||
eprintln!(" Loaded {}: {} frames → class '{}'",
|
||||
fname, loaded.len(), class_name);
|
||||
samples.extend(loaded);
|
||||
samples.extend(loaded.clone());
|
||||
recording_groups.push(loaded);
|
||||
}
|
||||
|
||||
if samples.is_empty() {
|
||||
|
|
@ -499,22 +725,428 @@ pub fn train_from_recordings(recordings_dir: &Path) -> Result<AdaptiveModel, Str
|
|||
}
|
||||
for c in 0..n_classes {
|
||||
let tot = class_total[c].max(1);
|
||||
eprintln!(" {}: {}/{} ({:.0}%)", class_names[c], class_correct[c], tot,
|
||||
eprintln!(" LogReg {}: {}/{} ({:.0}%)", class_names[c], class_correct[c], tot,
|
||||
class_correct[c] as f64 / tot as f64 * 100.0);
|
||||
}
|
||||
|
||||
// ── ADR-119: train MLP on the same normalised samples ──
|
||||
eprintln!("Training MLP (22 → {} → {}) ...", MLP_HIDDEN, n_classes);
|
||||
let mlp = train_mlp_classifier(&norm_samples, n_classes);
|
||||
let (mlp_acc, mlp_per_class) = eval_mlp(&mlp, &norm_samples, n_classes);
|
||||
eprintln!("MLP accuracy: {:.2}% (LogReg was {:.2}%)",
|
||||
mlp_acc * 100.0, accuracy * 100.0);
|
||||
for c in 0..n_classes {
|
||||
let tot = class_total[c].max(1);
|
||||
let corr = mlp_per_class[c];
|
||||
eprintln!(" MLP {}: {}/{} ({:.0}%)",
|
||||
class_names[c], corr, tot, corr as f64 / tot as f64 * 100.0);
|
||||
}
|
||||
|
||||
// ── ADR-120: Windowed MLP training ──
|
||||
// Build temporal-window samples within each recording (no cross-recording
|
||||
// mixing). Slide window of WINDOW_FRAMES with stride to balance class
|
||||
// count vs sample count.
|
||||
eprintln!("Building temporal windows ({} frames × {} features → {} dims)...",
|
||||
WINDOW_FRAMES, N_FEATURES, WINDOWED_INPUT);
|
||||
let window_stride = 5usize; // 4× overlap; ~28k windows total on 151k frames
|
||||
let mut win_samples: Vec<(Vec<f64>, usize)> = Vec::new();
|
||||
for group in &recording_groups {
|
||||
if group.len() < WINDOW_FRAMES { continue; }
|
||||
let class_idx = group[0].class_idx;
|
||||
let mut start = 0usize;
|
||||
while start + WINDOW_FRAMES <= group.len() {
|
||||
let mut flat: Vec<f64> = Vec::with_capacity(WINDOWED_INPUT);
|
||||
for f in 0..WINDOW_FRAMES {
|
||||
let frame = &group[start + f];
|
||||
for i in 0..N_FEATURES {
|
||||
let z = (frame.features[i] - global_mean[i]) / (global_std[i] + 1e-9);
|
||||
flat.push(z);
|
||||
}
|
||||
}
|
||||
win_samples.push((flat, class_idx));
|
||||
start += window_stride;
|
||||
}
|
||||
}
|
||||
eprintln!("Total windowed samples: {}", win_samples.len());
|
||||
|
||||
// Count per-class windowed samples.
|
||||
let mut win_class_total = vec![0usize; n_classes];
|
||||
for (_, c) in &win_samples { win_class_total[*c] += 1; }
|
||||
|
||||
eprintln!("Training Windowed MLP ({} → {} → {}) ...", WINDOWED_INPUT, WINDOWED_HIDDEN, n_classes);
|
||||
let windowed_mlp = train_windowed_mlp_classifier(&win_samples, n_classes);
|
||||
let (win_acc, win_per_class) = eval_windowed_mlp(&windowed_mlp, &win_samples, n_classes);
|
||||
eprintln!("Windowed MLP accuracy: {:.2}% (frame-level MLP was {:.2}%)",
|
||||
win_acc * 100.0, mlp_acc * 100.0);
|
||||
for c in 0..n_classes {
|
||||
let tot = win_class_total[c].max(1);
|
||||
let corr = win_per_class[c];
|
||||
eprintln!(" W-MLP {}: {}/{} ({:.0}%)",
|
||||
class_names[c], corr, tot, corr as f64 / tot as f64 * 100.0);
|
||||
}
|
||||
|
||||
// Pick the best classifier as final accuracy number.
|
||||
let final_accuracy = win_acc.max(mlp_acc).max(accuracy);
|
||||
|
||||
Ok(AdaptiveModel {
|
||||
class_stats,
|
||||
weights,
|
||||
mlp,
|
||||
windowed_mlp,
|
||||
global_mean,
|
||||
global_std,
|
||||
trained_frames: n,
|
||||
training_accuracy: accuracy,
|
||||
training_accuracy: final_accuracy,
|
||||
version: 1,
|
||||
class_names,
|
||||
})
|
||||
}
|
||||
|
||||
// ── ADR-119: MLP training (manual backprop, no external ML crate) ────────────
|
||||
|
||||
/// Train a single-hidden-layer MLP on already-z-score-normalised samples.
|
||||
/// Architecture: N_FEATURES → MLP_HIDDEN → n_classes (ReLU + softmax).
|
||||
/// Optimiser: SGD + momentum 0.9 + weight decay 1e-4 + cosine LR decay.
|
||||
fn train_mlp_classifier(samples: &[([f64; N_FEATURES], usize)], n_classes: usize) -> MlpModel {
|
||||
let n_w1 = N_FEATURES * MLP_HIDDEN;
|
||||
let n_w2 = MLP_HIDDEN * n_classes;
|
||||
|
||||
// He initialisation: w ~ N(0, sqrt(2/fan_in))
|
||||
let mut rng_state: u64 = 1337;
|
||||
let mut rng_u01 = move || -> f64 {
|
||||
rng_state = rng_state.wrapping_mul(6364136223846793005).wrapping_add(1442695040888963407);
|
||||
((rng_state >> 33) as f64) / ((u64::MAX >> 33) as f64)
|
||||
};
|
||||
let mut he_init = |n: usize, fan_in: usize| -> Vec<f64> {
|
||||
let s = (2.0 / fan_in as f64).sqrt();
|
||||
let mut v = Vec::with_capacity(n);
|
||||
let mut k = 0;
|
||||
while k < n {
|
||||
let u1 = rng_u01().max(1e-12);
|
||||
let u2 = rng_u01();
|
||||
let z0 = (-2.0 * u1.ln()).sqrt() * (2.0 * std::f64::consts::PI * u2).cos() * s;
|
||||
let z1 = (-2.0 * u1.ln()).sqrt() * (2.0 * std::f64::consts::PI * u2).sin() * s;
|
||||
v.push(z0);
|
||||
k += 1;
|
||||
if k < n { v.push(z1); k += 1; }
|
||||
}
|
||||
v
|
||||
};
|
||||
|
||||
let mut w1 = he_init(n_w1, N_FEATURES);
|
||||
let mut b1 = vec![0.0f64; MLP_HIDDEN];
|
||||
let mut w2 = he_init(n_w2, MLP_HIDDEN);
|
||||
let mut b2 = vec![0.0f64; n_classes];
|
||||
|
||||
let mut mw1 = vec![0.0f64; n_w1];
|
||||
let mut mb1 = vec![0.0f64; MLP_HIDDEN];
|
||||
let mut mw2 = vec![0.0f64; n_w2];
|
||||
let mut mb2 = vec![0.0f64; n_classes];
|
||||
|
||||
let momentum = 0.9f64;
|
||||
let weight_decay = 1e-4f64;
|
||||
let base_lr = 0.05f64;
|
||||
let batch_size = 64usize;
|
||||
let epochs = 30usize;
|
||||
let n = samples.len();
|
||||
|
||||
// Shuffle index buffer (avoid cloning sample arrays).
|
||||
let mut idx: Vec<usize> = (0..n).collect();
|
||||
let mut shuf_state: u64 = 7;
|
||||
let mut shuf_next = move || -> u64 {
|
||||
shuf_state = shuf_state.wrapping_mul(6364136223846793005).wrapping_add(1442695040888963407);
|
||||
shuf_state >> 33
|
||||
};
|
||||
|
||||
for epoch in 0..epochs {
|
||||
for i in (1..idx.len()).rev() {
|
||||
let j = (shuf_next() as usize) % (i + 1);
|
||||
idx.swap(i, j);
|
||||
}
|
||||
|
||||
let lr = base_lr * 0.5 * (1.0 + (std::f64::consts::PI * epoch as f64 / epochs as f64).cos());
|
||||
let mut epoch_loss = 0.0f64;
|
||||
let mut h_pre = vec![0.0f64; MLP_HIDDEN];
|
||||
let mut h = vec![0.0f64; MLP_HIDDEN];
|
||||
let mut logits = vec![0.0f64; n_classes];
|
||||
|
||||
let mut k = 0usize;
|
||||
while k < n {
|
||||
let bend = (k + batch_size).min(n);
|
||||
let mut gw1 = vec![0.0f64; n_w1];
|
||||
let mut gb1 = vec![0.0f64; MLP_HIDDEN];
|
||||
let mut gw2 = vec![0.0f64; n_w2];
|
||||
let mut gb2 = vec![0.0f64; n_classes];
|
||||
let bs = (bend - k) as f64;
|
||||
|
||||
for &si in &idx[k..bend] {
|
||||
let (x, target) = &samples[si];
|
||||
|
||||
// Forward.
|
||||
for j in 0..MLP_HIDDEN {
|
||||
let mut s = b1[j];
|
||||
for i in 0..N_FEATURES { s += x[i] * w1[i * MLP_HIDDEN + j]; }
|
||||
h_pre[j] = s;
|
||||
h[j] = s.max(0.0);
|
||||
}
|
||||
for c in 0..n_classes {
|
||||
let mut s = b2[c];
|
||||
for j in 0..MLP_HIDDEN { s += h[j] * w2[j * n_classes + c]; }
|
||||
logits[c] = s;
|
||||
}
|
||||
let mx = logits.iter().cloned().fold(f64::NEG_INFINITY, f64::max);
|
||||
let ex_sum: f64 = logits.iter().map(|z| (z - mx).exp()).sum();
|
||||
// d_logits = softmax - one_hot
|
||||
let mut d_logits = vec![0.0f64; n_classes];
|
||||
for c in 0..n_classes {
|
||||
let p = (logits[c] - mx).exp() / ex_sum;
|
||||
d_logits[c] = p - if c == *target { 1.0 } else { 0.0 };
|
||||
if c == *target { epoch_loss += -(p.max(1e-15)).ln(); }
|
||||
}
|
||||
|
||||
// Gradients.
|
||||
for c in 0..n_classes {
|
||||
gb2[c] += d_logits[c];
|
||||
for j in 0..MLP_HIDDEN {
|
||||
gw2[j * n_classes + c] += h[j] * d_logits[c];
|
||||
}
|
||||
}
|
||||
// Backprop through Layer-2 to hidden.
|
||||
let mut d_h = [0.0f64; MLP_HIDDEN];
|
||||
for j in 0..MLP_HIDDEN {
|
||||
if h_pre[j] <= 0.0 { continue; }
|
||||
let mut s = 0.0;
|
||||
for c in 0..n_classes { s += w2[j * n_classes + c] * d_logits[c]; }
|
||||
d_h[j] = s;
|
||||
}
|
||||
for j in 0..MLP_HIDDEN {
|
||||
gb1[j] += d_h[j];
|
||||
for i in 0..N_FEATURES { gw1[i * MLP_HIDDEN + j] += x[i] * d_h[j]; }
|
||||
}
|
||||
}
|
||||
|
||||
// SGD + momentum + weight decay.
|
||||
for q in 0..n_w1 {
|
||||
let g = gw1[q] / bs + weight_decay * w1[q];
|
||||
mw1[q] = momentum * mw1[q] + g;
|
||||
w1[q] -= lr * mw1[q];
|
||||
}
|
||||
for q in 0..MLP_HIDDEN {
|
||||
let g = gb1[q] / bs;
|
||||
mb1[q] = momentum * mb1[q] + g;
|
||||
b1[q] -= lr * mb1[q];
|
||||
}
|
||||
for q in 0..n_w2 {
|
||||
let g = gw2[q] / bs + weight_decay * w2[q];
|
||||
mw2[q] = momentum * mw2[q] + g;
|
||||
w2[q] -= lr * mw2[q];
|
||||
}
|
||||
for q in 0..n_classes {
|
||||
let g = gb2[q] / bs;
|
||||
mb2[q] = momentum * mb2[q] + g;
|
||||
b2[q] -= lr * mb2[q];
|
||||
}
|
||||
|
||||
k = bend;
|
||||
}
|
||||
if epoch % 5 == 0 || epoch == epochs - 1 {
|
||||
eprintln!(" MLP epoch {epoch:2}/{}: loss = {:.4}, lr = {:.4}",
|
||||
epochs, epoch_loss / n as f64, lr);
|
||||
}
|
||||
}
|
||||
|
||||
MlpModel { w1, b1, w2, b2, n_classes }
|
||||
}
|
||||
|
||||
/// Evaluate MLP accuracy and per-class correct counts on normalised samples.
|
||||
fn eval_mlp(mlp: &MlpModel, samples: &[([f64; N_FEATURES], usize)], n_classes: usize)
|
||||
-> (f64, Vec<usize>)
|
||||
{
|
||||
let mut correct = 0usize;
|
||||
let mut per_class = vec![0usize; n_classes];
|
||||
for (x, target) in samples {
|
||||
let probs = mlp.forward(x);
|
||||
let pred = probs.iter().enumerate()
|
||||
.max_by(|a, b| a.1.partial_cmp(b.1).unwrap())
|
||||
.unwrap().0;
|
||||
if pred == *target { correct += 1; per_class[*target] += 1; }
|
||||
}
|
||||
(correct as f64 / samples.len() as f64, per_class)
|
||||
}
|
||||
|
||||
// ── ADR-120: Windowed MLP training ──────────────────────────────────────────
|
||||
|
||||
/// Train a windowed MLP on temporal-window samples.
|
||||
/// Each sample is a 440-d flat vector (20 frames × 22 features) labeled
|
||||
/// with a class index. Architecture: 440 → 64 ReLU → n_classes softmax.
|
||||
/// Same SGD + momentum + cosine-decay recipe as MLP, fewer epochs because
|
||||
/// each window is a richer training signal than a single frame.
|
||||
fn train_windowed_mlp_classifier(
|
||||
samples: &[(Vec<f64>, usize)],
|
||||
n_classes: usize,
|
||||
) -> WindowedMlpModel {
|
||||
let n_w1 = WINDOWED_INPUT * WINDOWED_HIDDEN;
|
||||
let n_w2 = WINDOWED_HIDDEN * n_classes;
|
||||
|
||||
let mut rng_state: u64 = 24601;
|
||||
let mut rng_u01 = move || -> f64 {
|
||||
rng_state = rng_state.wrapping_mul(6364136223846793005).wrapping_add(1442695040888963407);
|
||||
((rng_state >> 33) as f64) / ((u64::MAX >> 33) as f64)
|
||||
};
|
||||
let mut he_init = |n: usize, fan_in: usize| -> Vec<f64> {
|
||||
let s = (2.0 / fan_in as f64).sqrt();
|
||||
let mut v = Vec::with_capacity(n);
|
||||
let mut k = 0;
|
||||
while k < n {
|
||||
let u1 = rng_u01().max(1e-12);
|
||||
let u2 = rng_u01();
|
||||
let z0 = (-2.0 * u1.ln()).sqrt() * (2.0 * std::f64::consts::PI * u2).cos() * s;
|
||||
let z1 = (-2.0 * u1.ln()).sqrt() * (2.0 * std::f64::consts::PI * u2).sin() * s;
|
||||
v.push(z0); k += 1;
|
||||
if k < n { v.push(z1); k += 1; }
|
||||
}
|
||||
v
|
||||
};
|
||||
|
||||
let mut w1 = he_init(n_w1, WINDOWED_INPUT);
|
||||
let mut b1 = vec![0.0f64; WINDOWED_HIDDEN];
|
||||
let mut w2 = he_init(n_w2, WINDOWED_HIDDEN);
|
||||
let mut b2 = vec![0.0f64; n_classes];
|
||||
|
||||
let mut mw1 = vec![0.0f64; n_w1];
|
||||
let mut mb1 = vec![0.0f64; WINDOWED_HIDDEN];
|
||||
let mut mw2 = vec![0.0f64; n_w2];
|
||||
let mut mb2 = vec![0.0f64; n_classes];
|
||||
|
||||
let momentum = 0.9f64;
|
||||
let weight_decay = 1e-4f64;
|
||||
let base_lr = 0.03f64; // smaller LR for larger network (vs MLP's 0.05)
|
||||
let batch_size = 32usize;
|
||||
let epochs = 25usize;
|
||||
let n = samples.len();
|
||||
|
||||
let mut idx: Vec<usize> = (0..n).collect();
|
||||
let mut shuf_state: u64 = 11;
|
||||
let mut shuf_next = move || -> u64 {
|
||||
shuf_state = shuf_state.wrapping_mul(6364136223846793005).wrapping_add(1442695040888963407);
|
||||
shuf_state >> 33
|
||||
};
|
||||
|
||||
let mut h_pre = vec![0.0f64; WINDOWED_HIDDEN];
|
||||
let mut h = vec![0.0f64; WINDOWED_HIDDEN];
|
||||
let mut logits = vec![0.0f64; n_classes];
|
||||
|
||||
for epoch in 0..epochs {
|
||||
for i in (1..idx.len()).rev() {
|
||||
let j = (shuf_next() as usize) % (i + 1);
|
||||
idx.swap(i, j);
|
||||
}
|
||||
let lr = base_lr * 0.5 * (1.0 + (std::f64::consts::PI * epoch as f64 / epochs as f64).cos());
|
||||
let mut epoch_loss = 0.0f64;
|
||||
|
||||
let mut k = 0usize;
|
||||
while k < n {
|
||||
let bend = (k + batch_size).min(n);
|
||||
let mut gw1 = vec![0.0f64; n_w1];
|
||||
let mut gb1 = vec![0.0f64; WINDOWED_HIDDEN];
|
||||
let mut gw2 = vec![0.0f64; n_w2];
|
||||
let mut gb2 = vec![0.0f64; n_classes];
|
||||
let bs = (bend - k) as f64;
|
||||
|
||||
for &si in &idx[k..bend] {
|
||||
let (x, target) = &samples[si];
|
||||
debug_assert_eq!(x.len(), WINDOWED_INPUT);
|
||||
|
||||
// Forward.
|
||||
for j in 0..WINDOWED_HIDDEN {
|
||||
let mut s = b1[j];
|
||||
for i in 0..WINDOWED_INPUT { s += x[i] * w1[i * WINDOWED_HIDDEN + j]; }
|
||||
h_pre[j] = s;
|
||||
h[j] = s.max(0.0);
|
||||
}
|
||||
for c in 0..n_classes {
|
||||
let mut s = b2[c];
|
||||
for j in 0..WINDOWED_HIDDEN { s += h[j] * w2[j * n_classes + c]; }
|
||||
logits[c] = s;
|
||||
}
|
||||
let mx = logits.iter().cloned().fold(f64::NEG_INFINITY, f64::max);
|
||||
let ex_sum: f64 = logits.iter().map(|z| (z - mx).exp()).sum();
|
||||
let mut d_logits = vec![0.0f64; n_classes];
|
||||
for c in 0..n_classes {
|
||||
let p = (logits[c] - mx).exp() / ex_sum;
|
||||
d_logits[c] = p - if c == *target { 1.0 } else { 0.0 };
|
||||
if c == *target { epoch_loss += -(p.max(1e-15)).ln(); }
|
||||
}
|
||||
|
||||
for c in 0..n_classes {
|
||||
gb2[c] += d_logits[c];
|
||||
for j in 0..WINDOWED_HIDDEN {
|
||||
gw2[j * n_classes + c] += h[j] * d_logits[c];
|
||||
}
|
||||
}
|
||||
let mut d_h = vec![0.0f64; WINDOWED_HIDDEN];
|
||||
for j in 0..WINDOWED_HIDDEN {
|
||||
if h_pre[j] <= 0.0 { continue; }
|
||||
let mut s = 0.0;
|
||||
for c in 0..n_classes { s += w2[j * n_classes + c] * d_logits[c]; }
|
||||
d_h[j] = s;
|
||||
}
|
||||
for j in 0..WINDOWED_HIDDEN {
|
||||
gb1[j] += d_h[j];
|
||||
for i in 0..WINDOWED_INPUT { gw1[i * WINDOWED_HIDDEN + j] += x[i] * d_h[j]; }
|
||||
}
|
||||
}
|
||||
|
||||
for q in 0..n_w1 {
|
||||
let g = gw1[q] / bs + weight_decay * w1[q];
|
||||
mw1[q] = momentum * mw1[q] + g;
|
||||
w1[q] -= lr * mw1[q];
|
||||
}
|
||||
for q in 0..WINDOWED_HIDDEN {
|
||||
let g = gb1[q] / bs;
|
||||
mb1[q] = momentum * mb1[q] + g;
|
||||
b1[q] -= lr * mb1[q];
|
||||
}
|
||||
for q in 0..n_w2 {
|
||||
let g = gw2[q] / bs + weight_decay * w2[q];
|
||||
mw2[q] = momentum * mw2[q] + g;
|
||||
w2[q] -= lr * mw2[q];
|
||||
}
|
||||
for q in 0..n_classes {
|
||||
let g = gb2[q] / bs;
|
||||
mb2[q] = momentum * mb2[q] + g;
|
||||
b2[q] -= lr * mb2[q];
|
||||
}
|
||||
|
||||
k = bend;
|
||||
}
|
||||
if epoch % 3 == 0 || epoch == epochs - 1 {
|
||||
eprintln!(" W-MLP epoch {epoch:2}/{}: loss = {:.4}, lr = {:.4}",
|
||||
epochs, epoch_loss / n as f64, lr);
|
||||
}
|
||||
}
|
||||
|
||||
WindowedMlpModel { w1, b1, w2, b2, n_classes }
|
||||
}
|
||||
|
||||
/// Evaluate Windowed MLP accuracy + per-class correct counts.
|
||||
fn eval_windowed_mlp(
|
||||
mlp: &WindowedMlpModel,
|
||||
samples: &[(Vec<f64>, usize)],
|
||||
n_classes: usize,
|
||||
) -> (f64, Vec<usize>) {
|
||||
let mut correct = 0usize;
|
||||
let mut per_class = vec![0usize; n_classes];
|
||||
for (x, target) in samples {
|
||||
let probs = mlp.forward(x);
|
||||
let pred = probs.iter().enumerate()
|
||||
.max_by(|a, b| a.1.partial_cmp(b.1).unwrap())
|
||||
.unwrap().0;
|
||||
if pred == *target { correct += 1; per_class[*target] += 1; }
|
||||
}
|
||||
(correct as f64 / samples.len() as f64, per_class)
|
||||
}
|
||||
|
||||
/// Default path for the saved adaptive model.
|
||||
pub fn model_path() -> PathBuf {
|
||||
PathBuf::from("data/adaptive_model.json")
|
||||
|
|
|
|||
|
|
@ -10,6 +10,68 @@ use crate::vital_signs::VitalSigns;
|
|||
|
||||
// ── ESP32 UDP frame parsers ─────────────────────────────────────────────────
|
||||
|
||||
/// Parse a 60-byte ADR-081 feature_state packet (magic 0xC511_0006).
|
||||
///
|
||||
/// Converts the on-wire rv_feature_state_t into an Esp32VitalsPacket so the
|
||||
/// existing vitals processing pipeline can consume it directly. Mapping:
|
||||
/// motion_score → motion_energy (and motion flag if > 0.05)
|
||||
/// presence_score → presence_score + presence (flag) if > 0.5
|
||||
/// respiration_bpm → breathing_rate_bpm
|
||||
/// heartbeat_bpm → heartrate_bpm
|
||||
/// quality_flags → presence/fall/motion bits
|
||||
pub fn parse_rv_feature_state(buf: &[u8]) -> Option<Esp32VitalsPacket> {
|
||||
if buf.len() < 60 { return None; }
|
||||
let magic = u32::from_le_bytes([buf[0], buf[1], buf[2], buf[3]]);
|
||||
if magic != 0xC511_0006 { return None; }
|
||||
|
||||
let node_id = buf[4];
|
||||
let _mode = buf[5];
|
||||
let _seq = u16::from_le_bytes([buf[6], buf[7]]);
|
||||
let ts_us = u64::from_le_bytes([
|
||||
buf[8], buf[9], buf[10], buf[11], buf[12], buf[13], buf[14], buf[15],
|
||||
]);
|
||||
let motion_score = f32::from_le_bytes([buf[16], buf[17], buf[18], buf[19]]);
|
||||
let presence_score = f32::from_le_bytes([buf[20], buf[21], buf[22], buf[23]]);
|
||||
let respiration_bpm = f32::from_le_bytes([buf[24], buf[25], buf[26], buf[27]]);
|
||||
let _respiration_conf = f32::from_le_bytes([buf[28], buf[29], buf[30], buf[31]]);
|
||||
let heartbeat_bpm = f32::from_le_bytes([buf[32], buf[33], buf[34], buf[35]]);
|
||||
let _heartbeat_conf = f32::from_le_bytes([buf[36], buf[37], buf[38], buf[39]]);
|
||||
let _anomaly_score = f32::from_le_bytes([buf[40], buf[41], buf[42], buf[43]]);
|
||||
let _env_shift_score = f32::from_le_bytes([buf[44], buf[45], buf[46], buf[47]]);
|
||||
let _node_coherence = f32::from_le_bytes([buf[48], buf[49], buf[50], buf[51]]);
|
||||
let quality_flags = u16::from_le_bytes([buf[52], buf[53]]);
|
||||
// ADR-100 D3: FW ships median RSSI in byte 54 (was `reserved`); 0 means
|
||||
// "not yet measured" → keep the historical -50 fallback so the UI's
|
||||
// RSSI trace isn't pinned at a misleading 0 dBm. Stays in sync with
|
||||
// the duplicate parser in main.rs (must remain identical).
|
||||
let rssi_byte = buf[54] as i8;
|
||||
let rssi: i8 = if rssi_byte == 0 { -50 } else { rssi_byte };
|
||||
|
||||
// Bit 0 of quality_flags = presence valid
|
||||
let presence_valid = (quality_flags & (1 << 0)) != 0;
|
||||
let presence = presence_valid && presence_score > 0.5;
|
||||
// Bit 3 = anomaly triggered → treat as fall (approximation)
|
||||
let fall_detected = (quality_flags & (1 << 3)) != 0;
|
||||
let motion = motion_score > 0.05;
|
||||
|
||||
// Single-node feature_state doesn't tell us number of persons; surface 1 when present.
|
||||
let n_persons = if presence { 1 } else { 0 };
|
||||
|
||||
Some(Esp32VitalsPacket {
|
||||
node_id,
|
||||
presence,
|
||||
fall_detected,
|
||||
motion,
|
||||
breathing_rate_bpm: respiration_bpm as f64,
|
||||
heartrate_bpm: heartbeat_bpm as f64,
|
||||
rssi,
|
||||
n_persons,
|
||||
motion_energy: motion_score,
|
||||
presence_score,
|
||||
timestamp_ms: (ts_us / 1000) as u32,
|
||||
})
|
||||
}
|
||||
|
||||
/// Parse a 32-byte edge vitals packet (magic 0xC511_0002).
|
||||
pub fn parse_esp32_vitals(buf: &[u8]) -> Option<Esp32VitalsPacket> {
|
||||
if buf.len() < 32 { return None; }
|
||||
|
|
@ -67,14 +129,32 @@ pub fn parse_esp32_frame(buf: &[u8]) -> Option<Esp32Frame> {
|
|||
let magic = u32::from_le_bytes([buf[0], buf[1], buf[2], buf[3]]);
|
||||
if magic != 0xC511_0001 { return None; }
|
||||
|
||||
let node_id = buf[4];
|
||||
let n_antennas = buf[5];
|
||||
let n_subcarriers = buf[6];
|
||||
let freq_mhz = u16::from_le_bytes([buf[8], buf[9]]);
|
||||
let sequence = u32::from_le_bytes([buf[10], buf[11], buf[12], buf[13]]);
|
||||
let rssi_raw = buf[14] as i8;
|
||||
let rssi = if rssi_raw > 0 { rssi_raw.saturating_neg() } else { rssi_raw };
|
||||
let noise_floor = buf[15] as i8;
|
||||
// On-wire layout — must stay in lockstep with
|
||||
// firmware/esp32-csi-node/main/csi_collector.c::serialize_csi_frame().
|
||||
// ADR-100 D3 fix: the previous version of this parser had every field
|
||||
// after `n_antennas` shifted by 2 bytes (n_subcarriers read as u8,
|
||||
// freq_mhz/sequence misaligned, rssi read from buf[14] instead of
|
||||
// buf[16]). That made `mean_rssi` random noise (a byte taken from
|
||||
// mid-sequence) which the saturating_neg() workaround then forced
|
||||
// negative — hiding the bug from cursory log inspection while keeping
|
||||
// RSSI traces useless. Layout below matches the FW byte-for-byte.
|
||||
// [0..4] magic (u32 LE)
|
||||
// [4] node_id (u8)
|
||||
// [5] n_antennas (u8)
|
||||
// [6..8] n_subcarriers(u16 LE)
|
||||
// [8..12] freq_mhz (u32 LE)
|
||||
// [12..16] sequence (u32 LE)
|
||||
// [16] rssi (i8)
|
||||
// [17] noise_floor (i8)
|
||||
// [18..20] reserved
|
||||
// [20..] I/Q payload
|
||||
let node_id = buf[4];
|
||||
let n_antennas = buf[5];
|
||||
let n_subcarriers = u16::from_le_bytes([buf[6], buf[7]]) as u8;
|
||||
let freq_mhz = u16::from_le_bytes([buf[8], buf[9]]); // upper bytes always 0 in practice
|
||||
let sequence = u32::from_le_bytes([buf[12], buf[13], buf[14], buf[15]]);
|
||||
let rssi = buf[16] as i8; // already in [-128..127]
|
||||
let noise_floor = buf[17] as i8;
|
||||
|
||||
let iq_start = 20;
|
||||
let n_pairs = n_antennas as usize * n_subcarriers as usize;
|
||||
|
|
@ -401,9 +481,16 @@ pub fn smooth_and_classify_node(ns: &mut NodeState, raw: &mut ClassificationInfo
|
|||
raw.confidence = (0.4 + sm * 0.6).clamp(0.0, 1.0);
|
||||
}
|
||||
|
||||
/// ADR-118: legacy single-node override variant kept for API compatibility.
|
||||
/// New callers should query per-node amps from AMP_HIST and pass the full
|
||||
/// `&[(u8, &[f64])]` slice. This variant degrades to "node 1 only" which
|
||||
/// produces a feature vector with 5 zero-padded node slots — usable for
|
||||
/// emergency fallback but the trained model expects the full multi-node
|
||||
/// vector.
|
||||
pub fn adaptive_override(state: &AppStateInner, features: &FeatureInfo, classification: &mut ClassificationInfo) {
|
||||
if let Some(ref model) = state.adaptive_model {
|
||||
let amps = state.frame_history.back().map(|v| v.as_slice()).unwrap_or(&[]);
|
||||
let amps_owned: Vec<f64> = state.frame_history.back().cloned().unwrap_or_default();
|
||||
let per_node_refs: Vec<(u8, &[f64])> = vec![(1u8, amps_owned.as_slice())];
|
||||
let feat_arr = adaptive_classifier::features_from_runtime(
|
||||
&serde_json::json!({
|
||||
"variance": features.variance,
|
||||
|
|
@ -414,7 +501,7 @@ pub fn adaptive_override(state: &AppStateInner, features: &FeatureInfo, classifi
|
|||
"change_points": features.change_points,
|
||||
"mean_rssi": features.mean_rssi,
|
||||
}),
|
||||
amps,
|
||||
&per_node_refs,
|
||||
);
|
||||
let (label, conf) = model.classify(&feat_arr);
|
||||
classification.motion_level = label.to_string();
|
||||
|
|
@ -673,3 +760,63 @@ pub fn chrono_timestamp() -> u64 {
|
|||
.map(|d| d.as_secs())
|
||||
.unwrap_or(0)
|
||||
}
|
||||
|
||||
#[cfg(test)]
|
||||
mod tests {
|
||||
use super::*;
|
||||
|
||||
/// Regression test for ADR-100 D3: parse_esp32_frame must extract
|
||||
/// fields from the exact offsets the firmware writes in
|
||||
/// csi_collector.c::serialize_csi_frame(). A previous version
|
||||
/// shifted every field after `n_antennas` by 2 bytes, making RSSI
|
||||
/// random noise. This test builds a synthetic frame with distinctive
|
||||
/// values for every header field and asserts the parser recovers
|
||||
/// each one.
|
||||
#[test]
|
||||
fn parse_esp32_frame_header_offsets_match_firmware() {
|
||||
let n_sub: u16 = 64;
|
||||
let freq_mhz: u32 = 2462; // channel 11
|
||||
let sequence: u32 = 0x1122_3344;
|
||||
let rssi: i8 = -57;
|
||||
let noise_floor: i8 = -95;
|
||||
let n_pairs = 1 * n_sub as usize;
|
||||
let mut buf = vec![0u8; 20 + n_pairs * 2];
|
||||
|
||||
buf[0..4].copy_from_slice(&0xC511_0001u32.to_le_bytes());
|
||||
buf[4] = 7; // node_id
|
||||
buf[5] = 1; // n_antennas
|
||||
buf[6..8].copy_from_slice(&n_sub.to_le_bytes()); // u16
|
||||
buf[8..12].copy_from_slice(&freq_mhz.to_le_bytes()); // u32
|
||||
buf[12..16].copy_from_slice(&sequence.to_le_bytes()); // u32
|
||||
buf[16] = rssi as u8;
|
||||
buf[17] = noise_floor as u8;
|
||||
// [18..20] reserved zeros
|
||||
// I/Q: leave zeros — parser still needs them present
|
||||
|
||||
let f = parse_esp32_frame(&buf).expect("frame parses");
|
||||
assert_eq!(f.node_id, 7);
|
||||
assert_eq!(f.n_antennas, 1);
|
||||
assert_eq!(f.n_subcarriers as u16, n_sub);
|
||||
assert_eq!(f.freq_mhz, freq_mhz as u16); // parser narrows to u16 (upper bytes always 0 in WiFi)
|
||||
assert_eq!(f.sequence, sequence);
|
||||
assert_eq!(f.rssi, -57, "rssi must come from byte 16, not 14");
|
||||
assert_eq!(f.noise_floor, -95, "noise_floor must come from byte 17, not 15");
|
||||
assert_eq!(f.amplitudes.len(), n_pairs);
|
||||
}
|
||||
|
||||
/// Boundary case: minimum-size frame (20 B header, zero I/Q pairs)
|
||||
/// must not panic and must still expose RSSI correctly.
|
||||
#[test]
|
||||
fn parse_esp32_frame_min_size_rssi_only() {
|
||||
let mut buf = vec![0u8; 20];
|
||||
buf[0..4].copy_from_slice(&0xC511_0001u32.to_le_bytes());
|
||||
buf[5] = 0; // 0 antennas → 0 IQ pairs
|
||||
buf[6..8].copy_from_slice(&0u16.to_le_bytes());
|
||||
buf[16] = (-71i8) as u8;
|
||||
buf[17] = (-92i8) as u8;
|
||||
let f = parse_esp32_frame(&buf).expect("min frame parses");
|
||||
assert_eq!(f.rssi, -71);
|
||||
assert_eq!(f.noise_floor, -92);
|
||||
assert!(f.amplitudes.is_empty());
|
||||
}
|
||||
}
|
||||
|
|
|
|||
|
|
@ -19,3 +19,5 @@ pub mod sona;
|
|||
pub mod sparse_inference;
|
||||
#[allow(dead_code)]
|
||||
pub mod embedding;
|
||||
/// ADR-116: WiFlow-v1 supervised pose model loader + Rust forward pass.
|
||||
pub mod wiflow_v1;
|
||||
|
|
|
|||
File diff suppressed because it is too large
Load Diff
|
|
@ -0,0 +1,473 @@
|
|||
//! ADR-116: WiFlow-v1 supervised pose model loader + inference.
|
||||
//!
|
||||
//! Ports `scripts/train-wiflow-supervised.js` inference path to Rust so
|
||||
//! sensing-server can serve real keypoints on `/api/v1/pose/*` instead of
|
||||
//! returning empty arrays per ADR-105 gate.
|
||||
//!
|
||||
//! The model on HuggingFace (`ruv/ruview/wiflow-v1/wiflow-v1.json`) is the
|
||||
//! **lite scale** (186,946 params), NOT the `architecture` field that the
|
||||
//! exporter hardcodes (which describes the `full` scale). We trust
|
||||
//! `totalParams` to disambiguate.
|
||||
//!
|
||||
//! Topology (lite):
|
||||
//! * 2 TCN blocks, kernel=3, dilations=[1,2]
|
||||
//! * Per block: causal_conv1 → bn1 → relu → causal_conv2 → bn2
|
||||
//! + residual (1×1 projection if in_ch ≠ out_ch) → relu
|
||||
//! * tcnChannels: 35 → 32 → 32
|
||||
//! * Flatten (32 × 20 = 640) → fc1 (640→256) → relu → fc2 (256→34)
|
||||
//! * Sigmoid on final 34-dim vector → 17 (x,y) keypoints in [0, 1]
|
||||
//!
|
||||
//! Weight order (collectParams in train script):
|
||||
//! for each tcn block:
|
||||
//! conv1.weight, conv1.bias, bn1.gamma, bn1.beta,
|
||||
//! conv2.weight, conv2.bias, bn2.gamma, bn2.beta,
|
||||
//! (if in_ch ≠ out_ch: res.weight, res.bias)
|
||||
//! fc1.weight, fc1.bias, fc2.weight, fc2.bias
|
||||
//!
|
||||
//! All weights are f32 little-endian, base64-encoded in `weightsBase64`.
|
||||
|
||||
use std::path::Path;
|
||||
|
||||
const TIME_STEPS: usize = 20;
|
||||
const INPUT_DIM: usize = 35;
|
||||
const NUM_KP: usize = 17;
|
||||
const OUT_DIM: usize = NUM_KP * 2; // 34
|
||||
const TCN_CH: [usize; 3] = [INPUT_DIM, 32, 32]; // chain: 35 → 32 → 32
|
||||
const TCN_K: usize = 3;
|
||||
const TCN_DIL: [usize; 2] = [1, 2];
|
||||
const HIDDEN: usize = 256;
|
||||
const FLAT_DIM: usize = 32 * TIME_STEPS; // 640
|
||||
|
||||
/// CausalConv1d weights: `weight[oc*(in_ch*k) + ic*k + tap]`, bias `[oc]`.
|
||||
#[derive(Debug, Clone)]
|
||||
struct Conv1d {
|
||||
in_ch: usize,
|
||||
out_ch: usize,
|
||||
kernel: usize,
|
||||
dilation: usize,
|
||||
weight: Vec<f32>,
|
||||
bias: Vec<f32>,
|
||||
}
|
||||
|
||||
/// BatchNorm1d: 2 params per channel (gamma, beta). Running stats are NOT
|
||||
/// serialized — JS impl re-computes mean/var per window at inference time.
|
||||
#[derive(Debug, Clone)]
|
||||
struct BatchNorm {
|
||||
channels: usize,
|
||||
gamma: Vec<f32>,
|
||||
beta: Vec<f32>,
|
||||
}
|
||||
|
||||
#[derive(Debug, Clone)]
|
||||
struct TcnBlock {
|
||||
conv1: Conv1d,
|
||||
bn1: BatchNorm,
|
||||
conv2: Conv1d,
|
||||
bn2: BatchNorm,
|
||||
res: Option<Conv1d>, // 1×1 projection when in_ch ≠ out_ch
|
||||
}
|
||||
|
||||
#[derive(Debug, Clone)]
|
||||
struct Linear {
|
||||
in_dim: usize,
|
||||
out_dim: usize,
|
||||
/// Row-major `[in_dim, out_dim]` — matches JS `weight[i*outDim + j]`.
|
||||
weight: Vec<f32>,
|
||||
bias: Vec<f32>,
|
||||
}
|
||||
|
||||
#[derive(Debug, Clone)]
|
||||
pub struct WiflowModel {
|
||||
blocks: [TcnBlock; 2],
|
||||
fc1: Linear,
|
||||
fc2: Linear,
|
||||
}
|
||||
|
||||
#[derive(Debug)]
|
||||
pub struct LoadError(pub String);
|
||||
|
||||
impl std::fmt::Display for LoadError {
|
||||
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
|
||||
write!(f, "wiflow_v1 load: {}", self.0)
|
||||
}
|
||||
}
|
||||
|
||||
impl std::error::Error for LoadError {}
|
||||
|
||||
impl WiflowModel {
|
||||
pub fn load_from_json(path: &Path) -> Result<Self, LoadError> {
|
||||
let raw = std::fs::read_to_string(path)
|
||||
.map_err(|e| LoadError(format!("read {}: {e}", path.display())))?;
|
||||
let v: serde_json::Value = serde_json::from_str(&raw)
|
||||
.map_err(|e| LoadError(format!("json parse: {e}")))?;
|
||||
|
||||
let total = v.get("totalParams").and_then(|x| x.as_u64()).unwrap_or(0) as usize;
|
||||
if total != 186_946 {
|
||||
return Err(LoadError(format!(
|
||||
"totalParams={total}, expected 186946 (lite scale). The exporter \
|
||||
hardcodes the `architecture` field to the full scale; \
|
||||
totalParams is the only reliable signal."
|
||||
)));
|
||||
}
|
||||
|
||||
let b64 = v.get("weightsBase64").and_then(|x| x.as_str())
|
||||
.ok_or_else(|| LoadError("missing weightsBase64".into()))?;
|
||||
let bytes = base64_decode(b64)
|
||||
.map_err(|e| LoadError(format!("base64: {e}")))?;
|
||||
if bytes.len() != total * 4 {
|
||||
return Err(LoadError(format!(
|
||||
"bytes={}, expected {} (totalParams*4)", bytes.len(), total * 4)));
|
||||
}
|
||||
let floats: Vec<f32> = bytes.chunks_exact(4)
|
||||
.map(|c| f32::from_le_bytes([c[0], c[1], c[2], c[3]]))
|
||||
.collect();
|
||||
|
||||
let mut cur = Cursor::new(&floats);
|
||||
let block0 = TcnBlock::take(&mut cur, TCN_CH[0], TCN_CH[1], TCN_K, TCN_DIL[0])?;
|
||||
let block1 = TcnBlock::take(&mut cur, TCN_CH[1], TCN_CH[2], TCN_K, TCN_DIL[1])?;
|
||||
let fc1 = Linear::take(&mut cur, FLAT_DIM, HIDDEN)?;
|
||||
let fc2 = Linear::take(&mut cur, HIDDEN, OUT_DIM)?;
|
||||
if cur.remaining() != 0 {
|
||||
return Err(LoadError(format!(
|
||||
"weight stream has {} unread floats after fc2 — topology mismatch",
|
||||
cur.remaining()
|
||||
)));
|
||||
}
|
||||
|
||||
Ok(Self { blocks: [block0, block1], fc1, fc2 })
|
||||
}
|
||||
|
||||
/// Forward pass.
|
||||
/// `input` is `[INPUT_DIM × TIME_STEPS]` row-major (channel-major):
|
||||
/// `input[c * TIME_STEPS + t]`.
|
||||
/// Returns 17 keypoints as (x, y) in [0, 1].
|
||||
pub fn forward(&self, input: &[f32]) -> [(f32, f32); NUM_KP] {
|
||||
debug_assert_eq!(input.len(), INPUT_DIM * TIME_STEPS);
|
||||
let mut x: Vec<f32> = input.to_vec();
|
||||
// TCN blocks
|
||||
x = self.blocks[0].forward(&x, TIME_STEPS);
|
||||
x = self.blocks[1].forward(&x, TIME_STEPS);
|
||||
// Flatten — channels-major matches JS `c * T + t` linearisation.
|
||||
debug_assert_eq!(x.len(), FLAT_DIM);
|
||||
// fc1 + relu
|
||||
let mut h = self.fc1.forward(&x);
|
||||
for v in h.iter_mut() { if *v < 0.0 { *v = 0.0; } }
|
||||
// fc2
|
||||
let out = self.fc2.forward(&h);
|
||||
// sigmoid → 17 (x, y)
|
||||
let mut kp = [(0.0f32, 0.0f32); NUM_KP];
|
||||
for i in 0..NUM_KP {
|
||||
kp[i].0 = sigmoid(out[i * 2]);
|
||||
kp[i].1 = sigmoid(out[i * 2 + 1]);
|
||||
}
|
||||
kp
|
||||
}
|
||||
}
|
||||
|
||||
// ── Internal layer impls ─────────────────────────────────────────────────────
|
||||
|
||||
struct Cursor<'a> {
|
||||
data: &'a [f32],
|
||||
offset: usize,
|
||||
}
|
||||
|
||||
impl<'a> Cursor<'a> {
|
||||
fn new(d: &'a [f32]) -> Self { Self { data: d, offset: 0 } }
|
||||
fn take(&mut self, n: usize) -> Result<Vec<f32>, LoadError> {
|
||||
if self.offset + n > self.data.len() {
|
||||
return Err(LoadError(format!(
|
||||
"weight underrun: need {}, have {}", n, self.data.len() - self.offset)));
|
||||
}
|
||||
let out = self.data[self.offset..self.offset + n].to_vec();
|
||||
self.offset += n;
|
||||
Ok(out)
|
||||
}
|
||||
fn remaining(&self) -> usize { self.data.len() - self.offset }
|
||||
}
|
||||
|
||||
impl Conv1d {
|
||||
fn take(c: &mut Cursor<'_>, in_ch: usize, out_ch: usize, k: usize, dil: usize)
|
||||
-> Result<Self, LoadError>
|
||||
{
|
||||
let weight = c.take(in_ch * k * out_ch)?;
|
||||
let bias = c.take(out_ch)?;
|
||||
Ok(Self { in_ch, out_ch, kernel: k, dilation: dil, weight, bias })
|
||||
}
|
||||
|
||||
/// Causal conv with left padding. Input layout: `[in_ch * T]` row-major.
|
||||
fn forward(&self, input: &[f32], t_steps: usize) -> Vec<f32> {
|
||||
let eff_k = self.kernel + (self.kernel - 1) * (self.dilation - 1);
|
||||
let pad_left = eff_k - 1;
|
||||
let mut out = vec![0.0f32; self.out_ch * t_steps];
|
||||
for oc in 0..self.out_ch {
|
||||
for t in 0..t_steps {
|
||||
let mut sum = self.bias[oc];
|
||||
for ic in 0..self.in_ch {
|
||||
for k in 0..self.kernel {
|
||||
let t_idx_signed = t as isize + pad_left as isize
|
||||
- (k * self.dilation) as isize;
|
||||
// Left-pad with zeros: only contribute when t_idx_signed - pad_left >= 0
|
||||
let t_src = t_idx_signed - pad_left as isize;
|
||||
if t_src < 0 || t_src >= t_steps as isize { continue; }
|
||||
let w_idx = oc * (self.in_ch * self.kernel) + ic * self.kernel + k;
|
||||
sum += self.weight[w_idx] * input[ic * t_steps + t_src as usize];
|
||||
}
|
||||
}
|
||||
out[oc * t_steps + t] = sum;
|
||||
}
|
||||
}
|
||||
out
|
||||
}
|
||||
}
|
||||
|
||||
impl BatchNorm {
|
||||
fn take(c: &mut Cursor<'_>, channels: usize) -> Result<Self, LoadError> {
|
||||
let gamma = c.take(channels)?;
|
||||
let beta = c.take(channels)?;
|
||||
Ok(Self { channels, gamma, beta })
|
||||
}
|
||||
|
||||
/// Per-window normalisation matching JS impl: mean/var computed across
|
||||
/// the T axis at inference time (not from saved running stats).
|
||||
fn forward(&self, x: &mut [f32], t_steps: usize) {
|
||||
let eps = 1e-5f32;
|
||||
for c in 0..self.channels {
|
||||
let base = c * t_steps;
|
||||
let mut mean = 0.0f32;
|
||||
for t in 0..t_steps { mean += x[base + t]; }
|
||||
mean /= t_steps as f32;
|
||||
let mut var = 0.0f32;
|
||||
for t in 0..t_steps {
|
||||
let d = x[base + t] - mean;
|
||||
var += d * d;
|
||||
}
|
||||
var /= t_steps as f32;
|
||||
let inv_std = 1.0f32 / (var + eps).sqrt();
|
||||
let g = self.gamma[c];
|
||||
let b = self.beta[c];
|
||||
for t in 0..t_steps {
|
||||
x[base + t] = g * (x[base + t] - mean) * inv_std + b;
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
impl TcnBlock {
|
||||
fn take(c: &mut Cursor<'_>, in_ch: usize, out_ch: usize, k: usize, dil: usize)
|
||||
-> Result<Self, LoadError>
|
||||
{
|
||||
let conv1 = Conv1d::take(c, in_ch, out_ch, k, dil)?;
|
||||
let bn1 = BatchNorm::take(c, out_ch)?;
|
||||
let conv2 = Conv1d::take(c, out_ch, out_ch, k, dil)?;
|
||||
let bn2 = BatchNorm::take(c, out_ch)?;
|
||||
let res = if in_ch != out_ch {
|
||||
Some(Conv1d::take(c, in_ch, out_ch, 1, 1)?)
|
||||
} else { None };
|
||||
Ok(Self { conv1, bn1, conv2, bn2, res })
|
||||
}
|
||||
|
||||
fn forward(&self, input: &[f32], t_steps: usize) -> Vec<f32> {
|
||||
let mut x = self.conv1.forward(input, t_steps);
|
||||
self.bn1.forward(&mut x, t_steps);
|
||||
for v in x.iter_mut() { if *v < 0.0 { *v = 0.0; } } // relu
|
||||
|
||||
let mut y = self.conv2.forward(&x, t_steps);
|
||||
self.bn2.forward(&mut y, t_steps);
|
||||
|
||||
// Residual
|
||||
let res: Vec<f32> = if let Some(r) = &self.res {
|
||||
r.forward(input, t_steps)
|
||||
} else {
|
||||
input.to_vec()
|
||||
};
|
||||
debug_assert_eq!(y.len(), res.len());
|
||||
for (yv, rv) in y.iter_mut().zip(res.iter()) { *yv += *rv; }
|
||||
for v in y.iter_mut() { if *v < 0.0 { *v = 0.0; } } // relu after residual
|
||||
y
|
||||
}
|
||||
}
|
||||
|
||||
impl Linear {
|
||||
fn take(c: &mut Cursor<'_>, in_dim: usize, out_dim: usize) -> Result<Self, LoadError> {
|
||||
let weight = c.take(in_dim * out_dim)?;
|
||||
let bias = c.take(out_dim)?;
|
||||
Ok(Self { in_dim, out_dim, weight, bias })
|
||||
}
|
||||
|
||||
fn forward(&self, input: &[f32]) -> Vec<f32> {
|
||||
let mut out = vec![0.0f32; self.out_dim];
|
||||
for j in 0..self.out_dim {
|
||||
let mut s = self.bias[j];
|
||||
for i in 0..self.in_dim {
|
||||
s += input[i] * self.weight[i * self.out_dim + j];
|
||||
}
|
||||
out[j] = s;
|
||||
}
|
||||
out
|
||||
}
|
||||
}
|
||||
|
||||
fn sigmoid(x: f32) -> f32 {
|
||||
if x >= 0.0 {
|
||||
let e = (-x).exp();
|
||||
1.0 / (1.0 + e)
|
||||
} else {
|
||||
let e = x.exp();
|
||||
e / (1.0 + e)
|
||||
}
|
||||
}
|
||||
|
||||
// ── Inline base64 decoder ────────────────────────────────────────────────────
|
||||
//
|
||||
// Standard alphabet (A–Z, a–z, 0–9, +, /). Padding `=` tolerated. Whitespace
|
||||
// (including newlines) ignored — JSON.stringify can wrap base64 across lines
|
||||
// in some exporters. Avoids pulling the `base64` crate just for one decode.
|
||||
|
||||
fn base64_decode(s: &str) -> Result<Vec<u8>, String> {
|
||||
let mut out = Vec::with_capacity(s.len() * 3 / 4 + 4);
|
||||
let mut buf: u32 = 0;
|
||||
let mut bits: u32 = 0;
|
||||
for ch in s.bytes() {
|
||||
let v: u32 = match ch {
|
||||
b'A'..=b'Z' => (ch - b'A') as u32,
|
||||
b'a'..=b'z' => (ch - b'a' + 26) as u32,
|
||||
b'0'..=b'9' => (ch - b'0' + 52) as u32,
|
||||
b'+' => 62,
|
||||
b'/' => 63,
|
||||
b'=' => break,
|
||||
b' ' | b'\n' | b'\r' | b'\t' => continue,
|
||||
_ => return Err(format!("invalid base64 char {:#x}", ch)),
|
||||
};
|
||||
buf = (buf << 6) | v;
|
||||
bits += 6;
|
||||
if bits >= 8 {
|
||||
bits -= 8;
|
||||
out.push((buf >> bits) as u8);
|
||||
buf &= (1 << bits) - 1;
|
||||
}
|
||||
}
|
||||
Ok(out)
|
||||
}
|
||||
|
||||
// ── Convenience input helpers ────────────────────────────────────────────────
|
||||
|
||||
/// Build the `[INPUT_DIM × TIME_STEPS]` input tensor from the most recent
|
||||
/// `TIME_STEPS` per-frame amplitude vectors of a single node. Picks the
|
||||
/// `INPUT_DIM` (35) subcarriers with smallest NBVI score (most useful), using
|
||||
/// the same per-subcarrier `α·σ/μ² + (1−α)·σ/μ` formula the classifier uses,
|
||||
/// but with K=35 instead of NBVI_TOP_K=12 — model expects 35 channels.
|
||||
///
|
||||
/// Returns `None` if the history has fewer than `TIME_STEPS` frames or all
|
||||
/// subcarriers are zero / unusable.
|
||||
pub fn build_input_from_history(
|
||||
history: &std::collections::VecDeque<Vec<f64>>,
|
||||
) -> Option<Vec<f32>> {
|
||||
let n = history.len();
|
||||
if n < TIME_STEPS { return None; }
|
||||
// Take the last 20 frames.
|
||||
let recent: Vec<&Vec<f64>> = history.iter().rev().take(TIME_STEPS).collect();
|
||||
// recent is reverse-chronological; we want chronological for forward pass.
|
||||
let recent: Vec<&Vec<f64>> = recent.into_iter().rev().collect();
|
||||
let n_sub = recent[0].len();
|
||||
if n_sub == 0 { return None; }
|
||||
|
||||
// Per-subcarrier mean and std over the 20 frames.
|
||||
let mut score: Vec<(usize, f64)> = (0..n_sub).map(|k| {
|
||||
let mut sum = 0.0f64;
|
||||
for f in &recent { sum += f.get(k).copied().unwrap_or(0.0); }
|
||||
let mu = sum / TIME_STEPS as f64;
|
||||
if mu.abs() < 1e-9 { return (k, f64::INFINITY); }
|
||||
let mut var = 0.0f64;
|
||||
for f in &recent {
|
||||
let d = f.get(k).copied().unwrap_or(0.0) - mu;
|
||||
var += d * d;
|
||||
}
|
||||
let sigma = (var / TIME_STEPS as f64).sqrt();
|
||||
// NBVI (α = 0.5): 0.5 * (σ/μ²) + 0.5 * (σ/μ)
|
||||
let mu2 = mu * mu;
|
||||
let nbvi = 0.5 * (sigma / mu2) + 0.5 * (sigma / mu.abs());
|
||||
(k, nbvi)
|
||||
}).collect();
|
||||
|
||||
// 25th-percentile dead-zone gate (drop subcarriers with mean amplitude
|
||||
// below the lower quartile).
|
||||
let mut means: Vec<f64> = (0..n_sub).map(|k| {
|
||||
let mut s = 0.0f64;
|
||||
for f in &recent { s += f.get(k).copied().unwrap_or(0.0); }
|
||||
s / TIME_STEPS as f64
|
||||
}).collect();
|
||||
means.sort_by(|a, b| a.partial_cmp(b).unwrap_or(std::cmp::Ordering::Equal));
|
||||
let q25_idx = (n_sub as f64 * 0.25) as usize;
|
||||
let dead_thresh = means.get(q25_idx).copied().unwrap_or(0.0);
|
||||
for (k, s) in score.iter_mut() {
|
||||
// Re-compute mean for this k to gate (means above is sorted, indices lost).
|
||||
let mut sum = 0.0f64;
|
||||
for f in &recent { sum += f.get(*k).copied().unwrap_or(0.0); }
|
||||
let mu = sum / TIME_STEPS as f64;
|
||||
if mu < dead_thresh { *s = f64::INFINITY; }
|
||||
}
|
||||
|
||||
score.sort_by(|a, b| a.1.partial_cmp(&b.1).unwrap_or(std::cmp::Ordering::Equal));
|
||||
if score.is_empty() || !score[0].1.is_finite() { return None; }
|
||||
|
||||
// Pick top-INPUT_DIM (35) by lowest NBVI. If fewer than 35 are finite,
|
||||
// pad the remaining channels with zeros (not subcarrier-0 duplicated —
|
||||
// the original implementation pushed `0` into `picks` which silently
|
||||
// duplicated channel 0 across all dead slots, fed the network 35x the
|
||||
// same data, and made the saturation worse).
|
||||
let mut picks: Vec<Option<usize>> = score.iter()
|
||||
.filter(|(_, s)| s.is_finite())
|
||||
.take(INPUT_DIM)
|
||||
.map(|(k, _)| Some(*k))
|
||||
.collect();
|
||||
if picks.is_empty() { return None; }
|
||||
while picks.len() < INPUT_DIM { picks.push(None); } // ← zero-pad, not dup
|
||||
|
||||
// Raw amplitudes pass-through. Training script (`scripts/train-wiflow-
|
||||
// supervised.js::loadJsonl`) feeds raw values; the two TCN BatchNorm
|
||||
// layers normalise per-channel per-window at inference time so absolute
|
||||
// scale (5–50 ESP32 amplitude range) is handled by the network itself.
|
||||
let mut out = vec![0.0f32; INPUT_DIM * TIME_STEPS];
|
||||
for (ci, pick) in picks.iter().enumerate() {
|
||||
match pick {
|
||||
Some(k) => {
|
||||
for (t, f) in recent.iter().enumerate() {
|
||||
out[ci * TIME_STEPS + t] = f.get(*k).copied().unwrap_or(0.0) as f32;
|
||||
}
|
||||
}
|
||||
None => { /* zero-padded channel, already 0.0 from vec init */ }
|
||||
}
|
||||
}
|
||||
Some(out)
|
||||
}
|
||||
|
||||
// ── Tests ────────────────────────────────────────────────────────────────────
|
||||
|
||||
#[cfg(test)]
|
||||
mod tests {
|
||||
use super::*;
|
||||
|
||||
#[test]
|
||||
fn base64_round_trip_alphabet() {
|
||||
// "Man" -> "TWFu"
|
||||
assert_eq!(base64_decode("TWFu").unwrap(), b"Man");
|
||||
// padding
|
||||
assert_eq!(base64_decode("TWE=").unwrap(), b"Ma");
|
||||
assert_eq!(base64_decode("TQ==").unwrap(), b"M");
|
||||
// whitespace tolerated
|
||||
assert_eq!(base64_decode("T W\nF u").unwrap(), b"Man");
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn sigmoid_bounds() {
|
||||
assert!((sigmoid(0.0) - 0.5).abs() < 1e-6);
|
||||
assert!(sigmoid(10.0) > 0.999);
|
||||
assert!(sigmoid(-10.0) < 0.001);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn build_input_zero_history() {
|
||||
let h = std::collections::VecDeque::new();
|
||||
assert!(build_input_from_history(&h).is_none());
|
||||
}
|
||||
}
|
||||
|
|
@ -0,0 +1,509 @@
|
|||
<!doctype html>
|
||||
<html lang="en"><head>
|
||||
<meta charset="utf-8"/>
|
||||
<meta name="viewport" content="width=device-width, initial-scale=1"/>
|
||||
<title>RuView — Raw Signals</title>
|
||||
<style>
|
||||
:root { color-scheme: dark; }
|
||||
body { margin:0; padding:14px; font-family:-apple-system,Inter,system-ui,sans-serif;
|
||||
background:#0a0e13; color:#e6edf3; font-size:12px; }
|
||||
h1 { font-size:15px; font-weight:600; margin:0 0 2px; }
|
||||
.sub { font-size:11px; color:#888; margin:0 0 12px; }
|
||||
.topbar { display:flex; gap:14px; align-items:center; margin-bottom:10px; flex-wrap:wrap; }
|
||||
.pill { padding:4px 10px; border-radius:4px; font-family:JetBrains Mono,monospace; font-size:11px;
|
||||
background:#1c2128; }
|
||||
.pill.dis { background:#3a1418; color:#ff6a6a; }
|
||||
.pill.ok { background:#0e2a1a; color:#7ce38b; }
|
||||
button { background:#21262d; color:#e6edf3; border:1px solid #30363d; border-radius:4px;
|
||||
padding:4px 10px; font-size:11px; cursor:pointer; }
|
||||
.node { background:#161b22; border:1px solid #30363d; border-radius:6px;
|
||||
padding:10px 12px; margin-bottom:10px; }
|
||||
.node h2 { margin:0 0 6px; font-size:12px; font-weight:600; color:#7cb6ff;
|
||||
font-family:JetBrains Mono,monospace; display:flex; gap:14px; align-items:baseline; }
|
||||
.node h2 .stat { color:#888; font-weight:normal; font-size:11px; }
|
||||
.node h2 .stat b { color:#e6edf3; font-weight:600; }
|
||||
.badge { font-family:JetBrains Mono,monospace; font-size:11px; padding:2px 8px; border-radius:3px; }
|
||||
.badge.absent { background:#21262d; color:#888; }
|
||||
.badge.present_still { background:#1c3a55; color:#7cb6ff; }
|
||||
.badge.present_moving{ background:#3a5520; color:#90d36b; }
|
||||
.badge.active { background:#552020; color:#ff7a7a; }
|
||||
.row { display:grid; grid-template-columns: 1fr 360px; gap:10px; }
|
||||
@media (max-width: 900px) { .row { grid-template-columns: 1fr; } }
|
||||
canvas { display:block; width:100%; background:#0a0e13; border-radius:3px; }
|
||||
canvas.bars { height: 130px; }
|
||||
canvas.trace { height: 130px; }
|
||||
canvas.spark { height: 48px; margin-top: 6px; }
|
||||
.lbl { color:#666; font-size:10px; font-family:JetBrains Mono,monospace; margin:2px 0 0; }
|
||||
.controls { display:flex; gap:8px; margin-left:auto; }
|
||||
.controls label { font-size:11px; color:#aaa; }
|
||||
</style>
|
||||
</head>
|
||||
<body>
|
||||
<h1>RuView — Raw CSI signals</h1>
|
||||
<p class="sub">Per-node subcarrier amplitudes + RSSI/broadband traces. No DSP, no classification. Stream straight from the sensor.</p>
|
||||
|
||||
<div class="topbar">
|
||||
<span id="status" class="pill dis">disconnected</span>
|
||||
<span class="pill" id="rate">0 fps</span>
|
||||
<span class="pill" id="lastTs">last: --</span>
|
||||
<span class="badge absent" id="globalBadge" style="font-size:13px;padding:4px 12px;">absent</span>
|
||||
<span class="pill" id="globalCV">CV 0%</span>
|
||||
<div class="controls">
|
||||
<label>peak-hold <input type="checkbox" id="peakHold" checked></label>
|
||||
<label>log-y <input type="checkbox" id="logY"></label>
|
||||
<button onclick="resetState()">reset</button>
|
||||
<button id="calibrateBtn" onclick="startCalibrate()" title="Step out of the room, click, wait 90 s">calibrate empty</button>
|
||||
<span class="pill" id="calibStatus" style="display:none"></span>
|
||||
<!-- ADR-107: visible progress bar shown while baseline capture runs. -->
|
||||
<div id="calibProgress" style="display:none; position:relative; width:140px; height:14px;
|
||||
border:1px solid #30363d; border-radius:7px; overflow:hidden;
|
||||
background:#0a0e13;">
|
||||
<div id="calibProgressFill" style="position:absolute; left:0; top:0; bottom:0; width:0%;
|
||||
background:linear-gradient(90deg,#1f6feb,#3fb950);
|
||||
transition: width 0.4s linear;"></div>
|
||||
<span id="calibProgressLabel" style="position:absolute; inset:0; display:flex;
|
||||
align-items:center; justify-content:center;
|
||||
font-size:10px; font-family:JetBrains Mono,monospace;
|
||||
color:#e6edf3; text-shadow:0 0 2px #000;"></span>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div id="nodes"></div>
|
||||
|
||||
<script>
|
||||
// ── State ──────────────────────────────────────────────────────────
|
||||
const TRACE_SEC = 30; // seconds of history per node
|
||||
const TRACE_MAX_PTS = 1200; // safety cap
|
||||
const state = new Map(); // node_id -> { amp, peak, rssiHist[], meanAmpHist[], lastTs, frames }
|
||||
let frameCount = 0;
|
||||
let lastRateTs = performance.now();
|
||||
let rateFps = 0;
|
||||
let logY = false;
|
||||
let peakHold = true;
|
||||
|
||||
function resetState() {
|
||||
state.clear();
|
||||
document.getElementById('nodes').innerHTML = '';
|
||||
frameCount = 0;
|
||||
}
|
||||
|
||||
document.getElementById('peakHold').addEventListener('change', e => { peakHold = e.target.checked; });
|
||||
document.getElementById('logY').addEventListener('change', e => { logY = e.target.checked; });
|
||||
|
||||
// ── Per-node block factory ─────────────────────────────────────────
|
||||
function ensureNodeBlock(nodeId) {
|
||||
if (state.has(nodeId)) return state.get(nodeId);
|
||||
const ent = {
|
||||
amp: [],
|
||||
peak: [],
|
||||
rssiHist: [], // { t, v }
|
||||
meanAmpHist: [],
|
||||
driftHist: [], // { t, v } — ADR-104 per-sub drift score
|
||||
lastTs: 0,
|
||||
frames: 0,
|
||||
lastFrameWall: performance.now(),
|
||||
fps: 0,
|
||||
};
|
||||
state.set(nodeId, ent);
|
||||
|
||||
const wrap = document.createElement('div');
|
||||
wrap.className = 'node';
|
||||
wrap.id = 'node-' + nodeId;
|
||||
wrap.innerHTML = `
|
||||
<h2>
|
||||
Node ${nodeId}
|
||||
<span class="badge absent" id="n${nodeId}-badge">absent</span>
|
||||
<span class="stat">CV <b id="n${nodeId}-cv">0%</b></span>
|
||||
<span class="stat">subc <b id="n${nodeId}-sub">0</b></span>
|
||||
<span class="stat">rssi <b id="n${nodeId}-rssi">--</b> dBm</span>
|
||||
<span class="stat">mean A <b id="n${nodeId}-meanA">0</b></span>
|
||||
<span class="stat">peak A <b id="n${nodeId}-peakA">0</b></span>
|
||||
<span class="stat">drift <b id="n${nodeId}-drift">--</b></span>
|
||||
<span class="stat">node fps <b id="n${nodeId}-fps">0</b></span>
|
||||
</h2>
|
||||
<div class="row">
|
||||
<div>
|
||||
<canvas class="bars" id="n${nodeId}-bars"></canvas>
|
||||
<p class="lbl">subcarrier amplitude bars (left → low freq, right → high freq)</p>
|
||||
</div>
|
||||
<div>
|
||||
<canvas class="trace" id="n${nodeId}-trace"></canvas>
|
||||
<p class="lbl"><span style="color:#8b949e">RSSI</span> <span style="color:#3fb950">broadband mean amplitude</span> (last ${TRACE_SEC}s)</p>
|
||||
<canvas class="spark" id="n${nodeId}-driftSpark"></canvas>
|
||||
<p class="lbl"><span style="color:#d29922">per-sub drift</span> — off-axis presence channel (ADR-104); dashed line = presence threshold 0.10</p>
|
||||
</div>
|
||||
</div>`;
|
||||
document.getElementById('nodes').appendChild(wrap);
|
||||
return ent;
|
||||
}
|
||||
|
||||
// ── Drawing ────────────────────────────────────────────────────────
|
||||
function drawBars(canvas, amps, peaks) {
|
||||
const w = canvas.clientWidth, h = canvas.clientHeight;
|
||||
if (canvas.width !== w || canvas.height !== h) { canvas.width = w; canvas.height = h; }
|
||||
const ctx = canvas.getContext('2d');
|
||||
ctx.fillStyle = '#0a0e13'; ctx.fillRect(0, 0, w, h);
|
||||
if (!amps.length) return;
|
||||
|
||||
// Determine scale
|
||||
let maxV = peakHold && peaks.length
|
||||
? Math.max(...peaks)
|
||||
: Math.max(...amps);
|
||||
if (!isFinite(maxV) || maxV <= 0) maxV = 1;
|
||||
|
||||
const n = amps.length;
|
||||
const bw = w / n;
|
||||
const margin = 4;
|
||||
|
||||
// Bars
|
||||
for (let i = 0; i < n; i++) {
|
||||
let v = amps[i];
|
||||
let pv = peaks[i] || 0;
|
||||
if (logY) {
|
||||
v = v > 0 ? Math.log10(v + 1) : 0;
|
||||
pv = pv > 0 ? Math.log10(pv + 1) : 0;
|
||||
}
|
||||
const scaleMax = logY ? Math.log10(maxV + 1) : maxV;
|
||||
const bh = Math.max(1, (v / scaleMax) * (h - margin));
|
||||
const ph = Math.max(1, (pv / scaleMax) * (h - margin));
|
||||
const x = i * bw;
|
||||
// peak (faint)
|
||||
if (peakHold && pv > 0) {
|
||||
ctx.fillStyle = '#1f3a5a';
|
||||
ctx.fillRect(x, h - ph, Math.max(1, bw - 1), 1.5);
|
||||
}
|
||||
// bar (active)
|
||||
const hue = 200 + (i / n) * 100;
|
||||
ctx.fillStyle = `hsl(${hue}, 70%, 55%)`;
|
||||
ctx.fillRect(x, h - bh, Math.max(1, bw - 1), bh);
|
||||
}
|
||||
|
||||
// Y-axis label
|
||||
ctx.fillStyle = '#555'; ctx.font = '9px monospace';
|
||||
ctx.fillText('max=' + maxV.toFixed(0), 4, 10);
|
||||
ctx.fillText('n=' + n, w - 40, 10);
|
||||
}
|
||||
|
||||
function drawTrace(canvas, rssiHist, meanAmpHist) {
|
||||
const w = canvas.clientWidth, h = canvas.clientHeight;
|
||||
if (canvas.width !== w || canvas.height !== h) { canvas.width = w; canvas.height = h; }
|
||||
const ctx = canvas.getContext('2d');
|
||||
ctx.fillStyle = '#0a0e13'; ctx.fillRect(0, 0, w, h);
|
||||
|
||||
const now = performance.now() / 1000;
|
||||
const t0 = now - TRACE_SEC;
|
||||
|
||||
const drawSeries = (arr, color, getRange) => {
|
||||
if (arr.length < 2) return;
|
||||
const visible = arr.filter(p => p.t >= t0);
|
||||
if (visible.length < 2) return;
|
||||
const { min, max } = getRange(visible);
|
||||
const span = (max - min) || 1;
|
||||
ctx.strokeStyle = color; ctx.lineWidth = 1.5; ctx.beginPath();
|
||||
for (let i = 0; i < visible.length; i++) {
|
||||
const p = visible[i];
|
||||
const x = ((p.t - t0) / TRACE_SEC) * w;
|
||||
const y = h - ((p.v - min) / span) * (h - 8) - 4;
|
||||
if (i === 0) ctx.moveTo(x, y); else ctx.lineTo(x, y);
|
||||
}
|
||||
ctx.stroke();
|
||||
// y-range text
|
||||
ctx.fillStyle = color; ctx.font = '9px monospace';
|
||||
return { min, max };
|
||||
};
|
||||
|
||||
const rssiR = drawSeries(rssiHist, '#8b949e', arr => {
|
||||
const vals = arr.map(p => p.v);
|
||||
return { min: Math.min(...vals), max: Math.max(...vals) };
|
||||
});
|
||||
const ampR = drawSeries(meanAmpHist, '#3fb950', arr => {
|
||||
const vals = arr.map(p => p.v);
|
||||
return { min: 0, max: Math.max(...vals) };
|
||||
});
|
||||
|
||||
// labels
|
||||
ctx.font = '9px monospace';
|
||||
if (rssiR) { ctx.fillStyle = '#8b949e'; ctx.fillText(`rssi ${rssiR.min.toFixed(0)}…${rssiR.max.toFixed(0)} dBm`, 4, 10); }
|
||||
if (ampR) { ctx.fillStyle = '#3fb950'; ctx.fillText(`A ${ampR.min.toFixed(0)}…${ampR.max.toFixed(0)}`, 4, 22); }
|
||||
|
||||
// grid line at now
|
||||
ctx.strokeStyle = '#1c2128'; ctx.beginPath();
|
||||
ctx.moveTo(w - 1, 0); ctx.lineTo(w - 1, h); ctx.stroke();
|
||||
}
|
||||
|
||||
// ADR-104: per-sub drift sparkline. Fixed Y range [0, 0.30] so the
|
||||
// presence threshold (0.10, dashed) and warning threshold (0.15) are
|
||||
// directly readable across nodes — re-scaling per node would make it
|
||||
// impossible to tell "Node 0 fired" from "Node 1 fired" at a glance.
|
||||
const DRIFT_PRESENCE_THRESH = 0.10;
|
||||
const DRIFT_WARN_THRESH = 0.15;
|
||||
const DRIFT_MAX = 0.30;
|
||||
|
||||
function drawDriftSpark(canvas, hist) {
|
||||
const w = canvas.clientWidth, h = canvas.clientHeight;
|
||||
if (canvas.width !== w || canvas.height !== h) { canvas.width = w; canvas.height = h; }
|
||||
const ctx = canvas.getContext('2d');
|
||||
ctx.fillStyle = '#0a0e13'; ctx.fillRect(0, 0, w, h);
|
||||
|
||||
const now = performance.now() / 1000;
|
||||
const t0 = now - TRACE_SEC;
|
||||
const yOf = v => h - (Math.min(v, DRIFT_MAX) / DRIFT_MAX) * (h - 4) - 2;
|
||||
|
||||
// Threshold lines.
|
||||
ctx.setLineDash([3, 3]);
|
||||
ctx.strokeStyle = '#5a4a1a'; ctx.lineWidth = 1; ctx.beginPath();
|
||||
ctx.moveTo(0, yOf(DRIFT_PRESENCE_THRESH)); ctx.lineTo(w, yOf(DRIFT_PRESENCE_THRESH));
|
||||
ctx.stroke();
|
||||
ctx.strokeStyle = '#7a3030'; ctx.beginPath();
|
||||
ctx.moveTo(0, yOf(DRIFT_WARN_THRESH)); ctx.lineTo(w, yOf(DRIFT_WARN_THRESH));
|
||||
ctx.stroke();
|
||||
ctx.setLineDash([]);
|
||||
|
||||
const visible = hist.filter(p => p.t >= t0);
|
||||
if (visible.length >= 2) {
|
||||
ctx.strokeStyle = '#d29922'; ctx.lineWidth = 1.5; ctx.beginPath();
|
||||
for (let i = 0; i < visible.length; i++) {
|
||||
const p = visible[i];
|
||||
const x = ((p.t - t0) / TRACE_SEC) * w;
|
||||
const y = yOf(p.v);
|
||||
if (i === 0) ctx.moveTo(x, y); else ctx.lineTo(x, y);
|
||||
}
|
||||
ctx.stroke();
|
||||
}
|
||||
|
||||
// Axis text.
|
||||
ctx.fillStyle = '#666'; ctx.font = '9px monospace';
|
||||
ctx.fillText('0', 2, h - 2);
|
||||
ctx.fillText(DRIFT_MAX.toFixed(2), 2, 10);
|
||||
}
|
||||
|
||||
// ── Frame ingestion ────────────────────────────────────────────────
|
||||
function handleSensingUpdate(d) {
|
||||
const nodes = d.nodes || [];
|
||||
const ts = d.timestamp || (Date.now() / 1000);
|
||||
const now = performance.now() / 1000;
|
||||
for (const n of nodes) {
|
||||
const id = n.node_id;
|
||||
const amps = n.amplitude || [];
|
||||
// Skip empty-amp ticks (feature_state path doesn't carry raw CSI).
|
||||
// Bars/traces only refresh on real raw-CSI frames so what you see
|
||||
// is always a live snapshot, not a repeated stale vector.
|
||||
if (!amps.length) continue;
|
||||
const ent = ensureNodeBlock(id);
|
||||
ent.amp = amps;
|
||||
// peak-hold update
|
||||
if (ent.peak.length !== amps.length) ent.peak = amps.slice();
|
||||
else for (let i = 0; i < amps.length; i++) if (amps[i] > ent.peak[i]) ent.peak[i] = amps[i];
|
||||
|
||||
const meanA = amps.reduce((s, x) => s + x, 0) / amps.length;
|
||||
// Only push valid (non-zero) RSSI samples so the trace doesn't
|
||||
// jump between real dBm values and the "0 = no data" sentinel.
|
||||
if (n.rssi_dbm && n.rssi_dbm !== 0) {
|
||||
ent.rssiHist.push({ t: now, v: n.rssi_dbm });
|
||||
}
|
||||
ent.meanAmpHist.push({ t: now, v: meanA });
|
||||
const cutoff = now - TRACE_SEC;
|
||||
while (ent.rssiHist.length && ent.rssiHist[0].t < cutoff) ent.rssiHist.shift();
|
||||
while (ent.meanAmpHist.length && ent.meanAmpHist[0].t < cutoff) ent.meanAmpHist.shift();
|
||||
if (ent.rssiHist.length > TRACE_MAX_PTS) ent.rssiHist.splice(0, ent.rssiHist.length - TRACE_MAX_PTS);
|
||||
if (ent.meanAmpHist.length > TRACE_MAX_PTS) ent.meanAmpHist.splice(0, ent.meanAmpHist.length - TRACE_MAX_PTS);
|
||||
|
||||
// per-node fps: count frames in the last second, refresh once a sec
|
||||
// (instantaneous 1/dt was wildly noisy because multiple WS paths
|
||||
// emit duplicate per-node updates back-to-back).
|
||||
ent.fpsCounter = (ent.fpsCounter || 0) + 1;
|
||||
const nowMs = performance.now();
|
||||
if (!ent.fpsWindowStart) ent.fpsWindowStart = nowMs;
|
||||
if (nowMs - ent.fpsWindowStart >= 1000) {
|
||||
ent.fps = ent.fpsCounter * 1000 / (nowMs - ent.fpsWindowStart);
|
||||
ent.fpsCounter = 0;
|
||||
ent.fpsWindowStart = nowMs;
|
||||
}
|
||||
ent.lastFrameWall = nowMs;
|
||||
ent.frames++;
|
||||
ent.lastTs = ts;
|
||||
|
||||
document.getElementById(`n${id}-sub`).textContent = amps.length;
|
||||
// n.rssi_dbm comes from sensing_update.nodes[]; it can be 0 on
|
||||
// early ticks (history not yet populated). Coerce to "--" so the
|
||||
// operator doesn't think the AP is dead.
|
||||
const rssiVal = (n.rssi_dbm && Number.isFinite(n.rssi_dbm) && n.rssi_dbm !== 0)
|
||||
? n.rssi_dbm.toFixed(1)
|
||||
: '--';
|
||||
document.getElementById(`n${id}-rssi`).textContent = rssiVal;
|
||||
// Push to RSSI trace history if non-zero (so the chart shows the
|
||||
// real ladder of dBm steps, not a fake "0 → -54" jump on boot).
|
||||
if (n.rssi_dbm && n.rssi_dbm !== 0) {
|
||||
// (handled by ent.rssiHist push below)
|
||||
}
|
||||
document.getElementById(`n${id}-meanA`).textContent = meanA.toFixed(1);
|
||||
document.getElementById(`n${id}-peakA`).textContent = Math.max(...ent.peak).toFixed(1);
|
||||
document.getElementById(`n${id}-fps`).textContent = ent.fps.toFixed(1);
|
||||
}
|
||||
|
||||
document.getElementById('lastTs').textContent = 'last: ' + new Date(ts * 1000).toLocaleTimeString();
|
||||
|
||||
// Global classification badge (ADR-101 fused).
|
||||
const gcl = d.classification || {};
|
||||
const glvl = gcl.motion_level || 'absent';
|
||||
const gb = document.getElementById('globalBadge');
|
||||
if (gb) { gb.textContent = glvl; gb.className = 'badge ' + glvl; gb.style.fontSize = '13px'; gb.style.padding = '4px 12px'; }
|
||||
const gcv = document.getElementById('globalCV');
|
||||
if (gcv) gcv.textContent = 'CV ' + ((gcl.confidence || 0) * 100).toFixed(1) + '%';
|
||||
|
||||
// Per-node level badge from node_features[i].classification (ADR-101).
|
||||
const nfNow = performance.now() / 1000;
|
||||
const nf = d.node_features || [];
|
||||
for (const f of nf) {
|
||||
const id = f.node_id;
|
||||
const cls = f.classification || {};
|
||||
const lvl = cls.motion_level || 'absent';
|
||||
const badge = document.getElementById(`n${id}-badge`);
|
||||
if (badge) {
|
||||
badge.textContent = lvl;
|
||||
badge.className = 'badge ' + lvl;
|
||||
}
|
||||
const cvEl = document.getElementById(`n${id}-cv`);
|
||||
if (cvEl) cvEl.textContent = ((cls.confidence || 0) * 100).toFixed(1) + '%';
|
||||
|
||||
// ADR-104 per-sub drift score (off-axis presence). May be absent
|
||||
// when no per-sub baseline is loaded for this node — show '--'
|
||||
// instead of '0.000' so the operator can tell the channel is
|
||||
// unknown vs. known and stable.
|
||||
const driftEl = document.getElementById(`n${id}-drift`);
|
||||
const driftLive = state.get(id);
|
||||
if (typeof f.drift_score === 'number' && Number.isFinite(f.drift_score)) {
|
||||
if (driftEl) driftEl.textContent = f.drift_score.toFixed(3);
|
||||
if (driftLive) {
|
||||
driftLive.driftHist.push({ t: nfNow, v: f.drift_score });
|
||||
const cutoff = nfNow - TRACE_SEC;
|
||||
while (driftLive.driftHist.length && driftLive.driftHist[0].t < cutoff) {
|
||||
driftLive.driftHist.shift();
|
||||
}
|
||||
if (driftLive.driftHist.length > TRACE_MAX_PTS) {
|
||||
driftLive.driftHist.splice(0, driftLive.driftHist.length - TRACE_MAX_PTS);
|
||||
}
|
||||
}
|
||||
} else if (driftEl) {
|
||||
driftEl.textContent = '--';
|
||||
}
|
||||
}
|
||||
frameCount++;
|
||||
}
|
||||
|
||||
function renderTick() {
|
||||
for (const [id, ent] of state) {
|
||||
const bars = document.getElementById('n' + id + '-bars');
|
||||
const trace = document.getElementById('n' + id + '-trace');
|
||||
const spark = document.getElementById('n' + id + '-driftSpark');
|
||||
if (bars) drawBars(bars, ent.amp, ent.peak);
|
||||
if (trace) drawTrace(trace, ent.rssiHist, ent.meanAmpHist);
|
||||
if (spark) drawDriftSpark(spark, ent.driftHist);
|
||||
}
|
||||
// fps pill
|
||||
const now = performance.now();
|
||||
if (now - lastRateTs > 500) {
|
||||
rateFps = (frameCount * 1000) / (now - lastRateTs);
|
||||
document.getElementById('rate').textContent = rateFps.toFixed(1) + ' fps total';
|
||||
frameCount = 0;
|
||||
lastRateTs = now;
|
||||
}
|
||||
requestAnimationFrame(renderTick);
|
||||
}
|
||||
requestAnimationFrame(renderTick);
|
||||
|
||||
// ── ADR-107: baseline calibrate button + progress bar ─────────────
|
||||
let calibPollTimer = null;
|
||||
const CALIB_DURATION_SEC = 90;
|
||||
|
||||
function setCalibProgress(pct, label) {
|
||||
const bar = document.getElementById('calibProgress');
|
||||
const fill = document.getElementById('calibProgressFill');
|
||||
const txt = document.getElementById('calibProgressLabel');
|
||||
if (!bar || !fill || !txt) return;
|
||||
bar.style.display = pct < 0 ? 'none' : 'inline-block';
|
||||
fill.style.width = Math.max(0, Math.min(100, pct)) + '%';
|
||||
txt.textContent = label || '';
|
||||
}
|
||||
|
||||
async function startCalibrate() {
|
||||
if (!confirm(`Step OUT of the room now. Calibration will record for ${CALIB_DURATION_SEC} s.\nClick OK when you are out.`)) return;
|
||||
const btn = document.getElementById('calibrateBtn');
|
||||
const stat = document.getElementById('calibStatus');
|
||||
btn.disabled = true; btn.textContent = 'recording…';
|
||||
// Hide the text-pill while the progress bar is the primary indicator;
|
||||
// it reappears only on terminal status messages (error / complete).
|
||||
stat.style.display = 'none';
|
||||
setCalibProgress(0, 'starting…');
|
||||
try {
|
||||
const res = await fetch('/api/v1/baseline/calibrate', {
|
||||
method: 'POST',
|
||||
headers: {'Content-Type': 'application/json'},
|
||||
body: JSON.stringify({ duration_sec: CALIB_DURATION_SEC, trim_sec: 15, clean_window_sec: 30 }),
|
||||
});
|
||||
const j = await res.json();
|
||||
if (!j.started) {
|
||||
setCalibProgress(-1, '');
|
||||
stat.style.display = 'inline-block';
|
||||
stat.textContent = j.reason || 'failed to start';
|
||||
btn.disabled = false; btn.textContent = 'calibrate empty';
|
||||
return;
|
||||
}
|
||||
} catch (e) {
|
||||
setCalibProgress(-1, '');
|
||||
stat.style.display = 'inline-block';
|
||||
stat.textContent = 'network error';
|
||||
btn.disabled = false; btn.textContent = 'calibrate empty';
|
||||
return;
|
||||
}
|
||||
if (calibPollTimer) clearInterval(calibPollTimer);
|
||||
let elapsed = 0;
|
||||
calibPollTimer = setInterval(async () => {
|
||||
elapsed += 2;
|
||||
try {
|
||||
const r = await fetch('/api/v1/baseline'); const j = await r.json();
|
||||
const s = j.calibration_status || 'idle';
|
||||
if (s.startsWith('running')) {
|
||||
const pct = Math.min(99, (elapsed / CALIB_DURATION_SEC) * 100);
|
||||
setCalibProgress(pct, `${elapsed}/${CALIB_DURATION_SEC} s`);
|
||||
} else {
|
||||
clearInterval(calibPollTimer); calibPollTimer = null;
|
||||
btn.disabled = false; btn.textContent = 'calibrate empty';
|
||||
if (s === 'complete') {
|
||||
setCalibProgress(100, 'done');
|
||||
stat.style.display = 'inline-block';
|
||||
stat.textContent = 'baseline updated ✓';
|
||||
setTimeout(() => setCalibProgress(-1, ''), 3000);
|
||||
} else {
|
||||
setCalibProgress(-1, '');
|
||||
stat.style.display = 'inline-block';
|
||||
stat.textContent = s;
|
||||
}
|
||||
}
|
||||
} catch (e) {}
|
||||
}, 2000);
|
||||
}
|
||||
|
||||
// ── WS ─────────────────────────────────────────────────────────────
|
||||
function connect() {
|
||||
const ws = new WebSocket('ws://' + location.hostname + ':8765/ws/sensing');
|
||||
ws.onopen = () => {
|
||||
const p = document.getElementById('status');
|
||||
p.textContent = 'connected'; p.className = 'pill ok';
|
||||
};
|
||||
ws.onclose = () => {
|
||||
const p = document.getElementById('status');
|
||||
p.textContent = 'disconnected — reconnecting'; p.className = 'pill dis';
|
||||
setTimeout(connect, 1500);
|
||||
};
|
||||
ws.onmessage = (e) => {
|
||||
try {
|
||||
const d = JSON.parse(e.data);
|
||||
if (d.type === 'sensing_update') handleSensingUpdate(d);
|
||||
} catch (_) {}
|
||||
};
|
||||
}
|
||||
connect();
|
||||
</script>
|
||||
</body></html>
|
||||
File diff suppressed because it is too large
Load Diff
1000
v2/crates/wifi-densepose-sensing-server/tests/fixtures/replay_motion.jsonl
vendored
Normal file
1000
v2/crates/wifi-densepose-sensing-server/tests/fixtures/replay_motion.jsonl
vendored
Normal file
File diff suppressed because it is too large
Load Diff
|
|
@ -122,9 +122,30 @@ fn test_different_nodes_produce_different_frames() {
|
|||
/// Send multiple frames from different nodes to a UDP port.
|
||||
/// This test verifies the packet format is accepted by a real server
|
||||
/// if one is running, but doesn't fail if no server is available.
|
||||
///
|
||||
/// ADR-117: previously this test sent to `127.0.0.1:5005` unconditionally,
|
||||
/// hitting any live server on the same port. With `node_ids = [1,2,3,5,7]`
|
||||
/// × 10 frames + 5 vitals it injected 55 spurious node_ids into the
|
||||
/// server's NODE_ADDRS — the keepalive task then spawned one `ping` child
|
||||
/// process per unique nid, accumulating 250+ ping zombies in production.
|
||||
/// Mitigation is two-layered: server now filters loopback at the UDP
|
||||
/// receiver, AND this test refuses to fire if anything is already bound
|
||||
/// to 127.0.0.1:5005.
|
||||
#[test]
|
||||
fn test_multi_node_udp_send() {
|
||||
// Try to bind to a random port and send to localhost:5005
|
||||
// ADR-117 guard: if some other process is bound to 127.0.0.1:5005 (most
|
||||
// commonly a live sensing-server during dev), skip the send so we don't
|
||||
// pollute that process's state. The bind probe is the cheapest signal —
|
||||
// if we can bind even briefly, nobody owns the port; if not, abort.
|
||||
match UdpSocket::bind("127.0.0.1:5005") {
|
||||
Ok(probe) => drop(probe),
|
||||
Err(_) => {
|
||||
eprintln!("test_multi_node_udp_send: 127.0.0.1:5005 already in use — skipping (ADR-117)");
|
||||
return;
|
||||
}
|
||||
};
|
||||
|
||||
// Try to bind to a random port and send to localhost:5005.
|
||||
// This is a smoke test — it verifies frames can be sent without panic.
|
||||
let sock = UdpSocket::bind("0.0.0.0:0").expect("bind");
|
||||
sock.set_write_timeout(Some(Duration::from_millis(100))).ok();
|
||||
|
|
|
|||
Loading…
Reference in New Issue