feat(adr-107): raw.html calibrate button + ADR

UI side of ADR-107: green "calibrate empty" button in raw.html next
to the existing reset/log-y controls. Click → confirm dialog tells
the operator to step out → POST /api/v1/baseline/calibrate with
90 s capture window → polls GET /api/v1/baseline every 2 s, surfaces
"recording… N/90 s" then "baseline updated ✓".

ADR-107 documents:
  D1  in-process capture_baseline_to_disk (port of record-baseline.py)
  D2  BASELINE_BUS broadcast forwarder so capture stays decoupled from
      WS clients
  D3  POST /api/v1/baseline/calibrate (immediate ack, background work)
  D4  GET /api/v1/baseline (current state + cooldown + status)
  D5  auto_recalibrate_task — 30-min absent+low-CV trigger, 1-h cooldown
  D6  raw.html button + polling
This commit is contained in:
arsen 2026-05-17 12:15:09 +07:00
parent 0f373467e5
commit 45c1464cc0
2 changed files with 224 additions and 0 deletions

View File

@ -0,0 +1,186 @@
# ADR-107 — REST Baseline Calibration + Auto-Recalibrate
**Status**: Accepted
**Date**: 2026-05-17
**Scope**: `v2/crates/wifi-densepose-sensing-server/src/main.rs`
(`baseline_get`, `baseline_calibrate`, `auto_recalibrate_task`,
`capture_baseline_to_disk`, `BASELINE_BUS`), `static/raw.html`
(`calibrate empty` button), CLI flags
`--auto-recalibrate-quiet-sec` / `--auto-recalibrate-min-age-sec`.
## Context
ADR-103 introduced a persistent empty-room baseline at
`data/baseline.json` so the classifier no longer needed a 60 s warm-up
after every server restart. To refresh it the operator had to:
1. Step out of the room.
2. SSH / open a terminal, run `python scripts/record-baseline.py
--duration 90`.
3. Wait for the "saved" message.
4. Restart the sensing-server (so it reloads the file).
5. Walk back in.
Steps 2, 4 are friction. The operator asked to remove them so a
fresh device that just wants to monitor a room doesn't need a CLI
or a restart. Two changes:
* **`POST /api/v1/baseline/calibrate`** — fires the same record-and-
trim pipeline from inside the server, hot-reloads the override map
on success. UI button in `raw.html` triggers it.
* **Auto-recalibrate background task** — silently refreshes the
baseline when the classifier reports `absent` and CV stays low for
a long-enough window, without any operator action.
## Decisions
### D1 — `capture_baseline_to_disk` in-process
Pure-Rust port of `scripts/record-baseline.py`:
1. Subscribe to `BASELINE_BUS` (a `tokio::sync::broadcast::Sender<String>`
that mirrors every WS JSON message published by the broadcaster).
2. Collect `duration_sec` of per-node `(t, amplitudes, rssi)`.
3. Trim `trim_sec` from head and tail.
4. Slide `clean_window_sec` window across, pick lowest-CV chunk per
node.
5. Compute FULL-broadband mean/p50/p95/std/CV% (same schema as
ADR-103 v2; reload uses the same `load_baseline_file`).
6. Write `data/baseline.json` (configurable via JSON body `out`).
7. Call `load_baseline_file(path)` to hot-reload `AMP_BASELINE_OVERRIDE`
and `AMP_BASELINE_CV`.
### D2 — `BASELINE_BUS` broadcast forwarder
Decouples baseline capture from individual WS clients. A small task
spawned at startup subscribes to `AppState.tx` and re-publishes every
message into `BASELINE_BUS`. Capture subscribers don't need a WS
connection or any external network path.
### D3 — `POST /api/v1/baseline/calibrate`
Optional JSON body: `{ duration_sec, trim_sec, clean_window_sec, out }`.
Defaults: 90 / 15 / 30 s and `data/baseline.json`. Returns immediately
with `{ "started": true, "hint": "..." }`. Subsequent calls while a
job is running return `{ "started": false, "reason": "calibration
already running" }`.
### D4 — `GET /api/v1/baseline`
```json
{
"nodes": { "1": {"full_broadband_p95": …, "full_broadband_cv_pct": …}, … },
"last_written_sec_ago": <i64>,
"calibration_status": "idle" | "running" | "running (auto)"
| "complete" | "complete (auto)" | "error: …"
}
```
UI polls this every 2 s while a calibration is running to drive the
button state machine.
### D5 — Auto-recalibrate background task
Wakes every 5 s. State machine:
* Read latest `classification.motion_level` and `confidence` (=CV).
* `quiet = (motion_level == "absent") && (cv < 0.08)`.
* If `quiet` is true continuously for `--auto-recalibrate-quiet-sec`
(default 1800 = 30 min) **AND** the last baseline write is older than
`--auto-recalibrate-min-age-sec` (default 3600 = 1 h), kick off
`capture_baseline_to_disk(90, 5, 45, "data/baseline.json")` in the
background.
* On error, log + set `calibration_status` so the UI surfaces it.
The 30-minute / 1-hour defaults are conservative: a person briefly
walking through doesn't reset the baseline; long-term drift from
WiFi reconfiguration or furniture rearrangement does. `--auto-
recalibrate-quiet-sec 0` disables entirely.
### D6 — `raw.html` button
`calibrate empty` next to the existing `reset` button. Click →
`confirm()` reminds operator to step out → POSTs the endpoint → polls
status every 2 s, updating the inline pill `recording… 12/90 s`
`baseline updated ✓` on success. Disables itself while running.
## Files Touched
```
v2/crates/wifi-densepose-sensing-server/src/main.rs
- statics: BASELINE_LAST_WRITTEN, BASELINE_CALIBRATION_STATUS, BASELINE_BUS
- fn capture_baseline_to_disk (D1)
- fn auto_recalibrate_task (D5)
- fn baseline_get (D4)
- fn baseline_calibrate (D3)
- routes /api/v1/baseline + /api/v1/baseline/calibrate
- Args { auto_recalibrate_quiet_sec, auto_recalibrate_min_age_sec }
- main(): bus init + auto-recalibrate spawn
v2/crates/wifi-densepose-sensing-server/static/raw.html
- <button id="calibrateBtn"> (D6)
- <span id="calibStatus" class="pill"> (D6)
- JS: startCalibrate(), polling loop
docs/adr/ADR-107-auto-recalibrate-and-rest-baseline.md (this)
```
One impl commit so far: `0f373467`. UI button + ADR are in this
follow-up.
## Verified Acceptance
Boot log shows the new task wired:
```
baseline: loaded 2 node overrides from data/baseline.json
(node1=27.04, node2=14.72; node1_cv=2.62%, node2_cv=3.65%)
Auto-recalibrate enabled: trigger after 1800s of `absent`+low-CV,
min 3600s between writes
CSI keepalive: 25 ICMP pkt/s/node (interval 0.040s)
```
REST endpoints live:
```
GET /api/v1/baseline → current state + last_written_sec_ago
POST /api/v1/baseline/calibrate → { "started": true }
```
End-to-end smoke test (5 s capture window for speed):
```
POST → { started: true, duration_sec: 5 }
… 8 s elapsed …
GET → { calibration_status: "complete", last_written_sec_ago: 13 }
file: /tmp/test_baseline.json contains n_samples=86 per node + full_broadband_*
```
The hot-reload was visible immediately: `GET /api/v1/baseline.nodes`
showed the new (capture-window) values before any server restart.
## Out of scope / open
* **UI: progress bar instead of pill text** — current state shows
textual `recording… 12/90 s`. Could be a thin progress bar.
* **Multiple baseline profiles** — only one `data/baseline.json` per
server. Future: name-scoped baselines for different deployment
contexts (day / night, summer / winter).
* **Quiet detection that uses CV alone** — currently AND-gated with
`motion_level == "absent"` which itself depends on the loaded
baseline. Risk: if the loaded baseline is *bad*, classifier may
never report `absent`, auto-recalibrate never fires. Mitigation:
REST endpoint stays available; first call out of the box is always
manual via the UI button.
## References
* ADR-100 — gain lock (the prerequisite that makes baseline meaningful).
* ADR-101 — classifier whose `motion_level`/`confidence` drives the
quiet-detector.
* ADR-103 — persistent baseline file (this ADR adds two ways to
refresh it).
* ADR-105 — no synthetic data (auto-recalibrate is *real* data, not
synthesized — it just runs without operator intervention).
* ADR-106 — keepalive (ensures the capture window has enough raw CSI
frames to give a meaningful percentile).
* [`scripts/record-baseline.py`](../../scripts/record-baseline.py)
— original CLI workflow, kept for headless use.

View File

@ -51,6 +51,8 @@
<label>peak-hold <input type="checkbox" id="peakHold" checked></label>
<label>log-y <input type="checkbox" id="logY"></label>
<button onclick="resetState()">reset</button>
<button id="calibrateBtn" onclick="startCalibrate()" title="Step out of the room, click, wait 90 s">calibrate empty</button>
<span class="pill" id="calibStatus" style="display:none"></span>
</div>
</div>
@ -323,6 +325,42 @@ function renderTick() {
}
requestAnimationFrame(renderTick);
// ── ADR-107: baseline calibrate button + polling ──────────────────
let calibPollTimer = null;
async function startCalibrate() {
if (!confirm('Step OUT of the room now. Calibration will record for 90 s.\nClick OK when you are out.')) return;
const btn = document.getElementById('calibrateBtn');
const stat = document.getElementById('calibStatus');
btn.disabled = true; btn.textContent = 'recording…';
stat.style.display = 'inline-block'; stat.textContent = 'starting…';
try {
const res = await fetch('/api/v1/baseline/calibrate', {
method: 'POST',
headers: {'Content-Type': 'application/json'},
body: JSON.stringify({ duration_sec: 90, trim_sec: 15, clean_window_sec: 30 }),
});
const j = await res.json();
if (!j.started) { stat.textContent = j.reason || 'failed to start'; btn.disabled = false; btn.textContent = 'calibrate empty'; return; }
} catch (e) {
stat.textContent = 'network error'; btn.disabled = false; btn.textContent = 'calibrate empty'; return;
}
if (calibPollTimer) clearInterval(calibPollTimer);
let elapsed = 0;
calibPollTimer = setInterval(async () => {
elapsed += 2;
try {
const r = await fetch('/api/v1/baseline'); const j = await r.json();
const s = j.calibration_status || 'idle';
stat.textContent = s.startsWith('running') ? `recording… ${elapsed}/90 s` : s;
if (!s.startsWith('running')) {
clearInterval(calibPollTimer); calibPollTimer = null;
btn.disabled = false; btn.textContent = 'calibrate empty';
if (s === 'complete') stat.textContent = 'baseline updated ✓';
}
} catch (e) {}
}, 2000);
}
// ── WS ─────────────────────────────────────────────────────────────
function connect() {
const ws = new WebSocket('ws://' + location.hostname + ':8765/ws/sensing');