docs(adr-109): new ADR + close ADR-108 open items
ADR-109 documents POST /ota/recalibrate + gl_ap_mac NVS binding and supersedes the two Open Items in ADR-108. Co-Authored-By: claude-flow <ruv@ruv.net>
This commit is contained in:
parent
f92807cdaf
commit
169a589def
|
|
@ -152,16 +152,17 @@ I (4980) csi_collector: gain-lock PERSISTED to NVS (csi_cfg/gl_agc, gl_fft)
|
|||
|
||||
## Open Items
|
||||
|
||||
* **REST endpoint to clear gain-lock NVS** — today the operator has
|
||||
to USB-erase the namespace. A FW-side `POST /ota/recalibrate` that
|
||||
clears the two keys + `esp_restart()` would close that loop.
|
||||
~30 min FW + flash.
|
||||
* **Track AP MAC alongside AGC/FFT** — `csi_cfg/gl_ap_mac`. On boot,
|
||||
if current AP MAC ≠ saved → ignore the cached values and re-calibrate.
|
||||
Fully automatic invalidation. ~1 h FW.
|
||||
* **Per-channel cache** — `csi_cfg/gl_<chan>_agc`. If the channel hop
|
||||
table (ADR-029) is reactivated, each channel needs its own values.
|
||||
~1 h FW.
|
||||
~1 h FW. Deferred — channel hopping is out of scope for the current
|
||||
single-channel deployment.
|
||||
|
||||
## Closed
|
||||
|
||||
* **REST endpoint to clear gain-lock NVS** — shipped via
|
||||
`POST /ota/recalibrate` in ADR-109.
|
||||
* **Track AP MAC alongside AGC/FFT** — shipped via `gl_ap_mac` NVS key
|
||||
+ boot-time comparison in ADR-109.
|
||||
|
||||
## References
|
||||
|
||||
|
|
|
|||
|
|
@ -0,0 +1,145 @@
|
|||
# ADR-109 — FW Gain-Lock Invalidation (REST trigger + AP-MAC binding)
|
||||
|
||||
**Status**: Accepted
|
||||
**Date**: 2026-05-17
|
||||
**Scope**: `firmware/esp32-csi-node/main/ota_update.c`,
|
||||
`firmware/esp32-csi-node/main/csi_collector.c`. Closes both Open Items in
|
||||
ADR-108.
|
||||
|
||||
## Context
|
||||
|
||||
ADR-108 persists the FW-side gain-lock (AGC + FFT scale) to NVS so a
|
||||
reboot resumes detect mode in ~0.5 s. Two follow-ups remained:
|
||||
|
||||
1. **No way to clear the cache without USB.** When an operator moved a
|
||||
sensor or swapped the AP, they had to plug the device in and run
|
||||
`idf.py erase-flash` to force a re-calibration. Defeats the whole
|
||||
point of OTA-only ops.
|
||||
2. **No automatic invalidation on AP swap.** Gain-lock is tied to a
|
||||
specific RF path (AP location, distance, multipath). Connecting the
|
||||
same sensor to a different AP and re-using the cached AGC/FFT yields
|
||||
either over-saturated or under-amplified CSI for the whole session
|
||||
until manual intervention.
|
||||
|
||||
## Decisions
|
||||
|
||||
### D1 — `POST /ota/recalibrate` REST trigger
|
||||
|
||||
New HTTP handler registered on the existing port 8032 next to `/ota`
|
||||
and `/ota/status`. Same Bearer-token auth path as the firmware upload
|
||||
endpoint (reuses `ota_check_auth`).
|
||||
|
||||
Behaviour:
|
||||
|
||||
1. Open NVS namespace `csi_cfg` RW.
|
||||
2. Erase three keys: `gl_agc`, `gl_fft`, `gl_ap_mac` (D2).
|
||||
3. `nvs_commit` + close.
|
||||
4. Send `200 OK {status:"ok"}` JSON.
|
||||
5. `vTaskDelay(1 s)` to flush the response, then `esp_restart()`.
|
||||
|
||||
Next boot: `rv_gain_load_from_nvs` returns `ESP_ERR_NVS_NOT_FOUND` →
|
||||
the existing 300-packet calibration runs as on a never-calibrated chip.
|
||||
|
||||
### D2 — `gl_ap_mac` NVS key (6-byte blob)
|
||||
|
||||
Stored alongside `gl_agc` / `gl_fft` whenever the calibration writes
|
||||
back. Source: `esp_wifi_sta_get_ap_info(&ap).bssid`. Read at the same
|
||||
moment as AGC/FFT during the one-shot NVS short-circuit at the top of
|
||||
`rv_gain_lock_process`.
|
||||
|
||||
Comparison rule on boot:
|
||||
|
||||
| Saved MAC | Current AP MAC | Action |
|
||||
|--------------------|-------------------------|---------------------------------------|
|
||||
| all-zero (legacy) | any | Use cached gain-lock (wildcard match) |
|
||||
| matches current | same | Use cached gain-lock |
|
||||
| differs | any | Log warning, fall through to full cal |
|
||||
| any | AP info unavailable | Defensive: fall through to full cal |
|
||||
|
||||
The all-zero wildcard is the one-time upgrade case: NVS blobs written
|
||||
by ADR-108 builds predate ADR-109 and have no MAC. Treating them as
|
||||
match-anything avoids forcing every existing deployment to re-calibrate
|
||||
on the first ADR-109 boot. The next save (post-re-cal or at the next
|
||||
natural calibration trigger) populates the real MAC, after which the
|
||||
strict comparison applies.
|
||||
|
||||
### D3 — `rv_gain_save_to_nvs` writes MAC too
|
||||
|
||||
Signature changes from `(uint8_t agc, int8_t fft)` to
|
||||
`(uint8_t agc, int8_t fft, const uint8_t mac[6])`. The caller reads
|
||||
`ap.bssid` at save time so the saved MAC reflects the AP the
|
||||
calibration actually ran against (not whatever AP the sensor is
|
||||
connected to N seconds later, which on a roaming-capable mesh could
|
||||
differ).
|
||||
|
||||
If the save-time AP MAC is unavailable (extremely rare — the gain-lock
|
||||
hook only fires from a CSI callback, and CSI callbacks require an
|
||||
established WiFi link), the saved MAC is left as all-zero. The next
|
||||
boot then takes the wildcard path, preserving the current behaviour
|
||||
rather than failing closed.
|
||||
|
||||
### D4 — Recalibrate handler also clears `gl_ap_mac`
|
||||
|
||||
Even though removing only AGC/FFT would force a re-cal by virtue of
|
||||
the missing keys, also erasing `gl_ap_mac` is cleaner: the next write
|
||||
will repopulate it from the current AP, and there's no stale MAC
|
||||
sitting in NVS that could be partially restored by a future bug.
|
||||
|
||||
## Trade-offs
|
||||
|
||||
* **One-time false re-cal on first ADR-109 boot for chips that ever
|
||||
saw an AP swap before this ADR shipped.** Acceptable: gain-lock
|
||||
re-cal takes 6-12 s and produces a brief noise spike, but it's a
|
||||
one-time event and the result is correct from that point onward.
|
||||
* **No multi-AP cache.** If a sensor roams between two APs (rare in
|
||||
this deployment: each sensor is parked next to a fixed TP-Link)
|
||||
it will re-calibrate on every AP swap. Multi-AP storage would need
|
||||
per-AP-MAC sub-keys (`gl_agc:<bssid>`, etc.) — explicitly out of
|
||||
scope; cross-references ADR-108's per-channel cache item which has
|
||||
the same "wait until needed" disposition.
|
||||
* **`gl_ap_mac` blob doubles NVS size of the gain-lock bundle from
|
||||
2 bytes to 8 bytes.** Negligible — the gain-lock namespace `csi_cfg`
|
||||
already holds SSID/password/IP and a few other keys totalling a few
|
||||
hundred bytes.
|
||||
|
||||
## Files Touched
|
||||
|
||||
```
|
||||
firmware/esp32-csi-node/main/ota_update.c
|
||||
- ota_recalibrate_handler (D1, D4)
|
||||
- register POST /ota/recalibrate
|
||||
|
||||
firmware/esp32-csi-node/main/csi_collector.c
|
||||
- RV_GAIN_NVS_K_AP_MAC define (D2)
|
||||
- rv_gain_load_from_nvs: optional MAC out-param + wildcard support
|
||||
- rv_gain_save_to_nvs: MAC in-param + nvs_set_blob (D3)
|
||||
- rv_gain_lock_process: AP-MAC comparison branch (D2)
|
||||
- rv_gain_lock_process: read current bssid before save (D3)
|
||||
|
||||
docs/adr/ADR-109-fw-gain-lock-invalidation.md (this)
|
||||
```
|
||||
|
||||
## Verified Acceptance
|
||||
|
||||
1. `idf.py build` clean (only the pre-existing `wifi_promiscuous_cb`
|
||||
unused warning, unchanged by this ADR).
|
||||
2. After OTA flash of both nodes:
|
||||
* `curl -X POST http://192.168.0.100:8032/ota/recalibrate`
|
||||
* `curl -X POST http://192.168.0.101:8032/ota/recalibrate`
|
||||
Both return `{"status":"ok","message":"gain-lock NVS cleared; rebooting"}`.
|
||||
3. Boot log on next reboot shows `gain-lock APPLIED:` (full cal) +
|
||||
`gain-lock PERSISTED to NVS (AGC=N FFT=M AP=…)` instead of the
|
||||
`gain-lock RESTORED from NVS:` line that fast-path boots produce.
|
||||
4. AP-swap path verified by manually flipping the WiFi credentials to
|
||||
a different SSID via `provision.py`, re-flashing, and confirming
|
||||
the boot log shows `gain-lock NVS MISS: saved AP=… → current=…
|
||||
Re-calibrating.` followed by a full cal.
|
||||
|
||||
## References
|
||||
|
||||
* ADR-108 — NVS persistence of gain-lock. Both Open Items in ADR-108
|
||||
resolved by this ADR (REST trigger, AP-MAC binding).
|
||||
* ADR-050 — OTA Bearer-token auth. Same `ota_check_auth` shared with
|
||||
the new endpoint.
|
||||
* `docs/references/ota-pipeline.md` — port 8032 recipe; gains a new
|
||||
bullet for `/ota/recalibrate`.
|
||||
Loading…
Reference in New Issue