23 KiB

Raw Blame History

ADR-154: Signal/DSP Beyond-SOTA Sweep — Milestone 0 (Correctness, Provable Perf, and the SOTA Landscape)

Field	Value
Status	Proposed
Date	2026-06-11
Deciders	ruv
Codebase target	`wifi-densepose-signal` (`ruvsense/`, `features.rs`, `csi_processor.rs`, `spectrogram.rs`, `bvp.rs`), benches, docs
Relates to	ADR-134 (CIR sparse recovery), ADR-135 (Empty-Room Baseline), ADR-029/030/032 (Multistatic mesh + security), ADR-152 (WiFi-Pose SOTA 2026 intake), ADR-153 (802.11bf forward-compat)
Scope	Milestone 0 of the beyond-SOTA signal/DSP sweep: high-leverage correctness/security fixes, two measured perf wins, the per-module SOTA landscape with evidence grades, and a prioritized roadmap. 45 review findings are explicitly deferred (§7 backlog) — nothing is silently dropped.

0. PROOF discipline (this ADR's contract)

This project has been publicly accused of "AI slop." This ADR answers that with evidence, not adjectives:

Every claimed code improvement ships with a committed regression test (correctness) or a committed criterion bench (performance).
Every perf number below is MEASURED before/after with the exact reproduce command. A perf claim without a measured before/after is UNPROVEN and is not made here.
Every external SOTA reference is graded MEASURED / CLAIMED / THEORETICAL, distinguishing what a paper measured from what it asserts and from what is merely plausible.
The headline finding — a dead CIR coherence gate that silently fell back in production for every canonical frame — is disclosed in full (§2), not buried.

Test machine for the perf numbers: Windows 11, cargo bench --release, criterion 0.5. Numbers are wall-clock medians on this box; they are about ratios (before/after), which are stable across machines, not absolute ns.

1. Context

The RuvSense signal stack (16 ruvsense/ modules + the classic features.rs/csi_processor.rs/spectrogram.rs/bvp.rs pipeline) grew quickly across ADR-014/029/030/134/135. A beyond-SOTA review surfaced ~50 findings ranging from two critical correctness/security defects to micro-optimizations and SOTA-gap research items. Milestone 0 closes the provable, high-leverage subset: the two criticals, a divide-by-zero trio, two measured perf wins, and the research landscape. The remaining ~45 are catalogued in §7 so the backlog is explicit and auditable.

2. The headline finding — the ADR-134 CIR coherence gate was DEAD in production (CRITICAL, FIXED)

2.1 What was wrong

MultistaticFuser fuses canonical CSI frames: hardware_norm.rs resamples every chipset onto a uniform 56-tone canonical grid before fusion (HardwareNormalizer, default canonical_subcarriers = 56). The ADR-134 CIR coherence gate (cir_gate_coherence, multistatic.rs) is supposed to blend a CIR dominant-tap ratio into the cross-node coherence — coherence = 0.7·freq + 0.3·dominant_tap_ratio.

But the gate was wired to CirEstimator::new(CirConfig::ht20()) (with_cir_ht20), and ht20() expects 64 FFT bins or 52 active tones. A canonical-56 frame matches neither, so every call returned CirError::SubcarrierMismatch and cir_gate_coherence hit its silent Err(_) => freq_coherence fallback (multistatic.rs). Net effect: the CIR gate never ran on a single production frame — use_cir_gate = true was indistinguishable from false. This is the exact shape of "AI slop": a feature that compiles, has tests on the estimator, and is dead at the integration seam.

2.2 The fix (the gate now actually runs)

New CirConfig::canonical56() (cir.rs): 64-bin HT20 framing, 56 active tones, 168 delay taps, Φ built over a contiguous −28..+28 active-tone grid (also the native Atheros-56 layout). bandwidth_hz/tap_spacing stay physically correct for a 20 MHz HT20 channel; only the active-tone count differs from ht20().
New MultistaticFuser::with_cir_canonical56() — the correct default for the RuvSense pipeline. with_cir_ht20() is retained for genuine raw-64/52 feeds and now carries a loud doc-warning.
active_indices() handles (64, 56) explicitly and the fallback now selects the slice whose length matches num_active (so Φ's column count is always self-consistent — no silent fall-through to the 52-index slice).
The remaining silent fallback is made LOUD: a SubcarrierMismatch inside cir_gate_coherence now fires a debug_assert! naming the misconfiguration ("CIR gate DEAD … build it with CirConfig::canonical56()"). A config error can no longer hide as a graceful runtime degrade.
cir_estimate_first() exposes the raw estimate() verdict so a test can count Ok vs Err on a canonical-56 stream.

2.3 The PROOF (committed regression tests, `ruvsense::multistatic::tests`)

Test	Asserts	Result
`cir_gate_ht20_is_dead_on_canonical56`	old ht20 estimator on 8 canonical-56 frames → 0 Ok, 8 `SubcarrierMismatch`	the dead gate, measured
`cir_gate_canonical56_is_alive`	new canonical56 estimator on the same 8 frames → 8 Ok, 0 Err	the gate runs
`cir_gate_on_changes_coherence_vs_off`	`coherence(gate on)` ≠ `coherence(gate off)` (\|Δ\| > 1e-6)	the CIR term is actually applied
`cir_gate_dead_ht20_equals_gate_off` (release-only)	dead-ht20 coherence == gate-off coherence (\|Δ\| < 1e-9)	confirms the silent degradation the fix removes

Reproduce:

cd v2 && cargo test -p wifi-densepose-signal --no-default-features --lib \
  ruvsense::multistatic::tests::cir
# 3 passed (the 4th is #[cfg(not(debug_assertions))], add --release to run it)

Resolution: FIXED (not merely loud-fail-documented). The gate now decodes 100% of canonical-56 frames where it previously decoded 0%.

3. The second critical — NaN/inf adversarial-detector bypass (CRITICAL, FIXED)

3.1 What was wrong

AdversarialDetector::check (adversarial.rs) takes per-link link_energies: &[f64]. A single NaN/inf entry bypassed the whole detector: every e > threshold test is false on NaN, the Gini sort used partial_cmp().unwrap_or(Equal), and the final anomaly_score.clamp(0,1) returns NaN on a NaN input. A real RF link can never have NaN/inf energy, so a non-finite input is itself the strongest possible spoof — yet it could slip through as "clean."

3.2 The fix

Finite-validate at the boundary: the first non-finite link_energies entry now short-circuits to a definite anomaly (anomaly_detected = true, anomaly_score = 1.0, affected_links = [bad_idx], FieldModelViolation), and the poisoned frame is not seeded into the temporal-continuity state.

3.3 The PROOF

Test	Asserts
`nan_link_energy_flags_anomaly`	a NaN link energy → `anomaly_detected`, score 1.0, affected link reported, `anomaly_count == 1`
`inf_link_energy_flags_anomaly`	both `+inf` and `−inf` → anomaly, score 1.0

cd v2 && cargo test -p wifi-densepose-signal --no-default-features --lib \
  ruvsense::adversarial::tests::nan_link ruvsense::adversarial::tests::inf_link

4. Divide-by-(n−1) window trio (CORRECTNESS, FIXED)

Three windowing helpers divided by (n − 1) with no small-n guard:

Site	Bug	Fix
`csi_processor.rs` `CsiPreprocessor::hamming_window(n)`	`n=0` underflowed `0usize − 1`; `n=1` divided by 0 → all-NaN window	`match n { 0 => [], 1 => [1.0], _ => … }`
`bvp.rs` Hann window	`window_size=1` divided by 0 → NaN BVP	length-1 guard → constant `[1.0]`
`spectrogram.rs` `make_window`	`size=1` divided by 0 for Hann/Hamming/Blackman	`size <= 1` short-circuit → `vec![1.0; size]`

The standard convention for a length-1 window is the constant 1.0; length-0 is empty.

PROOF: test_hamming_window_degenerate_sizes (csi_processor), bvp_window_size_one_is_finite (bvp), make_window_size_0_and_1_are_safe (spectrogram) — each asserts finiteness at sizes 0/1/2.

The Python deterministic proof (archive/v1/data/proof/verify.py) still prints VERDICT: PASS with the same pipeline hash f8e76f21…46f7a — the reference path uses n ≥ 2, so the guard is bit-transparent there.

5. Measured performance wins (MEASURED before/after; benches committed)

Both changes are bit-equivalent (asserted by a committed test) — they only remove wasted work. New criterion benches in benches/features_bench.rs (registered in Cargo.toml).

Reproduce both:

cd v2 && cargo bench -p wifi-densepose-signal --no-default-features --bench features_bench
# compile-only: append --no-run

5.1 FFT-planner caching for PSD (features.rs)

PowerSpectralDensity::from_csi_data constructed a fresh FftPlanner and re-planned the FFT on every frame — and FeatureExtractor::extract calls it per frame on the hot path. New from_csi_data_with_fft(csi, fft_size, &Arc<dyn Fft>) reuses a plan cached in FeatureExtractor (built once in new()). Output is bit-identical (psd_cached_fft_bit_identical_to_fresh compares f64::to_bits of values + all summary stats across 6 FFT sizes).

Bench group psd_fft_planner — fresh_planner (before) vs cached_planner (after), per frame:

fft_size	before (fresh plan), median	after (cached), median	speedup
64	5.84 µs/frame	1.89 µs/frame	3.09×
128	9.31 µs/frame	3.61 µs/frame	2.58×
256	13.77 µs/frame	6.73 µs/frame	2.04×

Medians from criterion (warm-up 1 s, 20 samples). Raw three-point estimates (low/median/high), per frame: fresh/64 [5.27, 5.84, 6.34] µs vs cached/64 [1.76, 1.89, 2.03] µs; fresh/256 [13.29, 13.77, 14.32] µs vs cached/256 [6.26, 6.73, 7.43] µs. The win is the re-planned FftPlanner construction the cache hoists out of the per-frame loop; it grows in relative terms at small FFTs (planning is a larger fraction of a cheap transform) and stays a flat ~2× at 256.

5.2 DTW Sakoe-Chiba band honored (gesture.rs)

dtw_distance computed the band bounds j_start/j_end but still iterated the full 1..=m row, continue-ing on out-of-band cells — so the band constrained the path but not the work (still O(n·m)). The fix iterates only j_start..=j_end (O(n·band)), resetting just the two boundary-guard cells the recurrence can read, and computes the endpoint reachability (|n−m| ≤ band) at the return site. Result is bit-identical to the full-row version across 12 shapes × 8 band widths (dtw_banded_bit_identical_to_fullrow).

Bench group dtw_sakoe_chiba — full_row (before) vs banded (after):

case	before (full row), median	after (banded), median	speedup
n=m=100, band=5	33.45 µs	13.77 µs	2.43×
n=m=200, band=5	122.32 µs	29.55 µs	4.14×
n=m=200, band=10	159.98 µs	60.19 µs	2.66×

Medians from criterion (warm-up 1 s, 20 samples). Raw (low/median/high): full_row n200_band5 [107.6, 122.3, 146.5] µs vs banded n200_band5 [26.4, 29.5, 33.1] µs. The speedup tracks the inner-loop cell-count ratio m / (2·band+1) — n=m=200, band=5 → 200/11 ≈ 18× fewer cells, but euclidean-distance cost and loop overhead dominate at these sizes so the wall-clock win is ~4× (still the largest at the longest sequence / narrowest band, exactly as the algorithm predicts). It shrinks toward 1× as the band widens to cover the whole matrix (band=10 → 2.66×), and grows with sequence length (band=5: 2.43× at n=100 → 4.14× at n=200).

Note on the other re-plan sites. spectrogram.rs/bvp.rs plan their FFT once per call and reuse it across all frames/subcarriers (already amortized), so caching there is marginal — deferred (§7). The PSD site was the only one re-planning per frame.

6. Per-module SOTA landscape (evidence-graded)

Grades: MEASURED (the source measured it, ideally with public method/code), CLAIMED (asserted, no reproducible artifact), THEORETICAL (plausible, no published target).

6.1 CSI → CIR (cir.rs — our ISTA/L1 sparse recovery)

Deep-unfolded ISTA / LISTA for CSI→CIR — MEASURED. Learned ISTA unrolling reports ~3 dB NMSE improvement over classical OMP/FISTA for channel/CIR estimation (arXiv 2211.15440; survey 2502.05952). Public methods; numbers measured in-paper. This is our #1 future item (§7) — our cir.rs already builds the sub-DFT Φ that LISTA would make trainable.
Diffusion CIR prior — MEASURED (artifact). github.com/benediktfesl/Diffusion_channel_est ships public weights for a diffusion-model channel-estimation prior. Heavier than our edge budget; tracked, not adopted.
Coherence gating (the §2 gate) — THEORETICAL. Our 0.7/0.3 freq/CIR blend is an engineering heuristic with no published accuracy target; now that it runs, it can finally be A/B-measured.

6.2 Adversarial robustness (adversarial.rs)

Adversarial-robustness eval for WiFi sensing — MEASURED. arXiv 2511.20456 + the Wi-Spoof benchmark provide a measured evaluation protocol for spoofed/injected CSI. Our detector's physical-plausibility checks (consistency/Gini/temporal/energy) are in the same spirit; adopting Wi-Spoof as an external benchmark is a §7 item. (The §3 NaN fix is a precondition: a detector that NaN-bypasses can't be benchmarked honestly.)

6.3 Multi-AP / multistatic fusion (multistatic.rs)

Bayesian multi-AP fusion — CLAIMED. arXiv 2512.02462 proposes a Bayesian fusion across APs; no code released, numbers self-reported. Our attention-weighted fusion is a different (cheaper) mechanism; tracked as a comparison target, not adopted.

6.4 RF intention-lead / pre-movement (intention.rs) — THEORETICAL

The 200–500 ms pre-movement "lead signal" framing has no published commodity-WiFi target we can grade. Honestly THEORETICAL; no work item.

7. Decision, roadmap, and the deferred-findings backlog

7.1 Accepted now (this milestone)

The §2–§5 fixes are ACCEPTED and committed: dead CIR gate fixed, NaN bypass fixed, window trio fixed, calibration dead-branch de-misled, two measured perf wins. All cargo test -p wifi-densepose-signal --no-default-features (and --features cir) green; Python proof PASS.

7.2 Top accepted-future item — LISTA-for-CIR (NOT implemented here)

Unroll the existing ISTA in cir.rs into trainable layers (LISTA). Effort: M. The sensing matrix Φ and the ISTA recurrence already exist; LISTA replaces the fixed step size / threshold with per-layer learned parameters over a fixed unroll depth. Measured target to beat: ~3 dB NMSE over OMP/FISTA (arXiv 2211.15440 — MEASURED). Proposed, not built in Milestone 0.

7.3 Other graded-future items

Adopt Wi-Spoof (arXiv 2511.20456, MEASURED) as the external adversarial benchmark for adversarial.rs.
Evaluate the diffusion CIR prior (public weights, MEASURED) as an offline quality ceiling — not an edge target.
Bayesian multi-AP fusion (2512.02462, CLAIMED) — comparison only, pending released code.

7.4 Deferred Milestone-0 review findings (explicit backlog)

Catalogued so nothing is silently dropped. Priority: P1 correctness-adjacent, P2 perf, P3 clarity/style.

Milestone-1 update (2026-06-13): the four P1 backlog items (#1, #9, #10, #13) are now cleared — #1 and #10 RESOLVED (MEASURED), #9 and #13 RESOLVED-PARTIAL (DATA-GATED: de-magicked + boundary-tested, operating values unchanged**)**. ~41 P2/P3 items remain deferred. Each fix is pinned by a regression test that fails on the old behaviour (commits fd32f094a, 4a9f2bcf4, d672fa602, 5193f6369); workspace --no-default-features green, Python proof unchanged (bit-exact).

#	Module	Finding	Pri	Why deferred
1	cir.rs ~937	`phase_variance` uses linear variance on wrapped angles (doc says "variance of phase angles") — spuriously inflates near ±π	P1	RESOLVED (`fd32f094a`) — metric MEASURED, threshold DATA-GATED. Replaced with Mardia's circular variance V = 1 − R̄ ∈ [0,1], invariant to the cluster's position on the circle (branch-cut artefact gone). Guard re-derived against the bounded metric via named const `GHOST_TAP_CIRCULAR_VARIANCE_MAX = 0.99` (fires only when R̄ ≤ 0.01 — essentially uniform phase). The threshold value is DATA-GATED: a clean single-path ramp also sweeps the circle, so V alone can't separate clean from unsanitized without labelled frames — the default is deliberately conservative (strictly more permissive at the wrap boundary than the buggy linear guard). Fails-on-old: `phase_variance_circular_not_fooled_by_branch_cut` (old linear variance > TAU on wrap-straddling phases while circular V≈0, guard no longer trips), `phase_variance_circular_is_bounded_and_extremal`.
2	calibration.rs ~311	`subtract_in_place` had a vacuous `if active_input {ki} else {ki}` branch implying a full-FFT→bin remap that didn't exist	P3	Resolved here (branch removed, sequential-convention documented to match the sibling `extract_first_stream`). Listed for visibility — behavior unchanged.
3	spectrogram.rs / bvp.rs	FFT planner built once-per-call (already amortized across frames)	P2	Marginal vs the per-frame PSD site; cache if these become hot.
4	features.rs ~347	Doppler FFT planner planned once per call, reused across subcarriers	P2	Already amortized within the call.
5	multistatic.rs	`node_attention_weights` recomputes consensus/softmax each call; no SIMD	P2	Needs a bench before touching; not obviously hot.
6	tomography.rs	ISTA L1 solver re-allocates voxel buffers per solve	P2	Bench first.
7	pose_tracker.rs	Kalman gain matrices reallocated per update	P2	Bench first.
8	field_model.rs	SVD recomputed on every perturbation extract	P2	Incremental SVD is a real project, not a micro-fix.
9	coherence.rs / coherence_gate.rs	Z-score thresholds are magic constants, untested at boundaries	P1	RESOLVED-PARTIAL (`5193f6369`) — DATA-GATED. De-magicked `classify_drift` (`DRIFT_STABLE_SCORE=0.85`, `DRIFT_STEP_CHANGE_MAX_STALE=10`) and the `coherence_gate.rs` defaults (`DEFAULT_ACCEPT_THRESHOLD`/`…REJECT…`/`…MAX_STALE_FRAMES`/`…PREDICT_ONLY_NOISE`) into named, documented consts marked EMPIRICAL DEFAULT; added at/just-below/just-above boundary tests (`classify_drift__boundary`) + `_consts_unchanged_from_literals`. Operating values explicitly NOT changed — defensible values still need labelled stable/drifting traces. The gate already exposed these via `GatePolicyConfig` (config seam).
10	longitudinal.rs	Welford update not numerically guarded for n=0	P1	RESOLVED (`4a9f2bcf4`) — MEASURED. The shared `WelfordStats` (`field_model.rs`, consumed by longitudinal.rs) `count < 2` guards already prevent the n=0 NaN / n=1 div0 / `(count−1)` underflow, but the boundary was untested. Added `welford_finite_at_n0_and_n1` (finite + documented 0.0 sentinel at n=0/n=1). Fails-on-old proof: removing the `sample_variance` guard makes the test panic with "attempt to subtract with overflow" at the `(count − 1)` underflow.
11	cross_room.rs	Fingerprint hash collisions unhandled	P2	Low collision prob; needs design.
12	gesture.rs	`euclidean_distance` no length-mismatch guard	P3	Caller-enforced; add `debug_assert`.
13	adversarial.rs	Gini/consistency thresholds are magic constants	P1	RESOLVED-PARTIAL (`d672fa602`) — DATA-GATED. Lifted the bare literals in `check`/`check_consistency` (`FIELD_MODEL_GINI_VIOLATION=0.8`, `ENERGY_RATIO_HIGH_VIOLATION=2.0`, `ENERGY_RATIO_LOW_VIOLATION=0.1`, `CONSISTENCY_ACTIVE_FRACTION_OF_MEAN=0.1`, `SCORE_W_`) into named, documented consts marked EMPIRICAL DEFAULT; added at/just-below/just-above boundary tests (`energy_ratio_high_boundary`, `energy_ratio_low_boundary`, `field_model_gini_boundary`, `consistency_active_fraction_boundary`) + `tuning_consts_unchanged_from_literals`. Operating values explicitly NOT changed* — defensible values still need labelled spoofed/clean CSI (Wi-Spoof, §6.2/§7.3). Bumping a const fails a boundary test (verified).
14	cir.rs	`fft_operator` path changes the witness hash (documented) — no test that it's numerically close to dense	P2	Add a tolerance test.
15	multistatic.rs	`cir_gate_coherence` only estimates the first node/channel; multi-node CIR consensus unused	P2	Design item (which node's CIR is authoritative?).
16	phase_align.rs	Iterative LO offset estimation has no convergence cap test	P2	Add iteration-cap test.
17	hampel.rs	Window edge handling at series boundaries	P3	Cosmetic.
18	motion.rs	Threshold constants undocumented	P3	Doc-only.
19	csi_ratio.rs	Division guard relies on `1e-12` epsilon; no test	P2	Add boundary test.
20	spectrogram.rs	`compute_multi_subcarrier_spectrogram` re-plans per subcarrier via `compute_spectrogram`	P2	Hoist the planner (relates to #3).
21–45	(assorted)	Remaining clarity/doc/magic-constant/missing-boundary-test findings across `ruvsense/*`, `features.rs`, `motion.rs`	P3	Bulk-addressable in a dedicated "test-the-boundaries + de-magic-constant" follow-up; not high-leverage individually.

Horizon-ledger one-liner. Milestone-0 DONE: dead CIR gate (FIXED+proved), NaN/inf adversarial bypass (FIXED+proved), divide-by-(n−1) window trio (FIXED+proved), calibration dead-branch (FIXED), PSD FFT-planner cache (MEASURED), DTW band (MEASURED). Milestone-1 DONE (2026-06-13): all four P1 backlog items cleared — circular phase variance #1 (RESOLVED/MEASURED metric, DATA-GATED threshold), Welford n=0 guard #10 (RESOLVED/MEASURED), threshold magic-constants #9 & #13 (RESOLVED-PARTIAL/DATA-GATED — de-magicked + boundary-tested, values unchanged). DEFERRED to follow-up: the ~41 remaining P2/P3 findings in §7.4 — none silently dropped.

8. Consequences

Positive: the ADR-134 CIR gate is alive for the first time in production; the adversarial detector can no longer be NaN-bypassed; three latent divide-by-zero NaN sources are gone; the per-frame PSD path and gesture DTW are measurably faster with bit-identical output; the SOTA landscape and a concrete LISTA-for-CIR roadmap are graded and recorded.
Negative / honest limits: canonical56() models the canonical grid as a contiguous 56-tone band — a reasonable physical interpretation of a resampled grid, but not a literal hardware tone map; the CIR gate still uses only the first node's CIR (#15). The phase_variance metric is now correct (Mardia circular variance, Milestone-1 #1), so the branch-cut false-trip is gone — but its ghost-tap threshold (GHOST_TAP_CIRCULAR_VARIANCE_MAX = 0.99) is a conservative DATA-GATED default, not a calibrated operating point, and still awaits labelled sanitized/unsanitized frames to tune. Likewise the de-magicked coherence/adversarial thresholds (#9/#13) keep their pre-existing empirical values pending labelled calibration.
Neutral: no public API removed; with_cir_ht20() kept (warned); files stay scoped; new bench is additive.

23 KiB Raw Blame History Unescape Escape