ADR-260 (Accepted — v0.1 reference stack): RuField, the open specification
for camera-free multimodal field sensing — one FieldEvent/FieldTensor/
FusionGraph/PrivacyClass/ProvenanceReceipt model above WiFi CSI/CIR/BFLD,
UWB, BLE Channel Sounding, mmWave radar, ultrasound, subsonic, infrared,
and quantum sensors.
Published standalone as github.com/ruvnet/rufield and vendored here as the
vendor/rufield submodule (the vendor/rvcsi pattern — not a v2/ workspace
member). v0.1 reference stack: 6 crates, 60 tests/0 failed, clippy-clean.
All benchmark metrics SYNTHETIC (simulator ground truth, no hardware).
Co-authored-by: ruv <ruvnet@gmail.com>
* fix(firmware): gate phantom persons + add presence hysteresis (#998, #996)
Two ESP32 edge-vitals logic bugs in edge_processing.c. Both are
robustness/logic fixes — NOT validated-accuracy claims. True count/PCK
vs labelled ground truth remains hardware/data-gated (COM9 ESP32-S3).
#998 — n_persons over-counted (reported 4 for one person):
update_multi_person_vitals() split top-K subcarriers into top_k_count/2
groups and marked EVERY group active, so one body's multipath always
read the full EDGE_MAX_PERSONS. Added two pure, host-testable helpers:
- count_distinct_persons(): per-group energy gate
(EDGE_PERSON_MIN_ENERGY_RATIO) + spatial dedup
(EDGE_PERSON_MIN_SC_SEP) so weak/adjacent multipath groups don't
count as separate bodies. Strongest group always counts (>=1).
- person_count_debounce(): a gated count must hold
EDGE_PERSON_PERSIST_FRAMES consecutive frames before it's emitted,
so a single noisy frame can't promote a phantom.
The active flags now mark only the strongest stable_count groups.
#996 — presence flag flickered at ~50cm despite high presence_score:
the bare `score > threshold` compare chattered on a noisy score
(field-observed 2.6-26.7 frame-to-frame). Replaced with a Schmitt
trigger + clear-debounce (presence_flag_update): assert above
threshold, hold in the dead band down to threshold *
EDGE_PRESENCE_HYST_RATIO, clear only after EDGE_PRESENCE_CLEAR_FRAMES
consecutive sub-low frames. presence_score itself is unchanged and
still emitted for consumer-side thresholding.
All thresholds are named, documented constants in edge_processing.h.
Firmware builds clean for esp32s3 (idf.py build RC=0).
Co-Authored-By: claude-flow <ruv@ruv.net>
* test(firmware): host C99 tests for vitals count + presence logic (#998, #996)
test/test_vitals_count_presence.c pins the two fixes with deterministic
host-buildable tests (no ESP-IDF needed). 13 cases / 22 assertions, all
passing under gcc 13 -Wall -Wextra:
#998 count gate: single strong signature + multipath -> count==1;
two well-separated -> 2; two strong-but-adjacent -> 1 (dedup);
no signal -> 0; three well-separated -> 3.
#998 debounce: transient spike rejected; sustained change accepted;
flapping count stays stable.
#996 presence: dithering trace -> stable flag (no flicker); brief dips
held by clear-debounce; genuine departure clears within hold window;
dead-band holds state.
The named tuning constants are #include'd from the real
edge_processing.h so the test and firmware can never disagree on
thresholds. `make run_vitals` / `make host_tests` added; binaries
gitignored.
Hardware-gated caveat documented in the test header: these pin the
decision LOGIC; the exact energy/separation/hysteresis values that best
match a real room vs labelled occupancy remain on-device tuning.
Co-Authored-By: claude-flow <ruv@ruv.net>
* docs: record ESP32 vitals count/presence fixes (#998, #996)
CHANGELOG [Unreleased] Fixed: root cause + fix + named constants + test
+ explicit hardware/data-gated caveat for both bugs.
ADR-021 Implementation Notes: dated 2026-06 entry noting the edge-path
person-count + presence-flicker fixes are boolean/count emission-logic
fixes, not a validated-accuracy claim; thresholds pending on-device
calibration.
Co-Authored-By: claude-flow <ruv@ruv.net>
* fix(sensing-server): emit real field-derived person position/motion to /ws/sensing (#1050)
The Observatory 3D figure never animated because the sensing_update WS
frame carried no per-person position/motion_score/pose — only image-space
keypoints. The FigurePool/PoseSystem (and demo-data.js's own contract)
animate each figure from persons[i].position (room-world), .motion_score
(0..100), and .pose; none were on the live stream.
Honest scope (Case 2): the pipeline has no calibrated per-person room
localizer or per-person skeletal pose. New field_localize module extracts
the strongest peak(s) from the real signal_field grid (subcarrier
variances x motion-band power) and maps the peak cell to Observatory world
coords with the exact _buildSignalField transform. motion_score is the
measured motion_band_power passed through; pose is set only from a real
aggregate posture estimate, else None (never a fabricated skeleton).
Empty/below-threshold field -> persons: [] (no phantom); present person
with no resolvable peak keeps position [0,0,0], not invented coords.
attach_field_positions runs after the tracker step at all five broadcast
sites. New position/motion_score/pose fields added to both PersonDetection
structs. No UI change needed — the Observatory already reads these fields.
Tests: field_localize peak/coordinate/empty/separation units +
observatory_persons_field_position_tests (known-peak -> emitted position,
empty-room -> no phantom, pose real-or-None, below-threshold honesty).
sensing-server bin 441->451, 0 failed.
Co-Authored-By: claude-flow <ruv@ruv.net>
* docs(changelog): record #1050 Observatory persons position/motion fix
Co-Authored-By: claude-flow <ruv@ruv.net>
* perf(signal): hoist FFT planner across subcarriers (ADR-154 §7.4 #20)
compute_multi_subcarrier_spectrogram called compute_spectrogram once per
subcarrier, and each call built a fresh FftPlanner + re-planned the same
length-window_size FFT. Hoist the plan + window out of the per-subcarrier
loop via a new compute_spectrogram_with_plan core that takes a pre-planned
Arc<dyn Fft> and pre-built window. compute_spectrogram delegates to it
(unchanged behaviour); the multi-subcarrier path plans once and reuses.
MEASURED-HOT (dsp_perf_bench, this box): at 56 subcarriers, window 128,
fresh-planner-per-subcarrier 467.88 µs -> hoisted-plan 254.75 µs = 1.84x;
window 256: 627.27 µs -> 448.39 µs = 1.40x. Plan-forward cost alone is
~1.86 µs (w128), x56 subcarriers ~= the removed delta.
Output is bit-identical: multi_subcarrier_hoisted_plan_bit_identical
compares f64::to_bits of every spectrogram value + freq/time resolution
against the per-call fresh-planner path across all 4 window functions x
{power,magnitude} on a 56-subcarrier matrix. The numeric STFT body is the
old loop verbatim; only plan/window construction is lifted.
Co-Authored-By: claude-flow <ruv@ruv.net>
* test(signal): boundary/tolerance tests for ADR-154 §7.4 #14#16#19
Three "+ test" backlog gaps closed — pure additions, no behaviour change
(phase_align refactor is internal: estimate_phase_offsets still returns the
identical offset vector; a counted core is split out only to observe the
iteration count).
#14 cir.rs fft_operator — fft_operator_within_tolerance_of_dense_canonical56:
the opt-in FFT Φ/Φᴴ path changes the witness hash, so pin it numerically
CLOSE to the dense path (not silently divergent). Asserts the full Cir
output (every tap within 1e-2·dominant, dominant idx/ratio, active_tap_count,
ranging_valid, rms_delay_spread) on the production canonical-56 config
across τ ∈ {20,50,90} ns. Extends the existing HT20/single-τ test.
#16 phase_align.rs — refinement_terminates_at_iteration_cap_when_not_converging:
forces non-convergence (tolerance=0.0, unreachable) and asserts the loop
runs exactly max_iterations then returns — proving the cap, not convergence,
bounds the loop (no infinite spin). Companion
refinement_converges_before_cap_on_easy_input proves the cap is an upper
bound, not the only exit.
#19 csi_ratio.rs — ratio_finite_at_and_below_1e_12_epsilon: the module
implements the CSI ratio as the conjugate product H_i·conj(H_j) (no
division), so it is finite even at/below the 1e-12 magnitude boundary a
naive H_i/H_j division would need an epsilon to guard. Pins finiteness +
bit-exact conjugate product at the boundary (zero target → zero, never
inf/NaN), through the amplitude/phase extraction.
cargo test -p wifi-densepose-signal --no-default-features --lib: 447 passed,
0 failed; --features cir --lib: 447 passed, 0 failed.
Co-Authored-By: claude-flow <ruv@ruv.net>
* docs(adr-154): record Milestone-2 P2-perf verdicts + boundary tests (§7.4)
§7.4: #20 MEASURED-HOT (1.40–1.84× spectrogram FFT-plan hoist, bit-identical);
#5/#6/#7 MEASURED-NULL (benched, not hot, left as-is — sub-µs / stack-only /
alloc-once); #8 MEASUREMENT-ONLY (per-call 56×56 eigh cost; eigenvalue/BLAS
backend un-buildable on this Windows host, number deferred to a BLAS box, NOT
fabricated; also corrects the finding — extract_perturbation reuses cached
modes, the recompute is in estimate_occupancy). #14/#16/#19 RESOLVED (tolerance
/ convergence-cap / epsilon-boundary tests). Updated §7.4 intro + Horizon-ledger
(deferred count 41→36). CHANGELOG [Unreleased] entry added.
Co-Authored-By: claude-flow <ruv@ruv.net>
* bench(signal): committed P2 bench-first benches (ADR-154 §7.4 #5/#6/#7/#8/#20)
New dsp_perf_bench.rs backs every Milestone-2 perf verdict with a committed
criterion bench — no speedup claimed without a before/after number here, and
a benched NULL is the proof a micro-opt was unnecessary (the §5.x "already
amortized" pattern). Registered in Cargo.toml [[bench]].
MEASURED (this box, criterion medians):
#20 spectrogram_multi_subcarrier (fresh vs hoisted plan):
MEASURED-HOT — 467.88→254.75 µs (1.84x) @ sc56/w128; 627.27→448.39 µs
(1.40x) @ sc56/w256. Optimized in the prior commit.
#5 multistatic_attention/weights: MEASURED-NULL — 181 ns (2 nodes) ..
848 ns (8 nodes); sub-µs, no hot-path alloc — left as-is.
#6 tomography_reconstruct/solve: MEASURED-NULL — 47.5 µs (16 links) /
60.4 µs (32 links) for a full 50-iter ISTA solve; the 2 per-solve voxel
buffers (~4 KB) are negligible vs O(iters·links·voxels) compute, and
reconstruct(&self) reuses them across iterations already — left as-is.
#7 pose_kalman_update/cycles: MEASURED-NULL — 150 ns (17 kpts) / 2.82 µs
(170); the Kalman "gain matrices" are fixed-size STACK arrays
([[f32;3];6]), zero heap — nothing to reuse — left as-is.
#8 field_model_occupancy (eigenvalue feature): MEASUREMENT-ONLY — quantifies
the per-call n×n eigendecomposition cost; incremental SVD is a sized
future project, not attempted (number recorded in ADR-154 §7.4).
Reproduce:
cargo bench -p wifi-densepose-signal --no-default-features --bench dsp_perf_bench
cargo bench -p wifi-densepose-signal --bench dsp_perf_bench # adds #8
Cargo.lock: dev-dep (criterion/clap) graph + crate version bumps from the
build; no runtime-dependency change.
Co-Authored-By: claude-flow <ruv@ruv.net>
* fix(hardware): constant-time HMAC sync-beacon tag compare (ADR-157 §B4)
AuthenticatedBeacon::verify compared the 8-byte HMAC-SHA256 tag with
`self.hmac_tag == expected`, which short-circuits on the first differing
byte and leaks, via verification latency, how many leading bytes a forged
tag matched — a byte-by-byte tag-recovery oracle (~256·N trials vs 256^N).
Replace with a hand-rolled branch-free `constant_time_tag_eq`: XOR-accumulate
every byte difference into a single u8 with no early exit, compare to zero
once. `#[inline(never)]` + `core::hint::black_box(diff)` resist the optimizer
reintroducing a short-circuit or a non-constant-time memcmp; length mismatch
returns false without inspecting contents. No new dependency — ADR-157 had
deferred this only to avoid the `subtle` crate; a fixed 8-byte compare needs
none.
Test (hard gate): tag_compare_is_constant_time_shape — equal / first-differ /
last-differ / all-differ / length-mismatch + end-to-end verify() last-byte
tamper. Proven to fail on a last-byte-skipping constant-time bug. A coarse
timing smoke check (tag_compare_timing_invariance_smoke) is #[ignore]d to
avoid CI flakiness. Grade MEASURED (constant-time construction).
ADR-157 §8 §B4 → RESOLVED. wifi-densepose-hardware: 164 passed / 0 failed.
Co-Authored-By: claude-flow <ruv@ruv.net>
* feat(wifiscan): MEASURE native wlanapi.dll vs netsh throughput (ADR-157 §5 #4)
ADR-157 §5 #4 recorded the native wlanapi.dll multi-BSSID fast path as
"asserted but NOT implemented; live scanner is the ~2 Hz netsh shim". Audit
finding: that status is stale — wlanapi_native::scan_native already implements
the real WlanOpenHandle → WlanEnumInterfaces → WlanGetNetworkBssList →
WlanFreeMemory/WlanCloseHandle FFI (handle cleanup on all exits, length-bounded
buffer walks, #[cfg(windows)] with typed Unsupported off-Windows), and
WlanApiScanner::scan_instrumented already wires it native-first with a netsh
fallback. The missing piece was an honest MEASUREMENT.
Add benchmark_backend(backend, window): drives one specific backend over a
fixed wall-clock window so netsh is timed independently (the existing
benchmark() picks native-first and so never measures netsh on a box where
native works). Returns None for an unavailable native path (honest negative,
not a fabricated number).
MEASURED on this box (Intel Wi-Fi 7 BE201 320MHz, 2026-06-13), 10 s window:
native 21.42 Hz vs netsh 3.84 Hz = 5.57× (mean 5.0 BSSIDs/scan each).
native-only run: 18.0 Hz. 50/50 back-to-back native scans, no handle leak.
A real positive result — NOT a fabricated 10×. Achieved 21.4 Hz is in the
asserted >2 Hz regime, below the asserted 10–20 Hz upper bound.
Tests (live-WLAN, #[ignore] for CI, RUN here):
measure_native_vs_netsh_throughput, native_scans_dont_leak_handles,
measure_native_scan_rate. Non-ignored pin native_scan_runs_real_ffi_on_windows
(pre-existing) stays green. wifi-densepose-wifiscan: 94 passed / 0 failed.
ADR-157 §5 #4 + §8 → MEASURED (was ACCEPTED-FUTURE / CLAIMED-unmeasured).
Co-Authored-By: claude-flow <ruv@ruv.net>
* refactor(train): hoist canonical PCK/OKS to un-gated metrics_core; fold test_metrics onto production (ADR-155 M1 §8)
ADR-155 §8 deferred item: test_metrics.rs reference kernels validated
production against their OWN reimplementation — a test that cannot catch a
canonical-impl bug (both could be wrong the same way).
- Extract canonical_torso_size / pck_canonical / oks_canonical / sigmas /
bounding_box_diagonal into a new NON-tch-gated `metrics_core` module, so
the single metric definition is reachable under
`cargo test --no-default-features` (the `metrics` module is tch-gated).
`metrics` re-exports every item → still exactly ONE implementation.
- Rewrite tests/test_metrics.rs to assert the PRODUCTION pck_canonical /
oks_canonical equal hand-computed fixtures (not a reimplementation):
canonical_pck_matches_hand_computed_fixture (corr=3/total=4/pck=0.75),
hip↔hip normalizer pin, zero-visible⇒0.0, OKS perfect⇒1.0, fake-Gold pin.
- Keep an INDEPENDENT raw-threshold reference kernel only as a differential
cross-check: test_kernel_agrees_with_canonical asserts it AGREES with
canonical where torso==1.0 (genuine cross-check, not duplication).
Grade: MEASURED. test_metrics 10→12 tests, 0 failed.
Co-Authored-By: claude-flow <ruv@ruv.net>
* fix(sensing-server): relabel divergent live PCK/OKS so they're never conflated with canonical (ADR-155 M1 §2.1/§8 Goal C)
Goal C named training_api.rs:804 (torso-HEIGHT PCK). Auditing it surfaced
TWO findings the ADR-155 §1 table missed:
1. training_api.rs is an ORPHAN file — not declared `mod` in lib.rs OR main.rs,
so it does NOT compile into the crate. It does not drive the live server.
2. The REAL live `best_pck`/`best_oks` (main.rs training path → RVF metadata
JSON read by model_manager.rs) come from trainer.rs:
- `pck_at_threshold` = RAW-threshold PCK, NO torso normalization (the most
divergent kind), printed/serialized as bare "PCK@0.2".
- `oks_map` calls `oks_single(area=1.0)` = the EXACT fake-Gold pattern
ADR-155 §2.1 claimed closed elsewhere — still live here, inflating best_oks.
Resolution = RELABEL (torso/raw math is load-bearing on different data; the
pub fns can't be renamed without breaking API; sensing-server has no train/
ndarray dep). Honest unify is a tracked §8 backlog item.
- training_api.rs: `compute_pck` → `compute_pck_torso_height` + divergence doc;
val_pck/best_pck/val_oks struct fields documented as torso-HEIGHT proxies;
logs say `pck_torso_h@0.2`. Test torso_pck_is_labelled_distinctly_from_canonical.
- trainer.rs (LIVE): `pck_at_threshold` documented raw-unnormalized; `oks_map`
area=1.0 flagged fake-Gold; test pck_at_threshold_is_raw_unnormalized_not_canonical.
- main.rs: live print relabelled `pck_raw@0.2` / `oks_map(area=1.0 proxy)`.
No wire-format field renames (back-compat); no pub-API rename (no silent break).
Grade: MEASURED (relabel + divergence pinned). sensing-server 450→451 lib tests, 0 failed.
Co-Authored-By: claude-flow <ruv@ruv.net>
* docs(adr-155): mark §8 metric items RESOLVED + audit map + honest §1 under-count correction (M1b Goals A/D)
- §8.1: full PCK/OKS audit map (every def: file:line, basis, canonical/
legacy/distinct), the two §8 items marked RESOLVED with resolution+why.
- Honest finding: §1's "seven divergent metrics" was an UNDER-count —
sensing-server's LIVE trainer.rs has a raw-unnormalized PCK and an
area=1.0 fake-Gold OKS the table omitted, and the file §8 named
(training_api.rs) is orphaned dead code. §9 honest-limits updated.
- Goal D: metrics.rs *_v2 variants confirmed caller-less + deprecated;
noted for future cleanup, NOT deleted (public API, tch-gated).
- CHANGELOG [Unreleased] Fixed entry.
Co-Authored-By: claude-flow <ruv@ruv.net>
* feat(ruvector): RaBitQ Pass-2 randomized rotation + topk bugfix (ADR-156 §8)
Implements the deferred "Multi-bit / Extended RaBitQ Pass 2" backlog item
from ADR-156 §8: a deterministic randomized orthogonal rotation applied
before sign-quantization, the published RaBitQ construction (Gao & Long,
SIGMOD 2024).
Rotation construction: Fast Hadamard Transform + seeded ±1 sign flips
("HD" / randomized Hadamard), O(d log d) time and O(d) memory — a dense
d×d rotation is O(d²) and infeasible at the 65,535-d the wire format
provisions for. Pads to the next power of two; SplitMix64 seeds the sign
stream so index-time and query-time rotations are bit-identical.
API is additive and backward-compatible: Pass 1 (`from_embedding`) is
untouched; Pass 2 is opt-in via `Sketch::from_embedding_rotated` and
`SketchBank::with_rotation` (+ `insert_embedding` / `topk_embedding` /
`novelty_embedding` helpers that rotate consistently). Default behaviour
is unchanged.
While building the Pass-2 coverage harness, found and fixed a PRE-EXISTING
correctness bug in `SketchBank::topk`: the n>k heap path used
`BinaryHeap<Reverse<(d,id)>>` (a min-heap) but treated its peek as the
max, so it returned the k FARTHEST sketches as "nearest". The shipped unit
tests only exercised the n≤k fast path, so it went unnoticed. Fixed to a
plain max-heap; pinned by `topk_heap_path_returns_nearest` and
`tight_clusters_give_high_coverage_with_overfetch` (the latter measured
0.072 on the old code).
New tests (+17, 100→117 in the crate): rotation determinism/norm-preservation
(`rotation_is_deterministic_for_seed`, `rotation_preserves_norm`), Pass-2
shape-compatibility, `pass2_coverage_not_worse_than_pass1`, and a
deterministic coverage report.
MEASURED top-K coverage (anisotropic planted-cluster fixture, cosine ground
truth; dim=128 N=2048 K=8 64 clusters noise=0.35 128 queries):
candidate_k=K=8 : Pass1 36.13% -> Pass2 46.39% (both << 90% bar)
candidate_k=24 : Pass1 83.89% -> Pass2 91.60% (Pass2 clears 90%)
candidate_k=32 : Pass1/Pass2 100%
Honest result: rotation consistently helps (+10pp at strict K), but neither
pass clears the ADR-084 90% bar at candidate_k==K on this distribution.
Pass 2 reaches 90% only with ~3x over-fetch (the ADR-084 "candidate set"
deployment pattern). Multi-bit Pass 3 evaluated separately.
Co-Authored-By: claude-flow <ruv@ruv.net>
* feat(ruvector): multi-bit Pass-3 experiment + ADR-156/084 measured results
Adds the multi-bit half of the ADR-156 §8 "Multi-bit / Extended RaBitQ"
item as a MEASURED experiment (coverage::measure_multibit): rotate, then
b-bit uniform scalar-quantize each coord, rank by L1 over codes — the
natural multi-bit generalization of hamming. Measures the bit/coverage
tradeoff the backlog item asked for.
MEASURED at the strict bar (candidate_k=K=8, anisotropic planted-cluster
fixture, cosine ground truth):
Pass1 (1-bit, no rot) 36.13% 16 B/vec
Pass2 (1-bit, rot) 46.39% 16 B/vec
Pass3 (rot, 2-bit) 54.39% 32 B/vec
Pass3 (rot, 3-bit) 66.70% 48 B/vec
Pass3 (rot, 4-bit) 74.22% 64 B/vec
Honest: multi-bit monotonically helps but even 4-bit (4x memory) reaches
only 74% at the strict bar — neither rotation nor <=4-bit multi-bit clears
the strict-K 90% bar on this distribution. The bar is met via over-fetch
(Pass2 @ candidate_k=24). Tests: multibit_tradeoff_report,
multibit_1bit_matches_pass2_approx (+ sanity that 1-bit ~= Pass-2).
Docs:
- ADR-156 §8 item #2 marked RESOLVED-PARTIAL; §5 #2 grade CLAIMED ->
MEASURED-on-our-hardware; new §10 with full measured tables, the topk
bugfix disclosure, and graded deferred sub-items.
- ADR-084: "Pass 2" section answering the rotation open-question with
measured numbers + the topk bug note.
- CHANGELOG [Unreleased]: Added (Pass-2 milestone) + Fixed (topk heap).
Co-Authored-By: claude-flow <ruv@ruv.net>
* fix(signal): circular phase variance for ghost-tap guard (ADR-154 §7.4 #1)
`phase_variance` computed a LINEAR sample variance over phase angles that
wrap at ±π, so a tightly-clustered set straddling the branch cut reported
spuriously HIGH dispersion — false-tripping the `> TAU` ghost-tap guard on
real, tightly-clustered CIR taps.
Replace with Mardia's circular variance V = 1 − R̄, bounded [0,1] and
invariant to where the cluster sits on the circle. Re-derive the guard
against the bounded metric via a named const
`GHOST_TAP_CIRCULAR_VARIANCE_MAX` (the old TAU-scaled threshold is
meaningless on [0,1]).
Grade: metric fix MEASURED; threshold value DATA-GATED — a clean single-path
ramp also sweeps the circle, so V alone cannot separate clean from
unsanitized without labelled frames. Conservative default (0.99) errs toward
never false-rejecting, strictly more permissive at the wrap boundary than the
buggy linear guard.
Fails-on-old test: `phase_variance_circular_not_fooled_by_branch_cut` —
inlines the old linear variance to show it exceeds TAU on wrap-straddling
phases while circular V≈0 and the guard no longer trips. Plus
`phase_variance_circular_is_bounded_and_extremal` (V∈[0,1], V≈0 identical,
V≈1 uniform).
cargo test -p wifi-densepose-signal --no-default-features --features cir --lib
→ 432 passed, 0 failed.
Co-Authored-By: claude-flow <ruv@ruv.net>
* fix(signal): pin Welford n=0/n=1 finiteness guard (ADR-154 §7.4 #10)
The shared `WelfordStats` (field_model.rs, used by longitudinal.rs and others)
relies on `count < 2` guards in `variance`/`sample_variance`/`std_dev`/
`z_score` to stay finite at the boundaries. The guards existed but the n=0
boundary was UNTESTED — exactly the §4 divide-by-(n−1) family the ADR groups
this with.
Add `welford_finite_at_n0_and_n1` asserting every statistic is finite and
returns the documented sentinel (0.0) at n=0 and n=1, plus load-bearing doc
comments on the two guards.
Fails-on-old proof: with the `sample_variance` guard removed, the test FAILS
with "attempt to subtract with overflow" at the `(self.count - 1)` underflow
(0usize − 1); `variance` would similarly yield 0.0/0.0 = NaN. The guard is
restored; the test pins it so a future regression is caught.
Grade: MEASURED (boundary finiteness is asserted; the guard is the §4-family
fix made testable).
cargo test -p wifi-densepose-signal --no-default-features --lib field_model
→ 22 passed, 0 failed.
Co-Authored-By: claude-flow <ruv@ruv.net>
* refactor(signal): de-magic adversarial thresholds + boundary tests (ADR-154 §7.4 #13)
Lift the bare numeric literals buried in `check`/`check_consistency` into
named, documented module consts (FIELD_MODEL_GINI_VIOLATION=0.8,
ENERGY_RATIO_HIGH_VIOLATION=2.0, ENERGY_RATIO_LOW_VIOLATION=0.1,
CONSISTENCY_ACTIVE_FRACTION_OF_MEAN=0.1, SCORE_W_* weights). VALUES UNCHANGED —
each const equals the original literal; only names + pinning tests are new.
Grade: DATA-GATED. The operating values stay empirical (defensible values need
labelled spoofed/clean CSI — Wi-Spoof, §6.2/§7.3). The de-magicking +
characterization tests are MEASURED: `tuning_consts_unchanged_from_literals`,
`energy_ratio_high_boundary`, `energy_ratio_low_boundary`,
`field_model_gini_boundary`, `consistency_active_fraction_boundary` pin the
decision boundaries at/just-below/just-above each threshold, so a future
data-driven retune is a visible, tested change.
Fails-on-change proof: bumping ENERGY_RATIO_HIGH_VIOLATION 2.0→3.0 makes
`energy_ratio_high_boundary` FAIL (restored). Operating values explicitly
NOT changed.
cargo test -p wifi-densepose-signal --no-default-features --lib ruvsense::adversarial
→ 20 passed, 0 failed.
Co-Authored-By: claude-flow <ruv@ruv.net>
* refactor(signal): de-magic coherence drift/gate thresholds (ADR-154 §7.4 #9)
Lift the bare detection literals in `coherence.rs::classify_drift`
(DRIFT_STABLE_SCORE=0.85, DRIFT_STEP_CHANGE_MAX_STALE=10) and the
`coherence_gate.rs` Default impl (DEFAULT_ACCEPT_THRESHOLD=0.85,
DEFAULT_REJECT_THRESHOLD=0.5, DEFAULT_MAX_STALE_FRAMES=200,
DEFAULT_PREDICT_ONLY_NOISE=3.0) into named, documented consts. VALUES
UNCHANGED. The gate already exposed these via GatePolicyConfig (config seam);
this names + pins the defaults.
Grade: DATA-GATED. Operating values stay empirical (defensible Z-score
thresholds need labelled stable/drifting coherence traces). De-magicking +
boundary tests are MEASURED: `classify_drift_stable_score_boundary`,
`classify_drift_stale_count_boundary` pin the at/just-below/just-above
decisions; `drift_consts_unchanged_from_literals` /
`gate_default_consts_unchanged_from_literals` pin the values. Operating values
explicitly NOT changed.
cargo test -p wifi-densepose-signal --no-default-features --lib ruvsense::coherence
→ 40 passed, 0 failed.
Co-Authored-By: claude-flow <ruv@ruv.net>
* docs(adr-154): mark §7.4 P1 backlog cleared — Milestone-1 (#1,#10 RESOLVED; #9,#13 DATA-GATED)
Update ADR-154 §7.4 backlog rows #1, #9, #10, #13 with commit refs + grades,
the §7.4 intro count (four P1 items cleared, ~41 P2/P3 remain), the
Horizon-ledger one-liner (Milestone-1 DONE), and the §8 honest-limits #1 line
(metric now correct; threshold still DATA-GATED). Add CHANGELOG [Unreleased]
entry.
Grades: #1 RESOLVED (MEASURED metric / DATA-GATED threshold), #10 RESOLVED
(MEASURED), #9 & #13 RESOLVED-PARTIAL (DATA-GATED — de-magicked + boundary
tested, operating values unchanged).
Validation: cargo test --workspace --no-default-features → 2057 passed, 0
failed; wifi-densepose-signal lib → 442 passed (no-default + --features cir);
python archive/v1/data/proof/verify.py → VERDICT: PASS, hash f8e76f21…46f7a
UNCHANGED (CIR ghost-tap guard is not on the deterministic proof path).
Co-Authored-By: claude-flow <ruv@ruv.net>
* fix(sensing-server): stop leaking internal errors in HTTP responses (ADR-080 #2)
Six handlers in `main.rs` serialized the internal error `Display` straight
into the JSON response body, leaking server internals to any client (ADR-080
finding #2, CWE-209; reframed onto the Rust boundary by ADR-164 G11):
- edge_registry_endpoint: a panicked spawn_blocking `JoinError`
("task … panicked") in a 500, and the raw upstream error in a 503
- delete_model / delete_recording / start_recording: std::io::Error
strings carrying OS detail / filesystem paths
- calibration_start / calibration_stop: the FieldModel error chain
New `error_response` module: `internal_error` / `internal_error_json` /
`upstream_unavailable` log the full detail server-side only (tagged with a
correlation id) and return a generic body
(`{"error":"internal_error","correlation_id":…}`) — no `panicked`, no file
paths, no Debug chain. The correlation id lets an operator join a client
report to the exact server log line without ever shipping the detail.
Pinned by 5 error_response tests, incl. a leak-substring guard
(internal_error_body_does_not_leak_detail) verified to FAIL on the reverted
old body (returns the panic message / path / "os error"). The HOMECORE sweep
(ADR-161) covered homecore-server, not this crate.
Co-Authored-By: claude-flow <ruv@ruv.net>
* test(sensing-server): pin XFF-immunity + no-query-token (ADR-080 #1, #3)
Findings #1 (XFF-spoofing bypass) and #3 (JWT-in-URL, CWE-598) were logged
against the Python v1 API but are VERIFIED ABSENT on the current Rust
sensing-server, so they get regression tests rather than redundant fixes:
- #1 XFF: there is no IP-based rate-limiter or IP-allowlist to bypass, and
neither security middleware reads a forwarded header. Added
bearer_auth::xff_header_never_affects_auth_decision (spoofed
X-Forwarded-For never flips a 401<->200 decision) and
host_validation::forwarded_headers_never_bypass_host_allowlist (spoofed
X-Forwarded-Host: localhost never lets Host: evil.com past the allowlist).
- #3 JWT-in-URL: require_bearer reads the token only from the Authorization
header; WS handlers take no query token; the sole Query extractor
(EdgeRegistryParams) is a non-secret refresh flag. Added
bearer_auth::query_string_token_is_never_accepted — ?token= / ?access_token=
in the URL never authenticates (stays 401) while the header path still 200s.
Verified to FAIL when a query-token path is injected into require_bearer.
Co-Authored-By: claude-flow <ruv@ruv.net>
* docs(adr-080): mark P0 security findings #1-#3 RESOLVED; close ADR-164 G11
- ADR-080: Status note + per-finding closure (#1 XFF and #3 JWT-in-URL
verified absent + regression-pinned; #2 leaked errors fixed via the
error_response module). Records the v1-vs-Rust boundary distinction
explicitly: v1 paths remain archived; this closure governs the shipped
Rust sensing-server.
- ADR-164: Gap Register G11 and the Open/Gated Backlog entry marked
RESOLVED with the fix + branch reference.
- CHANGELOG: [Unreleased] -> ### Security entry covering all three findings.
Co-Authored-By: claude-flow <ruv@ruv.net>
* docs(adr): renumber 6 displaced ADRs to resolve duplicate-number collisions (ADR-164 G1)
Resolves the 5 duplicate ADR numbers (6 displaced files) flagged by ADR-164
Gap Register item G1. Canonical keeper per number = first file committed at
that number (date tie-broken by inbound cross-reference count / parent-appendix
relationship). Displaced files renumbered to the next free numbers (166-171):
050 keeps provisioning-tool-enhancements (5 refs vs 1)
-> ADR-166-quality-engineering-security-hardening
052 keeps tauri-desktop-frontend (parent ADR)
-> ADR-167-ddd-bounded-contexts (its appendix)
147 keeps nvidia-cosmos/OccWorld (the actual ADR, has Status header)
-> ADR-168-benchmark-proof (proof companion, no Status)
-> ADR-169-adam-mode-light-theme (was untracked)
148 keeps drone-swarm-control-system (committed #862)
-> ADR-170-yoga-mode-pose-system (was untracked)
149 keeps public-community-leaderboard-huggingface (committed 16:47 vs 17:38)
-> ADR-171-swarm-benchmarking-evaluation-methodology
Updates in-file `# ADR-NNN` headers and intra-file self-references (yoga-modes
* docs(adr): repoint inbound cross-references to renumbered ADRs (166-171)
Follow-up to the ADR renumbering (ADR-164 G1). Updates every inbound reference
that pointed at a displaced ADR, disambiguating shared numbers by title/slug so
only references to the DISPLACED topic move and keeper references stay put.
ADR-168 (was 147 benchmark-proof): README, CHANGELOG, user-guide,
proof-of-capabilities, research docs 00/03 — all path/label refs updated.
ADR-169 (was 147 adam-mode) / ADR-170 (was 148 yoga-mode): docs/adr/README index.
ADR-171 (was 149 swarm-benchmarking): all ruview-swarm eval code+docs
(Cargo.toml, evals/, eval_swarm.rs, metrics/mod/report/runner.rs), research
doc 03 (every §-ref matched ADR-171 sections, not AetherArena), 00-system-review,
series README, CHANGELOG, and ADR-148's forward/"open issues" pointers.
ADR-166 (was 050 quality-engineering / security-hardening): disambiguated from the
ADR-050 provisioning KEEPER by topic. The HMAC/secure_tdm, directory-traversal,
bind-address, and OTA-PSK-auth references in code comments
(wifi-densepose-hardware Cargo.toml + secure_tdm.rs, sensing-server main.rs) and
in ADR-052-tauri / ADR-167 all describe the security-hardening ADR -> ADR-166.
ADR-167 (was 052 ddd-appendix): inbound appendix references.
Index/registry updates: docs/adr/README.md, gap-analysis/census.md (rows +
header count), gap-analysis/lens-findings.md (collision table marked RESOLVED),
and ADR-164 Gap Register G1 marked RESOLVED with the full renumber map.
Keeper references deliberately untouched: all ADR-147 OccWorld code, all ADR-148
drone-swarm code/docs, all ADR-149 AetherArena refs (incl. ADR-150's SSL/resampling
refs, which ADR-150 explicitly binds to the AetherArena benchmark), ADR-050
provisioning refs, ADR-052 tauri refs. The frozen GitHub blob URLs in
docs/adr/.issue-177-body.md (pinned to an old branch) are left as historical.
Comment-only code edits; no behavior change. wifi-densepose-hardware compiles
clean; the sensing-server build's sole blocker is the pre-existing upstream
midstreamer-temporal-compare@0.2.1 registry crate, unrelated to these edits.
Co-Authored-By: claude-flow <ruv@ruv.net>
Records the remediation done in this branch:
- G3 (homecore-recorder/migrate phantom ADRs) → RESOLVED: ADR-132 + ADR-165 written.
- G5 (10 streaming-engine Proposed-while-built) → RESOLVED: 136-145 flipped to
"Accepted — partial", with the honest caveat that the notes describe building
blocks built+tested, not live-path integration.
- G2 (missing Status headers) → corrected: ADR-134-CIR was mislabeled as missing
(it has a Status row); the 2 genuine misses (147-benchmark-proof, 052-ddd) are
both inside owner-gated duplicate-number collisions, so left untouched. Early
ADRs using "| Status |" vs "| **Status** |" are different-format-but-present.
Net: 0 status headers added.
- Updated Coverage-Gaps bullets for recorder/migrate.
Renumbering/dedup of the 6 collisions left owner-gated, as instructed.
Co-Authored-By: claude-flow <ruv@ruv.net>
All 10 streaming-engine ADRs (136-145) carried Status: Proposed while each has a
concrete commit-pinned "Built -- tested building block" Implementation-Status note
(136: 11f89727f; 137: 4fa3847ac; 138: fc7674bde; 139: 521a012d8; 140: 169a355bd;
141: 7d88eb84c; 142: 1f8e180d6; 143: 2d4f3dea5; 144: b10bc2e9a; 145: 0f336b7d3),
each with a test count.
Flipped each to "Accepted — partial (built + tested building block; integration
glue pending — see Implementation Status, commit <hash>)". Honest "partial", not
full Accepted: the notes themselves state the blocks are tested+compiling but
"mostly not yet on the live 20 Hz path". 143 (v2 dataset-gated) and 144 (no UWB
radio in fleet) carry their specific residual gates inline.
Co-Authored-By: claude-flow <ruv@ruv.net>
homecore-migrate cited "ADR-134 (HOMECORE-MIGRATE)", but on-disk ADR-134 is
"First-Class CIR Support" — a different decision. The migrate crate was governed
by a phantom identity (ADR-164 Gap G3).
- New ADR-165-homecore-migrate-from-home-assistant.md (next free number),
reverse-documented from the shipped P1 scaffold: HA .storage reader, versioned
format gate (unknown minor_version = hard error), per-artifact parsers, inspect
CLI, structured errors. Status: Accepted — P1 scaffold (full conversion P2).
Trust-boundary rationale for the untrusted .storage import is the centerpiece.
- Repointed every ADR-134 governing reference in v2/crates/homecore-migrate/
(Cargo.toml, README.md, src/lib.rs, src/config_entries.rs,
src/storage_format/mod.rs) → ADR-165. Left the ADR-132 (recorder-feature)
refs intact. Explanatory renumber notes retained.
- On-disk ADR-134 (CIR) untouched. ADR-126 series-map registry row owner-gated.
Docs/comments only — cargo build -p homecore-migrate --no-default-features
still compiles.
Co-Authored-By: claude-flow <ruv@ruv.net>
Adds benchmarks/edge-latency/RESULTS.md (wiflow-std RESULTS style: each
measured number with reproduce command, machine, MEASURED-on-host grade,
and the honest host-vs-ESP32 / steady-state-vs-cold-start caveats) and
ADR-163 (HEADLINE: CLAIMED latency budgets -> MEASURED-on-host, closing
M5/M6 measurement debt; ESP32-on-hardware still pending).
- ADR-160 deferred 'criterion benches for process_frame budget claims'
line updated to DONE (host) with the ESP32-pending note.
- PROOF.md performance table gains the two edge-latency reproduce rows;
provenance ADR range extended to ADR-163.
- prove.sh gated section gains the edge-latency bench note (host proxy
only; not asserted, never claims the ESP32 figure).
Benches/docs only; no crate republishes.
Co-Authored-By: claude-flow <ruv@ruv.net>
Records the Milestone 7 audit: library cores are real (anti-slop positive) but
the network boundary had a CRITICAL WS auth bypass (A1) + reply-theater (A2) +
documented-but-no-op automation (A3-A7) + a network-exposed dev bin (A8), all
fixed and graded MEASURED with failing-on-old tests. Cites the NO-ACTION
security positives (uuid::v4 CSPRNG refuted-suspicion, hardened CORS,
no-traversal migrate, no-secrets-in-logs, honest HAP stub) and the deferred
backlog (plugin authority-isolation P5, sig-verification P4, HAP real pairing
P2, bounded run-modes, YAML load-at-boot).
Co-Authored-By: claude-flow <ruv@ruv.net>
- tests/honest_labeling.rs: 10 source-presence tests asserting the A1-A5 claim
invariants (disclaimers present, uncited stat removed, WEAPON_ALERT no longer
exported, med_* feature-gated, no static-mut event buffers). Each is designed to
FAIL on the pre-fix source (ADR-159 A5 manifest-roundtrip style).
- ADR-160: records the headline (0 stubs/0 theater, all real DSP -> claim-surface
honesty debt), the graded A1-A5 fixes, NO-ACTION positives, per-prefix
classification, and the DATA-GATED deferred backlog (criterion benches,
per-skill accuracy validation, wasm32 static_mut_refs CI confirmation).
- ADR-159: its deferred-backlog line "wasm-edge ... honestly labelled, not claimed"
is now actually TRUE.
Validation (all 0 failed, host --features std):
DEFAULT 615 | MEDICAL (+medical-experimental) 653 | NO-DEFAULT 615; 0 warnings.
Co-Authored-By: claude-flow <ruv@ruv.net>
Update specification.md §3.6 ONLY with an honest implementation-status note:
the matching algorithm is now implemented and tested in
v2/crates/wifi-densepose-bfld/, weights remain unvalidated design intent, and
named-identity locking is data-gated (cardiac+respiratory alone are not
separable — measured gap ~0.0005). The broader Soul Signature system remains
Pre-Implementation.
Co-Authored-By: claude-flow <ruv@ruv.net>
Documents Milestone 3 across the four acquisition crates (vitals, hardware,
wifiscan, calibration). Honest headline: this layer was already well-hardened,
so the real work is small.
- §A1 (perf, MEASURED): Vec::remove(0) O(n^2) sliding windows -> VecDeque.
End-to-end win is NULL within noise at realistic window sizes (DSP dominates);
the win is the algorithmic O(n^2)->O(n) shown in isolation. Claimed nothing
more -- the committed bench proves the null.
- §A2 (correctness): breathing partial-weights scale-mixing -> normalized by
Sigma(effective weights). Pinned by two fail-on-old tests.
- §A3 (stability): IIR resonator divergence. Corrected the research report's
physically-inaccurate trigger (divergence needs |r|>=1, i.e. bw>=4, not "r
negative"); clamp + finite-guard. Pinned by two fail-on-old tests.
- §B1 hardening on an unreachable (already-gated) truncation path -- disclosed.
- §B4 (constant-time HMAC compare) DEFERRED: not worth a new direct `subtle`
dependency for an 8-byte LAN sync-beacon tag.
- MEASURED negative-results section (the centerpiece): esp32_parser length gate,
sync_packet infallible slices, the whole ieee80211bf validate-on-deserialize /
no-panic-FSM / single-role / SBP-single-evaluate model, secure_tdm HMAC+replay,
netsh_scanner fixed-argv + Option parse, geometry_embedding MAX_COORD_M -- each
cited file:line, all NO-ACTION.
- SOTA landscape: deep-CSI vitals (DATA-GATED), 802.11bf conformance (CLAIMED,
non-public suite), per-room calibration (CLAIMED on numbers), native wlanapi
FFI multi-BSSID (CLAIMED-unmeasured -- explicitly NOT claiming the 10x). Mostly
NO-ACTION / ACCEPTED-FUTURE.
- Deferred backlog (§8): nothing silently dropped.
Validation: cargo test --workspace --no-default-features = 3054 passed / 0
failed; python verify.py = VERDICT PASS (hash unchanged, Rust-only changes).
Co-Authored-By: claude-flow <ruv@ruv.net>
Records the integrity-critical fixes (unified canonical metric, leak-free
subject-disjoint split + synthetic-val disclosure, rapid_adapt real gradients,
proof margin + committed-hash rigor), the Tier-2 correctness/security fixes, the
measured Tier-3 perf win, the NN SOTA landscape graded MEASURED/CLAIMED/
THEORETICAL (GraphPose-Fi as top ACCEPTED-future candidate; INT4; CSI-JEPA-vs-MAE
with the honest "no JEPA/MAE-on-WiFi-pose yet" caveat; "Mamba-CSI-pose does not
exist"), and the ~45-finding deferred backlog. Discloses the libtorch/tch-gating
limitation and that the Rust proof is honestly in SKIP until a baseline is
committed.
Co-Authored-By: claude-flow <ruv@ruv.net>
Records Milestone-0 of the signal/DSP beyond-SOTA sweep with full PROOF
discipline (MEASURED vs CLAIMED vs THEORETICAL grading throughout):
- §2 discloses the headline anti-slop finding: the ADR-134 CIR coherence gate
was DEAD in production (canonical-56 frames -> SubcarrierMismatch -> silent
freq-domain fallback for every frame). Documents the canonical56() fix + the
4 committed proof tests.
- §3 NaN/inf adversarial bypass; §4 divide-by-(n-1) window trio.
- §5 the two MEASURED perf wins with before/after medians + reproduce commands.
- §6 per-module SOTA landscape, evidence-graded: deep-unfolded ISTA/LISTA for
CSI->CIR (~3 dB NMSE, MEASURED, arXiv 2211.15440 + 2502.05952), diffusion CIR
prior (public weights, MEASURED), Wi-Spoof adversarial eval (MEASURED, arXiv
2511.20456), Bayesian multi-AP fusion (CLAIMED, no code, 2512.02462),
coherence gating + RF intention-lead (THEORETICAL).
- §7 roadmap: LISTA-for-CIR as the top ACCEPTED-future item (M effort; the ISTA
+ Phi already exist in cir.rs) — proposed, NOT implemented this milestone —
plus the explicit deferred-findings backlog (the ~45 review findings not
fixed here, graded P1/P2/P3) so nothing is silently dropped, with a
horizon-ledger DONE-vs-DEFERRED one-liner.
Co-Authored-By: claude-flow <ruv@ruv.net>
* docs(research): add RuView beyond-SOTA system review (00)
First document of the beyond-SOTA research series: capability audit of
the current RuView engine with role-to-crate maturity matrix, ruvsense
module inventory, gap analysis, and risk register.
https://claude.ai/code/session_01MjBucx95K4BuUxZi8NWwRH
* docs(research): add beyond-SOTA architecture design (02, in progress)
https://claude.ai/code/session_01MjBucx95K4BuUxZi8NWwRH
* docs(research): finalize beyond-SOTA architecture (02)
https://claude.ai/code/session_01MjBucx95K4BuUxZi8NWwRH
* docs(research): add benchmark/validation methodology snapshot (03)
https://claude.ai/code/session_01MjBucx95K4BuUxZi8NWwRH
* docs(research): add beyond-SOTA series index with validation results; changelog
README index ties the 5 research docs together with the session's
measured validation evidence: 2,797 workspace tests / 0 failed, Python
proof PASS (bit-exact), and paired pre/post criterion CIR benchmarks.
https://claude.ai/code/session_01MjBucx95K4BuUxZi8NWwRH
* perf(signal): precompute CIR warm-start system; hoist tomography solver allocs
Exact, determinism-safe optimizations (bit-identical float results):
- cir.rs: diag(PhiH Phi)+lambda*I and its CSR matrix depend only on Phi
and lambda (fixed at CirEstimator::new) but were rebuilt every frame
(O(K*G) pass + CSR allocation). Now built once in new() via
build_warm_start_system; summation order unchanged.
- tomography.rs: ISTA gradient buffer hoisted out of the 100-iteration
loop (fill(0.0) reset) and the Frobenius Lipschitz bound moved from
per-reconstruct to construction.
Verified: signal 456 tests green; engine 11/11 green including
cycle_is_deterministic and witness-stability tests. Criterion paired
pre/post: cir_estimate/he40 -3.9% (p<0.01), multiband -1.2/-1.4%.
https://claude.ai/code/session_01MjBucx95K4BuUxZi8NWwRH
* fix(worldgraph): bound SemanticState growth with deterministic retention
StreamingEngine::process_cycle appended one SemanticState belief per cycle
with no eviction — ~1.7M nodes/day at 20 Hz (beyond-SOTA roadmap finding #6).
Add WorldGraph::prune_semantic_states(max): deterministic eviction of the
oldest beliefs by (valid_from_unix_ms, id); structural nodes (rooms, zones,
sensors, anchors, tracks, events) are never eligible. Wire it into the
engine after each belief append (DEFAULT_SEMANTIC_RETENTION = 7,200, ~6 min
at 20 Hz; set_semantic_retention to tune). The WorldGraph holds current
beliefs; durable history is the recorder's job, so no audit data is lost.
3 new tests: end-to-end bounded growth, oldest-only eviction, deterministic
equal-timestamp tie-break. Workspace gate: 2,865 passed, 0 failed.
https://claude.ai/code/session_01MjBucx95K4BuUxZi8NWwRH
* feat(sensing-server): route live frames through the governed StreamingEngine
Closes the live-trust-path gap (ADR-136 section 8, beyond-SOTA system review):
the running server fused live CSI with the bare MultistaticFuser, while the
privacy/provenance/witness control plane (ADR-135..146) only ever ran on
synthetic in-test frames. The privacy control plane was therefore bypassable
on the real path.
New engine_bridge module drives StreamingEngine::process_cycle from the
server's live NodeState map, reusing the existing NodeState -> MultiBandCsiFrame
conversion. It lazily wires each contributing node as a WorldGraph sensor
(idempotent), bounds belief growth via the retention cap, and forwards explicit
timestamps/calibration ids so the path stays deterministic and replayable.
Wired additively into both live ESP32/WiFi fusion sites in main.rs via a
split-borrow off the write guard, so person-count behavior is unchanged; the
latest BLAKE3 witness is stored on AppState. Every published belief now carries
evidence + model + calibration + privacy decision and a deterministic witness.
Adds wifi-densepose-engine/-worldgraph/-bfld/-geo deps. 6 new bridge tests
(witnessed belief with full provenance, cross-run determinism, idempotent node
registration, retention bound, privacy-mode propagation). sensing-server suite
430+128 green; workspace gate 2,904 passed / 0 failed.
https://claude.ai/code/session_01MjBucx95K4BuUxZi8NWwRH
* feat(train): falsifiable occupancy benchmark with anti-overfitting gate
Makes the presence/person-count "beyond SOTA" claim falsifiable in code
instead of aspirational (the unfalsifiability gap from the beyond-SOTA system
review). occupancy_bench grades predictions vs ground truth and gates a SOTA
claim behind one claim_allowed invariant requiring ALL of:
- DataProvenance::Measured — synthetic/mock data is scorable for regression
but never claimable (anti-mock-contamination; the CLAUDE.md Kconfig-bug
lesson made structural).
- A leak-free EvalSplit — validate() refuses any split where a subject OR
environment id appears in both train and test (subject leakage /
per-environment overfitting).
- n_test >= min_test_samples (small-N guard).
- Presence F1 whose bootstrap-CI lower bound (deterministic seeded splitmix64)
clears the threshold — not the point estimate.
- Count MAE within threshold.
The claim string is unreadable except through the gate (NO_CLAIM otherwise),
same discipline as the ruview-gamma acceptance gate. What remains is data, not
method: a frozen, SHA-pinned, subject/environment-disjoint measured replay set
turns the claim into a passing/failing test.
Lives in wifi-densepose-train (the eval bounded context, alongside ablation/
eval/metrics). 10 tests cover each refusal path; warning-clean under the
crate's missing_docs lint. Workspace gate 2,914 passed / 0 failed. Doc 03
updated.
https://claude.ai/code/session_01MjBucx95K4BuUxZi8NWwRH
* feat(engine): per-room adapter provenance + drift-to-recalibration advisor
Closes the trust-chain gap where an ~11 KB per-room LoRA adapter (ADR-150
section 3.4) could silently change inference without the witness noticing:
provenance carried only "rfenc-v<N>" with no notion of adapter identity.
- StreamingEngine::set_room_adapter(AdapterInfo): pins the adapter's
content-derived id into provenance model_version
("rfenc-v1+adapter:<id>") — and therefore into the BLAKE3 witness — so
swapping or clearing adapter weights always shifts the witness. Engine test
proves base -> adapter -> other-adapter -> cleared all witness differently
and cleared == base.
- RecalibrationAdvisor: recommends re-running the ADR-135 empty-room baseline
/ refitting the room adapter on sustained low fusion coherence (streak
threshold, default 60 cycles ~ 3 s at 20 Hz) or an ADR-142 change-point.
Surfaced as TrustedOutput::recalibration_recommended, stored on the
sensing-server AppState alongside the witness at both live fusion sites.
- Bridge plumbing: EngineBridge::{set_room_adapter, clear_room_adapter} +
live-path test that the adapter id flows into the live witness.
Scope note (honest): this is the deployable provenance/trigger half of the
"retrained model" roadmap item. Fitting the adapter itself runs in the
existing external calibration service (aether-arena/calibration/); a trained
RF-encoder checkpoint still does not exist in-tree.
Engine 15 tests, bridge 7 tests. Workspace gate: 2,918 passed / 0 failed.
https://claude.ai/code/session_01MjBucx95K4BuUxZi8NWwRH
* fix(mat): gate api module behind its feature — standalone no-default-features builds
pub mod api was unconditional while its only dependency, serde, is optional
behind the 'api' feature, so any build without default features failed with
101 unresolved-serde errors (masked in --workspace runs by feature
unification). The api module and its create_router/AppState re-export are now
cfg(feature = "api")-gated with docsrs annotations.
All combos compile: bare --no-default-features (was 101 errors, now 0),
--no-default-features --features api, and full default (177 tests pass).
Workspace gate: 2,918 passed / 0 failed.
https://claude.ai/code/session_01MjBucx95K4BuUxZi8NWwRH
* perf(signal): opt-in FFT operator for the CIR ISTA solver (8-14x measured)
Phi is a sub-DFT, so each ISTA mat-vec can run as one length-G FFT
(O(G log G)) instead of a dense O(K*G) product — the dominant-latency-hazard
finding from the beyond-SOTA optimization roadmap.
New CirConfig::fft_operator, default FALSE: the dense path stays the
bit-exact witness default. The FFT evaluates the same sums in a different
order, so enabling it shifts float results in the last bits and requires
regenerating any pinned witness — strictly opt-in per deployment.
FftOperator (rustfft, planned once at CirEstimator::new, scratch buffers
reused across the ISTA loop) dispatches inside ista_solve:
Phi x = scale * forward-FFT(x) sampled at bins (k_idx mod G)
Phi^H v = scale * unnormalised inverse-FFT of v scattered into those bins
Warm-start and Lipschitz estimation stay dense at construction.
Measured (criterion, same run, same machine):
ht20: 2.22 ms -> 265 us (8.4x)
ht40: 10.26 ms -> 717 us (14.3x)
The real HE40 grid (K=484, G=1452) scales further per the O(K*G)/O(G log G)
ratio.
3 new tests: FFT<->dense matvec equivalence to float tolerance on ht20 and
he40 grids; end-to-end dominant-tap agreement on a single-path frame; all
default configs keep FFT off. New cir_estimate_fft bench group.
Workspace gate: 2,921 passed / 0 failed (default path bit-exact, witnesses
unchanged).
https://claude.ai/code/session_01MjBucx95K4BuUxZi8NWwRH
* feat(core): canonical frame decoder — capture-to-claim replay (ADR-136)
The encode half of the ADR-136 frame contract existed (ComplexSample,
to_canonical_bytes, witness_hash) but there was no decoder: a captured
canonical frame could be witnessed but never reconstructed, blocking
replay-from-capture.
CsiFrame::from_canonical_bytes is the exact inverse: same id, metadata,
complex payload, and witness hash (tested as the round-trip law AC7 — the
replayed frame re-encodes byte-identically). Amplitude/phase are recomputed
from the payload (projections, not independent state). Every malformed-input
class fails closed (AC8): header truncation -> Truncated, payload truncation
-> PayloadMismatch, unknown discriminants, non-UTF-8 device id, trailing
bytes. Nil calibration uuid decodes as None per the documented encoding.
Core: 36 tests pass. Workspace gate: 2,937 passed / 0 failed.
https://claude.ai/code/session_01MjBucx95K4BuUxZi8NWwRH
* feat(engine): dynamic min-cut mesh partition guard (ruvector-mincut)
Maintains an exact min-cut over the live mesh coupling graph — nodes are
sensing nodes, coupling is the product of fusion attention weights — and
surfaces per cycle, as TrustedOutput::mesh:
- cut value: the global "how close is the array to partitioning" number,
a structural measure per-node heuristics miss;
- weak side: which specific nodes would split off (failure/jamming triage,
feeds ADR-032 posture);
- at-risk flag: counts as a structural event for the drift->recalibration
advisor (alongside ADR-142 change-points).
Degenerate cases fail toward risk: a node with zero coupling is reported as
already partitioned (cut 0, that node as the weak side).
Measured cost policy (criterion, 12-node mesh — the honest part):
- weights quantized (1/64) + change-gated: steady-state cycles do ZERO graph
work and reuse the cached cut (~7.3 us, ~23x cheaper than building);
- on any real change a full exact rebuild (~171 us) is used, because ONE
DynamicMinCut delete+insert measured ~240 us — the subpolynomial machinery
amortizes on much larger graphs, so rebuild-on-change is the measured
optimum at mesh scale (one-edge case -28% after switching policy);
- full process_cycle with the guard: ~33 us for 4 nodes vs the 50 ms budget.
9 mesh_guard tests (weak-node detection, steady-state zero updates,
sub-quantum gating, join/drop rebuild, determinism, disconnection) + an
engine-level wiring test (down-weighted node -> weak side -> recalibration).
Engine 24 tests; workspace gate 2,946 passed / 0 failed.
https://claude.ai/code/session_01MjBucx95K4BuUxZi8NWwRH
* feat(engine): mesh partition risk demotes privacy + enters the witness (ADR-032)
Completes the mesh-guard integration: its at_risk signal was advisory-only
(fed the recalibration advisor). It now also contributes to the ADR-141
privacy demotion alongside fusion- and array-level contradictions — a mesh
close to partitioning makes the fused belief less trustworthy, so the cycle
emits at a more restricted class (monotonic; information only removed).
Because effective_class feeds the BLAKE3 witness, a fragmenting array now
shifts the witness: partition risk is auditable, not just logged. The mesh
computation moved ahead of the demotion step in process_cycle; mesh_guard_mut
exposes risk-threshold tuning.
Test: a forced-risk 3-node cycle demotes PrivateHome Anonymous->Restricted
and shifts the witness vs a clean baseline. Engine 25 tests; workspace gate
2,947 passed / 0 failed.
https://claude.ai/code/session_01MjBucx95K4BuUxZi8NWwRH
* fix: public-PR review findings — privacy-path honesty, gate holes, mesh-guard cliff
- sensing-server: engine errors logged+counted (no silent swallow), trust
state exposed via status surface, privacy-demotion claims aligned with
the actual parallel-audit-path behavior
- occupancy_bench: vacuous-F1 hole closed (degenerate test sets fail with
their own criterion); CI-lower-bound test made probative
- mesh_guard: quantization scaled to observed coupling range — >=65-node
balanced meshes no longer permanently at_risk (regression test)
- engine: both wiring tests made probative (same-topology witness compare,
deterministic risk-crossing fixture)
- mat: axum/tokio optional behind api; real serde feature (api enables it)
- core: canonical decoder strict (non-zero reserved bytes and nil UUID
rejected — injective on accepted domain, forged-bytes tests)
- CHANGELOG: un-spliced the FFT/adapter bullet mangle
Co-Authored-By: claude-flow <ruv@ruv.net>
* chore: strip private-track references for public PR
Reword the occupancy-benchmark changelog bullet to drop a cross-reference
to the private research track, and restore the WorldGraph retention bullet
header that was glued onto the preceding MAT bullet.
Co-Authored-By: claude-flow <ruv@ruv.net>
* chore: lockfile refresh for cherry-picked feature set
Co-Authored-By: claude-flow <ruv@ruv.net>
---------
Co-authored-by: Claude <noreply@anthropic.com>
* docs(adr): ADR-151 — Per-Room Calibration & Specialized Model Training
Room-first calibration -> bank of small specialised ruVector models
(breathing, heartbeat, restlessness, posture, presence, anomaly) distilled
from the frozen Hugging-Face-published RF Foundation Encoder (ADR-150).
Four-stage local-first pipeline: baseline (ADR-135 environmental fingerprint)
-> guided enrollment (NEW EnrollmentProtocol, clean anchors not hours) ->
feature extraction (reuse signal_features + ruvsense) -> specialist bank
training (rapid_adapt LoRA heads, RVF storage, HNSW prototypes).
Invariants: specialisation over scale; local heads over a shared public base;
honest STALE degradation on baseline drift. Indexes ADR-149/150/151.
Co-Authored-By: claude-flow <ruv@ruv.net>
* feat(cli): calibration HTTP API for UI-driven baseline capture (ADR-135/151)
Adds `wifi-densepose calibrate-serve` — an Axum HTTP API that wraps the
ADR-135 CalibrationRecorder so a UI (or any client) can drive an empty-room
baseline capture remotely. Stage 1 ("teach the room") of the ADR-151 room
calibration & training pipeline.
A single background task owns the UDP socket (ESP32 0xC511_0001 frames) and
the optional active recorder; HTTP handlers talk to it over an mpsc command
channel and read a shared status snapshot, keeping the &mut recorder
lock-free. CORS permissive so a browser UI can call it.
Endpoints (/api/v1/calibration/*):
GET /health liveness + UDP ingest stats (frames_seen, streaming)
POST /start { tier?, duration_s?, room_id?, min_frames? }
GET /status live progress (state, frames, progress, z, eta) — poll for UI
POST /stop finalize the current session early
GET /result finalized baseline summary (amp/phase-dispersion averages)
GET /baselines list persisted baseline .bin files
Reuses the existing calibrate.rs ESP32 wire parser (made pub(crate)); honest
abort when <10 frames arrive in the window (e.g. ESP32 not streaming).
Verified end-to-end over loopback: start -> 300 replayed HT20 frames ->
state=complete, 52-subcarrier baseline, phase_dispersion_avg=0.00096
(concentrated/valid), persisted to disk; all 6 endpoints exercised.
CLI: 19 tests pass; crate builds clean.
Co-Authored-By: claude-flow <ruv@ruv.net>
* test(cli): firewall-free CSI UDP relay for local Windows ESP32 testing
Windows Defender blocks inbound LAN UDP to a freshly-built binary without an
admin allow-rule; python.exe is already allowed. This relay binds the public
CSI port and forwards each datagram verbatim to a loopback port where
`calibrate-serve --udp-bind 127.0.0.1 --udp-port 5006` listens (loopback is
firewall-exempt). No admin required.
Validated: ESP32-format 0xC5110001 frames -> :5005 -> relay -> :5006 ->
calibrate-serve -> state=complete, 52-subcarrier baseline,
phase_dispersion_avg=0.00098 (clean). Completes the no-admin live-test path.
Co-Authored-By: claude-flow <ruv@ruv.net>
* docs(changelog): record ADR-151 calibration API (calibrate-serve)
Co-Authored-By: claude-flow <ruv@ruv.net>
* feat(calibration): ADR-151 Stages 2–5 — enrollment, extraction, specialist bank, runtime
New crate wifi-densepose-calibration implementing the per-room pipeline beyond
Stage-1 baseline:
- anchor.rs: guided-anchor sequence + event-sourced EnrollmentSession (Stage 2)
- enrollment.rs: AnchorQualityGate + AnchorRecorder — gates anchors against the
ADR-135 baseline deviation (presence/motion), re-prompts bad captures
- extract.rs: Features + AnchorFeature — autocorrelation periodicity (breathing/
HR bands), variance/motion (Stage 3)
- specialist.rs: 6 small room-calibrated models — presence (learned threshold),
posture (nearest-prototype), breathing/heartbeat (band periodicity),
restlessness (calm/active normalization), anomaly (novelty vs anchors) (Stage 4)
- bank.rs: SpecialistBank — train/persist + baseline-drift STALE invalidation
- runtime.rs: MixtureOfSpecialists — presence short-circuit + anomaly veto +
stale flagging (Stage 5)
Statistical heads make the pipeline runnable/validatable today; the ADR-150 HF
RF Foundation Encoder backbone is the documented upgrade path. 29 unit tests pass.
Co-Authored-By: claude-flow <ruv@ruv.net>
* feat(cli): wire ADR-151 enroll / train-room / room-status / room-watch
Integrates the wifi-densepose-calibration crate into the CLI as four
subcommands driving the full Stage 2–5 pipeline against a live ESP32 raw-CSI
stream (edge_tier=0):
- enroll: walks the guided anchor sequence, gates each capture against the
ADR-135 baseline deviation (re-prompts bad anchors), writes labelled features
- train-room: fits the SpecialistBank from the enrollment, persists JSON
- room-status: prints a trained bank's summary
- room-watch: live mixture-of-specialists readout (presence/posture/breathing/
heart/restless) over a rolling window, with anomaly veto + STALE flagging
Per-frame scalar is the mean CSI amplitude (carries presence/motion + breathing
modulation). Validated end-to-end on the live ESP32 (COM8, edge_tier=0): the
real parser → feature extraction → runtime detected breathing (~16–31 BPM) on
hardware. Full multi-anchor enrollment accuracy requires the operator to perform
the poses; phase-based breathing extraction is a noted refinement.
48 tests pass (29 calibration + 19 CLI).
Co-Authored-By: claude-flow <ruv@ruv.net>
* docs(adr-151): mark Stages 1–5 implemented; expand CHANGELOG
Co-Authored-By: claude-flow <ruv@ruv.net>
* fix(cli): keep proven mean-amplitude carrier for room features
The max-variance-subcarrier carrier locked onto motion artifacts (not
breathing) and also had an out-of-bounds bug on variable CSI subcarrier
counts. Reverted to the mean-amplitude carrier, which is validated live to
detect breathing. Phase-based extraction on a stable subcarrier remains the
proper higher-SNR refinement (ADR-151 §4).
Co-Authored-By: claude-flow <ruv@ruv.net>
* feat(calibration): multistatic fusion of co-located nodes (ADR-029/151)
MultiNodeMixture fuses several co-located nodes (each with its own
room-calibrated SpecialistBank) into one RoomState:
- presence: OR across nodes (any node seeing a person wins)
- posture/breathing/heartbeat: highest-confidence node (best viewpoint)
- restlessness/anomaly: max across nodes
- veto: any node's physically-implausible signal vetoes the room's vitals
(anti-hallucination, same as single-node runtime) + presence short-circuit
- stale: any node's STALE flag propagates
Same-room multistatic only; cross-room is federation (ADR-105), not fusion.
6 unit tests (presence OR, best-confidence breathing, single-node veto,
staleness). 35 calibration tests pass.
Co-Authored-By: claude-flow <ruv@ruv.net>
* feat(cli): multistatic room-watch — fuse co-located nodes (ADR-029/151)
`room-watch --node-bank N:path` (repeatable) groups live CSI frames by node_id
and fuses per-node banks via MultiNodeMixture. Validated live on COM8 (node 9,
edge_tier=0): frames grouped + fused end-to-end. True 2-node fusion is covered
by unit tests; a second raw-CSI node is the hardware blocker. 54 tests pass.
Co-Authored-By: claude-flow <ruv@ruv.net>
* docs(integration): calibration → cognitum-v0 appliance integration overview
Detailed cross-repo integration spec for cognitum-one/v0-appliance: data
contracts (CSI wire format, ADR-135 baseline binary, enrollment/bank/RoomState
JSON schemas), calibrate-serve HTTP API, public crate API, Pi5+Hailo tiering,
and a 5-step appliance integration plan. Grounded in the verified cognitum-v0
inventory (aarch64, cargo 1.96, HAILO10H, ruview-vitals-worker:50054).
Co-Authored-By: claude-flow <ruv@ruv.net>
* fix(calibration): address PR review — aarch64 decouple, API auth, path traversal, throttle
Resolves the review on #989:
- **Cross-compile (the appliance blocker):** make wifi-densepose-mat optional
and feature-gate it (`mat`), so `cargo build -p wifi-densepose-cli
--no-default-features` excludes the mat→nn→ort(ONNX)→openssl-sys chain.
Verified: `cargo tree --no-default-features` shows 0 ort/openssl deps →
calibration cross-compiles clean for the Pi.
- **Security (must-fix before LAN):**
- `--token` / CALIBRATE_TOKEN bearer-auth middleware on every route; warns if
bound non-loopback without a token.
- sanitize client-supplied `room_id` to [A-Za-z0-9_-] (≤64) before it reaches
the baseline write path — kills the `../` file-write primitive. + test.
- **Perf:** stop locking shared status + cloning SessionStatus on every UDP
frame — counters/snapshot flush on the 200 ms tick instead (no CPU
starvation under flood). finalize write moved to async `tokio::fs::write`.
- **Docs:** ADR-151 STALE wording matches the impl (baseline-id change;
drift-threshold = P6 refinement); integration doc gets the
`--no-default-features` build + auth/sanitize notes.
35 calibration + 15 CLI tests (no-default) / 20 CLI (default) pass.
Co-Authored-By: claude-flow <ruv@ruv.net>
* docs(worldgraph,worldmodel): add crates.io READMEs
Plain-language overviews + feature lists, comparison tables (symbolic graph vs
predictive occupancy; graph vs grid vs event-log), usage, and technical
details. Adds readme = "README.md" to both manifests so they render on
crates.io on the next release.
Co-Authored-By: claude-flow <ruv@ruv.net>
* release: worldgraph & worldmodel 0.3.1 (READMEs on crates.io)
Co-Authored-By: claude-flow <ruv@ruv.net>
* docs: precise calibration validation scope (capture+API+auth proven; clean enroll→train→infer not yet on-target)
Aligns ADR-151 §7 + the appliance integration doc with the PR #989 scope
clarification: nothing has run a clean baseline → enroll → train → infer on
live CSI; the live breathing read used the stateless head, not a trained bank.
Adds --source-format adr018v6 to the backlog.
Co-Authored-By: claude-flow <ruv@ruv.net>
* feat(calibrate-serve): live GET /room/state endpoint (mixture over CSI window)
Adds a live RoomState readout over HTTP — the appliance UI's main need. The
ingest task maintains a rolling per-frame scalar window (flushed on the 200 ms
tick, no per-frame lock); the handler loads a bank (resolved as a sanitized
name under output_dir — same path-traversal defense as room_id), runs the
MixtureOfSpecialists over the window, returns RoomState JSON.
Validated live (ESP32-S3 via relay): breathing 14-19 BPM over HTTP; a
bank=../../etc/passwd query is neutralized to 'etcpasswd' (no traversal).
Co-Authored-By: claude-flow <ruv@ruv.net>
* feat(calibrate-serve): POST /room/train + fix AnchorLabel JSON to snake_case
- POST /api/v1/room/train: { room_id, baseline_id, anchors[] } → trains a
SpecialistBank and persists it as <output_dir>/<room_id>.json (path-sanitized),
readable via /room/state?bank=<room_id>. Completes the HTTP train→infer loop.
- Fix data-contract bug: AnchorLabel serialized as PascalCase variant names
(serde default) while as_str() + the integration doc used snake_case. Added
#[serde(rename_all = "snake_case")] so the JSON wire format matches the
documented contract (empty/stand_still/…). Locked with a roundtrip test.
Validated live (ESP32-S3): POST train (4 anchors → 6 specialists, persisted) →
GET /room/state returns RoomState with the trained presence/restlessness; the
synthetic-vs-real scale mismatch correctly triggers the anomaly veto. 36
calibration tests pass.
Co-Authored-By: claude-flow <ruv@ruv.net>
* feat(calibrate-serve): live enroll-over-HTTP (POST /enroll/anchor + /enroll/status)
Closes the last HTTP gap — the appliance can now drive the ENTIRE calibration
pipeline over HTTP without the CLI:
baseline (start/stop) -> enroll/anchor x8 -> room/train -> room/state
- POST /enroll/anchor { room_id, baseline, label, duration_s? }: the ingest task
loads the baseline (sanitized name under output_dir), captures the anchor for
the duration against it (AnchorRecorder + per-frame series), runs the quality
gate, and on completion replies with the verdict + accumulates the AnchorFeature
in an in-server enrollment map keyed by room_id. Re-prompts on rejection.
- GET /enroll/status?room=<id>: accepted anchors, next, complete.
- POST /room/train now falls back to the in-server enrollment when anchors[] is
omitted.
Validated live (ESP32-S3): capture baseline -> enroll stand_still (271 frames,
6s) -> gate correctly rejects "no person detected (presence_z 0.90 < 1.50)"
relative to a same-occupancy baseline (a clean empty-room baseline is the
documented on-target prerequisite). Builds clean; CLI tests pass.
Co-Authored-By: claude-flow <ruv@ruv.net>
* test(calibrate-serve): HTTP integration tests for the room/enroll endpoints
Factor the router into build_router() (shared by execute + tests) and add
tower-oneshot integration tests (no network/ingest needed):
- health + descriptor → 200
- POST /room/train persists the bank; GET /room/state → 200; train with no
anchors/enrollment → 400
- path-traversal: /room/state?bank=../../etc/passwd → 404 (sanitized, never
reads outside output_dir)
- enroll/status empty; /enroll/anchor with an unknown label → 400
CI regression coverage for the endpoints added this session. 18 CLI tests pass.
Co-Authored-By: claude-flow <ruv@ruv.net>
* fix(mat): make serde non-optional — unblocks `cargo test --workspace --no-default-features`
Making wifi-densepose-mat optional in the CLI (for the aarch64/ort decouple)
exposed a latent feature bug: mat's `api` module compiles unconditionally and
uses serde, but `serde` was an optional dep enabled only via the `api`/`serde`
features. Previously the CLI's *unconditional* mat dependency enabled those
features transitively, so `--workspace --no-default-features` still got serde;
once mat became optional+gated, the workspace build lost it →
`error[E0432]: unresolved import serde` across mat's api/* (CI red).
mat already pulls serde_json + axum unconditionally, so making `serde`
non-optional has no real cost and restores the workspace build. Does NOT affect
the aarch64 CLI build (mat isn't built there at all): verified
`cargo tree -p wifi-densepose-cli --no-default-features` still shows 0
ort/openssl deps, and `cargo test --workspace --no-default-features` compiles
clean.
Co-Authored-By: claude-flow <ruv@ruv.net>
* docs(claude.md): add wifi-densepose-calibration to crate table (pre-merge)
Co-Authored-By: claude-flow <ruv@ruv.net>
* docs(adr): ADR-152 — WiFi-pose SOTA 2026 intake (geometry-conditioned calibration, external benchmarks, encoder recipe)
Records the 2026-06-10 deep-research run (22 sources, 110 claims, 25
adversarially verified: 24 confirmed / 1 refuted) and the decisions it
implies:
- §2.1 ACCEPTED: geometry-condition the ADR-151 calibration system —
NodeGeometry at enrollment, geometry embeddings for future LoRA heads,
PerceptAlign-style two-checkerboard camera↔WiFi alignment for the
ADR-079 supervised path. PerceptAlign (MobiCom'26) names the failure
mode ("coordinate overfitting") that matches our own ADR-150 cross-
subject collapse.
- §2.2 ACCEPTED: benchmark protocol vs external "WiFlow-STD (DY2434)"
(claimed 97.25% PCK@20, Apache-2.0 weights+dataset) with a no-citation
rule until measured on our 17-keypoint ESP32 eval set. Name collision
with our internal WiFlow is disambiguated.
- §2.3 ACCEPTED: amend ADR-150 training recipe per UNSW MAE study —
80% masking, (30,3) patches, data-over-capacity priority (log-linear,
unsaturated at 1.3M samples).
- §2.4 watch items: IEEE 802.11bf-2025 published 2025-09-26;
esp_wifi_sensing as external presence baseline (drop-in claim REFUTED
0-3); ZTECSITool 160MHz/512-subcarrier anchor node (procurement-gated).
- §2.5 NOT adopted: non-WiFi "foundation model" papers; DensePose-UV
(no 2025-2026 work does UV regression from commodity WiFi).
Every number is evidence-graded CLAIMED vs MEASURED in the source
register. Re-check horizon 2026-12.
Co-Authored-By: RuFlo <ruv@ruv.net>
* test(calibration): full-loop integration test — baseline→enroll→train→infer proven in-process (ADR-151 §7 gap, software half)
Closes the software half of PR #989's headline validation gap: the
complete calibration loop had never run end-to-end anywhere, even
in-process. tests/full_loop.rs (412 lines, deterministic xorshift32
room simulator, HT20/52-subcarrier/20Hz, same fingerprint family as
the ADR-135 roundtrip test) now drives the CLI's exact stage order
through the public API:
1. baseline — 600 static frames, zero motion flags post-warmup,
calibration_uuid() exactly as the CLI derives it
2. enroll — all 8 AnchorLabel::SEQUENCE anchors through
AnchorQualityGate::default(), session is_complete()
3. extract — AnchorFeature::from_series recovers injected 0.25Hz
and 0.125Hz breathing within ±0.04Hz
4. train — SpecialistBank::train fits all 6 specialists; JSON
round-trip and the runtime consumes the RELOADED bank
5. infer — positive: never-enrolled 0.30Hz subject reads present,
18±2 BPM; negative: empty window reads absent;
degradation: foreign baseline_id flags STALE
Seed-robust (5 seeds), passes with and without default features:
36 unit + 1 integration green.
Validation docs updated (ADR-151 §7 + integration doc §7 matrix): what
remains is strictly the on-target hardware session (real CSI, physically
empty room, operator performing the guided anchors). Three behavioral
findings from building the test are recorded for pre-session triage:
z-band squeeze between baseline motion flagging (z>2.0) and the still-
anchor gate (presence_z≥1.5) — likeliest on-hardware enroll failure;
variance-only PresenceSpecialist missing motionless-person mean shift;
ungated breathing_hz/heart_hz in noise-window embeddings.
Co-Authored-By: RuFlo <ruv@ruv.net>
* fix(calibration): close all four ADR-152 behavioral findings pre-hardware-session
The full-loop integration test surfaced three findings; fixing the third
exposed a fourth. All four are fixed and regression-guarded:
1. z-band squeeze (enrollment.rs) — anchor motion is now measured from
frame-to-frame deltas of the deviation series (|Δz| > Z_DELTA_MOTION
0.5 ∨ |Δφ| > π/6), not from the absolute motion_flagged, which fires
at amplitude_z_median > 2.0 vs the EMPTY baseline and so conflated
presence strength with motion. A strongly-reflecting still person
(z = 3.0 — every frame flagged by the old heuristic) now enrolls.
The old unit tests mocked (z=3.0, motion=false), a combination the
real deviation() can never emit — which is exactly how the squeeze
hid; tests now derive the flag from z the way the producer does.
2. variance-only presence (specialist.rs) — PresenceSpecialist gains a
mean-shift channel: present when variance > threshold OR
|mean − empty_mean| > mean_dist_threshold (trained at half the
empty→occupied mean distance, None when the means don't separate).
Detects the motionless person whose body raises the scalar mean but
not its variance. Old persisted banks deserialize with the channel
inert (serde default None) — variance-only behavior preserved,
proven by a fixture test against pre-change JSON.
3. ungated hz embedding (extract.rs) — Features::embedding() zeroes
breathing_hz/heart_hz below EMBED_MIN_SCORE (0.25), keeping the
random in-band peaks of noise windows out of the posture/anomaly
prototype space. Raw fields stay ungated (specialists have their
own stricter gates).
4. heart-band lag-floor leakage (extract.rs, found while fixing 3) —
a pure 0.30 Hz breathing signal scored 0.67 in the heart band at
3.33 Hz: out-of-band rhythm leaks as a monotonic slope whose max
sits at the band's lag floor, so score gating alone cannot stop it.
autocorr_dominant now requires the winning lag to be an interior
local maximum; band-edge "peaks" are rejected, true in-band peaks
(interior by definition) are preserved.
full_loop.rs strengthened to drive the fixes end-to-end: the StandStill
anchor is now a z=3.0 strong reflector (unenrollable pre-fix), and a new
motionless-person runtime case proves mean-channel detection at empty-
level variance.
Validation: 41 calibration unit + 1 full-loop integration + 23 CLI tests
green; cargo test --workspace --no-default-features exit 0.
Co-Authored-By: RuFlo <ruv@ruv.net>
The --export-rvf handler ran *before* the --train/--pretrain handlers and
unconditionally wrote placeholder sine-wave weights, then returned. So the
documented `--train --dataset … --export-rvf <path>` workflow
(user-guide.md) short-circuited to a PLACEHOLDER model and never trained —
printing "exported successfully" for a non-functional model. Given the
project's anti-"is it fake" stance, silently emitting a fake model is the
wrong default.
Fix:
- Only emit the placeholder container-format demo when --export-rvf is used
*standalone* (new `export_emits_placeholder_demo` guard). With
--train/--pretrain, fall through so the real training pipeline runs and
exports calibrated weights.
- The standalone path now prints a clear WARNING that it writes a
container-format demo with placeholder weights — not a trained model —
pointing to --train / a pretrained encoder (#894).
- Docs: flag --export-rvf as a placeholder demo in the flag table, and fix
the Docker training example to use --save-rvf (consistent with the
from-source example) instead of the placeholder --export-rvf.
3 unit tests for the guard. Full crate unit suite: 429 + 117 passed, 0 failed.
The v1 "100% presence accuracy" headline was already retracted in the
README / user-guide intro / proof-of-capabilities — but 6 secondary
spots still flatly claimed "100% accuracy, never false alarms", which
made proof-of-capabilities.md's "replaced everywhere" assertion untrue.
Completed the retraction in-place with the honest label-free metric
(82.3% held-out temporal-triplet; v1 was a single-class recording where
a constant "yes" scores ~99.98%):
- docs/readme-details.md — 2 benchmark tables + the pre-trained-model row
- docs/user-guide.md — capability table, model-file comment, applications list
- CHANGELOG.md — annotated the historical entry in-place (kept as public
record per built-in-public ethos, not rewritten)
Verified: no remaining flat "100% presence/accuracy" claim lacks a
retraction marker; proof-of-capabilities.md "replaced everywhere" is now
accurate.
verify.py's published hash is now f8e76f21 (doppler excluded). Document
that the proof reproduces bit-for-bit across Windows / two Linux hosts /
the Azure CI runner, that the peak-normalized Doppler is excluded due to
its cross-microarch argmax instability, and that a relative-tolerance
check against a committed reference vector backs the five stable features.
- README: replace retracted "100% presence" claim with honest 82.3%
held-out temporal-triplet; correct stale "pose model not in this
release" (now live at ruvnet/wifi-densepose-mmfi-pose, 82.69%
torso-PCK@20 SOTA); add a Results & proof table (HF models,
AetherArena, benchmark study, deterministic verify.py proof, witness).
- user-guide: same 100%->82.3% correction in two places; add Results &
proof pointers and the SOTA pose model + AetherArena links.
- docs/proof-of-capabilities.md (new): evidence-first rebuttal to the
"fake / misleading" claims. Concedes what was fair (over-stated early
metrics, AI-doc tone), refutes the category errors (simulate-mode
mistaken for fraud; missing weights mistaken for missing pipeline),
and gives copy-paste "prove it yourself" steps (verify.py VERDICT:
PASS + published SHA-256, cargo test, HF model pull, ESP32 CSI).
Emphasizes built-in-public history (git, 96 ADRs, CHANGELOG, issues
incl. #803/#872 bug->fix arcs) as the anti-facade evidence.
- aether-arena/VERIFY.md: cross-link the whole-platform proof doc.
Verified: python archive/v1/data/proof/verify.py -> VERDICT: PASS
(hash ca58956c...9199 matches published expected_features.sha256).
Co-Authored-By: claude-flow <ruv@ruv.net>
Random frozen encoder + trained head matches a fully-trained encoder to
within 2-4pts (cross-subject <2pts). WiFi-CSI sensing is largely a
random-features + target-readout problem: barely a learned representation
to transfer, which unifies the zero-shot collapse, no-transfer results,
foundation-encoder failure, and why per-room calibration works. Practical:
invest in readout + calibration, not encoder pretraining.
Co-Authored-By: claude-flow <ruv@ruv.net>
Re-ran transfer on 14-class person-ID (harder than 6-activity HAR): same
null-transfer result (MM-Fi pretrain 91.7% = random 92.8%). Unified root
cause: CSI in-domain classification lives in the target-trained readout
(random projection already separable); learned reps don't transfer across
subjects/rooms/datasets. WiFi-CSI is distribution-locked. Addresses the
'HAR too easy' caveat.
Co-Authored-By: claude-flow <ruv@ruv.net>
Tested the cross-dataset frontier: MM-Fi-trained CSI representation does NOT
transfer beneficially to NTU-Fi HAR (frozen probe 91.5% = random features
93%; full fine-tune 75% < probe). CSI reps are distribution-locked, same
root cause as within-MM-Fi cross-subject/-env collapse. Caveat: NTU-Fi 6
coarse activities are an easy target (random->93%). Updates the study's
cross-dataset limitation from 'untested' to this measured result.
Co-Authored-By: claude-flow <ruv@ruv.net>
Consolidates the full campaign into one committed, citable artifact (the
detailed log was in a gitignored staging report): pose SOTA 83.6% + 20KB
int4 edge model; action recognition 88% (a WiFi task MM-Fi never
benchmarked); the generalization story (zero-shot collapse, few-shot
calibration rescue, task-general across pose+action); all honest negatives
(CORAL/DANN/instance-norm/SupCon/distillation/subject-scaling); the 11KB
calibration-adapter deployment recipe; honest limitations (cross-dataset
untested, ARM latency pending).
Co-Authored-By: claude-flow <ruv@ruv.net>
Verified on a 2nd MM-Fi task: 27-class action recognition (which MM-Fi
never benchmarked for WiFi; only published baseline WiDistill 34%). In-domain
88% (leaky); cross-subject zero-shot collapses to ~10%; few-shot calibration
rescues 10->76% (1000 samples). Same mechanism as pose -> few-shot in-room
calibration is the universal WiFi-sensing generalization answer, not a pose
quirk.
Co-Authored-By: claude-flow <ruv@ruv.net>
Decisive capstone: cross-environment (unseen room+people) zero-shot
10.6%, but 5 calibration samples/person -> 60%, 200 -> 73%. The hard
frontier is calibration-soluble, MORE dramatically than cross-subject
(+62.5 vs +12 at K=200). The unsolved-frontier framing was a zero-shot
artifact. Reframes generalization: ship few-shot calibration, not
zero-shot invariance. Recommend accepting ADR-150 re-scoped around the
calibration mechanism.
Co-Authored-By: claude-flow <ruv@ruv.net>
Compared per-room calibration methods at K=200: LoRA rank-8 recovers
63.6->72.5% (SOTA-level) with just 11K params (~11KB), 0.5% the model
size. Validates the ship-base-once + tiny-per-room-adapter mechanism for
the RuView calibration service. Accuracy/size knob documented.
Co-Authored-By: claude-flow <ruv@ruv.net>
Measured cross-subject PCK vs N training subjects: 4->8 = +21pts, but
24->32 = +0.45pt. Saturates ~64%, ~19pt below in-domain. Correction to
'more data': subject-count returns vanish past ~16-20; the residual is
device/room/protocol shift. Re-scope phase-1 capture around DIVERSITY
(rooms/devices/protocols) + few-shot target adaptation, not headcount.
Co-Authored-By: claude-flow <ruv@ruv.net>
Published deployable int4-QAT micro (verified 74.08%, ~20KB) at
ruvnet/wifi-densepose-mmfi-pose/edge. Runs 0.135ms single-thread x86 CPU
(no GPU) - real-time pose without an accelerator. ARM on-device validation
pending fleet availability.
Co-Authored-By: claude-flow <ruv@ruv.net>
Swept model size on MM-Fi random_split: every config from micro (75,237
params, 0.22ms, 74.30%) up beats MultiFormer (72.25%); nano (40K, 0.13ms)
within 0.5pt. Pareto-dominant (smaller AND more accurate than prior SOTA).
Orthogonal to the data-bound accuracy frontier (ADR-150).
Co-Authored-By: claude-flow <ruv@ruv.net>