wifi-densepose

Commit Graph

Author	SHA1	Message	Date
rUv	261ce80a72	feat(adr-260): RuField MFS spec + vendor/rufield submodule (#1061 ) ADR-260 (Accepted — v0.1 reference stack): RuField, the open specification for camera-free multimodal field sensing — one FieldEvent/FieldTensor/ FusionGraph/PrivacyClass/ProvenanceReceipt model above WiFi CSI/CIR/BFLD, UWB, BLE Channel Sounding, mmWave radar, ultrasound, subsonic, infrared, and quantum sensors. Published standalone as github.com/ruvnet/rufield and vendored here as the vendor/rufield submodule (the vendor/rvcsi pattern — not a v2/ workspace member). v0.1 reference stack: 6 crates, 60 tests/0 failed, clippy-clean. All benchmark metrics SYNTHETIC (simulator ground truth, no hardware). Co-authored-by: ruv <ruvnet@gmail.com>	2026-06-14 01:17:11 -04:00
rUv	0c2b1c16cc	fix: ESP32 vitals over-count + presence flicker (#998/#996) + Observatory per-person position/motion (#1050 ) (#1060 ) * fix(firmware): gate phantom persons + add presence hysteresis (#998, #996) Two ESP32 edge-vitals logic bugs in edge_processing.c. Both are robustness/logic fixes — NOT validated-accuracy claims. True count/PCK vs labelled ground truth remains hardware/data-gated (COM9 ESP32-S3). #998 — n_persons over-counted (reported 4 for one person): update_multi_person_vitals() split top-K subcarriers into top_k_count/2 groups and marked EVERY group active, so one body's multipath always read the full EDGE_MAX_PERSONS. Added two pure, host-testable helpers: - count_distinct_persons(): per-group energy gate (EDGE_PERSON_MIN_ENERGY_RATIO) + spatial dedup (EDGE_PERSON_MIN_SC_SEP) so weak/adjacent multipath groups don't count as separate bodies. Strongest group always counts (>=1). - person_count_debounce(): a gated count must hold EDGE_PERSON_PERSIST_FRAMES consecutive frames before it's emitted, so a single noisy frame can't promote a phantom. The active flags now mark only the strongest stable_count groups. #996 — presence flag flickered at ~50cm despite high presence_score: the bare `score > threshold` compare chattered on a noisy score (field-observed 2.6-26.7 frame-to-frame). Replaced with a Schmitt trigger + clear-debounce (presence_flag_update): assert above threshold, hold in the dead band down to threshold * EDGE_PRESENCE_HYST_RATIO, clear only after EDGE_PRESENCE_CLEAR_FRAMES consecutive sub-low frames. presence_score itself is unchanged and still emitted for consumer-side thresholding. All thresholds are named, documented constants in edge_processing.h. Firmware builds clean for esp32s3 (idf.py build RC=0). Co-Authored-By: claude-flow <ruv@ruv.net> * test(firmware): host C99 tests for vitals count + presence logic (#998, #996) test/test_vitals_count_presence.c pins the two fixes with deterministic host-buildable tests (no ESP-IDF needed). 13 cases / 22 assertions, all passing under gcc 13 -Wall -Wextra: #998 count gate: single strong signature + multipath -> count==1; two well-separated -> 2; two strong-but-adjacent -> 1 (dedup); no signal -> 0; three well-separated -> 3. #998 debounce: transient spike rejected; sustained change accepted; flapping count stays stable. #996 presence: dithering trace -> stable flag (no flicker); brief dips held by clear-debounce; genuine departure clears within hold window; dead-band holds state. The named tuning constants are #include'd from the real edge_processing.h so the test and firmware can never disagree on thresholds. `make run_vitals` / `make host_tests` added; binaries gitignored. Hardware-gated caveat documented in the test header: these pin the decision LOGIC; the exact energy/separation/hysteresis values that best match a real room vs labelled occupancy remain on-device tuning. Co-Authored-By: claude-flow <ruv@ruv.net> * docs: record ESP32 vitals count/presence fixes (#998, #996) CHANGELOG [Unreleased] Fixed: root cause + fix + named constants + test + explicit hardware/data-gated caveat for both bugs. ADR-021 Implementation Notes: dated 2026-06 entry noting the edge-path person-count + presence-flicker fixes are boolean/count emission-logic fixes, not a validated-accuracy claim; thresholds pending on-device calibration. Co-Authored-By: claude-flow <ruv@ruv.net> * fix(sensing-server): emit real field-derived person position/motion to /ws/sensing (#1050) The Observatory 3D figure never animated because the sensing_update WS frame carried no per-person position/motion_score/pose — only image-space keypoints. The FigurePool/PoseSystem (and demo-data.js's own contract) animate each figure from persons[i].position (room-world), .motion_score (0..100), and .pose; none were on the live stream. Honest scope (Case 2): the pipeline has no calibrated per-person room localizer or per-person skeletal pose. New field_localize module extracts the strongest peak(s) from the real signal_field grid (subcarrier variances x motion-band power) and maps the peak cell to Observatory world coords with the exact _buildSignalField transform. motion_score is the measured motion_band_power passed through; pose is set only from a real aggregate posture estimate, else None (never a fabricated skeleton). Empty/below-threshold field -> persons: [] (no phantom); present person with no resolvable peak keeps position [0,0,0], not invented coords. attach_field_positions runs after the tracker step at all five broadcast sites. New position/motion_score/pose fields added to both PersonDetection structs. No UI change needed — the Observatory already reads these fields. Tests: field_localize peak/coordinate/empty/separation units + observatory_persons_field_position_tests (known-peak -> emitted position, empty-room -> no phantom, pose real-or-None, below-threshold honesty). sensing-server bin 441->451, 0 failed. Co-Authored-By: claude-flow <ruv@ruv.net> * docs(changelog): record #1050 Observatory persons position/motion fix Co-Authored-By: claude-flow <ruv@ruv.net>	2026-06-14 00:31:30 -04:00
rUv	1d12e8831a	refactor(beyond-sota): ADR-155 M2 — host-verifiable §8 closeout (7 de-magic, 9 boundary tests, native-conv honest-null) (#1059 ) * refactor(train): ADR-155 M2 §8 — de-magic train non-tch tuning constants + boundary tests Lift bare numeric literals used as thresholds / guard epsilons in the non-tch (host-verifiable) train surface into named, documented consts and pin each set with a _consts_unchanged_from_literals test. Values are bit-identical to the prior inline literals — cleanup, no behaviour change. De-magicked (const + pin test): - metrics_core.rs: VISIBILITY_THRESHOLD (0.5), MIN_REFERENCE_EXTENT (1e-6), OKS_FALLBACK_SIGMA (0.07) - ruview_metrics.rs: NUM_KEYPOINTS (17), VISIBILITY_THRESHOLD (0.5), PCK_THRESHOLD (0.2), MIN_BBOX_DIAG (1e-3), MIN_DURATION_MINUTES (1e-6) - subcarrier.rs: SPARSE_BASIS_SIGMA (0.15), SPARSE_BASIS_THRESHOLD (1e-4), SPARSE_REGULARIZATION_LAMBDA (0.1), SPARSE_COO_PRUNE_EPS (1e-8), SPARSE_SOLVER_TOL (1e-5 f64), SPARSE_SOLVER_MAX_ITERS (500) - eval.rs: MIN_POSITIVE_MPJPE (1e-10) - domain.rs: LAYER_NORM_EPS (1e-5) - virtual_aug.rs: BOX_MULLER_U1_FLOOR (1e-10), MIN_ROOM_SCALE (1e-10) Boundary / characterization tests (pin CURRENT behaviour): - visibility_threshold_boundary_is_inclusive (>= 0.5 at the edge) - degenerate_extent_below_floor_is_unscoreable ((0,0,0.0)/0.0, not perfect) - tracking_zero_duration_does_not_divide_by_zero - oks_short_array_is_bounded_at_keypoint_count (16 rows, no panic) - compute_interp_weights_single_target_is_index_zero (target_sc==1) - sparse_interp_single_target_is_finite - domain_gap_infinite_when_in_domain_perfect_but_cross_nonzero - domain_gap_unity_when_everything_perfect - augment_frame_zero_room_scale_passes_amplitude_finite Doc-only (no behaviour change): - rapid_adapt.rs: correct module-doc O(eps) -> O(eps^2) for central differences - geometry.rs: add # Panics to DeepSets::encode (documents existing assert!) train --no-default-features: 191 lib (was 176), 303 total (was 288), 0 failed. Co-Authored-By: claude-flow <ruv@ruv.net> feat(nn): ADR-155 M2 §3 — pure-Rust LinearHead::try_new input guard + de-magic softplus threshold ADR-155 §3 found rf_encoder.rs has no adversarial checkpoint-deserialization assert — its assert_eq!s in LinearHead::new are construction-time API contracts on programmer-supplied vectors. This adds the honest, in-scope improvement the M2 task allows: a pure-Rust fallible constructor so weights from an untrusted / deserialized checkpoint can be shape-validated without panicking. - Add RfHeadError (WeightShape / BiasShape / VarWeightShape) + Display + Error. - Add LinearHead::try_new returning Result<Self, RfHeadError>; on success the head is byte-identical to LinearHead::new. new() is unchanged (still asserts; now documents # Panics and points to try_new) — no behaviour change for existing callers. - De-magic softplus's bare 20.0 overflow threshold into SOFTPLUS_LINEAR_THRESHOLD (value unchanged) + pin test. Tests: try_new_accepts_valid_and_rejects_each_bad_shape (valid == new forward; each bad shape → typed error), softplus_threshold_unchanged_from_literal. nn --no-default-features lib: 37 passed (was 35), 0 failed. Co-Authored-By: claude-flow <ruv@ruv.net> * perf(nn): ADR-155 M2 §4 — native-conv bench-first → MEASURED-INCONCLUSIVE (no perf change shipped) The §8 "native-conv naive-loop rewrite" backlog item: DensePoseHead:: apply_conv_layer is a pure-Rust 6-nested-loop conv (benchable on this host, not tch/ort-gated). Bench-first per the §0 PROOF discipline. - Add committed criterion bench benches/native_conv_bench.rs measuring forward() through the naive conv on representative single-layer configs (--no-default- features; no ort download). - Prototyped a bit-identical range-clamped variant (hoist the per-tap in-bounds branch by pre-clamping kh/kw ranges; same ic→kh→kw MAC order ⇒ bit-identical). MEASURED before/after on this host: ~35% faster on padding-heavy small-channel maps (4.40→2.84 ms) but a ~3% regression on channel-heavy maps (11.09→11.48 ms), all inside a ±20% run-to-run noise floor. Verdict: INCONCLUSIVE — the benefit is not robustly positive, so the rewrite is NOT shipped and NOT a fabricated speedup. Reverted to the naive loop; honestly deferred (ADR-155 §8). - Add native_conv_matches_reference: a hand-computed characterization anchor (1×1 = scalar MAC; same-padded 3×3 ones = truncated-window sums 9/6/4) pinning CURRENT conv behaviour for any future rewrite. nn --no-default-features lib: 38 passed (was 37), 0 failed. No behaviour change. Co-Authored-By: claude-flow <ruv@ruv.net> * docs(adr-155): M2 §8.2 — enumerated host-verifiable P3 backlog clearance + CHANGELOG Replace the §8 bulk "~40 lower-severity findings" line with the real, enumerated M2 resolution (§8.2): 7 de-magicked (const + pin == prior literal), 9 boundary tests, 1 input guard (rf_encoder try_new), 2 doc-only, 1 perf bench-first MEASURED-INCONCLUSIVE (not shipped). Mark native-conv + rf_encoder RESOLVED; state which §8 items stay data-gated (GraphPose-Fi/INT4/CSI-JEPA) or tch-gated (proof/trainer/model panic sites, metrics *_v2 dead code) and ONNX read-lock upstream-gated — blocked, not dropped. Declare the non-tch-verifiable subset of §8 cleared. Validation: train --no-default-features 303 passed (was 288); nn lib 38 (was 35); workspace --no-default-features 3,293 passed, 0 failed; Python proof VERDICT PASS, hash f8e76f21…46f7a UNCHANGED bit-exact. Co-Authored-By: claude-flow <ruv@ruv.net>	2026-06-14 00:07:56 -04:00
rUv	8c24b8bdfe	refactor(beyond-sota): ADR-154 M3 — clear §7.4 P3 backlog (22 de-magic + 6 boundary tests, backlog 36→0) (#1057 ) * refactor(signal): de-magic motion.rs tuning constants (ADR-154 §7.4 #18) Lift the bare fusion weights, normalization scales, confidence-indicator weights, and adaptive-threshold clamp bounds in motion.rs out of the scoring functions into named, documented EMPIRICAL-DEFAULT consts. Values are bit-identical to the prior literals — this is cleanup, no behaviour change. Adds boundary/characterization tests pinning current behaviour: - motion_tuning_consts_unchanged_from_literals (consts == old literals) - doppler_component_saturates_at_full_scale (/100 then clamp(0,1)) - correlation_score_zero_below_n2_boundary (n<2 guard) - temporal_variance_zero_below_two_history (len<2 guard) - adaptive_threshold_engages_at_history_boundary (history 9 vs 10) Co-Authored-By: claude-flow <ruv@ruv.net> * refactor(signal): gesture.rs euclidean length guard + de-magic (ADR-154 §7.4 #12) - Add a debug_assert! to euclidean_distance documenting the same-dimension caller contract: zip() silently truncates on a length mismatch, so a mismatch is now loud in debug builds while the release operating path and output are unchanged. - De-magic the bare 1e-10 confidence epsilon into a documented const CONFIDENCE_SECOND_BEST_EPSILON (value unchanged). Tests pinning current behaviour: - confidence_epsilon_unchanged_from_literal - dtw_empty_sequence_is_infinite (n=0/m=0 boundary) - euclidean_distance_equal_length_is_l2 (same-dim contract) Co-Authored-By: claude-flow <ruv@ruv.net> * refactor(signal): de-magic longitudinal.rs drift thresholds (ADR-154 §7.4) Lift the bare drift-detection literals (7-day baseline, 2-sigma z-score, 3-day sustained, 7-day escalation, EMA alpha, cosine epsilon) into named, documented EMPIRICAL-DEFAULT consts encoding the module's Key Invariants. The duplicated `>= 7` in is_ready/is_ready_at now share one const. EMA alpha kept as the exact 0.05 literal (1.0 - 0.95_f32 is not bit-identical in f32). Values unchanged. Tests: - drift_consts_unchanged_from_literals - is_ready_at_day_boundary (day 6 vs 7) - cosine_similarity_zero_vector_is_zero (zero-norm guard) Co-Authored-By: claude-flow <ruv@ruv.net> * refactor(signal): de-magic division/zero-norm epsilons + boundary tests (ADR-154 §7.4) De-magic the bare division-guard epsilons in four modules into named, documented consts (values unchanged) and pin the previously-untested zero-norm / zero-variance / degenerate boundaries: - cross_room.rs: COSINE_SIMILARITY_EPSILON (1e-9) + test_cosine_similarity_zero_vector - multiband.rs: PEARSON_DENOMINATOR_EPSILON (1e-12) + pearson_correlation_zero_variance - intention.rs: LEAD_TIME_MIN_ACCEL (1e-10) + lead_time_zero_for_static_stream - hampel.rs: ZERO_MAD_EPSILON (1e-15) + test_zero_half_window_error + test_zero_mad_constant_window; documented hampel_filter # Errors Each module also gets a _unchanged_from_literal const-pin test. Co-Authored-By: claude-flow <ruv@ruv.net> refactor(signal): de-magic rf_slam + attractor_drift constants (ADR-154 §7.4) rf_slam.rs: - NS_PER_DAY (86_400_000_000_000.0), MIGRATION_MIN_SPAN_DAYS (1e-9), and the fixed-map defaults (FIXED_MAP_ASSOC_RADIUS_M/MIN_SIGHTINGS/MIN_COHERENCE) lifted out of inline literals (values unchanged). - migration_zero_span_is_zero_rate pins the single-sighting zero-span guard. attractor_drift.rs: - METRIC_BUFFER_CAPACITY (365), STABLE_CENTER_WINDOW (10) de-magicked. - Documented the implicit recent.len()>=1 divide-safety in the PointAttractor branch (guaranteed by the count < min_observations guard). - analyze_min_observations_boundary pins the off-by-one boundary. Each module gets a _consts_unchanged_from_literals pin test. Co-Authored-By: claude-flow <ruv@ruv.net> refactor(signal): de-magic coherence.rs variance floor + default decay (ADR-154 §7.4) Completes the M1 #9 de-magic for coherence.rs: the four bare 1e-6 variance-floor literals (update_reference floor + coherence_score/per_subcarrier_zscores epsilon) collapse to one VARIANCE_FLOOR const, and the inline 0.95 default decay becomes DEFAULT_EMA_DECAY. Values unchanged. Tests: - drift_consts_unchanged_from_literals extended (VARIANCE_FLOOR, DEFAULT_EMA_DECAY) - coherence_score_finite_with_zero_variance pins the floor's effect Co-Authored-By: claude-flow <ruv@ruv.net> * refactor(signal): de-magic calibration.rs thresholds + min-frames default (ADR-154 §7.4 #2) Lift the bare calibration literals into named EMPIRICAL-DEFAULT consts (values unchanged, bit-identical; calibration is off the Python proof path): - DEFAULT_MIN_FRAMES (600) — was repeated across all four tier constructors - AMP_STD_FLOOR (1e-12) z-score divisor floor - MOTION_AMP_Z_THRESHOLD (2.0) / MOTION_PHASE_DRIFT_THRESHOLD (π/6) — the two motion_flagged sites now share one definition - SUBTRACT_MIN_NORM (1e-30) baseline-subtraction guard Test calibration_consts_unchanged_from_literals pins all five and asserts every tier constructor shares DEFAULT_MIN_FRAMES. Co-Authored-By: claude-flow <ruv@ruv.net> * refactor(signal): de-magic fusion_quality + temporal_gesture constants (ADR-154 §7.4) fusion_quality.rs: - CONTRADICTION_PENALTY (0.8) and CONTRADICTION_BOUND_HALFWIDTH (0.1) named. - no_contradiction_is_identity pins the n=0 boundary (penalty 0.8^0 = 1.0, zero-width bounds). temporal_gesture.rs: - CONFIDENCE_SECOND_BEST_EPSILON (1e-10, mirrors gesture.rs) and NORM_QUANTIZATION_SCALE (1000.0) named. Each module gets a _consts_unchanged_from_literals pin test. Values unchanged. Co-Authored-By: claude-flow <ruv@ruv.net> docs(adr-154): record Milestone-3 — §7.4 row #21-45 P3 backlog cleared Replace the lumped #21-45 backlog row with the enumerated M3 resolution: 22 magic constants de-magicked into named EMPIRICAL-DEFAULT consts (each pinned == prior literal), 6 boundary/characterization tests, ~4 doc-only, across 11 modules; not-real findings reported + skipped (unreachable attractor_drift div0, non-existent gesture thresholds, proof-path features.rs). Update residual P3 rows #2/#12/#17/#18 to RESOLVED, the deferred count (36 -> 0), the scope field, and the Horizon-ledger one-liner. §7.4 backlog fully cleared across M0-M3. CHANGELOG [Unreleased] entry added. Validation: signal lib --no-default-features 476/0/1; --features cir 476/0; workspace 3,275/0; Python proof PASS, hash f8e76f21...46f7a UNCHANGED. Co-Authored-By: claude-flow <ruv@ruv.net> --------- Co-authored-by: ruv <ruvnet@gmail.com>	2026-06-13 19:36:05 -04:00
rUv	91248536bc	feat(beyond-sota): ADR-156 M2 — RaBitQ unbiased distance estimator (rigorous published negative on strict-K) (#1056 ) * feat(ruvector): RaBitQ unbiased distance estimator (ADR-156 M2) Implement the real Gao & Long (SIGMOD 2024) RaBitQ contribution on top of the existing Pass-2 rotation: an unbiased estimator of the inner product / squared distance recovered from the 1-bit code plus 8 B/vec per-vector side info (residual_norm + x_dot_o), used to rerank the candidate set instead of raw Hamming. - src/estimator.rs (new): EstimatorSketch, SideInfo, EstimatorQuery, DistanceEstimator (estimate_inner_product / estimate_sq_distance / ranking_key / cosine_ranking_key), EstimatorBank (topk_estimated[_cosine], with_centroid). Zero-centroid simplification documented; paper-faithful centroid path also built. - src/rotation.rs: extract apply_padded() (full padded FHT frame the code lives in); apply() now truncates apply_padded(). No behaviour change. - lib.rs: export estimator types. Additive + backward-compatible: Pass-1 Sketch / Pass-2 SketchBank / WireSketch wire format unchanged; all external callers use Pass-1 and are unaffected. Co-Authored-By: claude-flow <ruv@ruv.net> * test(ruvector): estimator strict-K coverage harness (ADR-156 M2) Add measure_estimator (cosine rerank) + measure_estimator_euclidean to the coverage harness, on the BIT-IDENTICAL fixture / cluster centres / query stream / cosine ground truth as measure_pass1/measure_pass2 — apples-to-apples sign-Hamming vs unbiased-estimator-rerank. Regression tests: - estimator_rerank_not_worse_than_sign (>= sign-only Pass-2 on a fixed fixture) - estimator_coverage_is_deterministic - estimator_coverage_report (--nocapture prints the strict-K table) MEASURED strict-K (candidate_k=K=8): Pass-1 36.13% -> Pass-2-sign 46.39% -> estimator-cosine 49.71%. Still short of the ADR-084 90% strict bar; estimator reaches 95.12% at candidate_k=24 (vs sign 91.60%). Published negative. Co-Authored-By: claude-flow <ruv@ruv.net> * docs(ruvector): record RaBitQ estimator measured negative (ADR-156 §11, ADR-084) - sketch_bench: estimator cosine/euclid columns in the coverage table. - ADR-156 §11 (new): estimator formula + zero-centroid simplification stated honestly; strict-K coverage table; RESOLVED-NEGATIVE verdict (49.71% strict, short of 90%); pinning test names. §5 #2 + §10.5 updated. - ADR-084 'Pass 2b' (new): estimator landed + measured strict-K vs the bar. - CHANGELOG [Unreleased]: ADR-156 §11 Milestone-2 entry. Co-Authored-By: claude-flow <ruv@ruv.net>	2026-06-13 18:24:40 -04:00
rUv	865f9dee77	perf(beyond-sota): ADR-154 M2 — FFT planner hoist (1.84x, bit-identical) + 3 honest perf nulls + boundary tests (#1055 ) * perf(signal): hoist FFT planner across subcarriers (ADR-154 §7.4 #20) compute_multi_subcarrier_spectrogram called compute_spectrogram once per subcarrier, and each call built a fresh FftPlanner + re-planned the same length-window_size FFT. Hoist the plan + window out of the per-subcarrier loop via a new compute_spectrogram_with_plan core that takes a pre-planned Arc<dyn Fft> and pre-built window. compute_spectrogram delegates to it (unchanged behaviour); the multi-subcarrier path plans once and reuses. MEASURED-HOT (dsp_perf_bench, this box): at 56 subcarriers, window 128, fresh-planner-per-subcarrier 467.88 µs -> hoisted-plan 254.75 µs = 1.84x; window 256: 627.27 µs -> 448.39 µs = 1.40x. Plan-forward cost alone is ~1.86 µs (w128), x56 subcarriers ~= the removed delta. Output is bit-identical: multi_subcarrier_hoisted_plan_bit_identical compares f64::to_bits of every spectrogram value + freq/time resolution against the per-call fresh-planner path across all 4 window functions x {power,magnitude} on a 56-subcarrier matrix. The numeric STFT body is the old loop verbatim; only plan/window construction is lifted. Co-Authored-By: claude-flow <ruv@ruv.net> * test(signal): boundary/tolerance tests for ADR-154 §7.4 #14 #16 #19 Three "+ test" backlog gaps closed — pure additions, no behaviour change (phase_align refactor is internal: estimate_phase_offsets still returns the identical offset vector; a counted core is split out only to observe the iteration count). #14 cir.rs fft_operator — fft_operator_within_tolerance_of_dense_canonical56: the opt-in FFT Φ/Φᴴ path changes the witness hash, so pin it numerically CLOSE to the dense path (not silently divergent). Asserts the full Cir output (every tap within 1e-2·dominant, dominant idx/ratio, active_tap_count, ranging_valid, rms_delay_spread) on the production canonical-56 config across τ ∈ {20,50,90} ns. Extends the existing HT20/single-τ test. #16 phase_align.rs — refinement_terminates_at_iteration_cap_when_not_converging: forces non-convergence (tolerance=0.0, unreachable) and asserts the loop runs exactly max_iterations then returns — proving the cap, not convergence, bounds the loop (no infinite spin). Companion refinement_converges_before_cap_on_easy_input proves the cap is an upper bound, not the only exit. #19 csi_ratio.rs — ratio_finite_at_and_below_1e_12_epsilon: the module implements the CSI ratio as the conjugate product H_i·conj(H_j) (no division), so it is finite even at/below the 1e-12 magnitude boundary a naive H_i/H_j division would need an epsilon to guard. Pins finiteness + bit-exact conjugate product at the boundary (zero target → zero, never inf/NaN), through the amplitude/phase extraction. cargo test -p wifi-densepose-signal --no-default-features --lib: 447 passed, 0 failed; --features cir --lib: 447 passed, 0 failed. Co-Authored-By: claude-flow <ruv@ruv.net> * docs(adr-154): record Milestone-2 P2-perf verdicts + boundary tests (§7.4) §7.4: #20 MEASURED-HOT (1.40–1.84× spectrogram FFT-plan hoist, bit-identical); #5/#6/#7 MEASURED-NULL (benched, not hot, left as-is — sub-µs / stack-only / alloc-once); #8 MEASUREMENT-ONLY (per-call 56×56 eigh cost; eigenvalue/BLAS backend un-buildable on this Windows host, number deferred to a BLAS box, NOT fabricated; also corrects the finding — extract_perturbation reuses cached modes, the recompute is in estimate_occupancy). #14/#16/#19 RESOLVED (tolerance / convergence-cap / epsilon-boundary tests). Updated §7.4 intro + Horizon-ledger (deferred count 41→36). CHANGELOG [Unreleased] entry added. Co-Authored-By: claude-flow <ruv@ruv.net> * bench(signal): committed P2 bench-first benches (ADR-154 §7.4 #5/#6/#7/#8/#20) New dsp_perf_bench.rs backs every Milestone-2 perf verdict with a committed criterion bench — no speedup claimed without a before/after number here, and a benched NULL is the proof a micro-opt was unnecessary (the §5.x "already amortized" pattern). Registered in Cargo.toml [[bench]]. MEASURED (this box, criterion medians): #20 spectrogram_multi_subcarrier (fresh vs hoisted plan): MEASURED-HOT — 467.88→254.75 µs (1.84x) @ sc56/w128; 627.27→448.39 µs (1.40x) @ sc56/w256. Optimized in the prior commit. #5 multistatic_attention/weights: MEASURED-NULL — 181 ns (2 nodes) .. 848 ns (8 nodes); sub-µs, no hot-path alloc — left as-is. #6 tomography_reconstruct/solve: MEASURED-NULL — 47.5 µs (16 links) / 60.4 µs (32 links) for a full 50-iter ISTA solve; the 2 per-solve voxel buffers (~4 KB) are negligible vs O(iters·links·voxels) compute, and reconstruct(&self) reuses them across iterations already — left as-is. #7 pose_kalman_update/cycles: MEASURED-NULL — 150 ns (17 kpts) / 2.82 µs (170); the Kalman "gain matrices" are fixed-size STACK arrays ([[f32;3];6]), zero heap — nothing to reuse — left as-is. #8 field_model_occupancy (eigenvalue feature): MEASUREMENT-ONLY — quantifies the per-call n×n eigendecomposition cost; incremental SVD is a sized future project, not attempted (number recorded in ADR-154 §7.4). Reproduce: cargo bench -p wifi-densepose-signal --no-default-features --bench dsp_perf_bench cargo bench -p wifi-densepose-signal --bench dsp_perf_bench # adds #8 Cargo.lock: dev-dep (criterion/clap) graph + crate version bumps from the build; no runtime-dependency change. Co-Authored-By: claude-flow <ruv@ruv.net>	2026-06-13 17:34:37 -04:00
rUv	cf2a85db66	feat(beyond-sota): ADR-157 M1 — constant-time HMAC compare + MEASURED 5.57x native wlanapi scan (#1054 ) * fix(hardware): constant-time HMAC sync-beacon tag compare (ADR-157 §B4) AuthenticatedBeacon::verify compared the 8-byte HMAC-SHA256 tag with `self.hmac_tag == expected`, which short-circuits on the first differing byte and leaks, via verification latency, how many leading bytes a forged tag matched — a byte-by-byte tag-recovery oracle (~256·N trials vs 256^N). Replace with a hand-rolled branch-free `constant_time_tag_eq`: XOR-accumulate every byte difference into a single u8 with no early exit, compare to zero once. `#[inline(never)]` + `core::hint::black_box(diff)` resist the optimizer reintroducing a short-circuit or a non-constant-time memcmp; length mismatch returns false without inspecting contents. No new dependency — ADR-157 had deferred this only to avoid the `subtle` crate; a fixed 8-byte compare needs none. Test (hard gate): tag_compare_is_constant_time_shape — equal / first-differ / last-differ / all-differ / length-mismatch + end-to-end verify() last-byte tamper. Proven to fail on a last-byte-skipping constant-time bug. A coarse timing smoke check (tag_compare_timing_invariance_smoke) is #[ignore]d to avoid CI flakiness. Grade MEASURED (constant-time construction). ADR-157 §8 §B4 → RESOLVED. wifi-densepose-hardware: 164 passed / 0 failed. Co-Authored-By: claude-flow <ruv@ruv.net> * feat(wifiscan): MEASURE native wlanapi.dll vs netsh throughput (ADR-157 §5 #4) ADR-157 §5 #4 recorded the native wlanapi.dll multi-BSSID fast path as "asserted but NOT implemented; live scanner is the ~2 Hz netsh shim". Audit finding: that status is stale — wlanapi_native::scan_native already implements the real WlanOpenHandle → WlanEnumInterfaces → WlanGetNetworkBssList → WlanFreeMemory/WlanCloseHandle FFI (handle cleanup on all exits, length-bounded buffer walks, #[cfg(windows)] with typed Unsupported off-Windows), and WlanApiScanner::scan_instrumented already wires it native-first with a netsh fallback. The missing piece was an honest MEASUREMENT. Add benchmark_backend(backend, window): drives one specific backend over a fixed wall-clock window so netsh is timed independently (the existing benchmark() picks native-first and so never measures netsh on a box where native works). Returns None for an unavailable native path (honest negative, not a fabricated number). MEASURED on this box (Intel Wi-Fi 7 BE201 320MHz, 2026-06-13), 10 s window: native 21.42 Hz vs netsh 3.84 Hz = 5.57× (mean 5.0 BSSIDs/scan each). native-only run: 18.0 Hz. 50/50 back-to-back native scans, no handle leak. A real positive result — NOT a fabricated 10×. Achieved 21.4 Hz is in the asserted >2 Hz regime, below the asserted 10–20 Hz upper bound. Tests (live-WLAN, #[ignore] for CI, RUN here): measure_native_vs_netsh_throughput, native_scans_dont_leak_handles, measure_native_scan_rate. Non-ignored pin native_scan_runs_real_ffi_on_windows (pre-existing) stays green. wifi-densepose-wifiscan: 94 passed / 0 failed. ADR-157 §5 #4 + §8 → MEASURED (was ACCEPTED-FUTURE / CLAIMED-unmeasured). Co-Authored-By: claude-flow <ruv@ruv.net>	2026-06-13 16:32:34 -04:00
rUv	9b07dff298	feat(beyond-sota): ADR-155 metric unification + ADR-156 RaBitQ Pass-2 (honest negative + latent topk bugfix) (#1053 ) * refactor(train): hoist canonical PCK/OKS to un-gated metrics_core; fold test_metrics onto production (ADR-155 M1 §8) ADR-155 §8 deferred item: test_metrics.rs reference kernels validated production against their OWN reimplementation — a test that cannot catch a canonical-impl bug (both could be wrong the same way). - Extract canonical_torso_size / pck_canonical / oks_canonical / sigmas / bounding_box_diagonal into a new NON-tch-gated `metrics_core` module, so the single metric definition is reachable under `cargo test --no-default-features` (the `metrics` module is tch-gated). `metrics` re-exports every item → still exactly ONE implementation. - Rewrite tests/test_metrics.rs to assert the PRODUCTION pck_canonical / oks_canonical equal hand-computed fixtures (not a reimplementation): canonical_pck_matches_hand_computed_fixture (corr=3/total=4/pck=0.75), hip↔hip normalizer pin, zero-visible⇒0.0, OKS perfect⇒1.0, fake-Gold pin. - Keep an INDEPENDENT raw-threshold reference kernel only as a differential cross-check: test_kernel_agrees_with_canonical asserts it AGREES with canonical where torso==1.0 (genuine cross-check, not duplication). Grade: MEASURED. test_metrics 10→12 tests, 0 failed. Co-Authored-By: claude-flow <ruv@ruv.net> * fix(sensing-server): relabel divergent live PCK/OKS so they're never conflated with canonical (ADR-155 M1 §2.1/§8 Goal C) Goal C named training_api.rs:804 (torso-HEIGHT PCK). Auditing it surfaced TWO findings the ADR-155 §1 table missed: 1. training_api.rs is an ORPHAN file — not declared `mod` in lib.rs OR main.rs, so it does NOT compile into the crate. It does not drive the live server. 2. The REAL live `best_pck`/`best_oks` (main.rs training path → RVF metadata JSON read by model_manager.rs) come from trainer.rs: - `pck_at_threshold` = RAW-threshold PCK, NO torso normalization (the most divergent kind), printed/serialized as bare "PCK@0.2". - `oks_map` calls `oks_single(area=1.0)` = the EXACT fake-Gold pattern ADR-155 §2.1 claimed closed elsewhere — still live here, inflating best_oks. Resolution = RELABEL (torso/raw math is load-bearing on different data; the pub fns can't be renamed without breaking API; sensing-server has no train/ ndarray dep). Honest unify is a tracked §8 backlog item. - training_api.rs: `compute_pck` → `compute_pck_torso_height` + divergence doc; val_pck/best_pck/val_oks struct fields documented as torso-HEIGHT proxies; logs say `pck_torso_h@0.2`. Test torso_pck_is_labelled_distinctly_from_canonical. - trainer.rs (LIVE): `pck_at_threshold` documented raw-unnormalized; `oks_map` area=1.0 flagged fake-Gold; test pck_at_threshold_is_raw_unnormalized_not_canonical. - main.rs: live print relabelled `pck_raw@0.2` / `oks_map(area=1.0 proxy)`. No wire-format field renames (back-compat); no pub-API rename (no silent break). Grade: MEASURED (relabel + divergence pinned). sensing-server 450→451 lib tests, 0 failed. Co-Authored-By: claude-flow <ruv@ruv.net> * docs(adr-155): mark §8 metric items RESOLVED + audit map + honest §1 under-count correction (M1b Goals A/D) - §8.1: full PCK/OKS audit map (every def: file:line, basis, canonical/ legacy/distinct), the two §8 items marked RESOLVED with resolution+why. - Honest finding: §1's "seven divergent metrics" was an UNDER-count — sensing-server's LIVE trainer.rs has a raw-unnormalized PCK and an area=1.0 fake-Gold OKS the table omitted, and the file §8 named (training_api.rs) is orphaned dead code. §9 honest-limits updated. - Goal D: metrics.rs _v2 variants confirmed caller-less + deprecated; noted for future cleanup, NOT deleted (public API, tch-gated). - CHANGELOG [Unreleased] Fixed entry. Co-Authored-By: claude-flow <ruv@ruv.net> feat(ruvector): RaBitQ Pass-2 randomized rotation + topk bugfix (ADR-156 §8) Implements the deferred "Multi-bit / Extended RaBitQ Pass 2" backlog item from ADR-156 §8: a deterministic randomized orthogonal rotation applied before sign-quantization, the published RaBitQ construction (Gao & Long, SIGMOD 2024). Rotation construction: Fast Hadamard Transform + seeded ±1 sign flips ("HD" / randomized Hadamard), O(d log d) time and O(d) memory — a dense d×d rotation is O(d²) and infeasible at the 65,535-d the wire format provisions for. Pads to the next power of two; SplitMix64 seeds the sign stream so index-time and query-time rotations are bit-identical. API is additive and backward-compatible: Pass 1 (`from_embedding`) is untouched; Pass 2 is opt-in via `Sketch::from_embedding_rotated` and `SketchBank::with_rotation` (+ `insert_embedding` / `topk_embedding` / `novelty_embedding` helpers that rotate consistently). Default behaviour is unchanged. While building the Pass-2 coverage harness, found and fixed a PRE-EXISTING correctness bug in `SketchBank::topk`: the n>k heap path used `BinaryHeap<Reverse<(d,id)>>` (a min-heap) but treated its peek as the max, so it returned the k FARTHEST sketches as "nearest". The shipped unit tests only exercised the n≤k fast path, so it went unnoticed. Fixed to a plain max-heap; pinned by `topk_heap_path_returns_nearest` and `tight_clusters_give_high_coverage_with_overfetch` (the latter measured 0.072 on the old code). New tests (+17, 100→117 in the crate): rotation determinism/norm-preservation (`rotation_is_deterministic_for_seed`, `rotation_preserves_norm`), Pass-2 shape-compatibility, `pass2_coverage_not_worse_than_pass1`, and a deterministic coverage report. MEASURED top-K coverage (anisotropic planted-cluster fixture, cosine ground truth; dim=128 N=2048 K=8 64 clusters noise=0.35 128 queries): candidate_k=K=8 : Pass1 36.13% -> Pass2 46.39% (both << 90% bar) candidate_k=24 : Pass1 83.89% -> Pass2 91.60% (Pass2 clears 90%) candidate_k=32 : Pass1/Pass2 100% Honest result: rotation consistently helps (+10pp at strict K), but neither pass clears the ADR-084 90% bar at candidate_k==K on this distribution. Pass 2 reaches 90% only with ~3x over-fetch (the ADR-084 "candidate set" deployment pattern). Multi-bit Pass 3 evaluated separately. Co-Authored-By: claude-flow <ruv@ruv.net> * feat(ruvector): multi-bit Pass-3 experiment + ADR-156/084 measured results Adds the multi-bit half of the ADR-156 §8 "Multi-bit / Extended RaBitQ" item as a MEASURED experiment (coverage::measure_multibit): rotate, then b-bit uniform scalar-quantize each coord, rank by L1 over codes — the natural multi-bit generalization of hamming. Measures the bit/coverage tradeoff the backlog item asked for. MEASURED at the strict bar (candidate_k=K=8, anisotropic planted-cluster fixture, cosine ground truth): Pass1 (1-bit, no rot) 36.13% 16 B/vec Pass2 (1-bit, rot) 46.39% 16 B/vec Pass3 (rot, 2-bit) 54.39% 32 B/vec Pass3 (rot, 3-bit) 66.70% 48 B/vec Pass3 (rot, 4-bit) 74.22% 64 B/vec Honest: multi-bit monotonically helps but even 4-bit (4x memory) reaches only 74% at the strict bar — neither rotation nor <=4-bit multi-bit clears the strict-K 90% bar on this distribution. The bar is met via over-fetch (Pass2 @ candidate_k=24). Tests: multibit_tradeoff_report, multibit_1bit_matches_pass2_approx (+ sanity that 1-bit ~= Pass-2). Docs: - ADR-156 §8 item #2 marked RESOLVED-PARTIAL; §5 #2 grade CLAIMED -> MEASURED-on-our-hardware; new §10 with full measured tables, the topk bugfix disclosure, and graded deferred sub-items. - ADR-084: "Pass 2" section answering the rotation open-question with measured numbers + the topk bug note. - CHANGELOG [Unreleased]: Added (Pass-2 milestone) + Fixed (topk heap). Co-Authored-By: claude-flow <ruv@ruv.net>	2026-06-13 16:02:18 -04:00
rUv	42dcf49f4d	fix(adr): resolve duplicate ADR numbers + close ADR-080 security + ADR-154 M1 signal backlog (#1051 ) * fix(signal): circular phase variance for ghost-tap guard (ADR-154 §7.4 #1) `phase_variance` computed a LINEAR sample variance over phase angles that wrap at ±π, so a tightly-clustered set straddling the branch cut reported spuriously HIGH dispersion — false-tripping the `> TAU` ghost-tap guard on real, tightly-clustered CIR taps. Replace with Mardia's circular variance V = 1 − R̄, bounded [0,1] and invariant to where the cluster sits on the circle. Re-derive the guard against the bounded metric via a named const `GHOST_TAP_CIRCULAR_VARIANCE_MAX` (the old TAU-scaled threshold is meaningless on [0,1]). Grade: metric fix MEASURED; threshold value DATA-GATED — a clean single-path ramp also sweeps the circle, so V alone cannot separate clean from unsanitized without labelled frames. Conservative default (0.99) errs toward never false-rejecting, strictly more permissive at the wrap boundary than the buggy linear guard. Fails-on-old test: `phase_variance_circular_not_fooled_by_branch_cut` — inlines the old linear variance to show it exceeds TAU on wrap-straddling phases while circular V≈0 and the guard no longer trips. Plus `phase_variance_circular_is_bounded_and_extremal` (V∈[0,1], V≈0 identical, V≈1 uniform). cargo test -p wifi-densepose-signal --no-default-features --features cir --lib → 432 passed, 0 failed. Co-Authored-By: claude-flow <ruv@ruv.net> * fix(signal): pin Welford n=0/n=1 finiteness guard (ADR-154 §7.4 #10) The shared `WelfordStats` (field_model.rs, used by longitudinal.rs and others) relies on `count < 2` guards in `variance`/`sample_variance`/`std_dev`/ `z_score` to stay finite at the boundaries. The guards existed but the n=0 boundary was UNTESTED — exactly the §4 divide-by-(n−1) family the ADR groups this with. Add `welford_finite_at_n0_and_n1` asserting every statistic is finite and returns the documented sentinel (0.0) at n=0 and n=1, plus load-bearing doc comments on the two guards. Fails-on-old proof: with the `sample_variance` guard removed, the test FAILS with "attempt to subtract with overflow" at the `(self.count - 1)` underflow (0usize − 1); `variance` would similarly yield 0.0/0.0 = NaN. The guard is restored; the test pins it so a future regression is caught. Grade: MEASURED (boundary finiteness is asserted; the guard is the §4-family fix made testable). cargo test -p wifi-densepose-signal --no-default-features --lib field_model → 22 passed, 0 failed. Co-Authored-By: claude-flow <ruv@ruv.net> * refactor(signal): de-magic adversarial thresholds + boundary tests (ADR-154 §7.4 #13) Lift the bare numeric literals buried in `check`/`check_consistency` into named, documented module consts (FIELD_MODEL_GINI_VIOLATION=0.8, ENERGY_RATIO_HIGH_VIOLATION=2.0, ENERGY_RATIO_LOW_VIOLATION=0.1, CONSISTENCY_ACTIVE_FRACTION_OF_MEAN=0.1, SCORE_W_* weights). VALUES UNCHANGED — each const equals the original literal; only names + pinning tests are new. Grade: DATA-GATED. The operating values stay empirical (defensible values need labelled spoofed/clean CSI — Wi-Spoof, §6.2/§7.3). The de-magicking + characterization tests are MEASURED: `tuning_consts_unchanged_from_literals`, `energy_ratio_high_boundary`, `energy_ratio_low_boundary`, `field_model_gini_boundary`, `consistency_active_fraction_boundary` pin the decision boundaries at/just-below/just-above each threshold, so a future data-driven retune is a visible, tested change. Fails-on-change proof: bumping ENERGY_RATIO_HIGH_VIOLATION 2.0→3.0 makes `energy_ratio_high_boundary` FAIL (restored). Operating values explicitly NOT changed. cargo test -p wifi-densepose-signal --no-default-features --lib ruvsense::adversarial → 20 passed, 0 failed. Co-Authored-By: claude-flow <ruv@ruv.net> * refactor(signal): de-magic coherence drift/gate thresholds (ADR-154 §7.4 #9) Lift the bare detection literals in `coherence.rs::classify_drift` (DRIFT_STABLE_SCORE=0.85, DRIFT_STEP_CHANGE_MAX_STALE=10) and the `coherence_gate.rs` Default impl (DEFAULT_ACCEPT_THRESHOLD=0.85, DEFAULT_REJECT_THRESHOLD=0.5, DEFAULT_MAX_STALE_FRAMES=200, DEFAULT_PREDICT_ONLY_NOISE=3.0) into named, documented consts. VALUES UNCHANGED. The gate already exposed these via GatePolicyConfig (config seam); this names + pins the defaults. Grade: DATA-GATED. Operating values stay empirical (defensible Z-score thresholds need labelled stable/drifting coherence traces). De-magicking + boundary tests are MEASURED: `classify_drift_stable_score_boundary`, `classify_drift_stale_count_boundary` pin the at/just-below/just-above decisions; `drift_consts_unchanged_from_literals` / `gate_default_consts_unchanged_from_literals` pin the values. Operating values explicitly NOT changed. cargo test -p wifi-densepose-signal --no-default-features --lib ruvsense::coherence → 40 passed, 0 failed. Co-Authored-By: claude-flow <ruv@ruv.net> * docs(adr-154): mark §7.4 P1 backlog cleared — Milestone-1 (#1,#10 RESOLVED; #9,#13 DATA-GATED) Update ADR-154 §7.4 backlog rows #1, #9, #10, #13 with commit refs + grades, the §7.4 intro count (four P1 items cleared, ~41 P2/P3 remain), the Horizon-ledger one-liner (Milestone-1 DONE), and the §8 honest-limits #1 line (metric now correct; threshold still DATA-GATED). Add CHANGELOG [Unreleased] entry. Grades: #1 RESOLVED (MEASURED metric / DATA-GATED threshold), #10 RESOLVED (MEASURED), #9 & #13 RESOLVED-PARTIAL (DATA-GATED — de-magicked + boundary tested, operating values unchanged). Validation: cargo test --workspace --no-default-features → 2057 passed, 0 failed; wifi-densepose-signal lib → 442 passed (no-default + --features cir); python archive/v1/data/proof/verify.py → VERDICT: PASS, hash f8e76f21…46f7a UNCHANGED (CIR ghost-tap guard is not on the deterministic proof path). Co-Authored-By: claude-flow <ruv@ruv.net> * fix(sensing-server): stop leaking internal errors in HTTP responses (ADR-080 #2) Six handlers in `main.rs` serialized the internal error `Display` straight into the JSON response body, leaking server internals to any client (ADR-080 finding #2, CWE-209; reframed onto the Rust boundary by ADR-164 G11): - edge_registry_endpoint: a panicked spawn_blocking `JoinError` ("task … panicked") in a 500, and the raw upstream error in a 503 - delete_model / delete_recording / start_recording: std::io::Error strings carrying OS detail / filesystem paths - calibration_start / calibration_stop: the FieldModel error chain New `error_response` module: `internal_error` / `internal_error_json` / `upstream_unavailable` log the full detail server-side only (tagged with a correlation id) and return a generic body (`{"error":"internal_error","correlation_id":…}`) — no `panicked`, no file paths, no Debug chain. The correlation id lets an operator join a client report to the exact server log line without ever shipping the detail. Pinned by 5 error_response tests, incl. a leak-substring guard (internal_error_body_does_not_leak_detail) verified to FAIL on the reverted old body (returns the panic message / path / "os error"). The HOMECORE sweep (ADR-161) covered homecore-server, not this crate. Co-Authored-By: claude-flow <ruv@ruv.net> * test(sensing-server): pin XFF-immunity + no-query-token (ADR-080 #1, #3) Findings #1 (XFF-spoofing bypass) and #3 (JWT-in-URL, CWE-598) were logged against the Python v1 API but are VERIFIED ABSENT on the current Rust sensing-server, so they get regression tests rather than redundant fixes: - #1 XFF: there is no IP-based rate-limiter or IP-allowlist to bypass, and neither security middleware reads a forwarded header. Added bearer_auth::xff_header_never_affects_auth_decision (spoofed X-Forwarded-For never flips a 401<->200 decision) and host_validation::forwarded_headers_never_bypass_host_allowlist (spoofed X-Forwarded-Host: localhost never lets Host: evil.com past the allowlist). - #3 JWT-in-URL: require_bearer reads the token only from the Authorization header; WS handlers take no query token; the sole Query extractor (EdgeRegistryParams) is a non-secret refresh flag. Added bearer_auth::query_string_token_is_never_accepted — ?token= / ?access_token= in the URL never authenticates (stays 401) while the header path still 200s. Verified to FAIL when a query-token path is injected into require_bearer. Co-Authored-By: claude-flow <ruv@ruv.net> * docs(adr-080): mark P0 security findings #1-#3 RESOLVED; close ADR-164 G11 - ADR-080: Status note + per-finding closure (#1 XFF and #3 JWT-in-URL verified absent + regression-pinned; #2 leaked errors fixed via the error_response module). Records the v1-vs-Rust boundary distinction explicitly: v1 paths remain archived; this closure governs the shipped Rust sensing-server. - ADR-164: Gap Register G11 and the Open/Gated Backlog entry marked RESOLVED with the fix + branch reference. - CHANGELOG: [Unreleased] -> ### Security entry covering all three findings. Co-Authored-By: claude-flow <ruv@ruv.net> * docs(adr): renumber 6 displaced ADRs to resolve duplicate-number collisions (ADR-164 G1) Resolves the 5 duplicate ADR numbers (6 displaced files) flagged by ADR-164 Gap Register item G1. Canonical keeper per number = first file committed at that number (date tie-broken by inbound cross-reference count / parent-appendix relationship). Displaced files renumbered to the next free numbers (166-171): 050 keeps provisioning-tool-enhancements (5 refs vs 1) -> ADR-166-quality-engineering-security-hardening 052 keeps tauri-desktop-frontend (parent ADR) -> ADR-167-ddd-bounded-contexts (its appendix) 147 keeps nvidia-cosmos/OccWorld (the actual ADR, has Status header) -> ADR-168-benchmark-proof (proof companion, no Status) -> ADR-169-adam-mode-light-theme (was untracked) 148 keeps drone-swarm-control-system (committed #862) -> ADR-170-yoga-mode-pose-system (was untracked) 149 keeps public-community-leaderboard-huggingface (committed 16:47 vs 17:38) -> ADR-171-swarm-benchmarking-evaluation-methodology Updates in-file `# ADR-NNN` headers and intra-file self-references (yoga-modes * docs(adr): repoint inbound cross-references to renumbered ADRs (166-171) Follow-up to the ADR renumbering (ADR-164 G1). Updates every inbound reference that pointed at a displaced ADR, disambiguating shared numbers by title/slug so only references to the DISPLACED topic move and keeper references stay put. ADR-168 (was 147 benchmark-proof): README, CHANGELOG, user-guide, proof-of-capabilities, research docs 00/03 — all path/label refs updated. ADR-169 (was 147 adam-mode) / ADR-170 (was 148 yoga-mode): docs/adr/README index. ADR-171 (was 149 swarm-benchmarking): all ruview-swarm eval code+docs (Cargo.toml, evals/, eval_swarm.rs, metrics/mod/report/runner.rs), research doc 03 (every §-ref matched ADR-171 sections, not AetherArena), 00-system-review, series README, CHANGELOG, and ADR-148's forward/"open issues" pointers. ADR-166 (was 050 quality-engineering / security-hardening): disambiguated from the ADR-050 provisioning KEEPER by topic. The HMAC/secure_tdm, directory-traversal, bind-address, and OTA-PSK-auth references in code comments (wifi-densepose-hardware Cargo.toml + secure_tdm.rs, sensing-server main.rs) and in ADR-052-tauri / ADR-167 all describe the security-hardening ADR -> ADR-166. ADR-167 (was 052 ddd-appendix): inbound appendix references. Index/registry updates: docs/adr/README.md, gap-analysis/census.md (rows + header count), gap-analysis/lens-findings.md (collision table marked RESOLVED), and ADR-164 Gap Register G1 marked RESOLVED with the full renumber map. Keeper references deliberately untouched: all ADR-147 OccWorld code, all ADR-148 drone-swarm code/docs, all ADR-149 AetherArena refs (incl. ADR-150's SSL/resampling refs, which ADR-150 explicitly binds to the AetherArena benchmark), ADR-050 provisioning refs, ADR-052 tauri refs. The frozen GitHub blob URLs in docs/adr/.issue-177-body.md (pinned to an old branch) are left as historical. Comment-only code edits; no behavior change. wifi-densepose-hardware compiles clean; the sensing-server build's sole blocker is the pre-existing upstream midstreamer-temporal-compare@0.2.1 registry crate, unrelated to these edits. Co-Authored-By: claude-flow <ruv@ruv.net>	2026-06-13 14:31:38 -04:00
ruv	41665d3de9	test(wasm-edge): synthetic-ground-truth validation harness for edge skills (ADR-160) Plant signals with known answers, run the real detector, MEASURE detection accuracy / precision / recall / rate-error — synthetic-ground-truth ONLY, not field accuracy. MEASURED-on-synthetic (12 tests, all green): - vital_trend, exo_ghost_hunter(hidden breathing), occupancy, intrusion, exo_rain_detect, sig_optimal_transport: acc 1.000 - exo_time_crystal: 1.000 on periodic-vs-aperiodic (its sub-harmonic-vs-clean- period claim is NOT separable by autocorrelation — recorded honestly) - sig_flash_attention: 8/8 peak localization; spt_spiking_tracker: 4/4 zone localization (sparse plant); sig_mincut_person_match: 0 id-swaps/40 frames - lrn_dtw_gesture_learn: enrollment validated (replay-match reported, not asserted) - sig_sparse_recovery: trigger validated; recovery accuracy reported NEGATIVE (-2.2% vs unrecovered baseline) — only its detect/trigger path is validated DATA-GATED (listed, NOT faked): med_seizure/apnea/cardiac/respiratory/gait, sec_weapon_detect, exo_emotion/happiness/dream_stage/gesture_language — each needs real labelled clinical/affect/ASL/metal-object data; no number claimed. benchmarks/edge-skills/RESULTS.md documents every result + reproduce command and the explicit honesty boundary. ADR-160 deferred 'per-skill accuracy validation' item updated to PARTIALLY MEASURED-on-synthetic + DATA-GATED. Suite: 631 passed default / 669 medical, 0 failed. Co-Authored-By: claude-flow <ruv@ruv.net>	2026-06-13 00:33:51 -04:00
ruv	8fd4ee917d	docs(adr): mark ADR-164 Gap Register items resolved (G3, G5) + correct G2 Records the remediation done in this branch: - G3 (homecore-recorder/migrate phantom ADRs) → RESOLVED: ADR-132 + ADR-165 written. - G5 (10 streaming-engine Proposed-while-built) → RESOLVED: 136-145 flipped to "Accepted — partial", with the honest caveat that the notes describe building blocks built+tested, not live-path integration. - G2 (missing Status headers) → corrected: ADR-134-CIR was mislabeled as missing (it has a Status row); the 2 genuine misses (147-benchmark-proof, 052-ddd) are both inside owner-gated duplicate-number collisions, so left untouched. Early ADRs using "\| Status \|" vs "\| Status \|" are different-format-but-present. Net: 0 status headers added. - Updated Coverage-Gaps bullets for recorder/migrate. Renumbering/dedup of the 6 collisions left owner-gated, as instructed. Co-Authored-By: claude-flow <ruv@ruv.net>	2026-06-12 23:01:10 -04:00
ruv	5c5112db0e	docs(adr): correct streaming-engine statuses 136-145 Proposed→Accepted — ADR-164 G5 All 10 streaming-engine ADRs (136-145) carried Status: Proposed while each has a concrete commit-pinned "Built -- tested building block" Implementation-Status note (136: 11f89727f; 137: 4fa3847ac; 138: fc7674bde; 139: 521a012d8; 140: 169a355bd; 141: 7d88eb84c; 142: 1f8e180d6; 143: 2d4f3dea5; 144: b10bc2e9a; 145: `0f336b7d3`), each with a test count. Flipped each to "Accepted — partial (built + tested building block; integration glue pending — see Implementation Status, commit <hash>)". Honest "partial", not full Accepted: the notes themselves state the blocks are tested+compiling but "mostly not yet on the live 20 Hz path". 143 (v2 dataset-gated) and 144 (no UWB radio in fleet) carry their specific residual gates inline. Co-Authored-By: claude-flow <ruv@ruv.net>	2026-06-12 23:00:54 -04:00
ruv	e3696da8d8	docs(adr): write ADR-165 (HOMECORE-MIGRATE), repoint migrate 134→165 — ADR-164 G3 homecore-migrate cited "ADR-134 (HOMECORE-MIGRATE)", but on-disk ADR-134 is "First-Class CIR Support" — a different decision. The migrate crate was governed by a phantom identity (ADR-164 Gap G3). - New ADR-165-homecore-migrate-from-home-assistant.md (next free number), reverse-documented from the shipped P1 scaffold: HA .storage reader, versioned format gate (unknown minor_version = hard error), per-artifact parsers, inspect CLI, structured errors. Status: Accepted — P1 scaffold (full conversion P2). Trust-boundary rationale for the untrusted .storage import is the centerpiece. - Repointed every ADR-134 governing reference in v2/crates/homecore-migrate/ (Cargo.toml, README.md, src/lib.rs, src/config_entries.rs, src/storage_format/mod.rs) → ADR-165. Left the ADR-132 (recorder-feature) refs intact. Explanatory renumber notes retained. - On-disk ADR-134 (CIR) untouched. ADR-126 series-map registry row owner-gated. Docs/comments only — cargo build -p homecore-migrate --no-default-features still compiles. Co-Authored-By: claude-flow <ruv@ruv.net>	2026-06-12 23:00:33 -04:00
ruv	9457d441b2	docs(adr): write missing ADR-132 (HOMECORE-RECORDER) — resolves ADR-164 G3 homecore-recorder cites "ADR-132" in Cargo.toml/README/lib.rs/schema.rs/ semantic.rs, but no ADR-132 file existed — the durable-state backbone was ungoverned (ADR-164 Gap G3 / Coverage-Gaps Lens A). Reverse-documented from the shipped, tested crate (not invented): SQLite HA-compatible recorder schema v48 (P1, 14 tests), ruvector HNSW semantic index (P2, feature-gated, 20 tests), hash-embedding honesty note, P3 real embeddings planned. Status: Accepted (shipped). Filename matches the link the crate README already pointed at. Documented retroactively; honest about hash-embedding limits and unbenchmarked latency targets. Co-Authored-By: claude-flow <ruv@ruv.net>	2026-06-12 23:00:15 -04:00
ruv	260fceefe9	docs(adr): ADR-164 corpus gap analysis + research notes (162 ADRs) Parallel gap analysis of all 162 ADRs (14-agent workflow): status distribution, prioritized Gap Register, supersession integrity, contradictions/retractions (anti-slop centerpiece), coverage gaps, and the honestly-gated backlog. Key findings: 6 duplicate ADR numbers + 3 missing Status headers (breaks the index); shipped crates citing phantom governing ADRs (homecore-recorder->ADR-132 nonexistent, homecore-migrate->ADR-134 mis-identified); streaming-engine ADRs 136-145 marked Proposed but actually Built; open ADR-080 sensing-server security findings never closed; ~64 proposed-only ADRs; pre-ADR-155 accuracy claims are CLAIMED not MEASURED. Detail in docs/adr/gap-analysis/{census,lens-findings}.md. Co-Authored-By: claude-flow <ruv@ruv.net>	2026-06-12 22:40:32 -04:00
ruv	1a17cc5b06	docs(ADR-163): edge-latency RESULTS + PROOF/prove.sh wiring (T3) Adds benchmarks/edge-latency/RESULTS.md (wiflow-std RESULTS style: each measured number with reproduce command, machine, MEASURED-on-host grade, and the honest host-vs-ESP32 / steady-state-vs-cold-start caveats) and ADR-163 (HEADLINE: CLAIMED latency budgets -> MEASURED-on-host, closing M5/M6 measurement debt; ESP32-on-hardware still pending). - ADR-160 deferred 'criterion benches for process_frame budget claims' line updated to DONE (host) with the ESP32-pending note. - PROOF.md performance table gains the two edge-latency reproduce rows; provenance ADR range extended to ADR-163. - prove.sh gated section gains the edge-latency bench note (host proxy only; not asserted, never claims the ESP32 figure). Benches/docs only; no crate republishes. Co-Authored-By: claude-flow <ruv@ruv.net>	2026-06-12 08:02:07 -04:00
ruv	e7b1b66f74	docs(adr): ADR-162 — plugin security + bounded RunModes; mark ADR-161 P4/P5/§A5 DONE ADR-162 records the M8 work that makes ADR-161's honestly-deferred plugin security claims TRUE: P4 (Ed25519 signature + SHA-256 integrity verification, secure-default trust policy), P5 (capability/authority isolation on hc_state_set), and §A5 (bounded Restart/Queued/max RunModes). Each fix MEASURED with a failing-on-old test; threat model table (tampered module, untrusted publisher, over-privileged write, run-mode exhaustion); cog-ha-matter Ed25519 reuse cited; remaining honest deferral (key provisioning/rotation, native in-process plugins, HAP pairing). ADR-161 deferred-backlog lines for P4/P5/RunModes struck through and marked DONE → ADR-162; §B5 note points forward to the now-implemented P4 gate. Co-Authored-By: claude-flow <ruv@ruv.net>	2026-06-12 01:47:30 -04:00
ruv	d0da5888e3	docs(adr): ADR-161 — HOMECORE server-layer security & honest-labeling sweep (M7) Records the Milestone 7 audit: library cores are real (anti-slop positive) but the network boundary had a CRITICAL WS auth bypass (A1) + reply-theater (A2) + documented-but-no-op automation (A3-A7) + a network-exposed dev bin (A8), all fixed and graded MEASURED with failing-on-old tests. Cites the NO-ACTION security positives (uuid::v4 CSPRNG refuted-suspicion, hardened CORS, no-traversal migrate, no-secrets-in-logs, honest HAP stub) and the deferred backlog (plugin authority-isolation P5, sig-verification P4, HAP real pairing P2, bounded run-modes, YAML load-at-boot). Co-Authored-By: claude-flow <ruv@ruv.net>	2026-06-12 00:55:52 -04:00
ruv	8ad0d0f91c	test+docs(wasm-edge): honest-labeling presence tests + ADR-160 (ADR-159 backlog now TRUE) - tests/honest_labeling.rs: 10 source-presence tests asserting the A1-A5 claim invariants (disclaimers present, uncited stat removed, WEAPON_ALERT no longer exported, med_* feature-gated, no static-mut event buffers). Each is designed to FAIL on the pre-fix source (ADR-159 A5 manifest-roundtrip style). - ADR-160: records the headline (0 stubs/0 theater, all real DSP -> claim-surface honesty debt), the graded A1-A5 fixes, NO-ACTION positives, per-prefix classification, and the DATA-GATED deferred backlog (criterion benches, per-skill accuracy validation, wasm32 static_mut_refs CI confirmation). - ADR-159: its deferred-backlog line "wasm-edge ... honestly labelled, not claimed" is now actually TRUE. Validation (all 0 failed, host --features std): DEFAULT 615 \| MEDICAL (+medical-experimental) 653 \| NO-DEFAULT 615; 0 warnings. Co-Authored-By: claude-flow <ruv@ruv.net>	2026-06-12 00:01:22 -04:00
ruv	772ece4568	docs(adr): ADR-159 Cognitum appliance beyond-SOTA sweep Records the anti-AI-slop sweep over cog-person-count, cog-pose-estimation, cog-ha-matter, ruview-swarm. HEADLINE: the "never identified anyone" accusation is REFUTED (real SHA-pinned Ed25519-signed trained Candle models, honest 34%/3% accuracy in manifests). Documents claim-surface fixes A1-A5 (MEASURED), NO-ACTION positives (witness chain, fusion, PPO + randn audit), graded SOTA landscape (counting/pose DATA-GATED, swarm MARL untrained-at-runtime by design), and the deferred backlog (benches, Location/Vector, Matter v0.8, wasm-edge accuracy). Co-Authored-By: claude-flow <ruv@ruv.net>	2026-06-11 23:10:03 -04:00
ruv	3d96789475	docs(adr): ADR-158 MAT/world-model beyond-SOTA sweep (graded, MEASURED) Records the cluster sweep: §1 triage unification, §2 real RSSI + dedup, §3 real ESP32/UDP/PCAP ingest with honest typed errors, §4 parabolic interpolation, §5 real GDOP, §6 occworld-prior fail-safe (mat consumes none). Graded SOTA table (RF-through-rubble DATA-GATED; worldgraph NO-ACTION already-SOTA; worldmodel clamp-proven; pointcloud cited), confirmed negative results, deferred backlog (nothing dropped), and reproduction commands. Co-Authored-By: claude-flow <ruv@ruv.net>	2026-06-11 21:54:04 -04:00
ruv	b0ee2a4aaf	docs(soul): mark §3.6 matching algorithm as implemented + data-gated Update specification.md §3.6 ONLY with an honest implementation-status note: the matching algorithm is now implemented and tested in v2/crates/wifi-densepose-bfld/, weights remain unvalidated design intent, and named-identity locking is data-gated (cardiac+respiratory alone are not separable — measured gap ~0.0005). The broader Soul Signature system remains Pre-Implementation. Co-Authored-By: claude-flow <ruv@ruv.net>	2026-06-11 21:16:41 -04:00
ruv	66ebf798e5	docs(adr): ADR-157 Hardware/Sensing beyond-SOTA sweep — Milestone 3 Documents Milestone 3 across the four acquisition crates (vitals, hardware, wifiscan, calibration). Honest headline: this layer was already well-hardened, so the real work is small. - §A1 (perf, MEASURED): Vec::remove(0) O(n^2) sliding windows -> VecDeque. End-to-end win is NULL within noise at realistic window sizes (DSP dominates); the win is the algorithmic O(n^2)->O(n) shown in isolation. Claimed nothing more -- the committed bench proves the null. - §A2 (correctness): breathing partial-weights scale-mixing -> normalized by Sigma(effective weights). Pinned by two fail-on-old tests. - §A3 (stability): IIR resonator divergence. Corrected the research report's physically-inaccurate trigger (divergence needs \|r\|>=1, i.e. bw>=4, not "r negative"); clamp + finite-guard. Pinned by two fail-on-old tests. - §B1 hardening on an unreachable (already-gated) truncation path -- disclosed. - §B4 (constant-time HMAC compare) DEFERRED: not worth a new direct `subtle` dependency for an 8-byte LAN sync-beacon tag. - MEASURED negative-results section (the centerpiece): esp32_parser length gate, sync_packet infallible slices, the whole ieee80211bf validate-on-deserialize / no-panic-FSM / single-role / SBP-single-evaluate model, secure_tdm HMAC+replay, netsh_scanner fixed-argv + Option parse, geometry_embedding MAX_COORD_M -- each cited file:line, all NO-ACTION. - SOTA landscape: deep-CSI vitals (DATA-GATED), 802.11bf conformance (CLAIMED, non-public suite), per-room calibration (CLAIMED on numbers), native wlanapi FFI multi-BSSID (CLAIMED-unmeasured -- explicitly NOT claiming the 10x). Mostly NO-ACTION / ACCEPTED-FUTURE. - Deferred backlog (§8): nothing silently dropped. Validation: cargo test --workspace --no-default-features = 3054 passed / 0 failed; python verify.py = VERDICT PASS (hash unchanged, Rust-only changes). Co-Authored-By: claude-flow <ruv@ruv.net>	2026-06-11 21:00:59 -04:00
ruv	0ce2ac6440	docs(adr): ADR-156 RuVector/Fusion beyond-SOTA sweep — Milestone 2 Documents Milestone 2 of the beyond-SOTA sweep on the cross-viewpoint fusion path: four correctness/integrity/security fixes (each pinned by a bug-catching test), one MEASURED hot-path perf win, and the ANN/fusion SOTA landscape graded MEASURED/CLAIMED/data-gated. - Integrity: honest dimensionless GDOP (was RMSE mislabelled); canonical wrapped angular distance (disclosed numeric no-op under cos kernel — landed for contract/single-source-of-truth, not claimed as a behaviour change). - Security: crafted-index/zero-bin DoS panics closed on the multistatic path. - Perf: fuse() double-clone eliminated, ~2.17x on marshalling (MEASURED). - SOTA landscape: SymphonyQG (#1, CLAIMED — reproduction deferred) + multi-bit/Extended RaBitQ (#2, accepted near-term, the sketch.rs Pass-2); GraphPose-Fi learned fusion head documented ACCEPTED-FUTURE, data-gated per ADR-152 (b); CRB/sensor-placement investigated, no action (already SOTA). - Deferred backlog (§8): nothing silently dropped. Validation: cargo test --workspace --no-default-features = 3050 passed / 0 failed; python verify.py = VERDICT PASS. Co-Authored-By: claude-flow <ruv@ruv.net>	2026-06-11 20:23:43 -04:00
ruv	ea5ead7fb7	docs(adr): ADR-155 NN/training beyond-SOTA sweep — Milestone 1 Records the integrity-critical fixes (unified canonical metric, leak-free subject-disjoint split + synthetic-val disclosure, rapid_adapt real gradients, proof margin + committed-hash rigor), the Tier-2 correctness/security fixes, the measured Tier-3 perf win, the NN SOTA landscape graded MEASURED/CLAIMED/ THEORETICAL (GraphPose-Fi as top ACCEPTED-future candidate; INT4; CSI-JEPA-vs-MAE with the honest "no JEPA/MAE-on-WiFi-pose yet" caveat; "Mamba-CSI-pose does not exist"), and the ~45-finding deferred backlog. Discloses the libtorch/tch-gating limitation and that the Rust proof is honestly in SKIP until a baseline is committed. Co-Authored-By: claude-flow <ruv@ruv.net>	2026-06-11 19:57:54 -04:00
ruv	6511ca90fb	docs(adr): ADR-154 signal/DSP beyond-SOTA sweep — Milestone 0 Records Milestone-0 of the signal/DSP beyond-SOTA sweep with full PROOF discipline (MEASURED vs CLAIMED vs THEORETICAL grading throughout): - §2 discloses the headline anti-slop finding: the ADR-134 CIR coherence gate was DEAD in production (canonical-56 frames -> SubcarrierMismatch -> silent freq-domain fallback for every frame). Documents the canonical56() fix + the 4 committed proof tests. - §3 NaN/inf adversarial bypass; §4 divide-by-(n-1) window trio. - §5 the two MEASURED perf wins with before/after medians + reproduce commands. - §6 per-module SOTA landscape, evidence-graded: deep-unfolded ISTA/LISTA for CSI->CIR (~3 dB NMSE, MEASURED, arXiv 2211.15440 + 2502.05952), diffusion CIR prior (public weights, MEASURED), Wi-Spoof adversarial eval (MEASURED, arXiv 2511.20456), Bayesian multi-AP fusion (CLAIMED, no code, 2512.02462), coherence gating + RF intention-lead (THEORETICAL). - §7 roadmap: LISTA-for-CIR as the top ACCEPTED-future item (M effort; the ISTA + Phi already exist in cir.rs) — proposed, NOT implemented this milestone — plus the explicit deferred-findings backlog (the ~45 review findings not fixed here, graded P1/P2/P3) so nothing is silently dropped, with a horizon-ledger DONE-vs-DEFERRED one-liner. Co-Authored-By: claude-flow <ruv@ruv.net>	2026-06-11 19:21:31 -04:00
ruv	d22616c488	docs(research): WiFlow-STD audit writeup (published as public gist + upstream issue) Gist: https://gist.github.com/ruvnet/47d4369c0bd251ed233bbc450d50f6e6 Upstream report: DY2434/WiFlow...issues/3 Co-Authored-By: claude-flow <ruv@ruv.net>	2026-06-11 17:13:10 -04:00
rUv	17471e93ff	ADR-152: WiFi-Pose SOTA 2026 intake — WiFlow-STD benchmark, Rust integrations, ADR-153 802.11bf layer, efficiency frontier (#1008 ) * feat(calibration): NodeGeometry transceiver-geometry recording (ADR-152 §2.1.1) PerceptAlign-motivated geometry capture at enrollment: per-node optional records (position, antenna orientation, inter-node distances, acquisition method) — recorded when known, never required. Event-sourced via EnrollmentEvent::GeometryRecorded (latest recording wins); persisted on SpecialistBank with serde defaults so pre-ADR-152 bank JSON loads cleanly (fixture-proven, and geometry-free banks serialize byte-shape-identical to the old schema); threaded through MultiNodeMixture as data only — the learned geometry embeddings and algorithmic fusion use are §2.1.2, deliberately deferred until the ADR-151 P6 LoRA heads exist. Geometry recorded from now on means banks captured today remain usable for layout-conditioned training later — you can't retroactively add geometry to data you didn't record. 8 new tests (3 geometry, 2 anchor, 2 bank, 1 multistatic) + full-loop extension (2-node geometry, one tape-measured + one unknown, surviving the bank JSON round-trip the runtime loads from). 50/50 calibration (both feature configs) + 23 CLI tests green. Co-Authored-By: RuFlo <ruv@ruv.net> * feat(training): two-checkerboard camera↔room calibration for ADR-079 labels (ADR-152 §2.1.3) Defends the camera-supervised pipeline against PerceptAlign's "coordinate overfitting": MediaPipe keypoints were emitted in raw camera coordinates with no shared frame and no transceiver-geometry metadata — the exact label shape that memorizes deployment layout and collapses cross-layout. - scripts/calibrate-camera-room.py + calibration_lib.py: OpenCV two-checkerboard calibration → versioned bundle JSON (intrinsics, camera→room extrinsics, checkerboard spec, transceiver geometry, sha256 calibration_id). Intrinsics resolve from file > cache > multi-view computation > loud-warning 2-view fallback. - collect-ground-truth.py --calibration <bundle>: every sample gains keypoints_room (unit bearing rays from the camera center in the room frame — documented projective alignment; raw image coords preserved so training chooses), camera_origin_room, calibration_id, and the transceiver geometry stamp. Without the flag, output is byte-identical to before (tested) + a one-line ADR-152 warning. Design finding (recorded for ADR-152): a single planar checkerboard's corner grid is centrosymmetric — the reversed corner ordering fits a ghost camera pose with IDENTICAL reprojection error, so per-board flip disambiguation is mathematically ill-posed. solve_two_board_extrinsics solves the joint wall+floor set over all 4 flip combinations, where the minimum is unique — an independent reason the TWO-checkerboard method is required, beyond what PerceptAlign states. 15 headless pytest tests green (synthetic corners: extrinsics recovery incl. ghost resolution, bundle round-trip + hash stability, ray transforms w/ distortion + cross-resolution, no-calibration byte identity). Co-Authored-By: RuFlo <ruv@ruv.net> * feat(benchmarks): WiFlow-STD reproduction harness + measurement (a) results (ADR-152 §2.2) Shipped checkpoint REFUTED (0.08% PCK@20, wrong keypoint normalization); 6 reproducibility defects documented (broken imports, corrupted dataset tail with float32-max garbage that NaN-poisons fp16 BatchNorm, unreachable test phase). After repairs, retraining with upstream defaults reproduces 96.09% PCK@20 full-test / 96.61% corruption-free (published 97.25%) on RTX 5080. Claims graded MEASURED-EQUIVALENT; 2.23M params + ~0.055 GFLOPs verified. Third-party code/weights/data stay out of tree (gitignored). Co-Authored-By: claude-flow <ruv@ruv.net> * feat: ADR-152 Rust integrations + ADR-153 802.11bf protocol model - calibration: GeometryEmbedding — 32-slot permutation-invariant NodeGeometry featurization for future LoRA-head conditioning (ADR-152 §2.1.2); derived SpecialistBank::geometry_embedding() accessor; 59 tests - train: MaePretrainConfig + patchify/random-mask with UNSW measured recipe (80% masking, (30,3) patches; ADR-152 §2.3, arXiv 2511.18792); strict no-truncate/no-NaN policy; proptest properties - train: WiFlowStdModel — tch-gated port of the verified ~96%-PCK@20 WiFlow-STD architecture (ADR-152 §2.2 beyond-SOTA); ungated param formula pinned to 2,225,042; 15/17-keypoint support; 239 crate tests - hardware: ieee80211bf forward-compatibility protocol model (ADR-153): SpecProfile gates, SensingCapabilities negotiation, required ConsentMode, session FSM, SensingTransport + SimTransport + OpportunisticCsiBridge; full acceptance checklist covered; 156+4 tests - deps: ruvector bumps per ADR-152 §2.6 survey (mincut/solver 2.0.6, attention 2.1.0, gnn 2.2.0); vendor/ruvector synced to a083bd77f - docs: ADR-153 accepted; ADR-152 §2.2 status, §2.4 amendment, §2.6 added Workspace: 162 test suites green (--no-default-features); Python proof PASS. Known pre-existing flake: homecore-api env_empty_falls_back_to_defaults (unserialized env-var mutation) — untouched, follow-up. Co-Authored-By: claude-flow <ruv@ruv.net> * docs: CHANGELOG + CLAUDE.md entries for ADR-152 integrations and ADR-153 Co-Authored-By: claude-flow <ruv@ruv.net> * fix(train): repair tch-backend bit-rot — gated path compiles and tests run again Mechanical API refresh against current tch: Vec::from(Tensor) -> try_from (+ explicit flatten), numel() usize cast, Rem/div ops -> remainder() / divide_scalar_mode(floor) — the latter fixed a silent true-division bug in heatmap argmax decoding; clamp(1.0, f64::MAX) -> clamp_min (torch 2.x scalar overflow panic); petgraph EdgeRef import; missing EvalMetrics and verify_checkpoint_dir APIs that tests documented. wiflow_std roundtrip test uses safetensors (.pt _save_parameters roundtrip broken in torch 2.11 Windows). Gated: 349 passed (incl. all 20 wiflow_std); ungated: unchanged. Known pre-existing: gaussian-heatmap convention mismatch (2 tests), proof seed race under parallel threads — documented, deliberate follow-ups. Co-Authored-By: claude-flow <ruv@ruv.net> * feat(train): WiFlow-STD PyTorch->tch weight import + numerical parity proof export_to_safetensors.py maps the retrained checkpoint (295 tensors -> 248 mapped, param sum exactly 2,225,042; num_batches_tracked dropped) into a tch-loadable safetensors plus a deterministic parity fixture. Gated #[ignore] integration test loads it strictly and asserts forward-pass agreement: max abs diff 1.192e-7 on the seed-42 fixture. dump_variable_names test makes the tch name layout authoritative. Zero architecture discrepancies found. Co-Authored-By: claude-flow <ruv@ruv.net> * fix: workflow-review findings — BN gamma init, ThresholdParams serde, init docs Concurrent validation workflow (2 review lanes + adversarial verification, 13 agents): 5 confirmed findings, 3 refuted. Fixes: - wiflow_std: pin BatchNorm gamma to 1.0 (tch default draws Uniform(0,1) — silently halves activations in from-scratch training; loaded checkpoints unaffected, parity re-verified after the change) - wiflow_std: document the conv-init divergences vs the reference's effective kaiming_normal(fan_out) re-init (from-scratch dynamics only) - ieee80211bf: ThresholdParams deserialization validates via try_from so the <=100 invariant holds for untrusted payloads (+ rejection test) Benchmarks (release, ruvzen): GeometryEmbedding 1.84us/call (542k/s), MAE tokenization 7.38us/window (135k/s), 802.11bf FSM 8.9M events/s — nothing suspicious. Co-Authored-By: claude-flow <ruv@ruv.net> * docs(adr): ADR-152 §2.1.4 gate resolved — PerceptAlign repo MIT, dataset on HF Co-Authored-By: claude-flow <ruv@ruv.net> * feat(benchmarks): edge optimization measured + measurement (b) blocked + 92.9% retraction Edge optimization (ADR-152 optimize track): ONNX Runtime fp32 is the CPU latency win (3.2 ms/window, ~3.4x faster than torch, parity 2.4e-7); ORT dynamic int8 reaches 2.44 MB (paper's ~2.2 MB claim plausible only via conv-capable toolchains; -0.16pt PCK@20, +18% MPJPE, 2x slower); torch dynamic quant converts 0% of this conv-only model; fp16 halves storage free but is slower on CPU. Measurement (b) BLOCKED-ON-DATA: only 1,077 paired ESP32 windows exist (stop rule <2k). Forensic recheck of the surviving April holdout RETRACTS the ADR-079 '92.9% PCK@20' figure: constant-output model, absolute (not torso) threshold, 69 near-static frames — mean predictor scores 100% under that protocol; torso-PCK@20 is 19.1%. Corroborates PR #535. Stale citations removed from user-guide, readme-details, ADR-152 §2.1.3; no-citation rule extended to ADR-079 accuracy claims. Unblock: >=2k-window multi-pose paired session + torso-PCK re-baseline. Co-Authored-By: claude-flow <ruv@ruv.net> * docs(user-guide): corrected camera-supervised collection tutorial Step 0 CSI-rate check + session-length math (window yield = frames/20 — the May session's 8x under-delivery was a ~12 Hz CSI rate, not an aligner bug); two-checkerboard calibration step (ADR-152 §2.1.3); pose-variety and confidence guidance; torso-normalized PCK + temporal-split + pred-variance eval protocol (lessons from the 92.9% retraction); scale presets re-keyed to realistic window counts. Co-Authored-By: claude-flow <ruv@ruv.net> * feat(benchmarks): static PTQ int8 (calibrated) results + overnight capture script Conv-only static QDQ beats dynamic int8 on accuracy (PCK@20 96.61-96.63% vs 96.52%, MPJPE +10% vs +18% over fp32) at ~equal size/latency; all-ops QDQ strictly worse (int8 activations through attention glue). Entropy calibration verified bit-identical to MinMax on this data. Deployment: ONNX fp32 for speed (3.2ms), static conv-only QDQ for smallest (2.53MB). Also: scripts/overnight-empty-capture.py — segmented UDP CSI recorder for empty-room baselines (no glob collisions, detach-safe). Co-Authored-By: claude-flow <ruv@ruv.net> * feat(benchmarks): measurement (b) MEASURED — optimization transfer only, mean-pose baseline wins WiFlow-STD fine-tuned on 2,046 fresh single-room ESP32 paired windows (temporal 70/15/15, 70->540 adapter, K=17): pretrained-init 65% PCK@20 vs scratch 0% (optimization transfer) but frozen-trunk ~0% (no feature transfer), and NOTHING beats the mean-pose baseline (95.9% PCK@20 — single subject, near-static normalized coords). Honesty gates held: pred std 0.0113 (non-constant model) but mean-baseline dominance means no citable CSI->pose capability from this data. ADR-152 open question 1 answered partially; definitive answer needs multi-subject/position data. Two new aligner findings: heterogeneous csi_shape with silent zero-padding (~20%), and extractCsiMatrix's transposed shape label (frame-major data, [nSc, nFrames] label) — fixes pending. Co-Authored-By: claude-flow <ruv@ruv.net> * feat(benchmarks): efficiency sweep MEASURED — half model dominates full reference Compact WiFlow-STD variants on the same data/split/protocol: half (843,834 params, 0.38x) strictly dominates the 2.23M reference (PCK@20 96.62 vs 96.61, PCK@50 99.47 vs 99.11, MPJPE 0.00898 vs 0.0094) — the published architecture is over-parameterized for its own benchmark. quarter (338k) 96.05%; tiny (56,290 params, 1/39.5) holds 94.11% — a ~220KB fp32 edge candidate. In-domain caveats recorded; cross-domain untested. Co-Authored-By: claude-flow <ruv@ruv.net> * feat(train): compact WiFlow-STD presets in Rust + tiny edge artifact (ADR-152) WiFlowStdConfig gains half()/quarter()/tiny() mirroring the overnight sweep exactly: TcnGroupsMode (Fixed/Gcd/Depthwise), input_pw_groups, derived stride schedule and decoder-mid (all default to upstream behavior; legacy serde JSON unaffected). Param formulas pin to trained ground truth first try: 843,834 / 338,600 / 56,290; default 2,225,042 pin and 1.192e-7 parity unchanged. 248 tests green. Tiny edge artifact (tiny_edge_bench.py): ONNX fp32 = 295 KB, 0.66 ms/win (~1,500/s CPU), 94.11% PCK@20 (matches sweep clean-test exactly; parity 1.49e-7). Static int8 is a bad trade at this scale (-1.43pt, +19% MPJPE, -16% size, slower) — recorded as negative result. Export note: width-16 breaks AdaptiveAvgPool((15,1)) TorchScript export; replaced by exact mean+matmul equivalent, proven by parity. Co-Authored-By: claude-flow <ruv@ruv.net> * fix: resolve all 10 confirmed code-review findings (7-angle review, 20/20 verified) wiflow_std: min_feature_width (default 15) replaces the keypoints->stride coupling — for_keypoints(17) now provably builds the trained [2,2,2,2] graph and pools 15->17, matching the validated Python protocol (pinned by tests); param_count() total on invalid configs; random_mask returns Result and rejects non-finite/out-of-range ratios; trainer checkpoints switched to safetensors (.pt VarStore roundtrip broken on Windows torch 2.11). ieee80211bf: SBP proxy now re-triggers instances and relays reports via Action::RelaySbpReport -> SensingFrame::SbpReport (clients consume via their existing path); missed_instances reset on success = consecutive semantics; SessionTable gains a guarded SBP entry point + unknown-id drop counter; initiator-role sessions reject inbound setup/SBP requests (RejectedNotSupported) closing the idle hijack; StartSetup/StartSbp outside Idle return InvalidStateForCommand; SBP validation unified through evaluate_setup with a 1:1 SetupStatus->SbpStatus mapping. events.rs split out to honor the 500-line cap. calibration/cli: enrollment geometry now actually reaches trained banks — both production call sites attach .with_geometry; --geometry flag on train-room and POST /enroll/geometry + train-body geometry on calibrate-serve give production a recording surface; geometry-free banks log the ADR-152 §2.1.2 note. benchmarks: corruption masks committed as ground truth (unregenerable after in-place cleaning; verified bit-identical regeneration from the pristine copy) + generate_corruption_masks.py producer; _bench_common.py dedups the 5x-copied shim/evaluate/seed/remap (post-refactor PCK@20 re-verified equal to the last digit); remote scripts get the mmap patch; tiny_edge --calib validated multiple-of-64; onnx_bench --help no longer executes (and overwrote) the export — artifact restored byte-exact. Workspace: 2,963 tests passed, 0 failed; Python proof PASS. Co-Authored-By: claude-flow <ruv@ruv.net> * ci: build workspace tests without debuginfo — runner disk exhaustion The combined 38-crate debug target exceeds the GitHub runner's disk ('final link failed: No space left on device'); the same tree measured 151GB locally with full debuginfo. CARGO_PROFILE_{DEV,TEST}_DEBUG=0 shrinks the target ~5-10x; debuginfo serves no purpose in CI test runs. Co-Authored-By: claude-flow <ruv@ruv.net>	2026-06-11 17:02:23 -04:00
rUv	29de574e63	Beyond-SOTA engine/signal/train improvements: mesh partition guard, FFT CIR solver, canonical frame decoder, falsifiable occupancy benchmark, governed streaming, adapter provenance (#1018 ) * docs(research): add RuView beyond-SOTA system review (00) First document of the beyond-SOTA research series: capability audit of the current RuView engine with role-to-crate maturity matrix, ruvsense module inventory, gap analysis, and risk register. https://claude.ai/code/session_01MjBucx95K4BuUxZi8NWwRH * docs(research): add beyond-SOTA architecture design (02, in progress) https://claude.ai/code/session_01MjBucx95K4BuUxZi8NWwRH * docs(research): finalize beyond-SOTA architecture (02) https://claude.ai/code/session_01MjBucx95K4BuUxZi8NWwRH * docs(research): add benchmark/validation methodology snapshot (03) https://claude.ai/code/session_01MjBucx95K4BuUxZi8NWwRH * docs(research): add beyond-SOTA series index with validation results; changelog README index ties the 5 research docs together with the session's measured validation evidence: 2,797 workspace tests / 0 failed, Python proof PASS (bit-exact), and paired pre/post criterion CIR benchmarks. https://claude.ai/code/session_01MjBucx95K4BuUxZi8NWwRH * perf(signal): precompute CIR warm-start system; hoist tomography solver allocs Exact, determinism-safe optimizations (bit-identical float results): - cir.rs: diag(PhiH Phi)+lambdaI and its CSR matrix depend only on Phi and lambda (fixed at CirEstimator::new) but were rebuilt every frame (O(KG) pass + CSR allocation). Now built once in new() via build_warm_start_system; summation order unchanged. - tomography.rs: ISTA gradient buffer hoisted out of the 100-iteration loop (fill(0.0) reset) and the Frobenius Lipschitz bound moved from per-reconstruct to construction. Verified: signal 456 tests green; engine 11/11 green including cycle_is_deterministic and witness-stability tests. Criterion paired pre/post: cir_estimate/he40 -3.9% (p<0.01), multiband -1.2/-1.4%. https://claude.ai/code/session_01MjBucx95K4BuUxZi8NWwRH * fix(worldgraph): bound SemanticState growth with deterministic retention StreamingEngine::process_cycle appended one SemanticState belief per cycle with no eviction — ~1.7M nodes/day at 20 Hz (beyond-SOTA roadmap finding #6). Add WorldGraph::prune_semantic_states(max): deterministic eviction of the oldest beliefs by (valid_from_unix_ms, id); structural nodes (rooms, zones, sensors, anchors, tracks, events) are never eligible. Wire it into the engine after each belief append (DEFAULT_SEMANTIC_RETENTION = 7,200, ~6 min at 20 Hz; set_semantic_retention to tune). The WorldGraph holds current beliefs; durable history is the recorder's job, so no audit data is lost. 3 new tests: end-to-end bounded growth, oldest-only eviction, deterministic equal-timestamp tie-break. Workspace gate: 2,865 passed, 0 failed. https://claude.ai/code/session_01MjBucx95K4BuUxZi8NWwRH * feat(sensing-server): route live frames through the governed StreamingEngine Closes the live-trust-path gap (ADR-136 section 8, beyond-SOTA system review): the running server fused live CSI with the bare MultistaticFuser, while the privacy/provenance/witness control plane (ADR-135..146) only ever ran on synthetic in-test frames. The privacy control plane was therefore bypassable on the real path. New engine_bridge module drives StreamingEngine::process_cycle from the server's live NodeState map, reusing the existing NodeState -> MultiBandCsiFrame conversion. It lazily wires each contributing node as a WorldGraph sensor (idempotent), bounds belief growth via the retention cap, and forwards explicit timestamps/calibration ids so the path stays deterministic and replayable. Wired additively into both live ESP32/WiFi fusion sites in main.rs via a split-borrow off the write guard, so person-count behavior is unchanged; the latest BLAKE3 witness is stored on AppState. Every published belief now carries evidence + model + calibration + privacy decision and a deterministic witness. Adds wifi-densepose-engine/-worldgraph/-bfld/-geo deps. 6 new bridge tests (witnessed belief with full provenance, cross-run determinism, idempotent node registration, retention bound, privacy-mode propagation). sensing-server suite 430+128 green; workspace gate 2,904 passed / 0 failed. https://claude.ai/code/session_01MjBucx95K4BuUxZi8NWwRH * feat(train): falsifiable occupancy benchmark with anti-overfitting gate Makes the presence/person-count "beyond SOTA" claim falsifiable in code instead of aspirational (the unfalsifiability gap from the beyond-SOTA system review). occupancy_bench grades predictions vs ground truth and gates a SOTA claim behind one claim_allowed invariant requiring ALL of: - DataProvenance::Measured — synthetic/mock data is scorable for regression but never claimable (anti-mock-contamination; the CLAUDE.md Kconfig-bug lesson made structural). - A leak-free EvalSplit — validate() refuses any split where a subject OR environment id appears in both train and test (subject leakage / per-environment overfitting). - n_test >= min_test_samples (small-N guard). - Presence F1 whose bootstrap-CI lower bound (deterministic seeded splitmix64) clears the threshold — not the point estimate. - Count MAE within threshold. The claim string is unreadable except through the gate (NO_CLAIM otherwise), same discipline as the ruview-gamma acceptance gate. What remains is data, not method: a frozen, SHA-pinned, subject/environment-disjoint measured replay set turns the claim into a passing/failing test. Lives in wifi-densepose-train (the eval bounded context, alongside ablation/ eval/metrics). 10 tests cover each refusal path; warning-clean under the crate's missing_docs lint. Workspace gate 2,914 passed / 0 failed. Doc 03 updated. https://claude.ai/code/session_01MjBucx95K4BuUxZi8NWwRH * feat(engine): per-room adapter provenance + drift-to-recalibration advisor Closes the trust-chain gap where an ~11 KB per-room LoRA adapter (ADR-150 section 3.4) could silently change inference without the witness noticing: provenance carried only "rfenc-v<N>" with no notion of adapter identity. - StreamingEngine::set_room_adapter(AdapterInfo): pins the adapter's content-derived id into provenance model_version ("rfenc-v1+adapter:<id>") — and therefore into the BLAKE3 witness — so swapping or clearing adapter weights always shifts the witness. Engine test proves base -> adapter -> other-adapter -> cleared all witness differently and cleared == base. - RecalibrationAdvisor: recommends re-running the ADR-135 empty-room baseline / refitting the room adapter on sustained low fusion coherence (streak threshold, default 60 cycles ~ 3 s at 20 Hz) or an ADR-142 change-point. Surfaced as TrustedOutput::recalibration_recommended, stored on the sensing-server AppState alongside the witness at both live fusion sites. - Bridge plumbing: EngineBridge::{set_room_adapter, clear_room_adapter} + live-path test that the adapter id flows into the live witness. Scope note (honest): this is the deployable provenance/trigger half of the "retrained model" roadmap item. Fitting the adapter itself runs in the existing external calibration service (aether-arena/calibration/); a trained RF-encoder checkpoint still does not exist in-tree. Engine 15 tests, bridge 7 tests. Workspace gate: 2,918 passed / 0 failed. https://claude.ai/code/session_01MjBucx95K4BuUxZi8NWwRH * fix(mat): gate api module behind its feature — standalone no-default-features builds pub mod api was unconditional while its only dependency, serde, is optional behind the 'api' feature, so any build without default features failed with 101 unresolved-serde errors (masked in --workspace runs by feature unification). The api module and its create_router/AppState re-export are now cfg(feature = "api")-gated with docsrs annotations. All combos compile: bare --no-default-features (was 101 errors, now 0), --no-default-features --features api, and full default (177 tests pass). Workspace gate: 2,918 passed / 0 failed. https://claude.ai/code/session_01MjBucx95K4BuUxZi8NWwRH * perf(signal): opt-in FFT operator for the CIR ISTA solver (8-14x measured) Phi is a sub-DFT, so each ISTA mat-vec can run as one length-G FFT (O(G log G)) instead of a dense O(KG) product — the dominant-latency-hazard finding from the beyond-SOTA optimization roadmap. New CirConfig::fft_operator, default FALSE: the dense path stays the bit-exact witness default. The FFT evaluates the same sums in a different order, so enabling it shifts float results in the last bits and requires regenerating any pinned witness — strictly opt-in per deployment. FftOperator (rustfft, planned once at CirEstimator::new, scratch buffers reused across the ISTA loop) dispatches inside ista_solve: Phi x = scale forward-FFT(x) sampled at bins (k_idx mod G) Phi^H v = scale * unnormalised inverse-FFT of v scattered into those bins Warm-start and Lipschitz estimation stay dense at construction. Measured (criterion, same run, same machine): ht20: 2.22 ms -> 265 us (8.4x) ht40: 10.26 ms -> 717 us (14.3x) The real HE40 grid (K=484, G=1452) scales further per the O(KG)/O(G log G) ratio. 3 new tests: FFT<->dense matvec equivalence to float tolerance on ht20 and he40 grids; end-to-end dominant-tap agreement on a single-path frame; all default configs keep FFT off. New cir_estimate_fft bench group. Workspace gate: 2,921 passed / 0 failed (default path bit-exact, witnesses unchanged). https://claude.ai/code/session_01MjBucx95K4BuUxZi8NWwRH feat(core): canonical frame decoder — capture-to-claim replay (ADR-136) The encode half of the ADR-136 frame contract existed (ComplexSample, to_canonical_bytes, witness_hash) but there was no decoder: a captured canonical frame could be witnessed but never reconstructed, blocking replay-from-capture. CsiFrame::from_canonical_bytes is the exact inverse: same id, metadata, complex payload, and witness hash (tested as the round-trip law AC7 — the replayed frame re-encodes byte-identically). Amplitude/phase are recomputed from the payload (projections, not independent state). Every malformed-input class fails closed (AC8): header truncation -> Truncated, payload truncation -> PayloadMismatch, unknown discriminants, non-UTF-8 device id, trailing bytes. Nil calibration uuid decodes as None per the documented encoding. Core: 36 tests pass. Workspace gate: 2,937 passed / 0 failed. https://claude.ai/code/session_01MjBucx95K4BuUxZi8NWwRH * feat(engine): dynamic min-cut mesh partition guard (ruvector-mincut) Maintains an exact min-cut over the live mesh coupling graph — nodes are sensing nodes, coupling is the product of fusion attention weights — and surfaces per cycle, as TrustedOutput::mesh: - cut value: the global "how close is the array to partitioning" number, a structural measure per-node heuristics miss; - weak side: which specific nodes would split off (failure/jamming triage, feeds ADR-032 posture); - at-risk flag: counts as a structural event for the drift->recalibration advisor (alongside ADR-142 change-points). Degenerate cases fail toward risk: a node with zero coupling is reported as already partitioned (cut 0, that node as the weak side). Measured cost policy (criterion, 12-node mesh — the honest part): - weights quantized (1/64) + change-gated: steady-state cycles do ZERO graph work and reuse the cached cut (~7.3 us, ~23x cheaper than building); - on any real change a full exact rebuild (~171 us) is used, because ONE DynamicMinCut delete+insert measured ~240 us — the subpolynomial machinery amortizes on much larger graphs, so rebuild-on-change is the measured optimum at mesh scale (one-edge case -28% after switching policy); - full process_cycle with the guard: ~33 us for 4 nodes vs the 50 ms budget. 9 mesh_guard tests (weak-node detection, steady-state zero updates, sub-quantum gating, join/drop rebuild, determinism, disconnection) + an engine-level wiring test (down-weighted node -> weak side -> recalibration). Engine 24 tests; workspace gate 2,946 passed / 0 failed. https://claude.ai/code/session_01MjBucx95K4BuUxZi8NWwRH * feat(engine): mesh partition risk demotes privacy + enters the witness (ADR-032) Completes the mesh-guard integration: its at_risk signal was advisory-only (fed the recalibration advisor). It now also contributes to the ADR-141 privacy demotion alongside fusion- and array-level contradictions — a mesh close to partitioning makes the fused belief less trustworthy, so the cycle emits at a more restricted class (monotonic; information only removed). Because effective_class feeds the BLAKE3 witness, a fragmenting array now shifts the witness: partition risk is auditable, not just logged. The mesh computation moved ahead of the demotion step in process_cycle; mesh_guard_mut exposes risk-threshold tuning. Test: a forced-risk 3-node cycle demotes PrivateHome Anonymous->Restricted and shifts the witness vs a clean baseline. Engine 25 tests; workspace gate 2,947 passed / 0 failed. https://claude.ai/code/session_01MjBucx95K4BuUxZi8NWwRH * fix: public-PR review findings — privacy-path honesty, gate holes, mesh-guard cliff - sensing-server: engine errors logged+counted (no silent swallow), trust state exposed via status surface, privacy-demotion claims aligned with the actual parallel-audit-path behavior - occupancy_bench: vacuous-F1 hole closed (degenerate test sets fail with their own criterion); CI-lower-bound test made probative - mesh_guard: quantization scaled to observed coupling range — >=65-node balanced meshes no longer permanently at_risk (regression test) - engine: both wiring tests made probative (same-topology witness compare, deterministic risk-crossing fixture) - mat: axum/tokio optional behind api; real serde feature (api enables it) - core: canonical decoder strict (non-zero reserved bytes and nil UUID rejected — injective on accepted domain, forged-bytes tests) - CHANGELOG: un-spliced the FFT/adapter bullet mangle Co-Authored-By: claude-flow <ruv@ruv.net> * chore: strip private-track references for public PR Reword the occupancy-benchmark changelog bullet to drop a cross-reference to the private research track, and restore the WorldGraph retention bullet header that was glued onto the preceding MAT bullet. Co-Authored-By: claude-flow <ruv@ruv.net> * chore: lockfile refresh for cherry-picked feature set Co-Authored-By: claude-flow <ruv@ruv.net> --------- Co-authored-by: Claude <noreply@anthropic.com>	2026-06-11 16:08:54 -04:00
rUv	d0e27e652e	fix(firmware): C6 IDF v5.5 guard + HE-LTF host ingest + WITNESS-LOG-110 B1 resolution (#1005 ) (#1011 ) * fix(firmware): c6_sync_espnow IDF v5.5 send-callback guard + B1 HE-LTF resolution (#1005) Espressif backported the esp_now_send_cb_t signature change to v5.5 (esp_now_send_info_t = wifi_tx_info_t there), so the #944 guard must be ESP_IDF_VERSION >= VAL(5,5,0), not MAJOR >= 6. Validated on this repo's hardware toolchain: - WITHOUT fix, IDF v5.5.2 esp32c6 build fails with the reporter's exact incompatible-pointer error at c6_sync_espnow.c:199 (reproduced) - WITH fix, clean build on IDF v5.5.2 (esp32c6) AND IDF v5.4 (regression) Docs: WITNESS-LOG-110 §B1 marked RESOLVED WITH MEASUREMENT (external, @stuinfla, issue #1005): IDF v5.4 driver downconverts HE->HT; v5.5.2 delivers true HE-LTF (532B / 256 bins / 242 tones, PPDU 0x01 HE-SU). ADR-110 capability table updated accordingly. Co-Authored-By: claude-flow <ruv@ruv.net> * docs: WITNESS-LOG-110 §B1 — in-house HE-LTF replication on the original COM12 C6 84% of 1,525 frames at 532B/PPDU 0x01 (HE-SU) with IDF v5.5.2 + the #1005 guard fix, AP ruv.net 11ax 2.4GHz. Two independent rigs now confirm: v5.4 downconverts, v5.5.2 delivers 242-tone HE20. Co-Authored-By: claude-flow <ruv@ruv.net> * fix(host): 256-bin HE-LTF ingest end-to-end + latent offset bugs (#1005) Audit of every ADR-018 consumer against live C6 HE20 frames (532B/256-bin): - sensing-server + CLI calibrate parsers read n_subcarriers from one byte (256 decoded as 0) with stale seq/rssi offsets (rssi always 0 — latent, pre-existing, confirmed vs firmware csi_collector.c). Fixed to the real ADR-018 layout; n_subcarriers u8->u16; byte 18 surfaced as typed PpduType. - sensing-server probe buffer 256B -> 2048B (532B datagram errored on Windows) - per-node grid gate: lock densest (n_subcarriers, ppdu_type) grid, re-warm on upgrade, skip sparser minority frames — HT-64 never mixes into an HE-256 baseline window - hardware parser: HE-aware bandwidth classification (256-FFT HE20 = 20MHz, was Bw160); PpduType/Adr018Flags re-exported - verbatim live frames (532B HE-SU, 148B HT) embedded as regression fixtures - archive python parser: bandwidth heuristic mirror fix Live-validated: calibrate --tier he20 consumed 600x 256-bin frames into an ADR-135 He20 baseline (242 tones) skipping 94 HT frames; sensing-server shows node 12 active with real RSSI (-40dBm). 765 tests green across the three crates; workspace check clean; Python proof PASS. Co-Authored-By: claude-flow <ruv@ruv.net> * test(fuzz): esp_netif/ping_sock/ip_addr stubs — un-break ADR-061 fuzz build after #954 csi_collector.c gained esp_netif.h / ping/ping_sock.h / lwip/ip_addr.h includes for the #954 gateway self-ping; the host-fuzz stub env lacked them, breaking the fuzz build on main since `5789351b7`. Stubs return no-gateway so the self-ping path early-outs (compiles + links, never exercised — matches the fuzz threat model which targets frame serialization, not the network stack). Co-Authored-By: claude-flow <ruv@ruv.net>	2026-06-11 11:00:37 -04:00
rUv	2a307138f2	feat: per-room calibration system (ADR-151) + cognitum-v0 appliance integration spec (#989 ) * docs(adr): ADR-151 — Per-Room Calibration & Specialized Model Training Room-first calibration -> bank of small specialised ruVector models (breathing, heartbeat, restlessness, posture, presence, anomaly) distilled from the frozen Hugging-Face-published RF Foundation Encoder (ADR-150). Four-stage local-first pipeline: baseline (ADR-135 environmental fingerprint) -> guided enrollment (NEW EnrollmentProtocol, clean anchors not hours) -> feature extraction (reuse signal_features + ruvsense) -> specialist bank training (rapid_adapt LoRA heads, RVF storage, HNSW prototypes). Invariants: specialisation over scale; local heads over a shared public base; honest STALE degradation on baseline drift. Indexes ADR-149/150/151. Co-Authored-By: claude-flow <ruv@ruv.net> * feat(cli): calibration HTTP API for UI-driven baseline capture (ADR-135/151) Adds `wifi-densepose calibrate-serve` — an Axum HTTP API that wraps the ADR-135 CalibrationRecorder so a UI (or any client) can drive an empty-room baseline capture remotely. Stage 1 ("teach the room") of the ADR-151 room calibration & training pipeline. A single background task owns the UDP socket (ESP32 0xC511_0001 frames) and the optional active recorder; HTTP handlers talk to it over an mpsc command channel and read a shared status snapshot, keeping the &mut recorder lock-free. CORS permissive so a browser UI can call it. Endpoints (/api/v1/calibration/): GET /health liveness + UDP ingest stats (frames_seen, streaming) POST /start { tier?, duration_s?, room_id?, min_frames? } GET /status live progress (state, frames, progress, z, eta) — poll for UI POST /stop finalize the current session early GET /result finalized baseline summary (amp/phase-dispersion averages) GET /baselines list persisted baseline .bin files Reuses the existing calibrate.rs ESP32 wire parser (made pub(crate)); honest abort when <10 frames arrive in the window (e.g. ESP32 not streaming). Verified end-to-end over loopback: start -> 300 replayed HT20 frames -> state=complete, 52-subcarrier baseline, phase_dispersion_avg=0.00096 (concentrated/valid), persisted to disk; all 6 endpoints exercised. CLI: 19 tests pass; crate builds clean. Co-Authored-By: claude-flow <ruv@ruv.net> test(cli): firewall-free CSI UDP relay for local Windows ESP32 testing Windows Defender blocks inbound LAN UDP to a freshly-built binary without an admin allow-rule; python.exe is already allowed. This relay binds the public CSI port and forwards each datagram verbatim to a loopback port where `calibrate-serve --udp-bind 127.0.0.1 --udp-port 5006` listens (loopback is firewall-exempt). No admin required. Validated: ESP32-format 0xC5110001 frames -> :5005 -> relay -> :5006 -> calibrate-serve -> state=complete, 52-subcarrier baseline, phase_dispersion_avg=0.00098 (clean). Completes the no-admin live-test path. Co-Authored-By: claude-flow <ruv@ruv.net> * docs(changelog): record ADR-151 calibration API (calibrate-serve) Co-Authored-By: claude-flow <ruv@ruv.net> * feat(calibration): ADR-151 Stages 2–5 — enrollment, extraction, specialist bank, runtime New crate wifi-densepose-calibration implementing the per-room pipeline beyond Stage-1 baseline: - anchor.rs: guided-anchor sequence + event-sourced EnrollmentSession (Stage 2) - enrollment.rs: AnchorQualityGate + AnchorRecorder — gates anchors against the ADR-135 baseline deviation (presence/motion), re-prompts bad captures - extract.rs: Features + AnchorFeature — autocorrelation periodicity (breathing/ HR bands), variance/motion (Stage 3) - specialist.rs: 6 small room-calibrated models — presence (learned threshold), posture (nearest-prototype), breathing/heartbeat (band periodicity), restlessness (calm/active normalization), anomaly (novelty vs anchors) (Stage 4) - bank.rs: SpecialistBank — train/persist + baseline-drift STALE invalidation - runtime.rs: MixtureOfSpecialists — presence short-circuit + anomaly veto + stale flagging (Stage 5) Statistical heads make the pipeline runnable/validatable today; the ADR-150 HF RF Foundation Encoder backbone is the documented upgrade path. 29 unit tests pass. Co-Authored-By: claude-flow <ruv@ruv.net> * feat(cli): wire ADR-151 enroll / train-room / room-status / room-watch Integrates the wifi-densepose-calibration crate into the CLI as four subcommands driving the full Stage 2–5 pipeline against a live ESP32 raw-CSI stream (edge_tier=0): - enroll: walks the guided anchor sequence, gates each capture against the ADR-135 baseline deviation (re-prompts bad anchors), writes labelled features - train-room: fits the SpecialistBank from the enrollment, persists JSON - room-status: prints a trained bank's summary - room-watch: live mixture-of-specialists readout (presence/posture/breathing/ heart/restless) over a rolling window, with anomaly veto + STALE flagging Per-frame scalar is the mean CSI amplitude (carries presence/motion + breathing modulation). Validated end-to-end on the live ESP32 (COM8, edge_tier=0): the real parser → feature extraction → runtime detected breathing (~16–31 BPM) on hardware. Full multi-anchor enrollment accuracy requires the operator to perform the poses; phase-based breathing extraction is a noted refinement. 48 tests pass (29 calibration + 19 CLI). Co-Authored-By: claude-flow <ruv@ruv.net> * docs(adr-151): mark Stages 1–5 implemented; expand CHANGELOG Co-Authored-By: claude-flow <ruv@ruv.net> * fix(cli): keep proven mean-amplitude carrier for room features The max-variance-subcarrier carrier locked onto motion artifacts (not breathing) and also had an out-of-bounds bug on variable CSI subcarrier counts. Reverted to the mean-amplitude carrier, which is validated live to detect breathing. Phase-based extraction on a stable subcarrier remains the proper higher-SNR refinement (ADR-151 §4). Co-Authored-By: claude-flow <ruv@ruv.net> * feat(calibration): multistatic fusion of co-located nodes (ADR-029/151) MultiNodeMixture fuses several co-located nodes (each with its own room-calibrated SpecialistBank) into one RoomState: - presence: OR across nodes (any node seeing a person wins) - posture/breathing/heartbeat: highest-confidence node (best viewpoint) - restlessness/anomaly: max across nodes - veto: any node's physically-implausible signal vetoes the room's vitals (anti-hallucination, same as single-node runtime) + presence short-circuit - stale: any node's STALE flag propagates Same-room multistatic only; cross-room is federation (ADR-105), not fusion. 6 unit tests (presence OR, best-confidence breathing, single-node veto, staleness). 35 calibration tests pass. Co-Authored-By: claude-flow <ruv@ruv.net> * feat(cli): multistatic room-watch — fuse co-located nodes (ADR-029/151) `room-watch --node-bank N:path` (repeatable) groups live CSI frames by node_id and fuses per-node banks via MultiNodeMixture. Validated live on COM8 (node 9, edge_tier=0): frames grouped + fused end-to-end. True 2-node fusion is covered by unit tests; a second raw-CSI node is the hardware blocker. 54 tests pass. Co-Authored-By: claude-flow <ruv@ruv.net> * docs(integration): calibration → cognitum-v0 appliance integration overview Detailed cross-repo integration spec for cognitum-one/v0-appliance: data contracts (CSI wire format, ADR-135 baseline binary, enrollment/bank/RoomState JSON schemas), calibrate-serve HTTP API, public crate API, Pi5+Hailo tiering, and a 5-step appliance integration plan. Grounded in the verified cognitum-v0 inventory (aarch64, cargo 1.96, HAILO10H, ruview-vitals-worker:50054). Co-Authored-By: claude-flow <ruv@ruv.net> * fix(calibration): address PR review — aarch64 decouple, API auth, path traversal, throttle Resolves the review on #989: - Cross-compile (the appliance blocker): make wifi-densepose-mat optional and feature-gate it (`mat`), so `cargo build -p wifi-densepose-cli --no-default-features` excludes the mat→nn→ort(ONNX)→openssl-sys chain. Verified: `cargo tree --no-default-features` shows 0 ort/openssl deps → calibration cross-compiles clean for the Pi. - Security (must-fix before LAN): - `--token` / CALIBRATE_TOKEN bearer-auth middleware on every route; warns if bound non-loopback without a token. - sanitize client-supplied `room_id` to [A-Za-z0-9_-] (≤64) before it reaches the baseline write path — kills the `../` file-write primitive. + test. - Perf: stop locking shared status + cloning SessionStatus on every UDP frame — counters/snapshot flush on the 200 ms tick instead (no CPU starvation under flood). finalize write moved to async `tokio::fs::write`. - Docs: ADR-151 STALE wording matches the impl (baseline-id change; drift-threshold = P6 refinement); integration doc gets the `--no-default-features` build + auth/sanitize notes. 35 calibration + 15 CLI tests (no-default) / 20 CLI (default) pass. Co-Authored-By: claude-flow <ruv@ruv.net> * docs(worldgraph,worldmodel): add crates.io READMEs Plain-language overviews + feature lists, comparison tables (symbolic graph vs predictive occupancy; graph vs grid vs event-log), usage, and technical details. Adds readme = "README.md" to both manifests so they render on crates.io on the next release. Co-Authored-By: claude-flow <ruv@ruv.net> * release: worldgraph & worldmodel 0.3.1 (READMEs on crates.io) Co-Authored-By: claude-flow <ruv@ruv.net> * docs: precise calibration validation scope (capture+API+auth proven; clean enroll→train→infer not yet on-target) Aligns ADR-151 §7 + the appliance integration doc with the PR #989 scope clarification: nothing has run a clean baseline → enroll → train → infer on live CSI; the live breathing read used the stateless head, not a trained bank. Adds --source-format adr018v6 to the backlog. Co-Authored-By: claude-flow <ruv@ruv.net> * feat(calibrate-serve): live GET /room/state endpoint (mixture over CSI window) Adds a live RoomState readout over HTTP — the appliance UI's main need. The ingest task maintains a rolling per-frame scalar window (flushed on the 200 ms tick, no per-frame lock); the handler loads a bank (resolved as a sanitized name under output_dir — same path-traversal defense as room_id), runs the MixtureOfSpecialists over the window, returns RoomState JSON. Validated live (ESP32-S3 via relay): breathing 14-19 BPM over HTTP; a bank=../../etc/passwd query is neutralized to 'etcpasswd' (no traversal). Co-Authored-By: claude-flow <ruv@ruv.net> * feat(calibrate-serve): POST /room/train + fix AnchorLabel JSON to snake_case - POST /api/v1/room/train: { room_id, baseline_id, anchors[] } → trains a SpecialistBank and persists it as <output_dir>/<room_id>.json (path-sanitized), readable via /room/state?bank=<room_id>. Completes the HTTP train→infer loop. - Fix data-contract bug: AnchorLabel serialized as PascalCase variant names (serde default) while as_str() + the integration doc used snake_case. Added #[serde(rename_all = "snake_case")] so the JSON wire format matches the documented contract (empty/stand_still/…). Locked with a roundtrip test. Validated live (ESP32-S3): POST train (4 anchors → 6 specialists, persisted) → GET /room/state returns RoomState with the trained presence/restlessness; the synthetic-vs-real scale mismatch correctly triggers the anomaly veto. 36 calibration tests pass. Co-Authored-By: claude-flow <ruv@ruv.net> * feat(calibrate-serve): live enroll-over-HTTP (POST /enroll/anchor + /enroll/status) Closes the last HTTP gap — the appliance can now drive the ENTIRE calibration pipeline over HTTP without the CLI: baseline (start/stop) -> enroll/anchor x8 -> room/train -> room/state - POST /enroll/anchor { room_id, baseline, label, duration_s? }: the ingest task loads the baseline (sanitized name under output_dir), captures the anchor for the duration against it (AnchorRecorder + per-frame series), runs the quality gate, and on completion replies with the verdict + accumulates the AnchorFeature in an in-server enrollment map keyed by room_id. Re-prompts on rejection. - GET /enroll/status?room=<id>: accepted anchors, next, complete. - POST /room/train now falls back to the in-server enrollment when anchors[] is omitted. Validated live (ESP32-S3): capture baseline -> enroll stand_still (271 frames, 6s) -> gate correctly rejects "no person detected (presence_z 0.90 < 1.50)" relative to a same-occupancy baseline (a clean empty-room baseline is the documented on-target prerequisite). Builds clean; CLI tests pass. Co-Authored-By: claude-flow <ruv@ruv.net> * test(calibrate-serve): HTTP integration tests for the room/enroll endpoints Factor the router into build_router() (shared by execute + tests) and add tower-oneshot integration tests (no network/ingest needed): - health + descriptor → 200 - POST /room/train persists the bank; GET /room/state → 200; train with no anchors/enrollment → 400 - path-traversal: /room/state?bank=../../etc/passwd → 404 (sanitized, never reads outside output_dir) - enroll/status empty; /enroll/anchor with an unknown label → 400 CI regression coverage for the endpoints added this session. 18 CLI tests pass. Co-Authored-By: claude-flow <ruv@ruv.net> * fix(mat): make serde non-optional — unblocks `cargo test --workspace --no-default-features` Making wifi-densepose-mat optional in the CLI (for the aarch64/ort decouple) exposed a latent feature bug: mat's `api` module compiles unconditionally and uses serde, but `serde` was an optional dep enabled only via the `api`/`serde` features. Previously the CLI's unconditional mat dependency enabled those features transitively, so `--workspace --no-default-features` still got serde; once mat became optional+gated, the workspace build lost it → `error[E0432]: unresolved import serde` across mat's api/* (CI red). mat already pulls serde_json + axum unconditionally, so making `serde` non-optional has no real cost and restores the workspace build. Does NOT affect the aarch64 CLI build (mat isn't built there at all): verified `cargo tree -p wifi-densepose-cli --no-default-features` still shows 0 ort/openssl deps, and `cargo test --workspace --no-default-features` compiles clean. Co-Authored-By: claude-flow <ruv@ruv.net> * docs(claude.md): add wifi-densepose-calibration to crate table (pre-merge) Co-Authored-By: claude-flow <ruv@ruv.net> * docs(adr): ADR-152 — WiFi-pose SOTA 2026 intake (geometry-conditioned calibration, external benchmarks, encoder recipe) Records the 2026-06-10 deep-research run (22 sources, 110 claims, 25 adversarially verified: 24 confirmed / 1 refuted) and the decisions it implies: - §2.1 ACCEPTED: geometry-condition the ADR-151 calibration system — NodeGeometry at enrollment, geometry embeddings for future LoRA heads, PerceptAlign-style two-checkerboard camera↔WiFi alignment for the ADR-079 supervised path. PerceptAlign (MobiCom'26) names the failure mode ("coordinate overfitting") that matches our own ADR-150 cross- subject collapse. - §2.2 ACCEPTED: benchmark protocol vs external "WiFlow-STD (DY2434)" (claimed 97.25% PCK@20, Apache-2.0 weights+dataset) with a no-citation rule until measured on our 17-keypoint ESP32 eval set. Name collision with our internal WiFlow is disambiguated. - §2.3 ACCEPTED: amend ADR-150 training recipe per UNSW MAE study — 80% masking, (30,3) patches, data-over-capacity priority (log-linear, unsaturated at 1.3M samples). - §2.4 watch items: IEEE 802.11bf-2025 published 2025-09-26; esp_wifi_sensing as external presence baseline (drop-in claim REFUTED 0-3); ZTECSITool 160MHz/512-subcarrier anchor node (procurement-gated). - §2.5 NOT adopted: non-WiFi "foundation model" papers; DensePose-UV (no 2025-2026 work does UV regression from commodity WiFi). Every number is evidence-graded CLAIMED vs MEASURED in the source register. Re-check horizon 2026-12. Co-Authored-By: RuFlo <ruv@ruv.net> * test(calibration): full-loop integration test — baseline→enroll→train→infer proven in-process (ADR-151 §7 gap, software half) Closes the software half of PR #989's headline validation gap: the complete calibration loop had never run end-to-end anywhere, even in-process. tests/full_loop.rs (412 lines, deterministic xorshift32 room simulator, HT20/52-subcarrier/20Hz, same fingerprint family as the ADR-135 roundtrip test) now drives the CLI's exact stage order through the public API: 1. baseline — 600 static frames, zero motion flags post-warmup, calibration_uuid() exactly as the CLI derives it 2. enroll — all 8 AnchorLabel::SEQUENCE anchors through AnchorQualityGate::default(), session is_complete() 3. extract — AnchorFeature::from_series recovers injected 0.25Hz and 0.125Hz breathing within ±0.04Hz 4. train — SpecialistBank::train fits all 6 specialists; JSON round-trip and the runtime consumes the RELOADED bank 5. infer — positive: never-enrolled 0.30Hz subject reads present, 18±2 BPM; negative: empty window reads absent; degradation: foreign baseline_id flags STALE Seed-robust (5 seeds), passes with and without default features: 36 unit + 1 integration green. Validation docs updated (ADR-151 §7 + integration doc §7 matrix): what remains is strictly the on-target hardware session (real CSI, physically empty room, operator performing the guided anchors). Three behavioral findings from building the test are recorded for pre-session triage: z-band squeeze between baseline motion flagging (z>2.0) and the still- anchor gate (presence_z≥1.5) — likeliest on-hardware enroll failure; variance-only PresenceSpecialist missing motionless-person mean shift; ungated breathing_hz/heart_hz in noise-window embeddings. Co-Authored-By: RuFlo <ruv@ruv.net> * fix(calibration): close all four ADR-152 behavioral findings pre-hardware-session The full-loop integration test surfaced three findings; fixing the third exposed a fourth. All four are fixed and regression-guarded: 1. z-band squeeze (enrollment.rs) — anchor motion is now measured from frame-to-frame deltas of the deviation series (\|Δz\| > Z_DELTA_MOTION 0.5 ∨ \|Δφ\| > π/6), not from the absolute motion_flagged, which fires at amplitude_z_median > 2.0 vs the EMPTY baseline and so conflated presence strength with motion. A strongly-reflecting still person (z = 3.0 — every frame flagged by the old heuristic) now enrolls. The old unit tests mocked (z=3.0, motion=false), a combination the real deviation() can never emit — which is exactly how the squeeze hid; tests now derive the flag from z the way the producer does. 2. variance-only presence (specialist.rs) — PresenceSpecialist gains a mean-shift channel: present when variance > threshold OR \|mean − empty_mean\| > mean_dist_threshold (trained at half the empty→occupied mean distance, None when the means don't separate). Detects the motionless person whose body raises the scalar mean but not its variance. Old persisted banks deserialize with the channel inert (serde default None) — variance-only behavior preserved, proven by a fixture test against pre-change JSON. 3. ungated hz embedding (extract.rs) — Features::embedding() zeroes breathing_hz/heart_hz below EMBED_MIN_SCORE (0.25), keeping the random in-band peaks of noise windows out of the posture/anomaly prototype space. Raw fields stay ungated (specialists have their own stricter gates). 4. heart-band lag-floor leakage (extract.rs, found while fixing 3) — a pure 0.30 Hz breathing signal scored 0.67 in the heart band at 3.33 Hz: out-of-band rhythm leaks as a monotonic slope whose max sits at the band's lag floor, so score gating alone cannot stop it. autocorr_dominant now requires the winning lag to be an interior local maximum; band-edge "peaks" are rejected, true in-band peaks (interior by definition) are preserved. full_loop.rs strengthened to drive the fixes end-to-end: the StandStill anchor is now a z=3.0 strong reflector (unenrollable pre-fix), and a new motionless-person runtime case proves mean-channel detection at empty- level variance. Validation: 41 calibration unit + 1 full-loop integration + 23 CLI tests green; cargo test --workspace --no-default-features exit 0. Co-Authored-By: RuFlo <ruv@ruv.net>	2026-06-10 15:21:09 -04:00
rUv	0cfd255730	fix: --export-rvf no longer silently produces a placeholder model (#920 ) The --export-rvf handler ran before the --train/--pretrain handlers and unconditionally wrote placeholder sine-wave weights, then returned. So the documented `--train --dataset … --export-rvf <path>` workflow (user-guide.md) short-circuited to a PLACEHOLDER model and never trained — printing "exported successfully" for a non-functional model. Given the project's anti-"is it fake" stance, silently emitting a fake model is the wrong default. Fix: - Only emit the placeholder container-format demo when --export-rvf is used standalone (new `export_emits_placeholder_demo` guard). With --train/--pretrain, fall through so the real training pipeline runs and exports calibrated weights. - The standalone path now prints a clear WARNING that it writes a container-format demo with placeholder weights — not a trained model — pointing to --train / a pretrained encoder (#894). - Docs: flag --export-rvf as a placeholder demo in the flag table, and fix the Docker training example to use --save-rvf (consistent with the from-source example) instead of the placeholder --export-rvf. 3 unit tests for the guard. Full crate unit suite: 429 + 117 passed, 0 failed.	2026-06-03 08:55:36 +02:00
rUv	91b0e625bd	docs(#882 ): complete the "100% presence" retraction across all docs (#916 ) The v1 "100% presence accuracy" headline was already retracted in the README / user-guide intro / proof-of-capabilities — but 6 secondary spots still flatly claimed "100% accuracy, never false alarms", which made proof-of-capabilities.md's "replaced everywhere" assertion untrue. Completed the retraction in-place with the honest label-free metric (82.3% held-out temporal-triplet; v1 was a single-class recording where a constant "yes" scores ~99.98%): - docs/readme-details.md — 2 benchmark tables + the pre-trained-model row - docs/user-guide.md — capability table, model-file comment, applications list - CHANGELOG.md — annotated the historical entry in-place (kept as public record per built-in-public ethos, not rewritten) Verified: no remaining flat "100% presence/accuracy" claim lacks a retraction marker; proof-of-capabilities.md "replaced everywhere" is now accurate.	2026-06-02 18:50:39 +02:00
ruv	c79e2e60ca	docs(proof): update hash + note cross-platform determinism gate verify.py's published hash is now f8e76f21 (doppler excluded). Document that the proof reproduces bit-for-bit across Windows / two Linux hosts / the Azure CI runner, that the peak-normalized Doppler is excluded due to its cross-microarch argmax instability, and that a relative-tolerance check against a committed reference vector backs the five stable features.	2026-05-31 12:22:53 -04:00
ruv	138449a378	Merge remote-tracking branch 'origin/main' into feat/adr-149-aether-arena # Conflicts: # CHANGELOG.md	2026-05-31 10:36:12 -04:00
ruv	0fbdd15955	docs: results+proof links, capabilities-proof rebuttal, fix stale claims - README: replace retracted "100% presence" claim with honest 82.3% held-out temporal-triplet; correct stale "pose model not in this release" (now live at ruvnet/wifi-densepose-mmfi-pose, 82.69% torso-PCK@20 SOTA); add a Results & proof table (HF models, AetherArena, benchmark study, deterministic verify.py proof, witness). - user-guide: same 100%->82.3% correction in two places; add Results & proof pointers and the SOTA pose model + AetherArena links. - docs/proof-of-capabilities.md (new): evidence-first rebuttal to the "fake / misleading" claims. Concedes what was fair (over-stated early metrics, AI-doc tone), refutes the category errors (simulate-mode mistaken for fraud; missing weights mistaken for missing pipeline), and gives copy-paste "prove it yourself" steps (verify.py VERDICT: PASS + published SHA-256, cargo test, HF model pull, ESP32 CSI). Emphasizes built-in-public history (git, 96 ADRs, CHANGELOG, issues incl. #803/#872 bug->fix arcs) as the anti-facade evidence. - aether-arena/VERIFY.md: cross-link the whole-platform proof doc. Verified: python archive/v1/data/proof/verify.py -> VERDICT: PASS (hash ca58956c...9199 matches published expected_features.sha256). Co-Authored-By: claude-flow <ruv@ruv.net>	2026-05-31 10:29:28 -04:00
ruv	1d9c0b3d4c	docs(study): sharpest finding — the encoder barely matters for CSI pose Random frozen encoder + trained head matches a fully-trained encoder to within 2-4pts (cross-subject <2pts). WiFi-CSI sensing is largely a random-features + target-readout problem: barely a learned representation to transfer, which unifies the zero-shot collapse, no-transfer results, foundation-encoder failure, and why per-room calibration works. Practical: invest in readout + calibration, not encoder pretraining. Co-Authored-By: claude-flow <ruv@ruv.net>	2026-05-31 03:43:14 -04:00
ruv	c95dd308fd	docs(study): cross-dataset confirmed on harder NTU-Fi-HumanID task Re-ran transfer on 14-class person-ID (harder than 6-activity HAR): same null-transfer result (MM-Fi pretrain 91.7% = random 92.8%). Unified root cause: CSI in-domain classification lives in the target-trained readout (random projection already separable); learned reps don't transfer across subjects/rooms/datasets. WiFi-CSI is distribution-locked. Addresses the 'HAR too easy' caveat. Co-Authored-By: claude-flow <ruv@ruv.net>	2026-05-31 03:37:19 -04:00
ruv	af68bd68d8	docs(study): cross-dataset transfer tested (MM-Fi -> NTU-Fi, honest negative) Tested the cross-dataset frontier: MM-Fi-trained CSI representation does NOT transfer beneficially to NTU-Fi HAR (frozen probe 91.5% = random features 93%; full fine-tune 75% < probe). CSI reps are distribution-locked, same root cause as within-MM-Fi cross-subject/-env collapse. Caveat: NTU-Fi 6 coarse activities are an easy target (random->93%). Updates the study's cross-dataset limitation from 'untested' to this measured result. Co-Authored-By: claude-flow <ruv@ruv.net>	2026-05-31 03:27:38 -04:00
ruv	695b5fb700	docs: complete MM-Fi WiFi-sensing study (pose + action, the honest picture) Consolidates the full campaign into one committed, citable artifact (the detailed log was in a gitignored staging report): pose SOTA 83.6% + 20KB int4 edge model; action recognition 88% (a WiFi task MM-Fi never benchmarked); the generalization story (zero-shot collapse, few-shot calibration rescue, task-general across pose+action); all honest negatives (CORAL/DANN/instance-norm/SupCon/distillation/subject-scaling); the 11KB calibration-adapter deployment recipe; honest limitations (cross-dataset untested, ARM latency pending). Co-Authored-By: claude-flow <ruv@ruv.net>	2026-05-31 03:06:54 -04:00
ruv	dac40e5df2	docs(adr-150): calibration thesis is task-general (action recognition) Verified on a 2nd MM-Fi task: 27-class action recognition (which MM-Fi never benchmarked for WiFi; only published baseline WiDistill 34%). In-domain 88% (leaky); cross-subject zero-shot collapses to ~10%; few-shot calibration rescues 10->76% (1000 samples). Same mechanism as pose -> few-shot in-room calibration is the universal WiFi-sensing generalization answer, not a pose quirk. Co-Authored-By: claude-flow <ruv@ruv.net>	2026-05-31 03:01:50 -04:00
ruv	5533ffe43e	docs(adr-150): cross-env few-shot — no unsolved deployment case Decisive capstone: cross-environment (unseen room+people) zero-shot 10.6%, but 5 calibration samples/person -> 60%, 200 -> 73%. The hard frontier is calibration-soluble, MORE dramatically than cross-subject (+62.5 vs +12 at K=200). The unsolved-frontier framing was a zero-shot artifact. Reframes generalization: ship few-shot calibration, not zero-shot invariance. Recommend accepting ADR-150 re-scoped around the calibration mechanism. Co-Authored-By: claude-flow <ruv@ruv.net>	2026-05-31 02:09:03 -04:00
ruv	ef4344f0f9	docs(adr-150): LoRA calibration data requirement — completes calibration spec 11KB adapter needs ~100-200 labeled samples/room for ~72% (knee ~50->70%); below ~20 it hurts. Evidence-complete calibration-service spec: base + ~100-200 samples -> 11KB LoRA -> ~72% cross-subject. Encoder goal now precisely posed: cut the sample requirement / lift the per-budget ceiling. Co-Authored-By: claude-flow <ruv@ruv.net>	2026-05-31 02:04:37 -04:00
ruv	ed1294a176	docs(adr-150): deployable adapter calibration — 11KB LoRA = calibration service Compared per-room calibration methods at K=200: LoRA rank-8 recovers 63.6->72.5% (SOTA-level) with just 11K params (~11KB), 0.5% the model size. Validates the ship-base-once + tiny-per-room-adapter mechanism for the RuView calibration service. Accuracy/size knob documented. Co-Authored-By: claude-flow <ruv@ruv.net>	2026-05-31 01:54:23 -04:00
ruv	898aaef053	docs(adr-150): few-shot adaptation resolves the cross-subject frontier Decisive result: 50 labeled frames/subject of in-room calibration -> 72.2% (reaches SOTA), 200 -> 76.1%, 1000 -> 78.3%. Few-shot target adaptation dominates source volume (+24 subjects bought +6pt; 200 target frames bought +12.4pt). Re-scopes the deployment story: ship a ~30s on-site calibration, not a mass corpus. Foundation encoder's role shifts to making that calibration cheaper. Supersedes the earlier data-bound pessimism. Co-Authored-By: claude-flow <ruv@ruv.net>	2026-05-31 01:47:00 -04:00
ruv	70bf9e41fe	docs(adr-150): subject-scaling study — capture diversity, not volume Measured cross-subject PCK vs N training subjects: 4->8 = +21pts, but 24->32 = +0.45pt. Saturates ~64%, ~19pt below in-domain. Correction to 'more data': subject-count returns vanish past ~16-20; the residual is device/room/protocol shift. Re-scope phase-1 capture around DIVERSITY (rooms/devices/protocols) + few-shot target adaptation, not headcount. Co-Authored-By: claude-flow <ruv@ruv.net>	2026-05-31 01:43:49 -04:00
ruv	96ccfa58fb	bench: ship int4 edge artifact + CPU latency Published deployable int4-QAT micro (verified 74.08%, ~20KB) at ruvnet/wifi-densepose-mmfi-pose/edge. Runs 0.135ms single-thread x86 CPU (no GPU) - real-time pose without an accelerator. ARM on-device validation pending fleet availability. Co-Authored-By: claude-flow <ruv@ruv.net>	2026-05-31 01:30:29 -04:00
ruv	92d433523d	bench: deployed quantized accuracy + QAT for micro edge model int8 PTQ lossless (74.70%, 73.5KB); int4 naive PTQ drops below SOTA (70.21%) but QAT recovers to 74.46% (36.7KB) - still beats MultiFormer. A SOTA-beating WiFi-pose model genuinely runs in ~37KB int4 (QAT) / 73KB int8. Distillation negative noted. Co-Authored-By: claude-flow <ruv@ruv.net>	2026-05-31 01:23:30 -04:00
ruv	d64323c2d6	bench: add quantized footprint — SOTA-beating WiFi pose in 37KB int4 micro (74.87%, beats MultiFormer 72.25%) = 36.7KB int4 / 73.5KB int8; nano (~72%) = 19.5KB int4. Distillation tested, no gain (direct training wins). A SOTA-beating pose model fits on the sensing node itself. Co-Authored-By: claude-flow <ruv@ruv.net>	2026-05-31 01:16:16 -04:00
ruv	9c64d90054	bench: WiFi-CSI pose efficiency frontier — 75K-param model beats SOTA Swept model size on MM-Fi random_split: every config from micro (75,237 params, 0.22ms, 74.30%) up beats MultiFormer (72.25%); nano (40K, 0.13ms) within 0.5pt. Pareto-dominant (smaller AND more accurate than prior SOTA). Orthogonal to the data-bound accuracy frontier (ADR-150). Co-Authored-By: claude-flow <ruv@ruv.net>	2026-05-31 01:10:33 -04:00

1 2 3 4 5 ...

289 Commits