wifi-densepose/docs/research/ruview-beyond-sota
rUv 42dcf49f4d
fix(adr): resolve duplicate ADR numbers + close ADR-080 security + ADR-154 M1 signal backlog (#1051)
* fix(signal): circular phase variance for ghost-tap guard (ADR-154 §7.4 #1)

`phase_variance` computed a LINEAR sample variance over phase angles that
wrap at ±π, so a tightly-clustered set straddling the branch cut reported
spuriously HIGH dispersion — false-tripping the `> TAU` ghost-tap guard on
real, tightly-clustered CIR taps.

Replace with Mardia's circular variance V = 1 − R̄, bounded [0,1] and
invariant to where the cluster sits on the circle. Re-derive the guard
against the bounded metric via a named const
`GHOST_TAP_CIRCULAR_VARIANCE_MAX` (the old TAU-scaled threshold is
meaningless on [0,1]).

Grade: metric fix MEASURED; threshold value DATA-GATED — a clean single-path
ramp also sweeps the circle, so V alone cannot separate clean from
unsanitized without labelled frames. Conservative default (0.99) errs toward
never false-rejecting, strictly more permissive at the wrap boundary than the
buggy linear guard.

Fails-on-old test: `phase_variance_circular_not_fooled_by_branch_cut` —
inlines the old linear variance to show it exceeds TAU on wrap-straddling
phases while circular V≈0 and the guard no longer trips. Plus
`phase_variance_circular_is_bounded_and_extremal` (V∈[0,1], V≈0 identical,
V≈1 uniform).

cargo test -p wifi-densepose-signal --no-default-features --features cir --lib
→ 432 passed, 0 failed.

Co-Authored-By: claude-flow <ruv@ruv.net>

* fix(signal): pin Welford n=0/n=1 finiteness guard (ADR-154 §7.4 #10)

The shared `WelfordStats` (field_model.rs, used by longitudinal.rs and others)
relies on `count < 2` guards in `variance`/`sample_variance`/`std_dev`/
`z_score` to stay finite at the boundaries. The guards existed but the n=0
boundary was UNTESTED — exactly the §4 divide-by-(n−1) family the ADR groups
this with.

Add `welford_finite_at_n0_and_n1` asserting every statistic is finite and
returns the documented sentinel (0.0) at n=0 and n=1, plus load-bearing doc
comments on the two guards.

Fails-on-old proof: with the `sample_variance` guard removed, the test FAILS
with "attempt to subtract with overflow" at the `(self.count - 1)` underflow
(0usize − 1); `variance` would similarly yield 0.0/0.0 = NaN. The guard is
restored; the test pins it so a future regression is caught.

Grade: MEASURED (boundary finiteness is asserted; the guard is the §4-family
fix made testable).

cargo test -p wifi-densepose-signal --no-default-features --lib field_model
→ 22 passed, 0 failed.

Co-Authored-By: claude-flow <ruv@ruv.net>

* refactor(signal): de-magic adversarial thresholds + boundary tests (ADR-154 §7.4 #13)

Lift the bare numeric literals buried in `check`/`check_consistency` into
named, documented module consts (FIELD_MODEL_GINI_VIOLATION=0.8,
ENERGY_RATIO_HIGH_VIOLATION=2.0, ENERGY_RATIO_LOW_VIOLATION=0.1,
CONSISTENCY_ACTIVE_FRACTION_OF_MEAN=0.1, SCORE_W_* weights). VALUES UNCHANGED —
each const equals the original literal; only names + pinning tests are new.

Grade: DATA-GATED. The operating values stay empirical (defensible values need
labelled spoofed/clean CSI — Wi-Spoof, §6.2/§7.3). The de-magicking +
characterization tests are MEASURED: `tuning_consts_unchanged_from_literals`,
`energy_ratio_high_boundary`, `energy_ratio_low_boundary`,
`field_model_gini_boundary`, `consistency_active_fraction_boundary` pin the
decision boundaries at/just-below/just-above each threshold, so a future
data-driven retune is a visible, tested change.

Fails-on-change proof: bumping ENERGY_RATIO_HIGH_VIOLATION 2.0→3.0 makes
`energy_ratio_high_boundary` FAIL (restored). Operating values explicitly
NOT changed.

cargo test -p wifi-densepose-signal --no-default-features --lib ruvsense::adversarial
→ 20 passed, 0 failed.

Co-Authored-By: claude-flow <ruv@ruv.net>

* refactor(signal): de-magic coherence drift/gate thresholds (ADR-154 §7.4 #9)

Lift the bare detection literals in `coherence.rs::classify_drift`
(DRIFT_STABLE_SCORE=0.85, DRIFT_STEP_CHANGE_MAX_STALE=10) and the
`coherence_gate.rs` Default impl (DEFAULT_ACCEPT_THRESHOLD=0.85,
DEFAULT_REJECT_THRESHOLD=0.5, DEFAULT_MAX_STALE_FRAMES=200,
DEFAULT_PREDICT_ONLY_NOISE=3.0) into named, documented consts. VALUES
UNCHANGED. The gate already exposed these via GatePolicyConfig (config seam);
this names + pins the defaults.

Grade: DATA-GATED. Operating values stay empirical (defensible Z-score
thresholds need labelled stable/drifting coherence traces). De-magicking +
boundary tests are MEASURED: `classify_drift_stable_score_boundary`,
`classify_drift_stale_count_boundary` pin the at/just-below/just-above
decisions; `drift_consts_unchanged_from_literals` /
`gate_default_consts_unchanged_from_literals` pin the values. Operating values
explicitly NOT changed.

cargo test -p wifi-densepose-signal --no-default-features --lib ruvsense::coherence
→ 40 passed, 0 failed.

Co-Authored-By: claude-flow <ruv@ruv.net>

* docs(adr-154): mark §7.4 P1 backlog cleared — Milestone-1 (#1,#10 RESOLVED; #9,#13 DATA-GATED)

Update ADR-154 §7.4 backlog rows #1, #9, #10, #13 with commit refs + grades,
the §7.4 intro count (four P1 items cleared, ~41 P2/P3 remain), the
Horizon-ledger one-liner (Milestone-1 DONE), and the §8 honest-limits #1 line
(metric now correct; threshold still DATA-GATED). Add CHANGELOG [Unreleased]
entry.

Grades: #1 RESOLVED (MEASURED metric / DATA-GATED threshold), #10 RESOLVED
(MEASURED), #9 & #13 RESOLVED-PARTIAL (DATA-GATED — de-magicked + boundary
tested, operating values unchanged).

Validation: cargo test --workspace --no-default-features → 2057 passed, 0
failed; wifi-densepose-signal lib → 442 passed (no-default + --features cir);
python archive/v1/data/proof/verify.py → VERDICT: PASS, hash f8e76f21…46f7a
UNCHANGED (CIR ghost-tap guard is not on the deterministic proof path).

Co-Authored-By: claude-flow <ruv@ruv.net>

* fix(sensing-server): stop leaking internal errors in HTTP responses (ADR-080 #2)

Six handlers in `main.rs` serialized the internal error `Display` straight
into the JSON response body, leaking server internals to any client (ADR-080
finding #2, CWE-209; reframed onto the Rust boundary by ADR-164 G11):

  - edge_registry_endpoint: a panicked spawn_blocking `JoinError`
    ("task … panicked") in a 500, and the raw upstream error in a 503
  - delete_model / delete_recording / start_recording: std::io::Error
    strings carrying OS detail / filesystem paths
  - calibration_start / calibration_stop: the FieldModel error chain

New `error_response` module: `internal_error` / `internal_error_json` /
`upstream_unavailable` log the full detail server-side only (tagged with a
correlation id) and return a generic body
(`{"error":"internal_error","correlation_id":…}`) — no `panicked`, no file
paths, no Debug chain. The correlation id lets an operator join a client
report to the exact server log line without ever shipping the detail.

Pinned by 5 error_response tests, incl. a leak-substring guard
(internal_error_body_does_not_leak_detail) verified to FAIL on the reverted
old body (returns the panic message / path / "os error"). The HOMECORE sweep
(ADR-161) covered homecore-server, not this crate.

Co-Authored-By: claude-flow <ruv@ruv.net>

* test(sensing-server): pin XFF-immunity + no-query-token (ADR-080 #1, #3)

Findings #1 (XFF-spoofing bypass) and #3 (JWT-in-URL, CWE-598) were logged
against the Python v1 API but are VERIFIED ABSENT on the current Rust
sensing-server, so they get regression tests rather than redundant fixes:

  - #1 XFF: there is no IP-based rate-limiter or IP-allowlist to bypass, and
    neither security middleware reads a forwarded header. Added
    bearer_auth::xff_header_never_affects_auth_decision (spoofed
    X-Forwarded-For never flips a 401<->200 decision) and
    host_validation::forwarded_headers_never_bypass_host_allowlist (spoofed
    X-Forwarded-Host: localhost never lets Host: evil.com past the allowlist).

  - #3 JWT-in-URL: require_bearer reads the token only from the Authorization
    header; WS handlers take no query token; the sole Query extractor
    (EdgeRegistryParams) is a non-secret refresh flag. Added
    bearer_auth::query_string_token_is_never_accepted — ?token= / ?access_token=
    in the URL never authenticates (stays 401) while the header path still 200s.
    Verified to FAIL when a query-token path is injected into require_bearer.

Co-Authored-By: claude-flow <ruv@ruv.net>

* docs(adr-080): mark P0 security findings #1-#3 RESOLVED; close ADR-164 G11

- ADR-080: Status note + per-finding closure (#1 XFF and #3 JWT-in-URL
  verified absent + regression-pinned; #2 leaked errors fixed via the
  error_response module). Records the v1-vs-Rust boundary distinction
  explicitly: v1 paths remain archived; this closure governs the shipped
  Rust sensing-server.
- ADR-164: Gap Register G11 and the Open/Gated Backlog entry marked
  RESOLVED with the fix + branch reference.
- CHANGELOG: [Unreleased] -> ### Security entry covering all three findings.

Co-Authored-By: claude-flow <ruv@ruv.net>

* docs(adr): renumber 6 displaced ADRs to resolve duplicate-number collisions (ADR-164 G1)

Resolves the 5 duplicate ADR numbers (6 displaced files) flagged by ADR-164
Gap Register item G1. Canonical keeper per number = first file committed at
that number (date tie-broken by inbound cross-reference count / parent-appendix
relationship). Displaced files renumbered to the next free numbers (166-171):

  050 keeps provisioning-tool-enhancements (5 refs vs 1)
    -> ADR-166-quality-engineering-security-hardening
  052 keeps tauri-desktop-frontend (parent ADR)
    -> ADR-167-ddd-bounded-contexts (its appendix)
  147 keeps nvidia-cosmos/OccWorld (the actual ADR, has Status header)
    -> ADR-168-benchmark-proof (proof companion, no Status)
    -> ADR-169-adam-mode-light-theme (was untracked)
  148 keeps drone-swarm-control-system (committed #862)
    -> ADR-170-yoga-mode-pose-system (was untracked)
  149 keeps public-community-leaderboard-huggingface (committed 16:47 vs 17:38)
    -> ADR-171-swarm-benchmarking-evaluation-methodology

Updates in-file `# ADR-NNN` headers and intra-file self-references (yoga-modes

* docs(adr): repoint inbound cross-references to renumbered ADRs (166-171)

Follow-up to the ADR renumbering (ADR-164 G1). Updates every inbound reference
that pointed at a displaced ADR, disambiguating shared numbers by title/slug so
only references to the DISPLACED topic move and keeper references stay put.

ADR-168 (was 147 benchmark-proof): README, CHANGELOG, user-guide,
  proof-of-capabilities, research docs 00/03 — all path/label refs updated.
ADR-169 (was 147 adam-mode) / ADR-170 (was 148 yoga-mode): docs/adr/README index.
ADR-171 (was 149 swarm-benchmarking): all ruview-swarm eval code+docs
  (Cargo.toml, evals/, eval_swarm.rs, metrics/mod/report/runner.rs), research
  doc 03 (every §-ref matched ADR-171 sections, not AetherArena), 00-system-review,
  series README, CHANGELOG, and ADR-148's forward/"open issues" pointers.
ADR-166 (was 050 quality-engineering / security-hardening): disambiguated from the
  ADR-050 provisioning KEEPER by topic. The HMAC/secure_tdm, directory-traversal,
  bind-address, and OTA-PSK-auth references in code comments
  (wifi-densepose-hardware Cargo.toml + secure_tdm.rs, sensing-server main.rs) and
  in ADR-052-tauri / ADR-167 all describe the security-hardening ADR -> ADR-166.
ADR-167 (was 052 ddd-appendix): inbound appendix references.

Index/registry updates: docs/adr/README.md, gap-analysis/census.md (rows +
header count), gap-analysis/lens-findings.md (collision table marked RESOLVED),
and ADR-164 Gap Register G1 marked RESOLVED with the full renumber map.

Keeper references deliberately untouched: all ADR-147 OccWorld code, all ADR-148
drone-swarm code/docs, all ADR-149 AetherArena refs (incl. ADR-150's SSL/resampling
refs, which ADR-150 explicitly binds to the AetherArena benchmark), ADR-050
provisioning refs, ADR-052 tauri refs. The frozen GitHub blob URLs in
docs/adr/.issue-177-body.md (pinned to an old branch) are left as historical.

Comment-only code edits; no behavior change. wifi-densepose-hardware compiles
clean; the sensing-server build's sole blocker is the pre-existing upstream
midstreamer-temporal-compare@0.2.1 registry crate, unrelated to these edits.

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-06-13 14:31:38 -04:00
..
00-system-review.md fix(adr): resolve duplicate ADR numbers + close ADR-080 security + ADR-154 M1 signal backlog (#1051) 2026-06-13 14:31:38 -04:00
01-sota-landscape-2026.md Beyond-SOTA engine/signal/train improvements: mesh partition guard, FFT CIR solver, canonical frame decoder, falsifiable occupancy benchmark, governed streaming, adapter provenance (#1018) 2026-06-11 16:08:54 -04:00
02-beyond-sota-architecture.md Beyond-SOTA engine/signal/train improvements: mesh partition guard, FFT CIR solver, canonical frame decoder, falsifiable occupancy benchmark, governed streaming, adapter provenance (#1018) 2026-06-11 16:08:54 -04:00
03-benchmark-validation-methodology.md fix(adr): resolve duplicate ADR numbers + close ADR-080 security + ADR-154 M1 signal backlog (#1051) 2026-06-13 14:31:38 -04:00
04-optimization-roadmap.md Beyond-SOTA engine/signal/train improvements: mesh partition guard, FFT CIR solver, canonical frame decoder, falsifiable occupancy benchmark, governed streaming, adapter provenance (#1018) 2026-06-11 16:08:54 -04:00
README.md fix(adr): resolve duplicate ADR numbers + close ADR-080 security + ADR-154 M1 signal backlog (#1051) 2026-06-13 14:31:38 -04:00

README.md

RuView Beyond-SOTA Research Series

Research swarm output (2026-06-09) defining what a beyond-state-of-the-art RuView implementation is, what the current system actually delivers, and the validation/benchmark/optimization evidence gathered in the same session.

Produced by a 5-agent hierarchical research swarm (system reviewer, SOTA surveyor, architect, benchmark methodologist, performance analyst) plus a validation pass run against the working tree.

Documents

Doc Scope One-line takeaway
00-system-review.md Capability audit of the current engine Signal layer is the deepest asset (ruvsense/ ≈14.4k lines, 310 in-module tests); the model tier is the emptiest (no trained checkpoint in-tree); the live 20 Hz path is the main integration gap
01-sota-landscape-2026.md Published SOTA per capability axis (web-verified) Defines the beyond-SOTA bar: 12-row capability → published SOTA → RuView-today → target table; IEEE 802.11bf-2025 is ratified and moves the moat up-stack
02-beyond-sota-architecture.md Target architecture 8 pillars (RF foundation encoder + UQ heads, differentiable RF forward model, RF-SLAM×WorldGraph loop, camera→RF distillation, swarm apertures, continual adaptation, deterministic WASM edge, NV fusion) — all landing inside existing crates, no rewrite (per ADR-136 §2.1)
03-benchmark-validation-methodology.md Test/validation/benchmark methodology 6-layer validation pyramid; 15 criterion bench targets inventoried; benchmark_baseline.json is a live-capture anchor, not a criterion baseline; statistical protocol from ADR-171 (≥10 seeds, IQM, bootstrap CIs)
04-optimization-roadmap.md Performance review + 90-day plan ISTA CIR solver is the dominant latency hazard (~1.1 GFLOP/frame at HE40); exact zero-risk wins identified; WorldGraph grows unboundedly (no eviction) — a real bug-class

Validation results (this session, 2026-06-09)

All measured on this branch (claude/ruview-beyond-sota-xgv8aq), Linux container, cargo test --workspace --exclude wifi-densepose-desktop --no-default-features (the desktop crate needs GTK system libraries absent in the container; this is an environment limitation, not a code failure).

Layer Command Result
L0 unit/integration cargo test --workspace --exclude wifi-densepose-desktop --no-default-features 154 suites, 2,797 passed, 0 failed (pre-optimization baseline; re-run post-optimization also green)
L1 deterministic proof python archive/v1/data/proof/verify.py VERDICT: PASS — hash f8e76f21a0f9852b70b6d9dd5318239f6b20cbcb4cdd995863263cecdc446f7a (bit-exact)
L2 criterion (CIR) cargo bench -p wifi-densepose-signal --bench cir_bench --no-default-features --features cir Baselines captured pre/post optimization (below)

Known pre-existing issue (not introduced here): cargo check -p wifi-densepose-mat --no-default-features fails standalone with 101 serde feature-unification errors; it builds and passes inside --workspace runs. Fixed on this branch: pub mod api (the only serde user) is now gated behind the api feature that owns the optional serde dependency; all feature combos compile.

Optimizations applied (this session)

Two exact (bit-identical float results — summation order unchanged, witness chain unaffected) optimizations from the 04 roadmap's "zero-risk" tier were implemented and verified:

  1. cir.rs warm-start precompute — the diagonal Tikhonov preconditioner diag(Φ^H Φ) + λI and its CSR matrix depend only on Φ and λ (fixed at CirEstimator::new) but were rebuilt on every frame (O(K·G) pass + CSR allocation). Moved to construction (crates/wifi-densepose-signal/src/ruvsense/cir.rs, build_warm_start_system).
  2. tomography.rs solver hoisting — the ISTA gradient Vec was allocated inside the 100-iteration loop and the Frobenius Lipschitz bound recomputed per reconstruct call; both hoisted (crates/wifi-densepose-signal/src/ruvsense/tomography.rs).

Measured impact (criterion, paired pre/post baselines, same container)

Bench Pre-opt Post-opt Change Significant?
cir_estimate/he40 12.34 ms 11.86 ms 3.9 % yes (p < 0.01)
cir_multiband_3band (30 ms group) 30.16 ms 29.72 ms 1.4 % yes (p < 0.01)
cir_multiband (142 ms group) 141.9 ms 140.1 ms 1.2 % yes (p < 0.01)
cir_estimate/ht40 11.73 ms 11.78 ms +0.4 % no (p = 0.28)
cir_estimate/he20 2.49 ms 2.49 ms 0.1 % no (p = 0.85)
cir_estimate/ht20 2.48 ms 2.58 ms +3.8 % noise — see note

Note on ht20: cir_estimator_new/ht20 (construction, which now does strictly more work) also shows "+3 %", establishing a ≈34 % container noise floor; the ht20 estimate delta is within it. The honest summary: the warm-start precompute removes 1 of ~101 O(K·G) passes per frame, so the expected gain is ≈14 % — consistent with what was measured. The dominant per-frame cost is the 100-iteration ISTA loop itself, which is exactly what the roadmap's flag-gated FFT-operator proposal (840× on the mat-vecs, requires witnessed hash regeneration) targets next.

Correctness post-optimization: wifi-densepose-signal 456 tests green; wifi-densepose-engine 11/11 green including cycle_is_deterministic and calibration_mismatch_demotes_and_witness_stable (witness-chain stability).

Headline conclusions

  1. "Beyond SOTA" is currently unfalsifiable without a real-CSI ground-truth benchmark — standing one up (per doc 03's acceptance table and ADR-171's statistical protocol) is the highest-leverage next step.
  2. The path is evolution, not rewrite: all eight architecture pillars in doc 02 land inside existing crates on the ADR-136 Stage<I,O>/FrameMeta contract spine.
  3. The biggest engineering gaps are the live 20 Hz ingest path, a trained RF encoder checkpoint, and WorldGraph retention/eviction — ahead of any frontier capability work.
  4. Determinism is the differentiator: every optimization and new pillar must preserve the witness chain; the advisory-vs-witnessed split (doc 02 §determinism) is the mechanism that lets frontier components in without breaking it.