|
|
||
|---|---|---|
| .. | ||
| 00-system-review.md | ||
| 01-sota-landscape-2026.md | ||
| 02-beyond-sota-architecture.md | ||
| 03-benchmark-validation-methodology.md | ||
| 04-optimization-roadmap.md | ||
| README.md | ||
README.md
RuView Beyond-SOTA Research Series
Research swarm output (2026-06-09) defining what a beyond-state-of-the-art RuView implementation is, what the current system actually delivers, and the validation/benchmark/optimization evidence gathered in the same session.
Produced by a 5-agent hierarchical research swarm (system reviewer, SOTA surveyor, architect, benchmark methodologist, performance analyst) plus a validation pass run against the working tree.
Documents
| Doc | Scope | One-line takeaway |
|---|---|---|
| 00-system-review.md | Capability audit of the current engine | Signal layer is the deepest asset (ruvsense/ ≈14.4k lines, 310 in-module tests); the model tier is the emptiest (no trained checkpoint in-tree); the live 20 Hz path is the main integration gap |
| 01-sota-landscape-2026.md | Published SOTA per capability axis (web-verified) | Defines the beyond-SOTA bar: 12-row capability → published SOTA → RuView-today → target table; IEEE 802.11bf-2025 is ratified and moves the moat up-stack |
| 02-beyond-sota-architecture.md | Target architecture | 8 pillars (RF foundation encoder + UQ heads, differentiable RF forward model, RF-SLAM×WorldGraph loop, camera→RF distillation, swarm apertures, continual adaptation, deterministic WASM edge, NV fusion) — all landing inside existing crates, no rewrite (per ADR-136 §2.1) |
| 03-benchmark-validation-methodology.md | Test/validation/benchmark methodology | 6-layer validation pyramid; 15 criterion bench targets inventoried; benchmark_baseline.json is a live-capture anchor, not a criterion baseline; statistical protocol from ADR-149 (≥10 seeds, IQM, bootstrap CIs) |
| 04-optimization-roadmap.md | Performance review + 90-day plan | ISTA CIR solver is the dominant latency hazard (~1.1 GFLOP/frame at HE40); exact zero-risk wins identified; WorldGraph grows unboundedly (no eviction) — a real bug-class |
Validation results (this session, 2026-06-09)
All measured on this branch (claude/ruview-beyond-sota-xgv8aq), Linux
container, cargo test --workspace --exclude wifi-densepose-desktop --no-default-features (the desktop crate needs GTK system libraries absent in
the container; this is an environment limitation, not a code failure).
| Layer | Command | Result |
|---|---|---|
| L0 unit/integration | cargo test --workspace --exclude wifi-densepose-desktop --no-default-features |
154 suites, 2,797 passed, 0 failed (pre-optimization baseline; re-run post-optimization also green) |
| L1 deterministic proof | python archive/v1/data/proof/verify.py |
VERDICT: PASS — hash f8e76f21a0f9852b70b6d9dd5318239f6b20cbcb4cdd995863263cecdc446f7a (bit-exact) |
| L2 criterion (CIR) | cargo bench -p wifi-densepose-signal --bench cir_bench --no-default-features --features cir |
Baselines captured pre/post optimization (below) |
Known pre-existing issue (not introduced here):
Fixed on this branch: cargo check -p wifi-densepose-mat --no-default-features fails standalone with 101 serde
feature-unification errors; it builds and passes inside --workspace runs.pub mod api (the only serde user) is now gated
behind the api feature that owns the optional serde dependency; all feature
combos compile.
Optimizations applied (this session)
Two exact (bit-identical float results — summation order unchanged, witness chain unaffected) optimizations from the 04 roadmap's "zero-risk" tier were implemented and verified:
cir.rswarm-start precompute — the diagonal Tikhonov preconditionerdiag(Φ^H Φ) + λIand its CSR matrix depend only on Φ and λ (fixed atCirEstimator::new) but were rebuilt on every frame (O(K·G) pass + CSR allocation). Moved to construction (crates/wifi-densepose-signal/src/ruvsense/cir.rs,build_warm_start_system).tomography.rssolver hoisting — the ISTA gradientVecwas allocated inside the 100-iteration loop and the Frobenius Lipschitz bound recomputed perreconstructcall; both hoisted (crates/wifi-densepose-signal/src/ruvsense/tomography.rs).
Measured impact (criterion, paired pre/post baselines, same container)
| Bench | Pre-opt | Post-opt | Change | Significant? |
|---|---|---|---|---|
cir_estimate/he40 |
12.34 ms | 11.86 ms | −3.9 % | yes (p < 0.01) |
cir_multiband_3band (30 ms group) |
30.16 ms | 29.72 ms | −1.4 % | yes (p < 0.01) |
cir_multiband (142 ms group) |
141.9 ms | 140.1 ms | −1.2 % | yes (p < 0.01) |
cir_estimate/ht40 |
11.73 ms | 11.78 ms | +0.4 % | no (p = 0.28) |
cir_estimate/he20 |
2.49 ms | 2.49 ms | −0.1 % | no (p = 0.85) |
cir_estimate/ht20 |
2.48 ms | 2.58 ms | +3.8 % | noise — see note |
Note on ht20: cir_estimator_new/ht20 (construction, which now does strictly
more work) also shows "+3 %", establishing a ≈3–4 % container noise floor;
the ht20 estimate delta is within it. The honest summary: the warm-start
precompute removes 1 of ~101 O(K·G) passes per frame, so the expected gain is
≈1–4 % — consistent with what was measured. The dominant per-frame cost is
the 100-iteration ISTA loop itself, which is exactly what the roadmap's
flag-gated FFT-operator proposal (8–40× on the mat-vecs, requires witnessed
hash regeneration) targets next.
Correctness post-optimization: wifi-densepose-signal 456 tests green;
wifi-densepose-engine 11/11 green including cycle_is_deterministic and
calibration_mismatch_demotes_and_witness_stable (witness-chain stability).
Headline conclusions
- "Beyond SOTA" is currently unfalsifiable without a real-CSI ground-truth benchmark — standing one up (per doc 03's acceptance table and ADR-149's statistical protocol) is the highest-leverage next step.
- The path is evolution, not rewrite: all eight architecture pillars in
doc 02 land inside existing crates on the ADR-136
Stage<I,O>/FrameMetacontract spine. - The biggest engineering gaps are the live 20 Hz ingest path, a trained RF encoder checkpoint, and WorldGraph retention/eviction — ahead of any frontier capability work.
- Determinism is the differentiator: every optimization and new pillar must preserve the witness chain; the advisory-vs-witnessed split (doc 02 §determinism) is the mechanism that lets frontier components in without breaking it.