Adds benchmarks/edge-latency/RESULTS.md (wiflow-std RESULTS style: each
measured number with reproduce command, machine, MEASURED-on-host grade,
and the honest host-vs-ESP32 / steady-state-vs-cold-start caveats) and
ADR-163 (HEADLINE: CLAIMED latency budgets -> MEASURED-on-host, closing
M5/M6 measurement debt; ESP32-on-hardware still pending).
- ADR-160 deferred 'criterion benches for process_frame budget claims'
line updated to DONE (host) with the ESP32-pending note.
- PROOF.md performance table gains the two edge-latency reproduce rows;
provenance ADR range extended to ADR-163.
- prove.sh gated section gains the edge-latency bench note (host proxy
only; not asserted, never claims the ESP32 figure).
Benches/docs only; no crate republishes.
Co-Authored-By: claude-flow <ruv@ruv.net>
Records the Milestone 7 audit: library cores are real (anti-slop positive) but
the network boundary had a CRITICAL WS auth bypass (A1) + reply-theater (A2) +
documented-but-no-op automation (A3-A7) + a network-exposed dev bin (A8), all
fixed and graded MEASURED with failing-on-old tests. Cites the NO-ACTION
security positives (uuid::v4 CSPRNG refuted-suspicion, hardened CORS,
no-traversal migrate, no-secrets-in-logs, honest HAP stub) and the deferred
backlog (plugin authority-isolation P5, sig-verification P4, HAP real pairing
P2, bounded run-modes, YAML load-at-boot).
Co-Authored-By: claude-flow <ruv@ruv.net>
- tests/honest_labeling.rs: 10 source-presence tests asserting the A1-A5 claim
invariants (disclaimers present, uncited stat removed, WEAPON_ALERT no longer
exported, med_* feature-gated, no static-mut event buffers). Each is designed to
FAIL on the pre-fix source (ADR-159 A5 manifest-roundtrip style).
- ADR-160: records the headline (0 stubs/0 theater, all real DSP -> claim-surface
honesty debt), the graded A1-A5 fixes, NO-ACTION positives, per-prefix
classification, and the DATA-GATED deferred backlog (criterion benches,
per-skill accuracy validation, wasm32 static_mut_refs CI confirmation).
- ADR-159: its deferred-backlog line "wasm-edge ... honestly labelled, not claimed"
is now actually TRUE.
Validation (all 0 failed, host --features std):
DEFAULT 615 | MEDICAL (+medical-experimental) 653 | NO-DEFAULT 615; 0 warnings.
Co-Authored-By: claude-flow <ruv@ruv.net>
Update specification.md §3.6 ONLY with an honest implementation-status note:
the matching algorithm is now implemented and tested in
v2/crates/wifi-densepose-bfld/, weights remain unvalidated design intent, and
named-identity locking is data-gated (cardiac+respiratory alone are not
separable — measured gap ~0.0005). The broader Soul Signature system remains
Pre-Implementation.
Co-Authored-By: claude-flow <ruv@ruv.net>
Documents Milestone 3 across the four acquisition crates (vitals, hardware,
wifiscan, calibration). Honest headline: this layer was already well-hardened,
so the real work is small.
- §A1 (perf, MEASURED): Vec::remove(0) O(n^2) sliding windows -> VecDeque.
End-to-end win is NULL within noise at realistic window sizes (DSP dominates);
the win is the algorithmic O(n^2)->O(n) shown in isolation. Claimed nothing
more -- the committed bench proves the null.
- §A2 (correctness): breathing partial-weights scale-mixing -> normalized by
Sigma(effective weights). Pinned by two fail-on-old tests.
- §A3 (stability): IIR resonator divergence. Corrected the research report's
physically-inaccurate trigger (divergence needs |r|>=1, i.e. bw>=4, not "r
negative"); clamp + finite-guard. Pinned by two fail-on-old tests.
- §B1 hardening on an unreachable (already-gated) truncation path -- disclosed.
- §B4 (constant-time HMAC compare) DEFERRED: not worth a new direct `subtle`
dependency for an 8-byte LAN sync-beacon tag.
- MEASURED negative-results section (the centerpiece): esp32_parser length gate,
sync_packet infallible slices, the whole ieee80211bf validate-on-deserialize /
no-panic-FSM / single-role / SBP-single-evaluate model, secure_tdm HMAC+replay,
netsh_scanner fixed-argv + Option parse, geometry_embedding MAX_COORD_M -- each
cited file:line, all NO-ACTION.
- SOTA landscape: deep-CSI vitals (DATA-GATED), 802.11bf conformance (CLAIMED,
non-public suite), per-room calibration (CLAIMED on numbers), native wlanapi
FFI multi-BSSID (CLAIMED-unmeasured -- explicitly NOT claiming the 10x). Mostly
NO-ACTION / ACCEPTED-FUTURE.
- Deferred backlog (§8): nothing silently dropped.
Validation: cargo test --workspace --no-default-features = 3054 passed / 0
failed; python verify.py = VERDICT PASS (hash unchanged, Rust-only changes).
Co-Authored-By: claude-flow <ruv@ruv.net>
Records the integrity-critical fixes (unified canonical metric, leak-free
subject-disjoint split + synthetic-val disclosure, rapid_adapt real gradients,
proof margin + committed-hash rigor), the Tier-2 correctness/security fixes, the
measured Tier-3 perf win, the NN SOTA landscape graded MEASURED/CLAIMED/
THEORETICAL (GraphPose-Fi as top ACCEPTED-future candidate; INT4; CSI-JEPA-vs-MAE
with the honest "no JEPA/MAE-on-WiFi-pose yet" caveat; "Mamba-CSI-pose does not
exist"), and the ~45-finding deferred backlog. Discloses the libtorch/tch-gating
limitation and that the Rust proof is honestly in SKIP until a baseline is
committed.
Co-Authored-By: claude-flow <ruv@ruv.net>
Records Milestone-0 of the signal/DSP beyond-SOTA sweep with full PROOF
discipline (MEASURED vs CLAIMED vs THEORETICAL grading throughout):
- §2 discloses the headline anti-slop finding: the ADR-134 CIR coherence gate
was DEAD in production (canonical-56 frames -> SubcarrierMismatch -> silent
freq-domain fallback for every frame). Documents the canonical56() fix + the
4 committed proof tests.
- §3 NaN/inf adversarial bypass; §4 divide-by-(n-1) window trio.
- §5 the two MEASURED perf wins with before/after medians + reproduce commands.
- §6 per-module SOTA landscape, evidence-graded: deep-unfolded ISTA/LISTA for
CSI->CIR (~3 dB NMSE, MEASURED, arXiv 2211.15440 + 2502.05952), diffusion CIR
prior (public weights, MEASURED), Wi-Spoof adversarial eval (MEASURED, arXiv
2511.20456), Bayesian multi-AP fusion (CLAIMED, no code, 2512.02462),
coherence gating + RF intention-lead (THEORETICAL).
- §7 roadmap: LISTA-for-CIR as the top ACCEPTED-future item (M effort; the ISTA
+ Phi already exist in cir.rs) — proposed, NOT implemented this milestone —
plus the explicit deferred-findings backlog (the ~45 review findings not
fixed here, graded P1/P2/P3) so nothing is silently dropped, with a
horizon-ledger DONE-vs-DEFERRED one-liner.
Co-Authored-By: claude-flow <ruv@ruv.net>
* docs(research): add RuView beyond-SOTA system review (00)
First document of the beyond-SOTA research series: capability audit of
the current RuView engine with role-to-crate maturity matrix, ruvsense
module inventory, gap analysis, and risk register.
https://claude.ai/code/session_01MjBucx95K4BuUxZi8NWwRH
* docs(research): add beyond-SOTA architecture design (02, in progress)
https://claude.ai/code/session_01MjBucx95K4BuUxZi8NWwRH
* docs(research): finalize beyond-SOTA architecture (02)
https://claude.ai/code/session_01MjBucx95K4BuUxZi8NWwRH
* docs(research): add benchmark/validation methodology snapshot (03)
https://claude.ai/code/session_01MjBucx95K4BuUxZi8NWwRH
* docs(research): add beyond-SOTA series index with validation results; changelog
README index ties the 5 research docs together with the session's
measured validation evidence: 2,797 workspace tests / 0 failed, Python
proof PASS (bit-exact), and paired pre/post criterion CIR benchmarks.
https://claude.ai/code/session_01MjBucx95K4BuUxZi8NWwRH
* perf(signal): precompute CIR warm-start system; hoist tomography solver allocs
Exact, determinism-safe optimizations (bit-identical float results):
- cir.rs: diag(PhiH Phi)+lambda*I and its CSR matrix depend only on Phi
and lambda (fixed at CirEstimator::new) but were rebuilt every frame
(O(K*G) pass + CSR allocation). Now built once in new() via
build_warm_start_system; summation order unchanged.
- tomography.rs: ISTA gradient buffer hoisted out of the 100-iteration
loop (fill(0.0) reset) and the Frobenius Lipschitz bound moved from
per-reconstruct to construction.
Verified: signal 456 tests green; engine 11/11 green including
cycle_is_deterministic and witness-stability tests. Criterion paired
pre/post: cir_estimate/he40 -3.9% (p<0.01), multiband -1.2/-1.4%.
https://claude.ai/code/session_01MjBucx95K4BuUxZi8NWwRH
* fix(worldgraph): bound SemanticState growth with deterministic retention
StreamingEngine::process_cycle appended one SemanticState belief per cycle
with no eviction — ~1.7M nodes/day at 20 Hz (beyond-SOTA roadmap finding #6).
Add WorldGraph::prune_semantic_states(max): deterministic eviction of the
oldest beliefs by (valid_from_unix_ms, id); structural nodes (rooms, zones,
sensors, anchors, tracks, events) are never eligible. Wire it into the
engine after each belief append (DEFAULT_SEMANTIC_RETENTION = 7,200, ~6 min
at 20 Hz; set_semantic_retention to tune). The WorldGraph holds current
beliefs; durable history is the recorder's job, so no audit data is lost.
3 new tests: end-to-end bounded growth, oldest-only eviction, deterministic
equal-timestamp tie-break. Workspace gate: 2,865 passed, 0 failed.
https://claude.ai/code/session_01MjBucx95K4BuUxZi8NWwRH
* feat(sensing-server): route live frames through the governed StreamingEngine
Closes the live-trust-path gap (ADR-136 section 8, beyond-SOTA system review):
the running server fused live CSI with the bare MultistaticFuser, while the
privacy/provenance/witness control plane (ADR-135..146) only ever ran on
synthetic in-test frames. The privacy control plane was therefore bypassable
on the real path.
New engine_bridge module drives StreamingEngine::process_cycle from the
server's live NodeState map, reusing the existing NodeState -> MultiBandCsiFrame
conversion. It lazily wires each contributing node as a WorldGraph sensor
(idempotent), bounds belief growth via the retention cap, and forwards explicit
timestamps/calibration ids so the path stays deterministic and replayable.
Wired additively into both live ESP32/WiFi fusion sites in main.rs via a
split-borrow off the write guard, so person-count behavior is unchanged; the
latest BLAKE3 witness is stored on AppState. Every published belief now carries
evidence + model + calibration + privacy decision and a deterministic witness.
Adds wifi-densepose-engine/-worldgraph/-bfld/-geo deps. 6 new bridge tests
(witnessed belief with full provenance, cross-run determinism, idempotent node
registration, retention bound, privacy-mode propagation). sensing-server suite
430+128 green; workspace gate 2,904 passed / 0 failed.
https://claude.ai/code/session_01MjBucx95K4BuUxZi8NWwRH
* feat(train): falsifiable occupancy benchmark with anti-overfitting gate
Makes the presence/person-count "beyond SOTA" claim falsifiable in code
instead of aspirational (the unfalsifiability gap from the beyond-SOTA system
review). occupancy_bench grades predictions vs ground truth and gates a SOTA
claim behind one claim_allowed invariant requiring ALL of:
- DataProvenance::Measured — synthetic/mock data is scorable for regression
but never claimable (anti-mock-contamination; the CLAUDE.md Kconfig-bug
lesson made structural).
- A leak-free EvalSplit — validate() refuses any split where a subject OR
environment id appears in both train and test (subject leakage /
per-environment overfitting).
- n_test >= min_test_samples (small-N guard).
- Presence F1 whose bootstrap-CI lower bound (deterministic seeded splitmix64)
clears the threshold — not the point estimate.
- Count MAE within threshold.
The claim string is unreadable except through the gate (NO_CLAIM otherwise),
same discipline as the ruview-gamma acceptance gate. What remains is data, not
method: a frozen, SHA-pinned, subject/environment-disjoint measured replay set
turns the claim into a passing/failing test.
Lives in wifi-densepose-train (the eval bounded context, alongside ablation/
eval/metrics). 10 tests cover each refusal path; warning-clean under the
crate's missing_docs lint. Workspace gate 2,914 passed / 0 failed. Doc 03
updated.
https://claude.ai/code/session_01MjBucx95K4BuUxZi8NWwRH
* feat(engine): per-room adapter provenance + drift-to-recalibration advisor
Closes the trust-chain gap where an ~11 KB per-room LoRA adapter (ADR-150
section 3.4) could silently change inference without the witness noticing:
provenance carried only "rfenc-v<N>" with no notion of adapter identity.
- StreamingEngine::set_room_adapter(AdapterInfo): pins the adapter's
content-derived id into provenance model_version
("rfenc-v1+adapter:<id>") — and therefore into the BLAKE3 witness — so
swapping or clearing adapter weights always shifts the witness. Engine test
proves base -> adapter -> other-adapter -> cleared all witness differently
and cleared == base.
- RecalibrationAdvisor: recommends re-running the ADR-135 empty-room baseline
/ refitting the room adapter on sustained low fusion coherence (streak
threshold, default 60 cycles ~ 3 s at 20 Hz) or an ADR-142 change-point.
Surfaced as TrustedOutput::recalibration_recommended, stored on the
sensing-server AppState alongside the witness at both live fusion sites.
- Bridge plumbing: EngineBridge::{set_room_adapter, clear_room_adapter} +
live-path test that the adapter id flows into the live witness.
Scope note (honest): this is the deployable provenance/trigger half of the
"retrained model" roadmap item. Fitting the adapter itself runs in the
existing external calibration service (aether-arena/calibration/); a trained
RF-encoder checkpoint still does not exist in-tree.
Engine 15 tests, bridge 7 tests. Workspace gate: 2,918 passed / 0 failed.
https://claude.ai/code/session_01MjBucx95K4BuUxZi8NWwRH
* fix(mat): gate api module behind its feature — standalone no-default-features builds
pub mod api was unconditional while its only dependency, serde, is optional
behind the 'api' feature, so any build without default features failed with
101 unresolved-serde errors (masked in --workspace runs by feature
unification). The api module and its create_router/AppState re-export are now
cfg(feature = "api")-gated with docsrs annotations.
All combos compile: bare --no-default-features (was 101 errors, now 0),
--no-default-features --features api, and full default (177 tests pass).
Workspace gate: 2,918 passed / 0 failed.
https://claude.ai/code/session_01MjBucx95K4BuUxZi8NWwRH
* perf(signal): opt-in FFT operator for the CIR ISTA solver (8-14x measured)
Phi is a sub-DFT, so each ISTA mat-vec can run as one length-G FFT
(O(G log G)) instead of a dense O(K*G) product — the dominant-latency-hazard
finding from the beyond-SOTA optimization roadmap.
New CirConfig::fft_operator, default FALSE: the dense path stays the
bit-exact witness default. The FFT evaluates the same sums in a different
order, so enabling it shifts float results in the last bits and requires
regenerating any pinned witness — strictly opt-in per deployment.
FftOperator (rustfft, planned once at CirEstimator::new, scratch buffers
reused across the ISTA loop) dispatches inside ista_solve:
Phi x = scale * forward-FFT(x) sampled at bins (k_idx mod G)
Phi^H v = scale * unnormalised inverse-FFT of v scattered into those bins
Warm-start and Lipschitz estimation stay dense at construction.
Measured (criterion, same run, same machine):
ht20: 2.22 ms -> 265 us (8.4x)
ht40: 10.26 ms -> 717 us (14.3x)
The real HE40 grid (K=484, G=1452) scales further per the O(K*G)/O(G log G)
ratio.
3 new tests: FFT<->dense matvec equivalence to float tolerance on ht20 and
he40 grids; end-to-end dominant-tap agreement on a single-path frame; all
default configs keep FFT off. New cir_estimate_fft bench group.
Workspace gate: 2,921 passed / 0 failed (default path bit-exact, witnesses
unchanged).
https://claude.ai/code/session_01MjBucx95K4BuUxZi8NWwRH
* feat(core): canonical frame decoder — capture-to-claim replay (ADR-136)
The encode half of the ADR-136 frame contract existed (ComplexSample,
to_canonical_bytes, witness_hash) but there was no decoder: a captured
canonical frame could be witnessed but never reconstructed, blocking
replay-from-capture.
CsiFrame::from_canonical_bytes is the exact inverse: same id, metadata,
complex payload, and witness hash (tested as the round-trip law AC7 — the
replayed frame re-encodes byte-identically). Amplitude/phase are recomputed
from the payload (projections, not independent state). Every malformed-input
class fails closed (AC8): header truncation -> Truncated, payload truncation
-> PayloadMismatch, unknown discriminants, non-UTF-8 device id, trailing
bytes. Nil calibration uuid decodes as None per the documented encoding.
Core: 36 tests pass. Workspace gate: 2,937 passed / 0 failed.
https://claude.ai/code/session_01MjBucx95K4BuUxZi8NWwRH
* feat(engine): dynamic min-cut mesh partition guard (ruvector-mincut)
Maintains an exact min-cut over the live mesh coupling graph — nodes are
sensing nodes, coupling is the product of fusion attention weights — and
surfaces per cycle, as TrustedOutput::mesh:
- cut value: the global "how close is the array to partitioning" number,
a structural measure per-node heuristics miss;
- weak side: which specific nodes would split off (failure/jamming triage,
feeds ADR-032 posture);
- at-risk flag: counts as a structural event for the drift->recalibration
advisor (alongside ADR-142 change-points).
Degenerate cases fail toward risk: a node with zero coupling is reported as
already partitioned (cut 0, that node as the weak side).
Measured cost policy (criterion, 12-node mesh — the honest part):
- weights quantized (1/64) + change-gated: steady-state cycles do ZERO graph
work and reuse the cached cut (~7.3 us, ~23x cheaper than building);
- on any real change a full exact rebuild (~171 us) is used, because ONE
DynamicMinCut delete+insert measured ~240 us — the subpolynomial machinery
amortizes on much larger graphs, so rebuild-on-change is the measured
optimum at mesh scale (one-edge case -28% after switching policy);
- full process_cycle with the guard: ~33 us for 4 nodes vs the 50 ms budget.
9 mesh_guard tests (weak-node detection, steady-state zero updates,
sub-quantum gating, join/drop rebuild, determinism, disconnection) + an
engine-level wiring test (down-weighted node -> weak side -> recalibration).
Engine 24 tests; workspace gate 2,946 passed / 0 failed.
https://claude.ai/code/session_01MjBucx95K4BuUxZi8NWwRH
* feat(engine): mesh partition risk demotes privacy + enters the witness (ADR-032)
Completes the mesh-guard integration: its at_risk signal was advisory-only
(fed the recalibration advisor). It now also contributes to the ADR-141
privacy demotion alongside fusion- and array-level contradictions — a mesh
close to partitioning makes the fused belief less trustworthy, so the cycle
emits at a more restricted class (monotonic; information only removed).
Because effective_class feeds the BLAKE3 witness, a fragmenting array now
shifts the witness: partition risk is auditable, not just logged. The mesh
computation moved ahead of the demotion step in process_cycle; mesh_guard_mut
exposes risk-threshold tuning.
Test: a forced-risk 3-node cycle demotes PrivateHome Anonymous->Restricted
and shifts the witness vs a clean baseline. Engine 25 tests; workspace gate
2,947 passed / 0 failed.
https://claude.ai/code/session_01MjBucx95K4BuUxZi8NWwRH
* fix: public-PR review findings — privacy-path honesty, gate holes, mesh-guard cliff
- sensing-server: engine errors logged+counted (no silent swallow), trust
state exposed via status surface, privacy-demotion claims aligned with
the actual parallel-audit-path behavior
- occupancy_bench: vacuous-F1 hole closed (degenerate test sets fail with
their own criterion); CI-lower-bound test made probative
- mesh_guard: quantization scaled to observed coupling range — >=65-node
balanced meshes no longer permanently at_risk (regression test)
- engine: both wiring tests made probative (same-topology witness compare,
deterministic risk-crossing fixture)
- mat: axum/tokio optional behind api; real serde feature (api enables it)
- core: canonical decoder strict (non-zero reserved bytes and nil UUID
rejected — injective on accepted domain, forged-bytes tests)
- CHANGELOG: un-spliced the FFT/adapter bullet mangle
Co-Authored-By: claude-flow <ruv@ruv.net>
* chore: strip private-track references for public PR
Reword the occupancy-benchmark changelog bullet to drop a cross-reference
to the private research track, and restore the WorldGraph retention bullet
header that was glued onto the preceding MAT bullet.
Co-Authored-By: claude-flow <ruv@ruv.net>
* chore: lockfile refresh for cherry-picked feature set
Co-Authored-By: claude-flow <ruv@ruv.net>
---------
Co-authored-by: Claude <noreply@anthropic.com>
* docs(adr): ADR-151 — Per-Room Calibration & Specialized Model Training
Room-first calibration -> bank of small specialised ruVector models
(breathing, heartbeat, restlessness, posture, presence, anomaly) distilled
from the frozen Hugging-Face-published RF Foundation Encoder (ADR-150).
Four-stage local-first pipeline: baseline (ADR-135 environmental fingerprint)
-> guided enrollment (NEW EnrollmentProtocol, clean anchors not hours) ->
feature extraction (reuse signal_features + ruvsense) -> specialist bank
training (rapid_adapt LoRA heads, RVF storage, HNSW prototypes).
Invariants: specialisation over scale; local heads over a shared public base;
honest STALE degradation on baseline drift. Indexes ADR-149/150/151.
Co-Authored-By: claude-flow <ruv@ruv.net>
* feat(cli): calibration HTTP API for UI-driven baseline capture (ADR-135/151)
Adds `wifi-densepose calibrate-serve` — an Axum HTTP API that wraps the
ADR-135 CalibrationRecorder so a UI (or any client) can drive an empty-room
baseline capture remotely. Stage 1 ("teach the room") of the ADR-151 room
calibration & training pipeline.
A single background task owns the UDP socket (ESP32 0xC511_0001 frames) and
the optional active recorder; HTTP handlers talk to it over an mpsc command
channel and read a shared status snapshot, keeping the &mut recorder
lock-free. CORS permissive so a browser UI can call it.
Endpoints (/api/v1/calibration/*):
GET /health liveness + UDP ingest stats (frames_seen, streaming)
POST /start { tier?, duration_s?, room_id?, min_frames? }
GET /status live progress (state, frames, progress, z, eta) — poll for UI
POST /stop finalize the current session early
GET /result finalized baseline summary (amp/phase-dispersion averages)
GET /baselines list persisted baseline .bin files
Reuses the existing calibrate.rs ESP32 wire parser (made pub(crate)); honest
abort when <10 frames arrive in the window (e.g. ESP32 not streaming).
Verified end-to-end over loopback: start -> 300 replayed HT20 frames ->
state=complete, 52-subcarrier baseline, phase_dispersion_avg=0.00096
(concentrated/valid), persisted to disk; all 6 endpoints exercised.
CLI: 19 tests pass; crate builds clean.
Co-Authored-By: claude-flow <ruv@ruv.net>
* test(cli): firewall-free CSI UDP relay for local Windows ESP32 testing
Windows Defender blocks inbound LAN UDP to a freshly-built binary without an
admin allow-rule; python.exe is already allowed. This relay binds the public
CSI port and forwards each datagram verbatim to a loopback port where
`calibrate-serve --udp-bind 127.0.0.1 --udp-port 5006` listens (loopback is
firewall-exempt). No admin required.
Validated: ESP32-format 0xC5110001 frames -> :5005 -> relay -> :5006 ->
calibrate-serve -> state=complete, 52-subcarrier baseline,
phase_dispersion_avg=0.00098 (clean). Completes the no-admin live-test path.
Co-Authored-By: claude-flow <ruv@ruv.net>
* docs(changelog): record ADR-151 calibration API (calibrate-serve)
Co-Authored-By: claude-flow <ruv@ruv.net>
* feat(calibration): ADR-151 Stages 2–5 — enrollment, extraction, specialist bank, runtime
New crate wifi-densepose-calibration implementing the per-room pipeline beyond
Stage-1 baseline:
- anchor.rs: guided-anchor sequence + event-sourced EnrollmentSession (Stage 2)
- enrollment.rs: AnchorQualityGate + AnchorRecorder — gates anchors against the
ADR-135 baseline deviation (presence/motion), re-prompts bad captures
- extract.rs: Features + AnchorFeature — autocorrelation periodicity (breathing/
HR bands), variance/motion (Stage 3)
- specialist.rs: 6 small room-calibrated models — presence (learned threshold),
posture (nearest-prototype), breathing/heartbeat (band periodicity),
restlessness (calm/active normalization), anomaly (novelty vs anchors) (Stage 4)
- bank.rs: SpecialistBank — train/persist + baseline-drift STALE invalidation
- runtime.rs: MixtureOfSpecialists — presence short-circuit + anomaly veto +
stale flagging (Stage 5)
Statistical heads make the pipeline runnable/validatable today; the ADR-150 HF
RF Foundation Encoder backbone is the documented upgrade path. 29 unit tests pass.
Co-Authored-By: claude-flow <ruv@ruv.net>
* feat(cli): wire ADR-151 enroll / train-room / room-status / room-watch
Integrates the wifi-densepose-calibration crate into the CLI as four
subcommands driving the full Stage 2–5 pipeline against a live ESP32 raw-CSI
stream (edge_tier=0):
- enroll: walks the guided anchor sequence, gates each capture against the
ADR-135 baseline deviation (re-prompts bad anchors), writes labelled features
- train-room: fits the SpecialistBank from the enrollment, persists JSON
- room-status: prints a trained bank's summary
- room-watch: live mixture-of-specialists readout (presence/posture/breathing/
heart/restless) over a rolling window, with anomaly veto + STALE flagging
Per-frame scalar is the mean CSI amplitude (carries presence/motion + breathing
modulation). Validated end-to-end on the live ESP32 (COM8, edge_tier=0): the
real parser → feature extraction → runtime detected breathing (~16–31 BPM) on
hardware. Full multi-anchor enrollment accuracy requires the operator to perform
the poses; phase-based breathing extraction is a noted refinement.
48 tests pass (29 calibration + 19 CLI).
Co-Authored-By: claude-flow <ruv@ruv.net>
* docs(adr-151): mark Stages 1–5 implemented; expand CHANGELOG
Co-Authored-By: claude-flow <ruv@ruv.net>
* fix(cli): keep proven mean-amplitude carrier for room features
The max-variance-subcarrier carrier locked onto motion artifacts (not
breathing) and also had an out-of-bounds bug on variable CSI subcarrier
counts. Reverted to the mean-amplitude carrier, which is validated live to
detect breathing. Phase-based extraction on a stable subcarrier remains the
proper higher-SNR refinement (ADR-151 §4).
Co-Authored-By: claude-flow <ruv@ruv.net>
* feat(calibration): multistatic fusion of co-located nodes (ADR-029/151)
MultiNodeMixture fuses several co-located nodes (each with its own
room-calibrated SpecialistBank) into one RoomState:
- presence: OR across nodes (any node seeing a person wins)
- posture/breathing/heartbeat: highest-confidence node (best viewpoint)
- restlessness/anomaly: max across nodes
- veto: any node's physically-implausible signal vetoes the room's vitals
(anti-hallucination, same as single-node runtime) + presence short-circuit
- stale: any node's STALE flag propagates
Same-room multistatic only; cross-room is federation (ADR-105), not fusion.
6 unit tests (presence OR, best-confidence breathing, single-node veto,
staleness). 35 calibration tests pass.
Co-Authored-By: claude-flow <ruv@ruv.net>
* feat(cli): multistatic room-watch — fuse co-located nodes (ADR-029/151)
`room-watch --node-bank N:path` (repeatable) groups live CSI frames by node_id
and fuses per-node banks via MultiNodeMixture. Validated live on COM8 (node 9,
edge_tier=0): frames grouped + fused end-to-end. True 2-node fusion is covered
by unit tests; a second raw-CSI node is the hardware blocker. 54 tests pass.
Co-Authored-By: claude-flow <ruv@ruv.net>
* docs(integration): calibration → cognitum-v0 appliance integration overview
Detailed cross-repo integration spec for cognitum-one/v0-appliance: data
contracts (CSI wire format, ADR-135 baseline binary, enrollment/bank/RoomState
JSON schemas), calibrate-serve HTTP API, public crate API, Pi5+Hailo tiering,
and a 5-step appliance integration plan. Grounded in the verified cognitum-v0
inventory (aarch64, cargo 1.96, HAILO10H, ruview-vitals-worker:50054).
Co-Authored-By: claude-flow <ruv@ruv.net>
* fix(calibration): address PR review — aarch64 decouple, API auth, path traversal, throttle
Resolves the review on #989:
- **Cross-compile (the appliance blocker):** make wifi-densepose-mat optional
and feature-gate it (`mat`), so `cargo build -p wifi-densepose-cli
--no-default-features` excludes the mat→nn→ort(ONNX)→openssl-sys chain.
Verified: `cargo tree --no-default-features` shows 0 ort/openssl deps →
calibration cross-compiles clean for the Pi.
- **Security (must-fix before LAN):**
- `--token` / CALIBRATE_TOKEN bearer-auth middleware on every route; warns if
bound non-loopback without a token.
- sanitize client-supplied `room_id` to [A-Za-z0-9_-] (≤64) before it reaches
the baseline write path — kills the `../` file-write primitive. + test.
- **Perf:** stop locking shared status + cloning SessionStatus on every UDP
frame — counters/snapshot flush on the 200 ms tick instead (no CPU
starvation under flood). finalize write moved to async `tokio::fs::write`.
- **Docs:** ADR-151 STALE wording matches the impl (baseline-id change;
drift-threshold = P6 refinement); integration doc gets the
`--no-default-features` build + auth/sanitize notes.
35 calibration + 15 CLI tests (no-default) / 20 CLI (default) pass.
Co-Authored-By: claude-flow <ruv@ruv.net>
* docs(worldgraph,worldmodel): add crates.io READMEs
Plain-language overviews + feature lists, comparison tables (symbolic graph vs
predictive occupancy; graph vs grid vs event-log), usage, and technical
details. Adds readme = "README.md" to both manifests so they render on
crates.io on the next release.
Co-Authored-By: claude-flow <ruv@ruv.net>
* release: worldgraph & worldmodel 0.3.1 (READMEs on crates.io)
Co-Authored-By: claude-flow <ruv@ruv.net>
* docs: precise calibration validation scope (capture+API+auth proven; clean enroll→train→infer not yet on-target)
Aligns ADR-151 §7 + the appliance integration doc with the PR #989 scope
clarification: nothing has run a clean baseline → enroll → train → infer on
live CSI; the live breathing read used the stateless head, not a trained bank.
Adds --source-format adr018v6 to the backlog.
Co-Authored-By: claude-flow <ruv@ruv.net>
* feat(calibrate-serve): live GET /room/state endpoint (mixture over CSI window)
Adds a live RoomState readout over HTTP — the appliance UI's main need. The
ingest task maintains a rolling per-frame scalar window (flushed on the 200 ms
tick, no per-frame lock); the handler loads a bank (resolved as a sanitized
name under output_dir — same path-traversal defense as room_id), runs the
MixtureOfSpecialists over the window, returns RoomState JSON.
Validated live (ESP32-S3 via relay): breathing 14-19 BPM over HTTP; a
bank=../../etc/passwd query is neutralized to 'etcpasswd' (no traversal).
Co-Authored-By: claude-flow <ruv@ruv.net>
* feat(calibrate-serve): POST /room/train + fix AnchorLabel JSON to snake_case
- POST /api/v1/room/train: { room_id, baseline_id, anchors[] } → trains a
SpecialistBank and persists it as <output_dir>/<room_id>.json (path-sanitized),
readable via /room/state?bank=<room_id>. Completes the HTTP train→infer loop.
- Fix data-contract bug: AnchorLabel serialized as PascalCase variant names
(serde default) while as_str() + the integration doc used snake_case. Added
#[serde(rename_all = "snake_case")] so the JSON wire format matches the
documented contract (empty/stand_still/…). Locked with a roundtrip test.
Validated live (ESP32-S3): POST train (4 anchors → 6 specialists, persisted) →
GET /room/state returns RoomState with the trained presence/restlessness; the
synthetic-vs-real scale mismatch correctly triggers the anomaly veto. 36
calibration tests pass.
Co-Authored-By: claude-flow <ruv@ruv.net>
* feat(calibrate-serve): live enroll-over-HTTP (POST /enroll/anchor + /enroll/status)
Closes the last HTTP gap — the appliance can now drive the ENTIRE calibration
pipeline over HTTP without the CLI:
baseline (start/stop) -> enroll/anchor x8 -> room/train -> room/state
- POST /enroll/anchor { room_id, baseline, label, duration_s? }: the ingest task
loads the baseline (sanitized name under output_dir), captures the anchor for
the duration against it (AnchorRecorder + per-frame series), runs the quality
gate, and on completion replies with the verdict + accumulates the AnchorFeature
in an in-server enrollment map keyed by room_id. Re-prompts on rejection.
- GET /enroll/status?room=<id>: accepted anchors, next, complete.
- POST /room/train now falls back to the in-server enrollment when anchors[] is
omitted.
Validated live (ESP32-S3): capture baseline -> enroll stand_still (271 frames,
6s) -> gate correctly rejects "no person detected (presence_z 0.90 < 1.50)"
relative to a same-occupancy baseline (a clean empty-room baseline is the
documented on-target prerequisite). Builds clean; CLI tests pass.
Co-Authored-By: claude-flow <ruv@ruv.net>
* test(calibrate-serve): HTTP integration tests for the room/enroll endpoints
Factor the router into build_router() (shared by execute + tests) and add
tower-oneshot integration tests (no network/ingest needed):
- health + descriptor → 200
- POST /room/train persists the bank; GET /room/state → 200; train with no
anchors/enrollment → 400
- path-traversal: /room/state?bank=../../etc/passwd → 404 (sanitized, never
reads outside output_dir)
- enroll/status empty; /enroll/anchor with an unknown label → 400
CI regression coverage for the endpoints added this session. 18 CLI tests pass.
Co-Authored-By: claude-flow <ruv@ruv.net>
* fix(mat): make serde non-optional — unblocks `cargo test --workspace --no-default-features`
Making wifi-densepose-mat optional in the CLI (for the aarch64/ort decouple)
exposed a latent feature bug: mat's `api` module compiles unconditionally and
uses serde, but `serde` was an optional dep enabled only via the `api`/`serde`
features. Previously the CLI's *unconditional* mat dependency enabled those
features transitively, so `--workspace --no-default-features` still got serde;
once mat became optional+gated, the workspace build lost it →
`error[E0432]: unresolved import serde` across mat's api/* (CI red).
mat already pulls serde_json + axum unconditionally, so making `serde`
non-optional has no real cost and restores the workspace build. Does NOT affect
the aarch64 CLI build (mat isn't built there at all): verified
`cargo tree -p wifi-densepose-cli --no-default-features` still shows 0
ort/openssl deps, and `cargo test --workspace --no-default-features` compiles
clean.
Co-Authored-By: claude-flow <ruv@ruv.net>
* docs(claude.md): add wifi-densepose-calibration to crate table (pre-merge)
Co-Authored-By: claude-flow <ruv@ruv.net>
* docs(adr): ADR-152 — WiFi-pose SOTA 2026 intake (geometry-conditioned calibration, external benchmarks, encoder recipe)
Records the 2026-06-10 deep-research run (22 sources, 110 claims, 25
adversarially verified: 24 confirmed / 1 refuted) and the decisions it
implies:
- §2.1 ACCEPTED: geometry-condition the ADR-151 calibration system —
NodeGeometry at enrollment, geometry embeddings for future LoRA heads,
PerceptAlign-style two-checkerboard camera↔WiFi alignment for the
ADR-079 supervised path. PerceptAlign (MobiCom'26) names the failure
mode ("coordinate overfitting") that matches our own ADR-150 cross-
subject collapse.
- §2.2 ACCEPTED: benchmark protocol vs external "WiFlow-STD (DY2434)"
(claimed 97.25% PCK@20, Apache-2.0 weights+dataset) with a no-citation
rule until measured on our 17-keypoint ESP32 eval set. Name collision
with our internal WiFlow is disambiguated.
- §2.3 ACCEPTED: amend ADR-150 training recipe per UNSW MAE study —
80% masking, (30,3) patches, data-over-capacity priority (log-linear,
unsaturated at 1.3M samples).
- §2.4 watch items: IEEE 802.11bf-2025 published 2025-09-26;
esp_wifi_sensing as external presence baseline (drop-in claim REFUTED
0-3); ZTECSITool 160MHz/512-subcarrier anchor node (procurement-gated).
- §2.5 NOT adopted: non-WiFi "foundation model" papers; DensePose-UV
(no 2025-2026 work does UV regression from commodity WiFi).
Every number is evidence-graded CLAIMED vs MEASURED in the source
register. Re-check horizon 2026-12.
Co-Authored-By: RuFlo <ruv@ruv.net>
* test(calibration): full-loop integration test — baseline→enroll→train→infer proven in-process (ADR-151 §7 gap, software half)
Closes the software half of PR #989's headline validation gap: the
complete calibration loop had never run end-to-end anywhere, even
in-process. tests/full_loop.rs (412 lines, deterministic xorshift32
room simulator, HT20/52-subcarrier/20Hz, same fingerprint family as
the ADR-135 roundtrip test) now drives the CLI's exact stage order
through the public API:
1. baseline — 600 static frames, zero motion flags post-warmup,
calibration_uuid() exactly as the CLI derives it
2. enroll — all 8 AnchorLabel::SEQUENCE anchors through
AnchorQualityGate::default(), session is_complete()
3. extract — AnchorFeature::from_series recovers injected 0.25Hz
and 0.125Hz breathing within ±0.04Hz
4. train — SpecialistBank::train fits all 6 specialists; JSON
round-trip and the runtime consumes the RELOADED bank
5. infer — positive: never-enrolled 0.30Hz subject reads present,
18±2 BPM; negative: empty window reads absent;
degradation: foreign baseline_id flags STALE
Seed-robust (5 seeds), passes with and without default features:
36 unit + 1 integration green.
Validation docs updated (ADR-151 §7 + integration doc §7 matrix): what
remains is strictly the on-target hardware session (real CSI, physically
empty room, operator performing the guided anchors). Three behavioral
findings from building the test are recorded for pre-session triage:
z-band squeeze between baseline motion flagging (z>2.0) and the still-
anchor gate (presence_z≥1.5) — likeliest on-hardware enroll failure;
variance-only PresenceSpecialist missing motionless-person mean shift;
ungated breathing_hz/heart_hz in noise-window embeddings.
Co-Authored-By: RuFlo <ruv@ruv.net>
* fix(calibration): close all four ADR-152 behavioral findings pre-hardware-session
The full-loop integration test surfaced three findings; fixing the third
exposed a fourth. All four are fixed and regression-guarded:
1. z-band squeeze (enrollment.rs) — anchor motion is now measured from
frame-to-frame deltas of the deviation series (|Δz| > Z_DELTA_MOTION
0.5 ∨ |Δφ| > π/6), not from the absolute motion_flagged, which fires
at amplitude_z_median > 2.0 vs the EMPTY baseline and so conflated
presence strength with motion. A strongly-reflecting still person
(z = 3.0 — every frame flagged by the old heuristic) now enrolls.
The old unit tests mocked (z=3.0, motion=false), a combination the
real deviation() can never emit — which is exactly how the squeeze
hid; tests now derive the flag from z the way the producer does.
2. variance-only presence (specialist.rs) — PresenceSpecialist gains a
mean-shift channel: present when variance > threshold OR
|mean − empty_mean| > mean_dist_threshold (trained at half the
empty→occupied mean distance, None when the means don't separate).
Detects the motionless person whose body raises the scalar mean but
not its variance. Old persisted banks deserialize with the channel
inert (serde default None) — variance-only behavior preserved,
proven by a fixture test against pre-change JSON.
3. ungated hz embedding (extract.rs) — Features::embedding() zeroes
breathing_hz/heart_hz below EMBED_MIN_SCORE (0.25), keeping the
random in-band peaks of noise windows out of the posture/anomaly
prototype space. Raw fields stay ungated (specialists have their
own stricter gates).
4. heart-band lag-floor leakage (extract.rs, found while fixing 3) —
a pure 0.30 Hz breathing signal scored 0.67 in the heart band at
3.33 Hz: out-of-band rhythm leaks as a monotonic slope whose max
sits at the band's lag floor, so score gating alone cannot stop it.
autocorr_dominant now requires the winning lag to be an interior
local maximum; band-edge "peaks" are rejected, true in-band peaks
(interior by definition) are preserved.
full_loop.rs strengthened to drive the fixes end-to-end: the StandStill
anchor is now a z=3.0 strong reflector (unenrollable pre-fix), and a new
motionless-person runtime case proves mean-channel detection at empty-
level variance.
Validation: 41 calibration unit + 1 full-loop integration + 23 CLI tests
green; cargo test --workspace --no-default-features exit 0.
Co-Authored-By: RuFlo <ruv@ruv.net>
The --export-rvf handler ran *before* the --train/--pretrain handlers and
unconditionally wrote placeholder sine-wave weights, then returned. So the
documented `--train --dataset … --export-rvf <path>` workflow
(user-guide.md) short-circuited to a PLACEHOLDER model and never trained —
printing "exported successfully" for a non-functional model. Given the
project's anti-"is it fake" stance, silently emitting a fake model is the
wrong default.
Fix:
- Only emit the placeholder container-format demo when --export-rvf is used
*standalone* (new `export_emits_placeholder_demo` guard). With
--train/--pretrain, fall through so the real training pipeline runs and
exports calibrated weights.
- The standalone path now prints a clear WARNING that it writes a
container-format demo with placeholder weights — not a trained model —
pointing to --train / a pretrained encoder (#894).
- Docs: flag --export-rvf as a placeholder demo in the flag table, and fix
the Docker training example to use --save-rvf (consistent with the
from-source example) instead of the placeholder --export-rvf.
3 unit tests for the guard. Full crate unit suite: 429 + 117 passed, 0 failed.
The v1 "100% presence accuracy" headline was already retracted in the
README / user-guide intro / proof-of-capabilities — but 6 secondary
spots still flatly claimed "100% accuracy, never false alarms", which
made proof-of-capabilities.md's "replaced everywhere" assertion untrue.
Completed the retraction in-place with the honest label-free metric
(82.3% held-out temporal-triplet; v1 was a single-class recording where
a constant "yes" scores ~99.98%):
- docs/readme-details.md — 2 benchmark tables + the pre-trained-model row
- docs/user-guide.md — capability table, model-file comment, applications list
- CHANGELOG.md — annotated the historical entry in-place (kept as public
record per built-in-public ethos, not rewritten)
Verified: no remaining flat "100% presence/accuracy" claim lacks a
retraction marker; proof-of-capabilities.md "replaced everywhere" is now
accurate.
verify.py's published hash is now f8e76f21 (doppler excluded). Document
that the proof reproduces bit-for-bit across Windows / two Linux hosts /
the Azure CI runner, that the peak-normalized Doppler is excluded due to
its cross-microarch argmax instability, and that a relative-tolerance
check against a committed reference vector backs the five stable features.
- README: replace retracted "100% presence" claim with honest 82.3%
held-out temporal-triplet; correct stale "pose model not in this
release" (now live at ruvnet/wifi-densepose-mmfi-pose, 82.69%
torso-PCK@20 SOTA); add a Results & proof table (HF models,
AetherArena, benchmark study, deterministic verify.py proof, witness).
- user-guide: same 100%->82.3% correction in two places; add Results &
proof pointers and the SOTA pose model + AetherArena links.
- docs/proof-of-capabilities.md (new): evidence-first rebuttal to the
"fake / misleading" claims. Concedes what was fair (over-stated early
metrics, AI-doc tone), refutes the category errors (simulate-mode
mistaken for fraud; missing weights mistaken for missing pipeline),
and gives copy-paste "prove it yourself" steps (verify.py VERDICT:
PASS + published SHA-256, cargo test, HF model pull, ESP32 CSI).
Emphasizes built-in-public history (git, 96 ADRs, CHANGELOG, issues
incl. #803/#872 bug->fix arcs) as the anti-facade evidence.
- aether-arena/VERIFY.md: cross-link the whole-platform proof doc.
Verified: python archive/v1/data/proof/verify.py -> VERDICT: PASS
(hash ca58956c...9199 matches published expected_features.sha256).
Co-Authored-By: claude-flow <ruv@ruv.net>
Random frozen encoder + trained head matches a fully-trained encoder to
within 2-4pts (cross-subject <2pts). WiFi-CSI sensing is largely a
random-features + target-readout problem: barely a learned representation
to transfer, which unifies the zero-shot collapse, no-transfer results,
foundation-encoder failure, and why per-room calibration works. Practical:
invest in readout + calibration, not encoder pretraining.
Co-Authored-By: claude-flow <ruv@ruv.net>
Re-ran transfer on 14-class person-ID (harder than 6-activity HAR): same
null-transfer result (MM-Fi pretrain 91.7% = random 92.8%). Unified root
cause: CSI in-domain classification lives in the target-trained readout
(random projection already separable); learned reps don't transfer across
subjects/rooms/datasets. WiFi-CSI is distribution-locked. Addresses the
'HAR too easy' caveat.
Co-Authored-By: claude-flow <ruv@ruv.net>
Tested the cross-dataset frontier: MM-Fi-trained CSI representation does NOT
transfer beneficially to NTU-Fi HAR (frozen probe 91.5% = random features
93%; full fine-tune 75% < probe). CSI reps are distribution-locked, same
root cause as within-MM-Fi cross-subject/-env collapse. Caveat: NTU-Fi 6
coarse activities are an easy target (random->93%). Updates the study's
cross-dataset limitation from 'untested' to this measured result.
Co-Authored-By: claude-flow <ruv@ruv.net>
Consolidates the full campaign into one committed, citable artifact (the
detailed log was in a gitignored staging report): pose SOTA 83.6% + 20KB
int4 edge model; action recognition 88% (a WiFi task MM-Fi never
benchmarked); the generalization story (zero-shot collapse, few-shot
calibration rescue, task-general across pose+action); all honest negatives
(CORAL/DANN/instance-norm/SupCon/distillation/subject-scaling); the 11KB
calibration-adapter deployment recipe; honest limitations (cross-dataset
untested, ARM latency pending).
Co-Authored-By: claude-flow <ruv@ruv.net>
Verified on a 2nd MM-Fi task: 27-class action recognition (which MM-Fi
never benchmarked for WiFi; only published baseline WiDistill 34%). In-domain
88% (leaky); cross-subject zero-shot collapses to ~10%; few-shot calibration
rescues 10->76% (1000 samples). Same mechanism as pose -> few-shot in-room
calibration is the universal WiFi-sensing generalization answer, not a pose
quirk.
Co-Authored-By: claude-flow <ruv@ruv.net>
Decisive capstone: cross-environment (unseen room+people) zero-shot
10.6%, but 5 calibration samples/person -> 60%, 200 -> 73%. The hard
frontier is calibration-soluble, MORE dramatically than cross-subject
(+62.5 vs +12 at K=200). The unsolved-frontier framing was a zero-shot
artifact. Reframes generalization: ship few-shot calibration, not
zero-shot invariance. Recommend accepting ADR-150 re-scoped around the
calibration mechanism.
Co-Authored-By: claude-flow <ruv@ruv.net>
Compared per-room calibration methods at K=200: LoRA rank-8 recovers
63.6->72.5% (SOTA-level) with just 11K params (~11KB), 0.5% the model
size. Validates the ship-base-once + tiny-per-room-adapter mechanism for
the RuView calibration service. Accuracy/size knob documented.
Co-Authored-By: claude-flow <ruv@ruv.net>
Measured cross-subject PCK vs N training subjects: 4->8 = +21pts, but
24->32 = +0.45pt. Saturates ~64%, ~19pt below in-domain. Correction to
'more data': subject-count returns vanish past ~16-20; the residual is
device/room/protocol shift. Re-scope phase-1 capture around DIVERSITY
(rooms/devices/protocols) + few-shot target adaptation, not headcount.
Co-Authored-By: claude-flow <ruv@ruv.net>
Published deployable int4-QAT micro (verified 74.08%, ~20KB) at
ruvnet/wifi-densepose-mmfi-pose/edge. Runs 0.135ms single-thread x86 CPU
(no GPU) - real-time pose without an accelerator. ARM on-device validation
pending fleet availability.
Co-Authored-By: claude-flow <ruv@ruv.net>
Swept model size on MM-Fi random_split: every config from micro (75,237
params, 0.22ms, 74.30%) up beats MultiFormer (72.25%); nano (40K, 0.13ms)
within 0.5pt. Pareto-dominant (smaller AND more accurate than prior SOTA).
Orthogonal to the data-bound accuracy frontier (ADR-150).
Co-Authored-By: claude-flow <ruv@ruv.net>
Measured all near-term levers on the official MM-Fi cross-subject split:
- mixup+TTA+ensemble = best at 64.92% (+0.9 over doc 64.04)
- pose-contrastive foundation pretrain: estimated +5..+12, MEASURED -2.3
(SupCon loss pinned at ln(B) across K/BS/seeds -> same-pose CSI is not
contrastively alignable across subjects)
- instance-norm+SpecAugment -4.6; CORAL/DANN ~0
Conclusion: the 18-pt in-domain<->cross-subject gap is fundamental subject
shift, not algorithmic. Promotes multi-subject data collection to the primary
lever; recommends re-scoping ADR-150 phase 1 around capture.
Co-Authored-By: claude-flow <ruv@ruv.net>
Per direction "remove the initial number, optimize for benchmark first" + "include
witness chain capabilities for proof and repeatability analysis":
- Empty board, no seeded numbers: ledger seeds to genesis only. Every result is a
real scoring-pipeline witness; RuView gets no hand-entered baseline.
- Real model scoring: aa_score_runner now loads predictions + an eval split
(--split/--pred) and scores them through the real ruview_metrics pose harness —
not just a synthetic fixture. Committed public smoke split (fixtures/smoke_*.json).
- Witness chain: each score emits a witness = inputs_sha256 (binds it to the exact
inputs) + proof_sha256 (cross-platform-stable score hash) + harness_version.
- Repeatability analysis: --repeat N runs the harness N× and fails if it ever
yields >=2 distinct proof hashes (16/16 identical locally).
- Witness ledger: ledger/ledger_tools.py — append-only, hash-chained, tamper-
evident (seed/append/verify); editing any past row breaks the chain.
- CI gate extended: determinism + repeatability(16) + real-scoring smoke + ledger
chain verify on every PR.
Co-Authored-By: claude-flow <ruv@ruv.net>
AetherArena ("AA") — the official, project-agnostic Spatial-Intelligence Benchmark
(ADR-149, Accepted). Iteration 1 of the long-horizon build:
- ADR-149 accepted: name locked (ruvnet/aether-arena), v0 metrics locked
(pose/presence/latency/determinism), dataset legality resolved (MM-Fi CC BY-NC
only; Wi-Pose excluded). Adds four-part framing, threat model, arena_score
formula, submission state machine, neutrality/governance, and the §7 acceptance test.
- aa_score_runner: deterministic scorer bin reusing the real ruview_metrics pose
harness on a fixed seed=42 fixture → RuViewTier-style verdict + cross-platform
SHA-256 proof hash. Builds --no-default-features (no torch/GPU). VERDICT: PASS.
- CI harness gate: .github/workflows/aether-arena-harness.yml runs the scorer on
every PR — the "PR that runs the harness as part of the build" requirement.
- Scaffold: aether-arena/{README,VERIFY,STATUS}.md + schema/aa-submission.toml.
- Horizon record persisted (.claude-flow/horizons/aether-arena-aa.json).
Infra = the deliverable; model SOTA (MM-Fi PCK@20) is a separate effort blocked on
ADR-079 data collection, tracked as a stretch goal, not an infra exit.
Co-Authored-By: claude-flow <ruv@ruv.net>
- User guide: full retrain workflow (record → vqvae → transformer → serve)
with checkpoint path usage
- README: note fine-tune capability in world model capability row
Co-Authored-By: claude-flow <ruv@ruv.net>
Weaves the three framing points into every ADR in the series:
- skeleton/scaffolding (data contracts + trust/privacy/audit machinery +
algorithms; real, tested, compiling) that existing sensing code plugs into
- Built (tested building block) vs Integration glue (not yet on the live 20 Hz
path) — per-ADR, with commit + issue references
- trust throughline (traceable evidence, sensor agreement, calibration
provenance, auditable privacy)
ADR-136 §8 carries the full series framing; 137-146 carry per-ADR status.
Co-Authored-By: claude-flow <ruv@ruv.net>
Operator-initiated calibration that records 30 s of stationary CSI,
emits a per-subcarrier baseline (amplitude mean+variance via Welford,
phase via circular sin/cos sums with von Mises dispersion), and gates
downstream stages on a deviation z-score. Plugs into multistatic
coherence gating, motion/presence detection, and the new ADR-134 CIR
estimator as a reference-subtracted input.
API surface (under wifi_densepose_signal):
CalibrationConfig::{ht20, ht40, he20, he40}
CalibrationRecorder { record(), finalize(), frames_recorded() }
BaselineCalibration {
subcarriers: Vec<SubcarrierBaseline>,
deviation(&CsiFrame), subtract_in_place(&mut CsiFrame),
to_bytes(), from_bytes()
}
CalibrationDeviationScore { amplitude_z_median, amplitude_z_max,
phase_drift_median, motion_flagged }
CalibrationError { SubcarrierMismatch, TierMismatch,
InsufficientFrames, VersionMismatch, TruncatedBuffer }
Binary baseline format: magic 0xCA1B_0001 + u8 version=1 + u8 tier +
captured_at_unix_s (i64) + frame_count (u64) + num_subcarriers (u32) +
[SubcarrierBaseline; N] as 16 bytes each (amp_mean, amp_variance,
phase_mean, phase_dispersion as f32 LE). Hand-written serialisation so
the format is stable across Rust toolchain versions without serde drift.
CLI: new `wifi-densepose calibrate` subcommand binds a UDP listener
(0xC511_0001 frames), streams them through CalibrationRecorder, prints
a real-time z-score banner per ADR-135 §risk 1 (operator-may-be-moving),
aborts on sustained high deviation, and writes the binary baseline to
disk. Local UDP packet parser duplicated from sensing-server (per ADR
discussion — avoids cross-crate API churn).
Witness: cross-platform-deterministic SHA-256 over the per-subcarrier
quantised baseline profile (u16 LE at 1e-2/1e-4/1e-3, no sort) using
the lesson learnt from the CIR PR #837 libm-jitter fix. Hash:
d6bce07ecb1648e6936561df44bf4a3bfc17bb0ba5f692646b2301d105b52f67
CI guard: new "ADR-135 calibration witness proof (determinism guard)"
step under the Rust Workspace Tests job, adjacent to the existing
ADR-134 CIR guard. Regressions are unambiguously attributable.
Hardware-in-loop validation: full 600-frame capture exercised via the
new scripts/synth-csi-udp.py emitter targeting 127.0.0.1:5005. The CLI
binary received 600 frames at 20 Hz, z_med stable at ~0.7, motion
correctly NOT flagged, finalised baseline written to baseline.bin (860
bytes) with correct magic + version + timestamp in the header. Live
ESP32 capture from COM9 is operator follow-up — requires provisioning
the firmware's UDP target IP to match the host running the CLI.
Test results (cargo test -p wifi-densepose-signal --no-default-features):
lib: 382 pass / 0 fail / 1 ignored
calibration_synthetic: 17 pass / 0 fail
calibration_drift: 5 pass / 0 fail
calibration_roundtrip: 10 pass / 0 fail
cir_*: 9 pass + 6 documented P2 ignores
doctest: 10 pass
Bench: 20 Criterion combinations registered
(recorder_record / recorder_finalize / deviation / record_600 /
to_bytes across HT20/HT40/HE20/HE40 tiers).
Witness: bash scripts/verify-calibration-proof.sh → VERDICT: PASS
Co-Authored-By: claude-flow <ruv@ruv.net>
* feat(signal): ADR-134 — CSI→CIR via ISTA + NeumannSolver warm-start
End-to-end first-class Channel Impulse Response estimation in the Rust
workspace. Bridges CSI (frequency domain) to CIR (delay domain) so
multistatic coherence gating, NLOS/LOS classification, and (at HT40+)
ToF ranging become tractable in `wifi-densepose-signal`.
Algorithm: ISTA L1 sparse recovery over a normalized DFT sub-matrix
sensing operator Φ ∈ ℂ^(K×G) with G = 3K (3× super-resolution). The
Tikhonov-regularised warm start re-uses `ruvector_solver::neumann::
NeumannSolver` — same call pattern as `fresnel.rs:280` and
`train/subcarrier.rs:225` — so no new crate dependencies.
Tiers supported: HT20 / HT40 / HE20 (Tier A-HE, C6) / HE40. The C6
HE-LTF tier is the preferred Tier A target whenever an 11ax AP is in
range; firmware substrate already shipped at v0.7.0-esp32 per ADR-110.
Measured performance (release, single CirEstimator shared across 12
links): HT20 2.72 ms / HE20 3.20 ms / HT40 13.43 ms / HE40 9.71 ms per
estimate(). HT20 12-link multistatic 17.7 ms — fits the 50 ms RuvSense
cycle; HT40 12-link 74 ms exceeds it and is flagged in ADR-134 §2.7 as
requiring Rayon parallelism or G=2K super-res reduction.
Measured Φ conditioning: κ(Φ) ≈ 1.00 identically across all tiers.
ADR-134 §2.3 was corrected — the C6 advantage is statistical SNR gain
(√(242/52) ≈ 2.16×) from more independent measurements, not improved
conditioning.
Witness: bit-deterministic SHA-256 over CirEstimator output on the
synthetic ADR-028 reference signal (100 frames, top-5 taps, 1e-6
quantization). Hash committed to expected_cir_features.sha256;
verify-cir-proof.sh wires the check into the existing witness bundle.
CI: cargo test --features cir + verify-cir-proof.sh added as separate
steps under the Rust Workspace Tests job; regressions are unambiguously
attributable.
Files:
- ADR + WITNESS-LOG-028 row 34 + CLAUDE.md module count (14 → 15)
- src/ruvsense/cir.rs (~540 LOC) + lib.rs re-exports + multistatic.rs
wire-up (reversible via `use_cir_gate=false`)
- 3 integration tests + Criterion bench + 3 deterministic fixtures
- cir_proof_runner binary + sha256 + verify-cir-proof.sh
Test rate: 395 pass / 6 ignored (P2 ISTA hyperparameter tuning; see
#[ignore] reasons) / 0 fail. cargo check clean; verify-cir-proof.sh
VERDICT: PASS.
Co-Authored-By: claude-flow <ruv@ruv.net>
* fix(signal): make CIR witness cross-platform-deterministic
The first witness (Windows-generated hash 89704bfd…) failed on Linux CI
with a different hash (b36741bf…). Root cause: hashing `re`/`im` parts of
top-5 taps at 1e-6 precision is too tight against libm differences in
sin/cos/sqrt across glibc, MSVC, and Apple-clang. The previous
"top-5 sorted by magnitude" form also suffered from rank instability when
taps are near-tied — libm jitter could shuffle the ordering even when the
algorithm is unchanged.
New canonical form: full per-tap quantised-magnitude profile in natural
index order, no sort.
- 156 taps × 2 bytes (u16 le) per frame = 312 bytes/frame.
- Quantisation 1e-2 — robust to ~1e-3 float drift while still tripping
on real algorithmic changes (e.g., a 10× lambda shift moves magnitudes
by >1e-2).
- No top-K selection — eliminates the unstable magnitude-sort step.
Regenerated expected_cir_features.sha256 — new hash 120bd7b1…
If the next CI run still mismatches, the cause is structural (rustfft SIMD
code path selection or NeumannSolver internal ordering), not magnitudes,
and the witness needs further coarsening or to be made platform-tagged.
Co-Authored-By: claude-flow <ruv@ruv.net>