Establishes the kernel-level output-divergence envelope between the
two backends — what §5's downstream-metric gate (contrastive loss,
rank-1, Spearman) would calibrate against. Two regimes:
1. Saturated pattern (window ≥ N, block ≥ N): sparse and dense visit
the same edge set, so divergence reflects only float accumulation
order. **Asserted < 1e-4** at N=32, heads=4, dim=16. Tight bound.
2. Realistic sparse (window=16, block=32, N=256): real approximation,
real divergence. **Measured max_abs_err = 5.22e-3, mean = 1.79e-3**
on the deterministic test inputs. Sanity-checked finite + < 1.0
so structural breakage (NaN, softmax overflow) trips a panic, but
the specific numbers are *baseline data* not a hard contract — the
§5 gate cares about downstream task metrics, not bit-equality.
Why this is in the test suite rather than a benchmark:
- It runs in <0.2s, no need to gate behind --release.
- The saturated-pattern bound IS a hard contract — if that breaks
the kernel changed semantics in a way the API hides, and we want
CI to catch it.
- Printing the realistic-pattern numbers (eprintln, visible with
--nocapture) gives a known-good reference point to compare future
builds against.
Test count is now 21/21 across the crate (6 smoke + 8 weight blob +
2 blob e2e + 3 streaming + 2 dense-vs-sparse).
Co-Authored-By: claude-flow <ruv@ruv.net>
Closes the Dense placeholder from earlier commits. Now both backends
implement forward(); only SparseGqa supports streaming step()/KvCache,
which is the structural gap dense MHA can't bridge by design.
Dense path:
- src/dense.rs new — DenseHead wraps upstream dense_attention. Stores
causal flag and (cloned) config. forward() is a one-line delegation;
no GQA dispatch (dense_attention upstream requires q_heads == kv_heads).
- AetherTemporalHead::Dense changed from a unit variant to Dense(DenseHead).
Construction succeeds for any valid TemporalHeadConfig where backend
is Dense.
- AetherTemporalHead.step() returns BackendDoesNotSupportStreaming for
Dense — there is no dense-MHA-with-KV-cache equivalent and offering
one would silently swallow the ADR-096 §3.2 structural argument.
- AetherTemporalHead.make_cache() likewise — there's no cache to size
for a dense kernel.
Errors:
- New TemporalError::BackendDoesNotSupportStreaming variant covers
the Dense-step / Dense-make_cache cases. Specific so callers can
fall back to forward() instead of giving up entirely.
- TemporalError::DenseBackendNotImplemented retained for v0.1
back-compat (no consumers depend on it post-this-commit, but
removing a public variant is a hard break). Future work can
deprecate it once downstream callers move off.
Tests (19/19 passing):
- dense_backend_returns_typed_error → renamed and rewritten as
dense_backend_forward_runs_with_matching_shape: constructs a Dense
head, runs forward over (32, 4, 4, 16) Q/K/V, asserts output shape.
- New dense_backend_step_returns_streaming_error: constructs Dense,
attempts make_cache, expects BackendDoesNotSupportStreaming.
- All 8 weight blob, 2 blob e2e, 3 streaming, 5 other smoke tests
unchanged and still passing.
This commit completes the ADR-096 §5 A/B gate: callers can now run
the same Q/K/V through both backends and compare outputs / latency.
The §5 four-gate validation (contrastive loss within 1%, rank-1
within 1pp, Spearman ≥0.95, latency ≥5×) becomes a runnable
proposition, not a future task — though the actual gate run requires
trained AETHER weights, which is its own track.
Co-Authored-By: claude-flow <ruv@ruv.net>
Closes the documentation gap on the host-side ADR-096 surface.
The crate has 7 commits, 5 source modules, 4 test suites, 2 examples,
and a captured benchmark; reviewers and downstream consumers needed
a landing page.
Sections:
- Quick start (5-line forward + 7-line streaming)
- Backends + selection rule (SparseGqa MHA-vs-GQA dispatch)
- Streaming semantics (cache lifetime, eviction policy, the
headline correctness test)
- Weight blob format with the host/firmware lockstep note
- Examples (init_random_blob, bench_speedup) with run lines
- Tests (18/18 passing as of 247794a2c, broken down by suite)
- Status of ADR-096 claims with concrete evidence for each
- Status of ADR-095 surface (firmware) + the toolchain blocker
- Carry-forward of the open questions still applicable from §8
The README intentionally cross-links to:
- docs/adr/ADR-096 for design rationale
- components/ruv_temporal/ README for the firmware mirror
- benches_results.md for the captured speedup curve
Doesn't claim more than is proven. Each ADR-096 claim either has a
test or a benchmark cited as evidence; the partial claim (30-100× at
long windows) explicitly says 21× was the measured number, not 30×.
Co-Authored-By: claude-flow <ruv@ruv.net>
Validates the central performance claim of ADR-096 with a runnable
benchmark. Single-run wall-clock, pure-Rust vs pure-Rust on x86_64
host. Real numbers, not just analytic argument.
Results (N=64..1024):
| N | Dense (ms) | Sparse (ms) | Speedup |
|--------|-----------:|------------:|--------:|
| 64 | 0.262 | 0.141 | 1.86× |
| 128 | 1.120 | 0.335 | 3.34× |
| 256 | 4.129 | 0.711 | 5.81× |
| 512 | 19.230 | 2.356 | 8.16× |
| 1024 | 71.904 | 3.389 | 21.21× |
Asymptotic check: 64→1024 is 16× more tokens. Dense's 274× cost
growth matches N² (256× = 16²). Sparse's 24× growth matches
N log N (16 · log(1024)/log(64) ≈ 27). The complexity claim is
empirically supported.
ADR-096 §3.1 honest-framing paragraph predicted N=64 would be
overhead-bound; we measured 1.86× there, consistent with the ADR's
warning that AETHER's current `window_frames=100` default is below
the inflection point where sparse pays.
What this commit adds:
- examples/bench_speedup.rs — measures dense_attention (upstream
reference), AetherTemporalHead.forward (this crate's wrapper),
and SubquadraticSparseAttention.forward (raw, to confirm the
wrapper isn't introducing overhead — it isn't, the two are
within noise).
- benches_results.md — captured table + asymptotic check + caveats
(config used, what the benchmark doesn't measure, how to run).
Run it:
cargo run -p wifi-densepose-temporal --example bench_speedup --release
What's NOT measured here:
- Decode-step latency (already proved correct at last-token, not
yet timed against a hypothetical O(N²) dense decode — they're
structurally not comparable anyway).
- Memory footprint of KvCache + FP16 (matters on firmware, not host).
- GQA dispatch — this bench uses MHA shape so dense and sparse
operate on identical tensors. Real AETHER will want MQA per
TemporalHeadConfig::default_aether(), which halves KV memory.
Co-Authored-By: claude-flow <ruv@ruv.net>
The structural advantage that's the entire point of ADR-096: O(log T)
per new token via decode_step against an accumulated KvCache, vs
O(N²) recompute for dense MHA. This commit lands the API and proves
the numerical equivalence at the last position.
API:
- AetherTemporalHead::step(q_new, k_new, v_new, &mut cache)
Single-token decode. Appends (k_new, v_new) to cache, runs
decode_step(q_new) against the now-updated cache, returns the new
position's output.
- AetherTemporalHead::make_cache(capacity)
Convenience constructor — caller doesn't need to import
ruvllm_sparse_attention to size a cache. Per ADR-096 §8.5 the
natural lifetime is per-PoseTrack (re-ID) or per-session (online
classification); when the track drops, drop the cache.
- KvCache re-exported at the crate root.
Contract:
- q_new/k_new/v_new must each have seq == 1. Multi-token q is the
prefill path (forward), not decode_step.
- Cache lifetime is the caller's. The crate enforces shape via
make_cache so callers can't mismatch kv_heads / head_dim / block_size.
- KvCache fill is the caller's problem. Upstream H2O heavy-hitter
eviction is opt-in; this crate's wrapper doesn't pre-pick a policy.
Tests (18/18 total now passing):
- streaming_step_matches_forward_at_last_position — central claim:
16-token sequence, append k/v one at a time via step(), compare
the streamed last-token output to forward(full Q,K,V)[N-1].
max_abs_err < 1e-3 (currently passes well under that bound for
the 0.1-magnitude activations the test uses).
- step_rejects_multi_token_q — contract enforcement.
- make_cache_returns_kvcache_with_correct_shape — wiring smoke,
confirms (capacity, kv_heads, dim, block_size) ordering is correct
through the make_cache wrapper.
Test config uses MHA shape (q_heads == kv_heads) because the upstream
decode_step is wired to the MHA branch; the GQA decode path is on
upstream's roadmap and lands in a separate ADR-096 follow-up when it
does.
Co-Authored-By: claude-flow <ruv@ruv.net>
Closes the host→file→firmware loop on the Phase 1 weight format. Real
.rvne artifact emitted from the example, parsed back through filesystem
in the e2e test, byte-identical across two seeded runs.
- examples/init_random_blob.rs — produces a 41,244-byte deployable blob
matching the AETHER default head shape (input_dim=16, q_heads=4,
kv_heads=1 [MQA], head_dim=32, layers=2, classes=4 — staying coherent
with TemporalHeadConfig::default_aether so a real trainer can drop
in this shape with one search-and-replace). Uses xorshift64* with a
fixed seed (0xC511_0007_DEAD_BEEF) for reproducibility.
Per-layer weight count derivation lives in the example (Wq + Wk +
Wv + Wo, plus a final classifier head) so the kernel's expectation
is anchored in code rather than a comment that drifts.
- tests/blob_e2e.rs — two new tests, 15/15 total now passing:
* realistic_blob_roundtrips_through_filesystem — writes a 25+ KB
blob to std::env::temp_dir(), reads it back, parses, validates.
Mirrors what the firmware loader will do once the toolchain
unblocks (mmap NVS or EMBED_FILES → parse).
* deterministic_seed_produces_byte_identical_blobs — same seed
produces byte-identical output, twice. This is what makes a
witness-bundle (ADR-028) over trained weights meaningful.
Verified by running the example with an explicit out path:
cargo run -p wifi-densepose-temporal --example init_random_blob -- \
v2/target/example-output/model_init.rvne
→ 41244 bytes, parses clean, dtype/shape/CRC all good.
What this isn't yet:
- Not a trained model. Random init only.
- Not a kernel forward over the blob. That requires the firmware
Rust component to compile (Phase 5 — toolchain blocker).
- Not wired into wifi-densepose-train. ADR-096 §8.1 flagged that
the AETHER train crate doesn't currently have a temporal-axis
attention; that integration is a separate piece of work.
Co-Authored-By: claude-flow <ruv@ruv.net>
The training/firmware boundary needs a stable serialization for the
temporal head's weights, distinct from the kernel scaffold and the
firmware ABI. This commit defines that format on the host side. The
firmware-side mirrored loader lands when the toolchain unblocks.
Format:
- Header (24 B): magic 'RVNE' / version 1 / dtype flag
(FP32 / FP16) / input_dim / n_q_heads / n_kv_heads / head_dim /
n_layers / n_classes / weights_len.
- Body: weights_len bytes of flat per-layer weights.
- Footer (4 B): CRC32 IEEE 802.3 over everything before, same
polynomial used by temporal_task.c so a blob produced here parses
on the firmware unchanged.
Layout decisions:
- Little-endian throughout (Xtensa native).
- Weights kept as Vec<u8> rather than Vec<f32>/Vec<f16> so the no_std
firmware loader (which may not have the `half` crate) can mmap and
read either dtype directly.
- Versioning is hard-break: bumping `version` means firmware refuses
to load. Optional fields go behind reserved flag bits, never by
field reorder. Documented inline.
Validation surface:
- `WeightBlobHeader::validate()` catches zero dims, invalid GQA
ratios (n_q_heads % n_kv_heads != 0), n_layers=0, n_classes<2.
Same checks fire from `WeightBlob::parse()` so the firmware can't
accidentally accept a blob the host should have rejected.
- `WeightBlob::parse()` enforces magic / version / size / CRC
before exposing weights to the caller.
Tests (8/8 passing, alongside 5/5 sparse smoke = 13/13 total):
- roundtrip_fp32, roundtrip_fp16
- parse_rejects_bad_magic, _wrong_version, _size_mismatch,
_crc_corruption, _invalid_gqa_ratio_in_header
- header_constants_match_wire_layout (anchor)
What's deliberately NOT in this commit:
- The firmware-side mirrored loader (deferred to the iteration that
unblocks the esp Rust toolchain — no point shipping a parser that
can't be compiled).
- Per-layer weight ordering. The blob is a flat byte-buffer; the
interpretation of per-layer offsets is the kernel's contract,
documented in the eventual model module (ADR-095 §3.2 follow-up).
Co-Authored-By: claude-flow <ruv@ruv.net>
Implements Phases 1-3 of the ADR-096 roadmap:
Phase 1: workspace integration
- Add `ruvllm_sparse_attention` as a path-vendored workspace dep against
`vendor/ruvector/crates/ruvllm_sparse_attention`, default-features=false,
features=["fp16"]. Mirrors the no_std posture ADR-095 will need on the
firmware side so both consumers share a single feature set.
- Register `wifi-densepose-temporal` as workspace member.
Phase 2: AETHER temporal head
- `AetherTemporalHead` facade dispatches to a `SparseGqa` backend wrapping
`SubquadraticSparseAttention`. Selection rule from ADR-096 §4.4 enforced
at forward(): MHA branch when q_heads == kv_heads, GQA branch otherwise.
- `Dense` backend reserved (returns typed `DenseBackendNotImplemented`)
so config-time validation fails loudly instead of at forward().
- `TemporalHeadConfig::default_aether()` matches the AETHER training
default per ADR-096 §3.1 (window=32, block=16, q=4, kv=1 → MQA).
- Token 0 always wired as a global anchor — preserves AETHER's
contrastive "session-start reference" role per ADR-024.
Phase 3: smoke tests (5/5 passing)
- forward at AETHER default config, both MHA and GQA dispatch paths,
rejected dense backend, rejected non-divisible GQA ratio, and the
long-window roadmap target (N=1000, the 10s @ 100Hz case from
ADR-096 §3.1 — proves the kernel runs at lengths where dense MHA
costs 10⁶ edge ops vs sparse 10⁴).
Streaming `step()` deferred — KvCache lifecycle ties to PoseTrack per
ADR-096 §8.5 and lands when the firmware-side ABI does (Phase 4+).
Co-Authored-By: claude-flow <ruv@ruv.net>
When ?backend=<url> pointed at a server that wasn't running (e.g. user
forgot to start ruview-pointcloud serve before clicking Connect ESP32),
the viewer was retrying 10 Hz forever — flooding the console with
ERR_CONNECTION_REFUSED and offering no guidance about what was wrong.
Two fixes:
1. Replace setInterval(fetchCloud, 100) with self-rescheduling
setTimeout. On success: 250 ms steady cadence. On failure for an
explicit backend: 250 ms → 500 → 1 s → 2 s → 4 s → 8 s → 16 s →
capped at 30 s. Resets to 250 ms the moment the backend comes back.
Auto mode (Pages with no backend) still disables network entirely
after the first 404. Strict-live mode (?live=1) also backs off so
it doesn't spam.
2. Show an actionable status banner in the info panel when the chosen
backend is unreachable: the URL, the actual error string, the next
retry time, and the exact `cargo run` command to start the server.
Visitor sees the diagnosis instead of staring at a 'demo' badge
wondering why their ESP32 feed isn't visible.
The scene keeps animating (face mesh / synthetic) while the viewer
waits, so the tab never goes blank.
Co-Authored-By: claude-flow <ruv@ruv.net>
Lets the visitor enable their browser webcam face mesh in addition to
(not instead of) a connected ESP32 backend. Both render in the same
Three.js scene — the live ESP32-driven splats from /api/splats plus the
visitor's own face as a 478-vertex MediaPipe point cloud. Use cases:
- Local development: see your face overlaid on the camera+CSI fusion
output to debug coordinate-frame alignment.
- Demos: show 'this is the room as ESP32 sees it, and this is me as
MediaPipe sees me' side-by-side in one scene.
Implementation:
- Extract pushFaceSplats(splats) — pushes the 478 face vertices plus
~8000 edge-interpolated samples into the array, with no Foundation
context. Reused by faceMeshFrame (demo path) and handleData (overlay
path) so there is one source of truth for face-splat geometry.
- handleData now appends pushFaceSplats output to data.splats when the
source is not 'face-mesh' AND the user has clicked the camera CTA.
Sets data._faceOverlay so the badge can show '+ face overlay'.
- Camera CTA is no longer hidden in remote/live modes — it relabels to
'▶ Add face overlay' so the affordance is clear. Strict-live mode
(?live=1) still hides it because the offline panel takes over.
- Splat count in the info panel reflects the rendered total (backend +
overlay) when the overlay is active.
Co-Authored-By: claude-flow <ruv@ruv.net>
The hosted GitHub Pages viewer can now act as a thin client for a
locally-running ruview-pointcloud serve instance — flip a button, the
ESP32's CSI fusion (camera depth + WiFi CSI + mmWave) renders inside
the same Three.js scene that previously only showed the face mesh
demo. No clone, no rebuild, no toolchain on the visitor's side.
Server (stream.rs):
- Add tower_http::cors::CorsLayer with a deliberate allowlist:
https://ruvnet.github.io, http://localhost:*, http://127.0.0.1:*,
and 'null' (for file:// origins). Anything else is denied — not a
wildcard CORS. Modern browsers (Chrome 94+, Firefox 116+, Safari
16.4+) treat 127.0.0.1 as a "potentially trustworthy" origin so
HTTPS Pages → HTTP loopback is permitted. The new layer wraps the
existing /api/cloud, /api/splats, /api/status, /health routes.
- Cargo.toml: pull in workspace tower-http (cors feature already on).
Viewer:
- New "📡 Connect ESP32…" CTA bottom-right. Clicking prompts for a
ruview-pointcloud serve URL (default http://127.0.0.1:9880),
persists the last-used value in localStorage, and reloads with
?backend=<url> so the existing remote-mode fetch path takes over.
When already connected the button toggles to "disconnect" and
reloads back to the demo.
- Reuses the existing transport selector — no new code path to
maintain. The face mesh / synthetic demo render path is unaffected;
this is purely an additive UI affordance over the ?backend= query.
Docs:
- ADR-094 §2.3 expanded with the local-ESP32 workflow and the CORS
posture rationale.
- Workflow README documents ?backend=http://127.0.0.1:9880 as the
intended local-ESP32 path.
Tests: cargo test -p wifi-densepose-pointcloud → 15/15 passed.
Co-Authored-By: claude-flow <ruv@ruv.net>
Browsers auto-request /favicon.ico when none is declared in <head>.
On a static GitHub Pages host that's a guaranteed 404 in the console.
Inline a 32x32 SVG amber dot via data: URL so the browser is satisfied
without an extra network round-trip.
Co-Authored-By: claude-flow <ruv@ruv.net>
When the viewer is hosted on a static origin (GitHub Pages, S3) it has
no backend at /api/splats. The default ?backend=auto path was issuing
a fetch every 100 ms, getting a 404, falling back to the demo, and
flooding the console with one 404 per tick. Cosmetic on the surface
but real network/CPU waste over time.
After the first 404 in auto mode, set networkDisabled=true and skip
fetch on subsequent ticks — the interval still fires but goes straight
to pickDemoFrame() so the face mesh / synthetic render path keeps
animating. Remote (?backend=<url>) and live (?live=1) modes keep
retrying so a transient outage doesn't permanently downgrade them.
Co-Authored-By: claude-flow <ruv@ruv.net>
Adds optional cinematic effects to the face-mesh demo, all toggleable
via a new ?fx= URL param. Default is 'all' (texture + mesh + scan +
halo). Lightweight modes available: ?fx=clean (texture only) or
?fx=points (original solid amber).
- Texture: per-frame webcam → hidden 2D canvas → getImageData lookup
at each landmark (and each interpolated edge sample). Splats now
carry the visitor's actual skin tone, not solid amber. Sampling is
mirrored on x to match the selfie convention used by the face mesh
vertex placement. All on-device — no frames leave the browser.
- Mesh: persistent THREE.LineSegments overlay drawn from
FACEMESH_TESSELATION (~1300 edges). Translucent (opacity 0.35),
amber, additive blending, depthWrite off — gives a holographic
wireframe wrapping the point cloud. Geometry is updated in place
each frame; only positions get re-uploaded.
- Scan: vertical bright slab sweeps top→bottom every 4 seconds,
amplifying splat color up to 2.6× when within ±0.08 world units of
the line. Westworld-style scanning.
- Halo: existing 60-particle ring around the face is now opt-in via
FX_HALO. Cleaner default for the texture-mesh combination.
Info panel surfaces active fx list in face-mesh mode. Synthetic
fallback hides the wireframe overlay so it doesn't render against an
empty figure. Workflow README updated with the new ?fx= options.
Co-Authored-By: claude-flow <ruv@ruv.net>
Three fixes in one pass to address visitor feedback:
1. Face was rendering upside down — MediaPipe's lm.y is image-down (0=top
of frame, 1=bottom) and the existing updateSplats() already does a
y-negate to convert to Three.js Y-up. Pre-flipping in lmToCenter was a
double flip. Use lm.y directly so the renderer's single flip lands the
head at the top of the screen.
2. Density and fidelity — interpolate 6 splats per FACEMESH_TESSELATION
edge (~1300 edges → ~8000 face splats vs 478 vertex-only). Amplify
lm.z mapping (×8 vs ×4) so eye sockets, nose, and chin show real 3D
depth. Smaller splat scale (0.006 surface, 0.010 vertices) for finer
point appearance.
3. Foundation-inspired aesthetic — the demo now renders the subject
(face mesh OR procedural fallback) inside a Hari Seldon time-vault:
* Holographic surveyor grid in amber, breathing brightness pattern.
* Slow-rotating two-arm galactic spiral receding behind the subject
(~640 stars, warm core to cool edges, Trantor-evocation).
* 800-star deterministic distant starfield on a spherical shell
(fixed LCG seed so visitors don't see noise flicker).
* 60-particle holographic halo orbiting the subject plane.
Shared pushFoundationContext() drives both face-mesh and synthetic
paths. Synthetic procedural figure densified 4x (240 vs 60 points)
and re-oriented (head→top, feet→bottom) so the y-down convention is
internally consistent.
Camera pulled back to (0, 0.2, -3.5) to frame the galactic context.
Poll cadence 4 Hz → 10 Hz so the spiral animates smoothly. Info panel
gets a Seldon quote and "Seldon Vault" branding. CTA copy reframed to
"Project Subject — render your face into the Vault".
ADR-094 already documents the dual-transport intent; the aesthetic
choices here are content, not architecture, so no ADR update needed.
Co-Authored-By: claude-flow <ruv@ruv.net>
The previous synthetic procedural demo did not represent what the local
fusion pipeline produces — a real depth-backprojected point cloud of
the user's face and surroundings. This commit ports the closest browser
equivalent: MediaPipe Face Mesh runs in-browser at ~30 fps and emits
478 3D landmarks per frame. Each visitor now sees the outline of their
own face rendered as a point cloud, with a small floor + back wall for
spatial context.
- Adds MediaPipe Face Mesh + Camera Utils via jsdelivr CDN.
- Adds an "▶ Enable camera" CTA so getUserMedia is gated on a user
gesture (required by some browsers and good UX regardless).
- New face-mesh frame generator uses the same splat shape as the live
/api/splats payload, so a single render path drives both modes.
- Mirrors x to match selfie convention; maps lm.z (relative depth) to
the world-coord range used by the live pipeline.
- Falls back automatically to the procedural floor + walls + figure
when the camera is denied, dismissed, or unavailable.
- Badge surfaces the new state: '● DEMO Your Face (MediaPipe)'.
- Bumps poll cadence to 4 Hz so face mesh updates feel live.
- ADR-094 updated to reflect the new default behavior.
Co-Authored-By: claude-flow <ruv@ruv.net>
Publishes the live 3D point cloud viewer to gh-pages/pointcloud/ so it
can be linked from the README alongside the Observatory and Dual-Modal
Pose Fusion demos. The viewer auto-selects its transport from URL
parameters:
- default / ?backend=auto — try /api/splats, fall back to synthetic demo
- ?backend=demo — synthetic in-browser only, no network
- ?backend=<url> — fetch from a CORS-permitting host running
ruview-pointcloud serve
- ?live=1 — strict mode, show offline panel instead of demo fallback
The synthetic frame matches the live API JSON shape (splats, count,
frame, live, pipeline.{skeleton,vitals}) so a single render path drives
both modes. New workflow uses keep_files: true to preserve the existing
observatory/, pose-fusion/, and nvsim/ deployments on gh-pages.
See docs/adr/ADR-094-pointcloud-github-pages-deployment.md for the full
decision record and 6 acceptance gates.
Closes the two post-merge failures from #436:
1. wasm-pack: command not found — cargo install doesn't reliably leave
the binary on PATH. Switched to the canonical installer in both the
Pages and a11y workflows.
2. nvsim-server Docker build — cargo couldn't resolve workspace.dependencies
from a partial copy. Dockerfile now generates a stub workspace
Cargo.toml inline that lists just nvsim + nvsim-server.
* feat(ruvector): ADR-084 Pass 1 — sketch module foundation
Implements Pass 1 of ADR-084 (RaBitQ similarity sensor): a thin
RuView-flavored API over `ruvector_core::quantization::BinaryQuantized`,
exposed at `wifi_densepose_ruvector::{Sketch, SketchBank, SketchError}`.
API surface:
- `Sketch::from_embedding(&[f32], sketch_version: u16)` — sign-quantize
a dense embedding into a 1-bit-per-dim packed sketch.
- `Sketch::distance` — hamming distance with schema-mismatch error.
- `Sketch::distance_unchecked` — hot-path variant for sketches already
validated as same-schema.
- `SketchBank::insert/topk/novelty` — bank with caller-assigned u32 IDs,
schema locked at first insert, novelty = min_distance / embedding_dim.
Schema versioning (`sketch_version: u16` + `embedding_dim: u16`) prevents
silent comparisons across embedding-model generations. Bumping the model
forces re-sketch of the candidate bank.
Pass 1 establishes the API and unit-test foundation. Acceptance criteria
(8x-30x compare-cost reduction, 90% top-K coverage, <1pp accuracy regression)
are measured per-site in Passes 2-5.
Validated:
- 12 new tests pass (sketch construction, hamming, top-K ordering,
schema lock, schema rejection, novelty)
- cargo test --workspace --no-default-features → 1,551 passed, 0 failed,
8 ignored (was 1,539 before; +12 new tests)
- ESP32-S3 on COM7 still streaming live CSI (cb #117300)
Co-Authored-By: claude-flow <ruv@ruv.net>
* bench(ruvector): ADR-084 acceptance — sketch-vs-float compare cost
Adds sketch_bench measuring the first ADR-084 acceptance criterion
(8x-30x compare cost reduction) at three dimensions and a realistic
top-K@k=8 over 1024 sketches.
Measured (Windows host, criterion --warm-up 1s --measurement 3s):
compare_d512:
float_l2: 197.03 ns/op
float_cosine: 231.17 ns/op
sketch_hamming: 4.56 ns/op → 43-51x speedup
topk_d128_n1024_k8:
float_l2_topk: 47.59 us
sketch_hamming: 6.34 us → 7.5x speedup
Pair-wise compare exceeds the 8-30x acceptance criterion by an order
of magnitude. Top-K is at 7.5x — close to the threshold; the sort
dominates at this bank size, which is a Pass 1.5 optimization
opportunity (partial-sort heap for small K).
Co-Authored-By: claude-flow <ruv@ruv.net>
* perf(ruvector): ADR-084 Pass 1.5 — partial-sort heap in SketchBank::topk
Replace `sort_by_key + truncate` (O(n log n)) with a fixed-size max-heap
(O(n log k)) for top-K queries when n > k. Fast path when n ≤ k stays
on the simple sort.
Bench at d=128, n=1024, k=8 (Windows host, criterion 3s measurement):
Before (sort + truncate): 6.34 µs/op
After (heap): 3.83 µs/op -39.4% / +1.65× faster
Combined with the 32× memory shrink and 47.6 µs → 3.83 µs total path
saving:
topk_d128_n1024_k8 vs float_l2_topk:
Pass 1 sort_by_key: 47.59 µs / 6.34 µs = 7.5× speedup
Pass 1.5 heap: 47.59 µs / 3.83 µs = 12.4× speedup
Now over the ADR-084 acceptance criterion of 8× minimum. Heap pays off
strictly more at larger n; benchmark at n=4096 is a Pass-2 follow-up.
Co-Authored-By: claude-flow <ruv@ruv.net>
* feat(signal): ADR-084 Pass 2 — sketch-prefilter for EmbeddingHistory::search
Adds `EmbeddingHistory::with_sketch(...)` and `search_prefilter(query, k,
prefilter_factor)`. The prefilter sketches the query, hamming-ranks the
parallel sketch array to take the top `k * prefilter_factor` candidates,
then refines those with exact cosine and returns the top-K.
`EmbeddingHistory::new(...)` is unchanged — sketches are opt-in via the
new constructor. `search_prefilter` falls back to brute-force `search`
when sketches are disabled, so callers never see incorrect results.
ADR-084 acceptance criterion empirically validated:
Synthetic 128-d AETHER-shape, n=256, 16 queries:
k=8, prefilter_factor=4 → 78.9% top-K coverage (FAIL <90%)
k=8, prefilter_factor=8 → ≥90% top-K coverage (PASS)
k=16, prefilter_factor=8 → ≥90% top-K coverage (PASS)
The factor=4 default that I'd planned in Pass 1 falls below the 90% bar
on uniform-random synthetic data. Production callers should use **8**
unless their embeddings carry enough structure (real AETHER traces
likely will) to clear the bar at lower factors. Documented in the
search_prefilter docstring and asserted in
test_search_prefilter_topk_coverage_meets_adr_084.
FIFO eviction now drains the parallel sketches array in lockstep —
test_search_prefilter_evicts_sketches_on_fifo guards against the two
arrays drifting (which would silently corrupt top-K via index
mismatch).
Validated:
- cargo test --workspace --no-default-features → 1,554 passed,
0 failed, 8 ignored (was 1,551; +3 new prefilter tests)
- ESP32-S3 on COM7 still streaming live CSI (cb #3200)
Co-Authored-By: claude-flow <ruv@ruv.net>
* bench(signal): ADR-084 Pass 2 — end-to-end search_prefilter speedup
Measures EmbeddingHistory::search_prefilter (sketch + cosine refine)
vs the brute-force EmbeddingHistory::search baseline at three realistic
AETHER bank sizes, with the empirically validated prefilter_factor=8.
Measured (Windows host, criterion --warm-up 1s --measurement 3s):
d=128, k=8:
n=256 brute_force_cosine = 31.98 us, prefilter = 13.78 us → 2.3x
n=1024 brute_force_cosine = 110.4 us, prefilter = 16.64 us → 6.6x
n=4096 brute_force_cosine = 507.4 us, prefilter = 66.37 us → 7.6x
Speedup grows with bank size (sketch overhead is fixed; brute-force
scales linearly with n). At n=4k the prefilter approaches the 8x
ADR-084 acceptance criterion; at n=10k+ (realistic multi-day
deployment banks) it crosses cleanly. Below n=512 the brute-force
path is already cheap (sub-50 us) so the prefilter's narrower wins
don't materially affect the hot path.
Coverage acceptance (≥90% top-K agreement) is exercised in the
unit-test suite, not the bench. The bench measures cost only.
Co-Authored-By: claude-flow <ruv@ruv.net>
* feat(signal): ADR-084 Pass 3 — EmbeddingHistory::novelty primitive
Adds the cluster-Pi novelty-sensor primitive: `EmbeddingHistory::novelty(query)`
returns `Option<f32>` in [0.0, 1.0] where 0.0 = exact-match-in-bank
and 1.0 = no-overlap. Returns None when sketches are disabled so
callers can fall back gracefully (existing `EmbeddingHistory::new`
constructor stays sketch-disabled).
This is the building block of the cluster-Pi novelty gate
described in ADR-084 §"cluster-Pi novelty sensor": each sensor node
maintains a bank of recent feature vectors, the gate scores the
incoming frame's novelty against the bank, and the heavy CNN /
pose-model wake gate consumes the score.
Wiring novelty into sensing-server's NodeState happens in a
follow-up — that's a ~50-line surgical change touching main.rs that
deserves its own commit. This patch lands the primitive + tests so
the wiring is straightforward.
Three regression tests added:
- test_novelty_returns_none_without_sketches
(graceful fallback when bank is sketch-less)
- test_novelty_zero_for_exact_match_one_for_empty_bank
(semantic boundaries)
- test_novelty_decreases_as_bank_grows_around_query
(gradient direction — guards against reversed comparator)
Validated:
- cargo test --workspace --no-default-features → 1,557 passed,
0 failed, 8 ignored (was 1,554; +3 new novelty tests)
- ESP32-S3 on COM7 still streaming live CSI (cb #7600)
Co-Authored-By: claude-flow <ruv@ruv.net>
* feat(sensing-server): ADR-084 Pass 3 — wire novelty into NodeState
Wires the EmbeddingHistory::novelty primitive (Pass 3 prior commit)
into the per-node frame ingestion path on the cluster Pi. Each
incoming CSI frame now updates a per-node sketch bank of the last
6.4 s of feature vectors and produces a novelty score in [0.0, 1.0]
that downstream model-wake gates can consume.
Two NodeState structs were touched (one in types.rs and a
refactoring-leftover duplicate in main.rs that the call site uses);
both gain feature_history + last_novelty_score fields and an
update_novelty helper that:
- truncates / zero-pads incoming amplitudes to NOVELTY_VECTOR_DIM (56)
- scores novelty *before* inserting (so a frame doesn't see itself)
- FIFO-evicts when the bank reaches NOVELTY_HISTORY_CAPACITY (64)
Wired at the per-node ESP32 frame path in main.rs:3772 (immediately
before frame_history.push_back). Existing call sites that operate on
the singleton SensingState (not per-node) intentionally untouched —
they will be wired in a follow-up alongside the WebSocket update
envelope's novelty_score field.
Two new unit tests in novelty_tests:
- first_frame_yields_max_novelty_then_zero_on_repeat
(semantic boundaries: empty bank = 1.0, exact repeat = 0.0)
- handles_short_and_long_amplitude_vectors
(truncate / zero-pad robustness across hardware variants)
Validated:
- cargo test --workspace --no-default-features → 1,559 passed,
0 failed, 8 ignored (was 1,557; +2 new novelty tests)
- ESP32-S3 on COM7 still streaming live CSI (cb #3900)
Co-Authored-By: claude-flow <ruv@ruv.net>
* hardening(ruvector): L2 from PR #435 review — overflow on >u16::MAX dims
Pass 1.6 hardening, addressing L2 finding from the security review on
PR #435 (https://github.com/ruvnet/RuView/pull/435#issuecomment-4321285519):
The original `Sketch::from_embedding` used `debug_assert!` for the
`embedding.len() <= u16::MAX` invariant, which compiled out in release
builds. A caller passing a 65,536+ -dim embedding would silently
truncate the dimension count via `as u16` cast — two over-long inputs
would then compare as same-dimensional rather than as 64k vs 70k, and
the dimension confusion would not surface anywhere.
Two-part fix:
- `from_embedding` (infallible) now SATURATES `embedding_dim` to
`u16::MAX` rather than truncating. Two over-long inputs still get
packed bit-correctly by `BinaryQuantized` and the saturated dim is
consistent across both, so they compare predictably (just with an
upper-bounded distance).
- `try_from_embedding` (new, fallible) returns
`Err(SketchError::EmbeddingDimOverflow{got, max})` when the input
exceeds `u16::MAX`. Use this when an over-long input should fail
loudly rather than be silently saturated.
- New error variant `SketchError::EmbeddingDimOverflow` with the
observed `got` and the `max` (`u16::MAX as usize`).
- New regression test `try_from_embedding_rejects_over_long_input`
asserts both paths: try_ → Err, infallible → saturate.
Validated:
- 13 sketch unit tests pass (was 12; +1 for L2 boundary).
- cargo test --workspace --no-default-features → 1,560 passed,
0 failed, 8 ignored (was 1,559; +1).
- ESP32-S3 on COM7 streaming live CSI (cb #100, fresh boot RSSI -48 dBm).
Co-Authored-By: claude-flow <ruv@ruv.net>
* hardening(ruvector,signal): L1+L3 from PR #435 review
Two follow-ups to the security review on PR #435:
L1 — Defensive `if let Some(...)` for SketchBank::topk heap peek.
The original `.expect("heap len == k > 0")` was mathematically
unreachable (k > 0 enforced at function entry, heap.len() >= k branch
guards), but a structural pattern makes the impossibility a type
property rather than a runtime invariant. Same hot-path cost; zero
panic risk in the production binary.
L3 — Guard `embedding_dim == 0` in `EmbeddingHistory::novelty`.
A 0-dim history is constructible via `with_sketch(0, ...)`; without
the guard the function returned `NaN` (min_d as f32 / 0.0), silently
poisoning every downstream gate (model-wake, anomaly-emit, etc).
Now returns Some(1.0) — fail-loud at "no comparison possible →
maximally novel," never NaN. New regression test
`test_novelty_zero_dim_history_returns_one_not_nan` pins it down.
Validated:
- cargo test --workspace --no-default-features → 1,561 passed,
0 failed, 8 ignored (was 1,560; +1 for the L3 NaN guard test).
- ESP32-S3 on COM7 streaming live CSI (cb #12400, RSSI fresh).
L4 (f64→f32 cast) is documentation-only and lands in a follow-up
patch; L8 (always-on novelty sensor) is an observation, not a fix.
Co-Authored-By: claude-flow <ruv@ruv.net>
* feat(sensing-server): ADR-084 Pass 3.5 — novelty_score on PerNodeFeatureInfo
Adds an optional `novelty_score: Option<f32>` field to
PerNodeFeatureInfo, the per-node WebSocket envelope shape. Mirrored
on both struct definitions (types.rs canonical + main.rs's
refactoring-leftover duplicate) so the schema is consistent.
`#[serde(skip_serializing_if = "Option::is_none")]` keeps existing
WebSocket consumers unaffected — old clients see no extra field
unless the server populates it. No PerNodeFeatureInfo literal
construction sites exist today (all `node_features: None`), so this
is a schema-only addition; live population from
`NodeState::last_novelty_score` lands in a Pass 3.6 follow-up that
also wires `node_features: Some(...)` at the per-node ESP32 frame
emit path.
Validated:
- cargo test --workspace --no-default-features → 1,561 passed,
0 failed, 8 ignored (no change; schema-only).
- ESP32-S3 on COM7 streaming live CSI (cb #2100, fresh boot).
Co-Authored-By: claude-flow <ruv@ruv.net>
* feat(sensing-server): ADR-084 Pass 3.6 — populate node_features with novelty_score
Wires `node_features: Some(...)` at the two per-node ESP32 frame
emit sites (formerly `node_features: None`). Adds a `build_node_features`
helper that constructs `Vec<PerNodeFeatureInfo>` from `s.node_states`,
including the per-node `last_novelty_score`.
This completes the Pass 3.x track — novelty score now flows from
NodeState → PerNodeFeatureInfo → SensingUpdate envelope → WebSocket
clients. Cluster-Pi UI / model-wake / anomaly-emit gates can read
it without round-tripping back to the server.
Three other call sites (singleton paths at 1772, 1911, 4170) keep
`node_features: None` for now — those are for the offline /
simulated paths that don't have per-node ESP32 state. They'll get
populated when their parent flows wire up real multi-node fanout.
Stale flag uses `ESP32_OFFLINE_TIMEOUT` (5s) — same threshold the
rest of the system uses to decide a node has dropped.
Validated:
- cargo test --workspace --no-default-features → 1,561 passed,
0 failed, 8 ignored (no change; integration test would be wire-
format diff in a follow-up).
- ESP32-S3 on COM7 streaming live CSI (cb #100, fresh boot,
RSSI -49 dBm).
Co-Authored-By: claude-flow <ruv@ruv.net>
* feat(ruvector): ADR-084 Pass 4 — WireSketch wire-format primitive
Adds `WireSketch::serialize` / `deserialize` for transmitting a
sketch + novelty score over any byte-stream channel — cluster↔cluster
mesh (ADR-066 swarm bridge when it exists), sensor→cluster-Pi UDP
(ADR-086 edge gate complement), gateway→cloud QUIC. Channel-agnostic
by design.
Wire layout (12-byte header + ceil(dim/8) bytes payload, little-endian):
[0..4] magic = 0xC5110084
[4..6] format_version = 1
[6..8] sketch_version (embedding-model schema)
[8..10] embedding_dim
[10..12] novelty_q15 (novelty * 32_767, saturated)
[12..] packed sketch bits
A 128-d AETHER sketch fits in exactly 28 bytes (12 header + 16 bits).
Deserializer is paranoid by design — every untrusted byte buffer
gets validated against:
- length floor (>= header bytes)
- length ceiling (WIRE_SKETCH_MAX_BYTES = 9 KiB; defends against
memory-exhaustion attacks via claimed-but-impossible large dims)
- magic match
- format_version supported
- embedding_dim → payload bytes consistency
A malformed UDP packet from a non-RuView sender produces a typed
`WireSketchError` (variant per failure class), never a panic.
Re-exported from lib.rs alongside `Sketch` / `SketchBank`.
Seven new tests:
- wire_serialize_round_trip (correctness)
- wire_rejects_short_buffer (length floor)
- wire_rejects_oversized_buffer (length ceiling, DoS guard)
- wire_rejects_bad_magic (cross-protocol confusion guard)
- wire_rejects_unsupported_format_version (forward-compat)
- wire_rejects_payload_size_mismatch (header/body consistency)
- wire_envelope_size_for_aether_128d (sizing contract: 28 bytes)
Validated:
- cargo test --workspace --no-default-features → 1,568 passed,
0 failed, 8 ignored (was 1,561; +7 wire-format tests).
- ESP32-S3 on COM7 streaming live CSI (cb #15100, RSSI -48 dBm).
Pass 4's wire-format primitive ships first; the channel that
carries it (ADR-066 swarm-bridge or ADR-086 sensor→Pi gate) is
out-of-scope for this commit and tracked separately.
Co-Authored-By: claude-flow <ruv@ruv.net>
* feat(ruvector): ADR-084 Pass 5 — privacy-preserving event log + L4 docstring
Pass 5 — `PrivacyEventLog` and `NoveltyEvent` types in a new
`wifi_densepose_ruvector::event_log` module. Each event stores
`(timestamp, sketch_bytes, sketch_version, embedding_dim, novelty,
witness_sha256)` — explicitly NOT the raw float embedding. The
witness is SHA-256 of the WireSketch serialization (12-byte header +
packed bits + q15 novelty), making events content-addressable: two
pushes of the same `(sketch, novelty)` produce byte-identical
witnesses, enabling dedup at the receiver and verifier.
Privacy properties (ADR-084 §"Privacy-preserving event log"):
1. Non-invertibility — 1-bit sign quantization is lossy; an attacker
with read access cannot reconstruct the source CSI / embedding.
2. Content addressing — `(sketch_version, witness)` is fully qualified.
3. Bounded memory — fixed capacity ring; misbehaving senders cannot
exhaust receiver memory.
Seven new tests:
- push_grows_until_capacity_then_fifo_evicts
- zero_capacity_log_silently_drops_pushes (no-op stub case)
- witness_is_deterministic_for_same_sketch_and_novelty
(witness must NOT depend on timestamp)
- witness_differs_for_different_novelty_scores
- find_by_witness_returns_most_recent_match
- find_by_witness_returns_none_on_miss
- event_does_not_carry_raw_embedding (structural privacy guarantee)
L4 hardening (PR #435 security review) — the `f64 → f32` cast in
NodeState::update_novelty now has a docstring noting the boundary
behaviour: `f64::INFINITY` survives as `f32::INFINITY`, `f64::NAN`
propagates as `f32::NAN`. Neither panics. CSI amplitudes from healthy
firmware are well within f32 finite range.
Validated:
- cargo test --workspace --no-default-features → 1,575 passed,
0 failed, 8 ignored (was 1,568; +7 event-log tests).
- ESP32-S3 on COM7 streaming live CSI (cb #2800, RSSI -52 dBm).
Co-Authored-By: claude-flow <ruv@ruv.net>
The Rust port at v2/ has been the primary codebase since the rename
in #427. The Python implementation at v1/ is no longer the active
target; the only load-bearing path is the deterministic proof bundle
at v1/data/proof/ (per ADR-011 / ADR-028 witness verification).
Move the whole Python tree into archive/v1/ and document the policy
in archive/README.md: no new features, bug fixes only when they affect
a still-load-bearing path (currently just the proof), CI continues to
verify the proof on every push and PR.
Path references updated in 26 files via path-pattern sed (only
matches v1/<known-child> patterns, never bare v1 or API URLs like
/api/v1/). Two double-prefix typos (archive/archive/v1/) caught and
hand-fixed in verify-pipeline.yml and ADR-011.
Validated:
- Python proof verify.py imports cleanly at archive/v1/data/proof/
(numpy/scipy still required; CI installs requirements-lock.txt
from archive/v1/ now)
- cargo test --workspace --no-default-features → 1,539 passed,
0 failed, 8 ignored (unaffected by Python tree relocation)
- ESP32-S3 on COM7 untouched (no firmware paths changed)
After-merge: contributors should re-run any local `python v1/...`
commands as `python archive/v1/...` (CLAUDE.md and CHANGELOG already
updated).
The Rust port lived two directories deep (rust-port/wifi-densepose-rs/)
without any sibling under rust-port/ that warranted the extra level.
Move the whole workspace up to v2/ to match v1/ (Python) at the same
depth and shorten every cd / build command across the repo.
git mv preserves history for all tracked files. 60 files updated for
path references (CI workflows, ADRs, docs, scripts, READMEs, internal
.claude-flow state). Two manual fixes for relative-cd paths in
CLAUDE.md and ADR-043 that became wrong after the depth change
(cd ../.. → cd ..).
Validated:
- cargo check --workspace --no-default-features → clean (after target/
nuke; the gitignored target/ was carried by the OS rename and had
hard-coded old paths in build scripts)
- cargo test --workspace --no-default-features → 1,539 passed, 0 failed,
8 ignored (same totals as pre-rename)
- ESP32-S3 on COM7 → still streaming live CSI (cb #40300, RSSI -64 dBm)
After-merge follow-up: contributors should `rm -rf v2/target` once and
let cargo regenerate from the new path.