wifi-densepose

History

ruv 247794a2c5 bench(temporal): empirical sparse-vs-dense speedup curve (ADR-096 §3.1, #513 ) Validates the central performance claim of ADR-096 with a runnable benchmark. Single-run wall-clock, pure-Rust vs pure-Rust on x86_64 host. Real numbers, not just analytic argument. Results (N=64..1024): \| N \| Dense (ms) \| Sparse (ms) \| Speedup \| \|--------\|-----------:\|------------:\|--------:\| \| 64 \| 0.262 \| 0.141 \| 1.86× \| \| 128 \| 1.120 \| 0.335 \| 3.34× \| \| 256 \| 4.129 \| 0.711 \| 5.81× \| \| 512 \| 19.230 \| 2.356 \| 8.16× \| \| 1024 \| 71.904 \| 3.389 \| 21.21× \| Asymptotic check: 64→1024 is 16× more tokens. Dense's 274× cost growth matches N² (256× = 16²). Sparse's 24× growth matches N log N (16 · log(1024)/log(64) ≈ 27). The complexity claim is empirically supported. ADR-096 §3.1 honest-framing paragraph predicted N=64 would be overhead-bound; we measured 1.86× there, consistent with the ADR's warning that AETHER's current `window_frames=100` default is below the inflection point where sparse pays. What this commit adds: - examples/bench_speedup.rs — measures dense_attention (upstream reference), AetherTemporalHead.forward (this crate's wrapper), and SubquadraticSparseAttention.forward (raw, to confirm the wrapper isn't introducing overhead — it isn't, the two are within noise). - benches_results.md — captured table + asymptotic check + caveats (config used, what the benchmark doesn't measure, how to run). Run it: cargo run -p wifi-densepose-temporal --example bench_speedup --release What's NOT measured here: - Decode-step latency (already proved correct at last-token, not yet timed against a hypothetical O(N²) dense decode — they're structurally not comparable anyway). - Memory footprint of KvCache + FP16 (matters on firmware, not host). - GQA dispatch — this bench uses MHA shape so dense and sparse operate on identical tensors. Real AETHER will want MQA per TemporalHeadConfig::default_aether(), which halves KV memory. Co-Authored-By: claude-flow <ruv@ruv.net>		2026-05-08 12:02:36 -04:00
..
.claude-flow	chore(repo): rename rust-port/wifi-densepose-rs → v2/ (flatten to one level) (#427 )	2026-04-25 21:28:13 -04:00
crates	bench(temporal): empirical sparse-vs-dense speedup curve (ADR-096 §3.1, #513 )	2026-05-08 12:02:36 -04:00
data	chore(repo): rename rust-port/wifi-densepose-rs → v2/ (flatten to one level) (#427 )	2026-04-25 21:28:13 -04:00
docs	chore(repo): rename rust-port/wifi-densepose-rs → v2/ (flatten to one level) (#427 )	2026-04-25 21:28:13 -04:00
examples	chore(repo): rename rust-port/wifi-densepose-rs → v2/ (flatten to one level) (#427 )	2026-04-25 21:28:13 -04:00
patches/ruvector-crv	chore(repo): rename rust-port/wifi-densepose-rs → v2/ (flatten to one level) (#427 )	2026-04-25 21:28:13 -04:00
Cargo.lock	feat(temporal): scaffold wifi-densepose-temporal crate (ADR-096 Phase 1-3, #513 )	2026-05-08 09:26:18 -04:00
Cargo.toml	feat(temporal): scaffold wifi-densepose-temporal crate (ADR-096 Phase 1-3, #513 )	2026-05-08 09:26:18 -04:00