Commit Graph

1 Commits

Author SHA1 Message Date
ruv a92b043143 perf(ruvector): eliminate fuse() double-clone (~2.17x marshalling) + bench (ADR-156 §2.4, §4)
MultistaticArray::fuse / fuse_ungated cloned every viewpoint embedding twice per
fusion (once into `extracted`, again when building the attention input). Now the
embeddings are MOVED out of `extracted` (one clone per viewpoint instead of two),
capturing geometry/ids by Copy in the same pass. Correctness-neutral — all 100
viewpoint/mat lib tests pass unchanged.

MEASURED (new benches/fusion_bench.rs, embedding_extract A/B, 8 vp x 128-d):
  before_double_clone 1.0029 us -> after_single_clone 461.6 ns  (~2.17x)
End-to-end fusion_pipeline (8 vp): 202 us — marshalling is <1% of fusion
(n*n attention dominates), so end-to-end win is modest; the A/B isolates the
clone elimination. Reproduce:
  cargo bench -p wifi-densepose-ruvector --bench fusion_bench

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-06-11 20:23:27 -04:00