wifi-densepose

Commit Graph

Author	SHA1	Message	Date
rUv	f756a8af49	feat(ADR-261): ruvector HNSW graph-ANN (25x measured vs linear) + honest SymphonyQG-direction refutation (#1063 ) * feat(ruvector): real float HNSW + SymphonyQG-style quantized-traversal index (ADR-261) Adds the graph-ANN index the ruvector retrieval path was missing (ADR-156 §5 #1 noted there was no HNSW baseline to measure SymphonyQG against). - hnsw.rs: correct float HNSW (Malkov & Yashunin) — multi-layer NSW graph, ef_construction/ef_search, Algorithm-4 neighbour selection, seeded- deterministic level assignment (SplitMix64, reused from rotation.rs), L2 + cosine, brute-force ground truth, full degenerate-case guards. recall@10 correctness gate >=0.95 vs brute force (L2 + cosine). - hnsw_quantized.rs: SymphonyQG-style variant — same graph, traversal scored by cheap 1-bit Hamming over the RaBitQ Pass-2 rotated sign code, final exact-float rerank. - ann_measure.rs: shared deterministic planted-cluster fixture + recall/QPS measurement (ann_bench_report is the ADR source of truth). Fixes an index-out-of-bounds bug the recall gate caught: insert wired bidirectional edges before pushing the node's own link row. +20 tests, ruvector lib 131->151, 0 failed. Co-Authored-By: claude-flow <ruv@ruv.net> * bench(ruvector): criterion ann_bench for HNSW vs quantized vs linear (ADR-261) Times the same shared ann_measure fixture/indices through criterion so the bench and the report test can never measure different graphs. Co-Authored-By: claude-flow <ruv@ruv.net> * docs(adr-261): graph-ANN index ADR with MEASURED HNSW vs quantized verdict ADR-261 (Accepted): float HNSW ~25x QPS over linear scan at recall >=0.99 (the baseline ADR-156 said was missing). Honest negative: the 1-bit quantized traversal is too coarse to beat float HNSW at equal recall at N=10k (best recall 0.738, no >=0.90 equal-recall point) — the SymphonyQG 3.5-17x is NOT reproduced by our 1-bit construction; expected crossover at large N + a multi-bit code. Caveat: our HNSW + our quant, not SymphonyQG's system — direction tested, not a 1:1 reproduction. ADR-156 §5 #1 + §8 backlog: CLAIMED -> MEASURED-direction-tested. CHANGELOG [Unreleased] entry. Co-Authored-By: claude-flow <ruv@ruv.net>	2026-06-14 02:33:32 -04:00
rUv	91248536bc	feat(beyond-sota): ADR-156 M2 — RaBitQ unbiased distance estimator (rigorous published negative on strict-K) (#1056 ) * feat(ruvector): RaBitQ unbiased distance estimator (ADR-156 M2) Implement the real Gao & Long (SIGMOD 2024) RaBitQ contribution on top of the existing Pass-2 rotation: an unbiased estimator of the inner product / squared distance recovered from the 1-bit code plus 8 B/vec per-vector side info (residual_norm + x_dot_o), used to rerank the candidate set instead of raw Hamming. - src/estimator.rs (new): EstimatorSketch, SideInfo, EstimatorQuery, DistanceEstimator (estimate_inner_product / estimate_sq_distance / ranking_key / cosine_ranking_key), EstimatorBank (topk_estimated[_cosine], with_centroid). Zero-centroid simplification documented; paper-faithful centroid path also built. - src/rotation.rs: extract apply_padded() (full padded FHT frame the code lives in); apply() now truncates apply_padded(). No behaviour change. - lib.rs: export estimator types. Additive + backward-compatible: Pass-1 Sketch / Pass-2 SketchBank / WireSketch wire format unchanged; all external callers use Pass-1 and are unaffected. Co-Authored-By: claude-flow <ruv@ruv.net> * test(ruvector): estimator strict-K coverage harness (ADR-156 M2) Add measure_estimator (cosine rerank) + measure_estimator_euclidean to the coverage harness, on the BIT-IDENTICAL fixture / cluster centres / query stream / cosine ground truth as measure_pass1/measure_pass2 — apples-to-apples sign-Hamming vs unbiased-estimator-rerank. Regression tests: - estimator_rerank_not_worse_than_sign (>= sign-only Pass-2 on a fixed fixture) - estimator_coverage_is_deterministic - estimator_coverage_report (--nocapture prints the strict-K table) MEASURED strict-K (candidate_k=K=8): Pass-1 36.13% -> Pass-2-sign 46.39% -> estimator-cosine 49.71%. Still short of the ADR-084 90% strict bar; estimator reaches 95.12% at candidate_k=24 (vs sign 91.60%). Published negative. Co-Authored-By: claude-flow <ruv@ruv.net> * docs(ruvector): record RaBitQ estimator measured negative (ADR-156 §11, ADR-084) - sketch_bench: estimator cosine/euclid columns in the coverage table. - ADR-156 §11 (new): estimator formula + zero-centroid simplification stated honestly; strict-K coverage table; RESOLVED-NEGATIVE verdict (49.71% strict, short of 90%); pinning test names. §5 #2 + §10.5 updated. - ADR-084 'Pass 2b' (new): estimator landed + measured strict-K vs the bar. - CHANGELOG [Unreleased]: ADR-156 §11 Milestone-2 entry. Co-Authored-By: claude-flow <ruv@ruv.net>	2026-06-13 18:24:40 -04:00
rUv	9b07dff298	feat(beyond-sota): ADR-155 metric unification + ADR-156 RaBitQ Pass-2 (honest negative + latent topk bugfix) (#1053 ) * refactor(train): hoist canonical PCK/OKS to un-gated metrics_core; fold test_metrics onto production (ADR-155 M1 §8) ADR-155 §8 deferred item: test_metrics.rs reference kernels validated production against their OWN reimplementation — a test that cannot catch a canonical-impl bug (both could be wrong the same way). - Extract canonical_torso_size / pck_canonical / oks_canonical / sigmas / bounding_box_diagonal into a new NON-tch-gated `metrics_core` module, so the single metric definition is reachable under `cargo test --no-default-features` (the `metrics` module is tch-gated). `metrics` re-exports every item → still exactly ONE implementation. - Rewrite tests/test_metrics.rs to assert the PRODUCTION pck_canonical / oks_canonical equal hand-computed fixtures (not a reimplementation): canonical_pck_matches_hand_computed_fixture (corr=3/total=4/pck=0.75), hip↔hip normalizer pin, zero-visible⇒0.0, OKS perfect⇒1.0, fake-Gold pin. - Keep an INDEPENDENT raw-threshold reference kernel only as a differential cross-check: test_kernel_agrees_with_canonical asserts it AGREES with canonical where torso==1.0 (genuine cross-check, not duplication). Grade: MEASURED. test_metrics 10→12 tests, 0 failed. Co-Authored-By: claude-flow <ruv@ruv.net> * fix(sensing-server): relabel divergent live PCK/OKS so they're never conflated with canonical (ADR-155 M1 §2.1/§8 Goal C) Goal C named training_api.rs:804 (torso-HEIGHT PCK). Auditing it surfaced TWO findings the ADR-155 §1 table missed: 1. training_api.rs is an ORPHAN file — not declared `mod` in lib.rs OR main.rs, so it does NOT compile into the crate. It does not drive the live server. 2. The REAL live `best_pck`/`best_oks` (main.rs training path → RVF metadata JSON read by model_manager.rs) come from trainer.rs: - `pck_at_threshold` = RAW-threshold PCK, NO torso normalization (the most divergent kind), printed/serialized as bare "PCK@0.2". - `oks_map` calls `oks_single(area=1.0)` = the EXACT fake-Gold pattern ADR-155 §2.1 claimed closed elsewhere — still live here, inflating best_oks. Resolution = RELABEL (torso/raw math is load-bearing on different data; the pub fns can't be renamed without breaking API; sensing-server has no train/ ndarray dep). Honest unify is a tracked §8 backlog item. - training_api.rs: `compute_pck` → `compute_pck_torso_height` + divergence doc; val_pck/best_pck/val_oks struct fields documented as torso-HEIGHT proxies; logs say `pck_torso_h@0.2`. Test torso_pck_is_labelled_distinctly_from_canonical. - trainer.rs (LIVE): `pck_at_threshold` documented raw-unnormalized; `oks_map` area=1.0 flagged fake-Gold; test pck_at_threshold_is_raw_unnormalized_not_canonical. - main.rs: live print relabelled `pck_raw@0.2` / `oks_map(area=1.0 proxy)`. No wire-format field renames (back-compat); no pub-API rename (no silent break). Grade: MEASURED (relabel + divergence pinned). sensing-server 450→451 lib tests, 0 failed. Co-Authored-By: claude-flow <ruv@ruv.net> * docs(adr-155): mark §8 metric items RESOLVED + audit map + honest §1 under-count correction (M1b Goals A/D) - §8.1: full PCK/OKS audit map (every def: file:line, basis, canonical/ legacy/distinct), the two §8 items marked RESOLVED with resolution+why. - Honest finding: §1's "seven divergent metrics" was an UNDER-count — sensing-server's LIVE trainer.rs has a raw-unnormalized PCK and an area=1.0 fake-Gold OKS the table omitted, and the file §8 named (training_api.rs) is orphaned dead code. §9 honest-limits updated. - Goal D: metrics.rs _v2 variants confirmed caller-less + deprecated; noted for future cleanup, NOT deleted (public API, tch-gated). - CHANGELOG [Unreleased] Fixed entry. Co-Authored-By: claude-flow <ruv@ruv.net> feat(ruvector): RaBitQ Pass-2 randomized rotation + topk bugfix (ADR-156 §8) Implements the deferred "Multi-bit / Extended RaBitQ Pass 2" backlog item from ADR-156 §8: a deterministic randomized orthogonal rotation applied before sign-quantization, the published RaBitQ construction (Gao & Long, SIGMOD 2024). Rotation construction: Fast Hadamard Transform + seeded ±1 sign flips ("HD" / randomized Hadamard), O(d log d) time and O(d) memory — a dense d×d rotation is O(d²) and infeasible at the 65,535-d the wire format provisions for. Pads to the next power of two; SplitMix64 seeds the sign stream so index-time and query-time rotations are bit-identical. API is additive and backward-compatible: Pass 1 (`from_embedding`) is untouched; Pass 2 is opt-in via `Sketch::from_embedding_rotated` and `SketchBank::with_rotation` (+ `insert_embedding` / `topk_embedding` / `novelty_embedding` helpers that rotate consistently). Default behaviour is unchanged. While building the Pass-2 coverage harness, found and fixed a PRE-EXISTING correctness bug in `SketchBank::topk`: the n>k heap path used `BinaryHeap<Reverse<(d,id)>>` (a min-heap) but treated its peek as the max, so it returned the k FARTHEST sketches as "nearest". The shipped unit tests only exercised the n≤k fast path, so it went unnoticed. Fixed to a plain max-heap; pinned by `topk_heap_path_returns_nearest` and `tight_clusters_give_high_coverage_with_overfetch` (the latter measured 0.072 on the old code). New tests (+17, 100→117 in the crate): rotation determinism/norm-preservation (`rotation_is_deterministic_for_seed`, `rotation_preserves_norm`), Pass-2 shape-compatibility, `pass2_coverage_not_worse_than_pass1`, and a deterministic coverage report. MEASURED top-K coverage (anisotropic planted-cluster fixture, cosine ground truth; dim=128 N=2048 K=8 64 clusters noise=0.35 128 queries): candidate_k=K=8 : Pass1 36.13% -> Pass2 46.39% (both << 90% bar) candidate_k=24 : Pass1 83.89% -> Pass2 91.60% (Pass2 clears 90%) candidate_k=32 : Pass1/Pass2 100% Honest result: rotation consistently helps (+10pp at strict K), but neither pass clears the ADR-084 90% bar at candidate_k==K on this distribution. Pass 2 reaches 90% only with ~3x over-fetch (the ADR-084 "candidate set" deployment pattern). Multi-bit Pass 3 evaluated separately. Co-Authored-By: claude-flow <ruv@ruv.net> * feat(ruvector): multi-bit Pass-3 experiment + ADR-156/084 measured results Adds the multi-bit half of the ADR-156 §8 "Multi-bit / Extended RaBitQ" item as a MEASURED experiment (coverage::measure_multibit): rotate, then b-bit uniform scalar-quantize each coord, rank by L1 over codes — the natural multi-bit generalization of hamming. Measures the bit/coverage tradeoff the backlog item asked for. MEASURED at the strict bar (candidate_k=K=8, anisotropic planted-cluster fixture, cosine ground truth): Pass1 (1-bit, no rot) 36.13% 16 B/vec Pass2 (1-bit, rot) 46.39% 16 B/vec Pass3 (rot, 2-bit) 54.39% 32 B/vec Pass3 (rot, 3-bit) 66.70% 48 B/vec Pass3 (rot, 4-bit) 74.22% 64 B/vec Honest: multi-bit monotonically helps but even 4-bit (4x memory) reaches only 74% at the strict bar — neither rotation nor <=4-bit multi-bit clears the strict-K 90% bar on this distribution. The bar is met via over-fetch (Pass2 @ candidate_k=24). Tests: multibit_tradeoff_report, multibit_1bit_matches_pass2_approx (+ sanity that 1-bit ~= Pass-2). Docs: - ADR-156 §8 item #2 marked RESOLVED-PARTIAL; §5 #2 grade CLAIMED -> MEASURED-on-our-hardware; new §10 with full measured tables, the topk bugfix disclosure, and graded deferred sub-items. - ADR-084: "Pass 2" section answering the rotation open-question with measured numbers + the topk bug note. - CHANGELOG [Unreleased]: Added (Pass-2 milestone) + Fixed (topk heap). Co-Authored-By: claude-flow <ruv@ruv.net>	2026-06-13 16:02:18 -04:00
ruv	0ce2ac6440	docs(adr): ADR-156 RuVector/Fusion beyond-SOTA sweep — Milestone 2 Documents Milestone 2 of the beyond-SOTA sweep on the cross-viewpoint fusion path: four correctness/integrity/security fixes (each pinned by a bug-catching test), one MEASURED hot-path perf win, and the ANN/fusion SOTA landscape graded MEASURED/CLAIMED/data-gated. - Integrity: honest dimensionless GDOP (was RMSE mislabelled); canonical wrapped angular distance (disclosed numeric no-op under cos kernel — landed for contract/single-source-of-truth, not claimed as a behaviour change). - Security: crafted-index/zero-bin DoS panics closed on the multistatic path. - Perf: fuse() double-clone eliminated, ~2.17x on marshalling (MEASURED). - SOTA landscape: SymphonyQG (#1, CLAIMED — reproduction deferred) + multi-bit/Extended RaBitQ (#2, accepted near-term, the sketch.rs Pass-2); GraphPose-Fi learned fusion head documented ACCEPTED-FUTURE, data-gated per ADR-152 (b); CRB/sensor-placement investigated, no action (already SOTA). - Deferred backlog (§8): nothing silently dropped. Validation: cargo test --workspace --no-default-features = 3050 passed / 0 failed; python verify.py = VERDICT PASS. Co-Authored-By: claude-flow <ruv@ruv.net>	2026-06-11 20:23:43 -04:00

4 Commits