refactor(beyond-sota): ADR-155 M2 — host-verifiable §8 closeout (7 de-magic, 9 boundary tests, native-conv honest-null) (#1059)
* refactor(train): ADR-155 M2 §8 — de-magic train non-tch tuning constants + boundary tests Lift bare numeric literals used as thresholds / guard epsilons in the non-tch (host-verifiable) train surface into named, documented consts and pin each set with a *_consts_unchanged_from_literals test. Values are bit-identical to the prior inline literals — cleanup, no behaviour change. De-magicked (const + pin test): - metrics_core.rs: VISIBILITY_THRESHOLD (0.5), MIN_REFERENCE_EXTENT (1e-6), OKS_FALLBACK_SIGMA (0.07) - ruview_metrics.rs: NUM_KEYPOINTS (17), VISIBILITY_THRESHOLD (0.5), PCK_THRESHOLD (0.2), MIN_BBOX_DIAG (1e-3), MIN_DURATION_MINUTES (1e-6) - subcarrier.rs: SPARSE_BASIS_SIGMA (0.15), SPARSE_BASIS_THRESHOLD (1e-4), SPARSE_REGULARIZATION_LAMBDA (0.1), SPARSE_COO_PRUNE_EPS (1e-8), SPARSE_SOLVER_TOL (1e-5 f64), SPARSE_SOLVER_MAX_ITERS (500) - eval.rs: MIN_POSITIVE_MPJPE (1e-10) - domain.rs: LAYER_NORM_EPS (1e-5) - virtual_aug.rs: BOX_MULLER_U1_FLOOR (1e-10), MIN_ROOM_SCALE (1e-10) Boundary / characterization tests (pin CURRENT behaviour): - visibility_threshold_boundary_is_inclusive (>= 0.5 at the edge) - degenerate_extent_below_floor_is_unscoreable ((0,0,0.0)/0.0, not perfect) - tracking_zero_duration_does_not_divide_by_zero - oks_short_array_is_bounded_at_keypoint_count (16 rows, no panic) - compute_interp_weights_single_target_is_index_zero (target_sc==1) - sparse_interp_single_target_is_finite - domain_gap_infinite_when_in_domain_perfect_but_cross_nonzero - domain_gap_unity_when_everything_perfect - augment_frame_zero_room_scale_passes_amplitude_finite Doc-only (no behaviour change): - rapid_adapt.rs: correct module-doc O(eps) -> O(eps^2) for central differences - geometry.rs: add # Panics to DeepSets::encode (documents existing assert!) train --no-default-features: 191 lib (was 176), 303 total (was 288), 0 failed. Co-Authored-By: claude-flow <ruv@ruv.net> * feat(nn): ADR-155 M2 §3 — pure-Rust LinearHead::try_new input guard + de-magic softplus threshold ADR-155 §3 found rf_encoder.rs has no adversarial checkpoint-deserialization assert — its assert_eq!s in LinearHead::new are construction-time API contracts on programmer-supplied vectors. This adds the honest, in-scope improvement the M2 task allows: a pure-Rust *fallible* constructor so weights from an untrusted / deserialized checkpoint can be shape-validated without panicking. - Add RfHeadError (WeightShape / BiasShape / VarWeightShape) + Display + Error. - Add LinearHead::try_new returning Result<Self, RfHeadError>; on success the head is byte-identical to LinearHead::new. new() is unchanged (still asserts; now documents # Panics and points to try_new) — no behaviour change for existing callers. - De-magic softplus's bare 20.0 overflow threshold into SOFTPLUS_LINEAR_THRESHOLD (value unchanged) + pin test. Tests: try_new_accepts_valid_and_rejects_each_bad_shape (valid == new forward; each bad shape → typed error), softplus_threshold_unchanged_from_literal. nn --no-default-features lib: 37 passed (was 35), 0 failed. Co-Authored-By: claude-flow <ruv@ruv.net> * perf(nn): ADR-155 M2 §4 — native-conv bench-first → MEASURED-INCONCLUSIVE (no perf change shipped) The §8 "native-conv naive-loop rewrite" backlog item: DensePoseHead:: apply_conv_layer is a pure-Rust 6-nested-loop conv (benchable on this host, not tch/ort-gated). Bench-first per the §0 PROOF discipline. - Add committed criterion bench benches/native_conv_bench.rs measuring forward() through the naive conv on representative single-layer configs (--no-default- features; no ort download). - Prototyped a bit-identical range-clamped variant (hoist the per-tap in-bounds branch by pre-clamping kh/kw ranges; same ic→kh→kw MAC order ⇒ bit-identical). MEASURED before/after on this host: ~35% faster on padding-heavy small-channel maps (4.40→2.84 ms) but a ~3% *regression* on channel-heavy maps (11.09→11.48 ms), all inside a ±20% run-to-run noise floor. Verdict: INCONCLUSIVE — the benefit is not robustly positive, so the rewrite is NOT shipped and NOT a fabricated speedup. Reverted to the naive loop; honestly deferred (ADR-155 §8). - Add native_conv_matches_reference: a hand-computed characterization anchor (1×1 = scalar MAC; same-padded 3×3 ones = truncated-window sums 9/6/4) pinning CURRENT conv behaviour for any future rewrite. nn --no-default-features lib: 38 passed (was 37), 0 failed. No behaviour change. Co-Authored-By: claude-flow <ruv@ruv.net> * docs(adr-155): M2 §8.2 — enumerated host-verifiable P3 backlog clearance + CHANGELOG Replace the §8 bulk "~40 lower-severity findings" line with the real, enumerated M2 resolution (§8.2): 7 de-magicked (const + pin == prior literal), 9 boundary tests, 1 input guard (rf_encoder try_new), 2 doc-only, 1 perf bench-first MEASURED-INCONCLUSIVE (not shipped). Mark native-conv + rf_encoder RESOLVED; state which §8 items stay data-gated (GraphPose-Fi/INT4/CSI-JEPA) or tch-gated (proof/trainer/model panic sites, metrics *_v2 dead code) and ONNX read-lock upstream-gated — blocked, not dropped. Declare the non-tch-verifiable subset of §8 cleared. Validation: train --no-default-features 303 passed (was 288); nn lib 38 (was 35); workspace --no-default-features 3,293 passed, 0 failed; Python proof VERDICT PASS, hash f8e76f21…46f7a UNCHANGED bit-exact. Co-Authored-By: claude-flow <ruv@ruv.net>
This commit is contained in:
parent
8c24b8bdfe
commit
1d12e8831a
|
|
@ -31,6 +31,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
|
|||
- **Mesh partition risk now demotes the privacy class and is witnessed (ADR-032).** The dynamic min-cut guard's `at_risk` signal was advisory-only (it fed the recalibration advisor). It now also contributes to the ADR-141 privacy demotion alongside fusion- and array-level contradictions: a mesh close to partitioning makes the fused belief less trustworthy, so the cycle emits at a more restricted class (monotonic — information only removed). Because `effective_class` feeds the BLAKE3 witness, a fragmenting array now shifts the witness — partition risk is auditable, not just logged. The mesh computation moved ahead of the demotion step in `process_cycle`; new `mesh_guard_mut()` exposes risk-threshold tuning. Test proves a forced-risk 3-node cycle demotes PrivateHome Anonymous→Restricted and shifts the witness vs a clean *same-topology* baseline (the only delta between the two cycles is the forced risk).
|
||||
|
||||
### Added
|
||||
- **ADR-155 Milestone-2 — cleared the host-verifiable subset of the §8 P3 backlog in `wifi-densepose-train` (+ the pure-Rust `rf_encoder.rs`/`densepose.rs` the §3/§4 items named).** Mirrors the ADR-154 M3 cleanup discipline. **Honest enumeration first (grep, not the ADR's "~40" estimate):** the actual non-tch train/nn surface is smaller — **7 de-magicked (const + `*_consts_unchanged_from_literals` pin == prior literal), 9 boundary/characterization tests, 1 added input guard (`rf_encoder::LinearHead::try_new`) + test, 2 doc-only fixes, 1 perf item bench-first → MEASURED-INCONCLUSIVE (not shipped)**. **This is cleanup — no operating value or behaviour changed:** each lifted literal is bit-identical to its prior value, each boundary test pins CURRENT behaviour. De-magicked: `metrics_core.rs` (`VISIBILITY_THRESHOLD`/`MIN_REFERENCE_EXTENT`/`OKS_FALLBACK_SIGMA`), `ruview_metrics.rs` (`NUM_KEYPOINTS`/`VISIBILITY_THRESHOLD`/`PCK_THRESHOLD`/`MIN_BBOX_DIAG`/`MIN_DURATION_MINUTES`), `subcarrier.rs` (6 `SPARSE_*` consts), `eval.rs` (`MIN_POSITIVE_MPJPE`), `domain.rs` (`LAYER_NORM_EPS`), `virtual_aug.rs` (`BOX_MULLER_U1_FLOOR`/`MIN_ROOM_SCALE`), `rf_encoder.rs` (`SOFTPLUS_LINEAR_THRESHOLD`). **§3 `rf_encoder.rs`:** added a pure-Rust fallible `LinearHead::try_new` → typed `RfHeadError` so untrusted/deserialized checkpoint weights can be shape-validated without the `new()` panic (`new` unchanged; additive). **§4 native-conv:** `densepose.rs::apply_conv_layer` (pure-Rust naive loop) was benched (committed `benches/native_conv_bench.rs`); a bit-identical range-clamped rewrite measured ~35% faster on padding-heavy small-channel maps but ~3% *slower* on channel-heavy maps, all inside a ±20% host-noise floor — **MEASURED-INCONCLUSIVE, so NOT shipped** (no fabricated number), characterized by `native_conv_matches_reference` and honestly deferred. **Skipped honestly (not-real / already-handled):** `ablation.rs` (NaN-sort + boundaries already fixed/tested in M1), `signal_features.rs` (consts already named, n=0 tested), `mae.rs` (no bare guard literals). `wifi-densepose-train --no-default-features`: **303 passed** (was 288, +15), 0 failed; `wifi-densepose-nn --no-default-features` lib: **38** (was 35, +3). Workspace `--no-default-features`: GREEN (single clean run). Python proof **VERDICT: PASS**, hash **`f8e76f21…46f7a` UNCHANGED, bit-exact** (asserted — the metrics path is off the deterministic signal proof path). **Remaining §8 backlog stays deferred-not-dropped:** GraphPose-Fi / ONNX-INT4 / CSI-JEPA (data/model-gated), ONNX read-lock (upstream `ort`-gated), tch-gated panic sites in `proof.rs`/`trainer.rs`/`model.rs` + `metrics.rs` `*_v2` dead-code (tch-gated — need a libtorch host). **The non-tch-verifiable subset of §8 is now cleared.**
|
||||
- **ADR-154 Milestone-3 — cleared the §7.4 row #21–45 P3 backlog in `wifi-densepose-signal` (the lumped "remaining clarity/doc/magic-constant/missing-boundary-test findings across `ruvsense/*`, `features.rs`, `motion.rs`").** Honest enumeration first (grep, not the ADR's estimate): the lumped row was **~25 findings → 22 real, de-magicked across 11 modules; 6 boundary/characterization tests added; ~4 doc-only; the rest were already-handled or not-real and are reported as such** (the "row #21–45" count was an estimate — there were not 25 *distinct* magic constants left after M0–M2). **This is cleanup — no operating value or behaviour changed:** every de-magicked literal becomes a named, documented EMPIRICAL-DEFAULT const that **equals the prior literal exactly** (each module ships a `*_consts_unchanged_from_literals` pin test), and every boundary test pins **current** behaviour so a future retune is a visible, tested change. Modules touched: `motion.rs` (#18, fusion weights/normalization/adaptive-threshold consts + 5 tests), `gesture.rs` (#12, `euclidean_distance` length-mismatch `debug_assert` documenting the silent-truncation contract + DTW n=0/m=0 boundary), `longitudinal.rs` (drift thresholds 7-day/2σ/3-day/7-day/EMA + day-6/7 + zero-vector cosine), `cross_room.rs`/`multiband.rs`/`intention.rs`/`hampel.rs` (division-guard epsilons + zero-norm/zero-variance/zero-MAD boundary + `half_window==0` error path), `rf_slam.rs` (`NS_PER_DAY` + fixed-map defaults + zero-span guard), `attractor_drift.rs` (buffer/recent-window consts + documented the implicit `recent.len()≥1` divide-safety + `min_observations` off-by-one boundary), `coherence.rs` (#9 completion — variance-floor + default-decay), `calibration.rs` (#2 — `DEFAULT_MIN_FRAMES` deduped across 4 tier constructors + motion/subtract thresholds), `fusion_quality.rs` (contradiction penalty/bounds + n=0 identity), `temporal_gesture.rs` (confidence epsilon + quantization scale). **A "magic" the agents flagged that was NOT real:** an `attractor_drift.rs:301` "divide-by-zero" is unreachable (the `count < min_observations` guard guarantees `recent.len()≥1`) — documented + boundary-tested rather than guarded, per the no-behaviour-change rule. Signal crate lib `--no-default-features`: **476 passed, 0 failed, 1 ignored**; `--no-default-features --features cir`: **476 passed, 0 failed** (plain `--features cir` is unbuildable on this Windows host — the default `eigenvalue` feature pulls `openblas-src`, the same BLAS gate documented in M2 #8). Workspace `--no-default-features`: **3,275 / 0 failed** (single clean run). Python proof **VERDICT: PASS**, hash **`f8e76f21…46f7a` UNCHANGED, bit-exact** (asserted explicitly — these modules are off the deterministic PSD/Doppler proof path, and the de-magicked consts are bit-identical regardless). **This clears ADR-154's §7.4 deferred backlog to zero across M0–M3.**
|
||||
- **ADR-154 Milestone-2 — bench-first P2 perf subset + missing boundary tests (`wifi-densepose-signal`, §7.4 #5/#6/#7/#8/#14/#16/#19/#20).** PROOF discipline (ADR-154 §0): every perf item was **benched before being touched** (new committed `benches/dsp_perf_bench.rs`, criterion, this Windows box); only the one item the bench proved hot was optimized, the rest are committed MEASURED-NULLs — a benched null is the proof the micro-opt was unnecessary, the §5.1 "already amortized" pattern. Every behaviour-changing edit is pinned bit-identical (or documented-tolerance). Signal crate lib `--no-default-features`: **447 passed, 0 failed, 1 ignored**; `--features cir`: **447 passed, 0 failed**.
|
||||
- **#20 MEASURED-HOT, optimized (bit-identical).** `compute_multi_subcarrier_spectrogram` re-planned a fresh `FftPlanner` for *every* subcarrier (via `compute_spectrogram`). Hoisted the plan + window out of the per-subcarrier loop (new `compute_spectrogram_with_plan` core; `compute_spectrogram` delegates, unchanged). **56-subcarrier: 467.88 µs → 254.75 µs = 1.84×** (window 128); **627.27 µs → 448.39 µs = 1.40×** (window 256). Bit-identical via `multi_subcarrier_hoisted_plan_bit_identical` (`f64::to_bits` of every value across all 4 window functions × {power,magnitude}). The §7.4 intro's predicted "most likely real win" — confirmed.
|
||||
|
|
|
|||
|
|
@ -187,13 +187,41 @@ The gap review surfaced ~60 findings; this milestone scoped to the provable inte
|
|||
- **GraphPose-Fi graph decoder** — build the §5 top candidate (ACCEPTED-future, not built).
|
||||
- **ONNX INT4** quantization; **CSI-JEPA vs MAE** A/B; the rest of the §5 roadmap.
|
||||
- **ONNX read-lock concurrency win** — blocked on an `ort` release exposing `&self` `Session::run` (§4.2); harness already committed.
|
||||
- **native-conv naive-loop** perf rewrite (§4).
|
||||
- **`rf_encoder.rs` `assert_eq!`-on-checkpoint** and any other **tch-gated** panic-on-input sites — require a libtorch host to compile/verify (`model.rs` `amp_fc1` unbounded alloc is *indirectly* guarded by the new `config.validate()` upper bounds, but a direct guard + test is deferred).
|
||||
- ~~**native-conv naive-loop** perf rewrite (§4).~~ — **RESOLVED in Milestone-2 (see §8.2): bench-first → MEASURED-INCONCLUSIVE, no perf change shipped.**
|
||||
- ~~**`rf_encoder.rs` `assert_eq!`-on-checkpoint**~~ — **RESOLVED in Milestone-2 (see §8.2): a pure-Rust fallible `LinearHead::try_new` guard was added.** Any genuine **tch-gated** panic-on-input sites remain deferred — they require a libtorch host to compile/verify (`model.rs` `amp_fc1` unbounded alloc is *indirectly* guarded by the new `config.validate()` upper bounds, but a direct guard + test is deferred).
|
||||
- ~~**`sensing-server/training_api.rs` PCK**~~ — **RESOLVED in Milestone-1b (see §8.1, Goal C).** Relabelled (not unified) — and the audit found the *real* live divergence is in `trainer.rs`, not the orphaned `training_api.rs`.
|
||||
- ~~**`test_metrics.rs` reference kernels**~~ — **RESOLVED in Milestone-1b (see §8.1, Goal B).** Canonical core hoisted to an un-gated module; the integration test now validates the production functions against hand-computed fixtures + a differential cross-check.
|
||||
- **`metrics.rs` `compute_pck_v2`/`compute_oks_v2`/`MetricsAccumulatorV2`/`evaluate_dataset_v2`/`hungarian_assignment_v2`** — confirmed to have **zero external callers** (only `evaluate_dataset_v2`→`MetricsAccumulatorV2` internally). They are already `#[deprecated]` and route through canonical, so they are not a *divergent-definition* risk, only dead weight. Left in place this pass (public API in a tch-gated module; deleting needs a deprecation-cycle + tch host to verify) — flagged here for a future cleanup, NOT deleted silently.
|
||||
- **`sensing-server/trainer.rs` `pck_at_threshold` (raw) + `oks_map(area=1.0)` and the `training_bench.rs` raw kernel** — relabelled in Milestone-1b (§8.1); true unification onto `pck_canonical`/`oks_canonical` (needs a torso scale + the train crate as a sensing-server dep) remains deferred.
|
||||
- The remaining ~40 lower-severity review findings (style, micro-opt, doc) from the NN/training gap review.
|
||||
- ~~The remaining ~40 lower-severity review findings (style, micro-opt, doc).~~ — **RESOLVED in Milestone-2 (§8.2): the host-verifiable subset is cleared.** The "~40" was an estimate; the actual host-verifiable (non-tch) train/nn surface is smaller. Enumerated resolution below.
|
||||
|
||||
### 8.2 Milestone-2 — host-verifiable §8 P3 backlog clearance — RESOLVED
|
||||
|
||||
Mirroring the ADR-154 M3 cleanup discipline, M2 closed the **host-verifiable (non-tch) subset** of the §8 backlog in `wifi-densepose-train` (+ the pure-Rust `rf_encoder.rs`/`densepose.rs` in `wifi-densepose-nn` that the §3/§4 items named). Everything behind `#[cfg(feature = "tch-backend")]` (`metrics.rs`, `model.rs`, `losses.rs`, `proof.rs`, `trainer.rs`, `wiflow_std/{layers,model}.rs`) is **out of host-verifiable scope** — it cannot be compiled/verified without libtorch and stays genuinely deferred (not dropped).
|
||||
|
||||
**PROOF discipline held:** every de-magicked constant is pinned `== prior literal` by a `*_consts_unchanged_from_literals` test; every boundary test characterizes CURRENT behaviour; no operating-value or behaviour change; the Python proof stays bit-exact at `f8e76f21…46f7a` (the metrics path is off the signal proof path — asserted, not assumed). A smaller-but-true count was reported rather than inventing 40 fixes.
|
||||
|
||||
**Enumerated finding → resolution (real counts):**
|
||||
|
||||
| # | Finding (location) | Action | Pin/characterization test |
|
||||
|---|---|---|---|
|
||||
| 1 | `metrics_core.rs` — `0.5` vis / `1e-6` extent / `0.07` OKS-fallback sigma | de-magic → `VISIBILITY_THRESHOLD` / `MIN_REFERENCE_EXTENT` / `OKS_FALLBACK_SIGMA` | `metrics_core_consts_unchanged_from_literals`; `visibility_threshold_boundary_is_inclusive`; `degenerate_extent_below_floor_is_unscoreable` |
|
||||
| 2 | `ruview_metrics.rs` — `17` / `0.5` / `0.2` / `1e-3` / `1e-6` | de-magic → `NUM_KEYPOINTS` / `VISIBILITY_THRESHOLD` / `PCK_THRESHOLD` / `MIN_BBOX_DIAG` / `MIN_DURATION_MINUTES` | `ruview_metrics_consts_unchanged_from_literals`; `tracking_zero_duration_does_not_divide_by_zero`; `oks_short_array_is_bounded_at_keypoint_count` |
|
||||
| 3 | `subcarrier.rs` — sparse-interp `0.15`/`1e-4`/`0.1`/`1e-8`/`1e-5`/`500` | de-magic → 6 `SPARSE_*` consts | `sparse_interp_consts_unchanged_from_literals`; `compute_interp_weights_single_target_is_index_zero`; `sparse_interp_single_target_is_finite` |
|
||||
| 4 | `eval.rs` — `1e-10` division guard (×3) | de-magic → `MIN_POSITIVE_MPJPE` | `eval_min_positive_mpjpe_unchanged_from_literal`; `domain_gap_infinite_when_in_domain_perfect_but_cross_nonzero`; `domain_gap_unity_when_everything_perfect` |
|
||||
| 5 | `domain.rs` — `1e-5` LayerNorm eps | de-magic → `LAYER_NORM_EPS` | `layer_norm_eps_unchanged_from_literal` (n=0/zero-var boundary already covered) |
|
||||
| 6 | `virtual_aug.rs` — `1e-10` Box-Muller / room-scale guards | de-magic → `BOX_MULLER_U1_FLOOR` / `MIN_ROOM_SCALE` | `virtual_aug_guard_consts_unchanged_from_literals`; `augment_frame_zero_room_scale_passes_amplitude_finite` |
|
||||
| 7 | `rf_encoder.rs` — `20.0` softplus overflow threshold | de-magic → `SOFTPLUS_LINEAR_THRESHOLD` | `softplus_threshold_unchanged_from_literal` |
|
||||
| 8 | `rf_encoder.rs` — panic-only `LinearHead::new` for untrusted weights (§3) | add pure-Rust fallible `try_new` → typed `RfHeadError` (additive; `new` unchanged) | `try_new_accepts_valid_and_rejects_each_bad_shape` |
|
||||
| 9 | `densepose.rs::apply_conv_layer` naive-loop (§4) | **bench-first → MEASURED-INCONCLUSIVE**, no perf change shipped; committed bench + characterization anchor | `native_conv_matches_reference` + `benches/native_conv_bench.rs` |
|
||||
| 10 | `rapid_adapt.rs` module-doc "O(ε)" inconsistency | doc-only fix → "O(ε²)" (central differences) | n/a (doc) |
|
||||
| 11 | `geometry.rs` `DeepSets::encode` missing `# Panics` | doc-only fix (documents existing `assert!`) | n/a (doc) |
|
||||
|
||||
**Tally:** **7 de-magicked (const + pin test)**, **9 new boundary/characterization tests**, **1 added input guard (`try_new`) + test**, **2 doc-only fixes**, **1 perf item bench-first MEASURED-INCONCLUSIVE (not shipped, deferred)**. New tests: train `--no-default-features` **303** (was 288, +15); nn `--no-default-features` lib **38** (was 35, +3).
|
||||
|
||||
**Skipped honestly (flagged-but-not-real):** `ablation.rs` (NaN sort + boundary already fixed/tested in M1 — clean), `signal_features.rs` (consts already named, n=0 boundary already tested), `mae.rs` (no bare guard literals found), `metrics_core` already had thorough zero-visible/hip-normalizer coverage from M1. No churn was manufactured to hit a count.
|
||||
|
||||
**Genuinely data-gated / tch-gated — remaining backlog (blocked, not dropped):** GraphPose-Fi graph decoder, ONNX INT4, CSI-JEPA vs MAE A/B (all **data/model-gated** — need a training run + datasets); ONNX read-lock concurrency win (**upstream-gated** on `ort`); the tch-gated panic-on-input sites in `proof.rs`/`trainer.rs`/`model.rs` and the `metrics.rs` `*_v2` dead-code deletion (**tch-gated** — need a libtorch host to compile/verify). **The non-tch-verifiable subset of §8 is now cleared.**
|
||||
|
||||
### 8.1 Milestone-1b — metric-definition unification (the §8 metric subset) — RESOLVED
|
||||
|
||||
|
|
|
|||
|
|
@ -63,3 +63,7 @@ harness = false
|
|||
name = "onnx_bench"
|
||||
harness = false
|
||||
required-features = ["onnx"]
|
||||
|
||||
[[bench]]
|
||||
name = "native_conv_bench"
|
||||
harness = false
|
||||
|
|
|
|||
|
|
@ -0,0 +1,79 @@
|
|||
//! ADR-155 M2 §4 — native (pure-Rust) DensePose conv benchmark.
|
||||
//!
|
||||
//! `DensePoseHead::apply_conv_layer` is a pure-Rust naive 6-nested-loop
|
||||
//! convolution (the §8 "native-conv naive-loop" backlog item). This bench
|
||||
//! measures `forward()` (which runs the shared-conv + segmentation + UV conv
|
||||
//! stacks through that naive loop) on a representative single-layer config so a
|
||||
//! perf claim can be made (or refused) with a MEASURED before/after — never a
|
||||
//! fabricated number.
|
||||
//!
|
||||
//! Reproduce:
|
||||
//! cargo bench -p wifi-densepose-nn --no-default-features --bench native_conv_bench
|
||||
//!
|
||||
//! The bench is `--no-default-features` (no `onnx`/`ort` download needed): the
|
||||
//! conv path is pure-Rust and benchable on any host.
|
||||
|
||||
use criterion::{criterion_group, criterion_main, BenchmarkId, Criterion, Throughput};
|
||||
use ndarray::{Array1, Array4};
|
||||
use std::hint::black_box;
|
||||
use wifi_densepose_nn::densepose::{ConvLayerWeights, DensePoseWeights};
|
||||
use wifi_densepose_nn::{DensePoseConfig, DensePoseHead, Tensor};
|
||||
|
||||
/// Build a single same-padding conv layer `in_ch -> out_ch`, kernel `k`, with a
|
||||
/// bias (no batch-norm) — deterministic, small, representative of one stage.
|
||||
fn conv_layer(in_ch: usize, out_ch: usize, k: usize) -> ConvLayerWeights {
|
||||
let weight = Array4::from_shape_fn((out_ch, in_ch, k, k), |(o, i, kh, kw)| {
|
||||
// Deterministic, bounded weights.
|
||||
((o + i + kh + kw) as f32 * 0.013).sin()
|
||||
});
|
||||
ConvLayerWeights {
|
||||
weight,
|
||||
bias: Some(Array1::from_shape_fn(out_ch, |o| o as f32 * 0.01)),
|
||||
bn_gamma: None,
|
||||
bn_beta: None,
|
||||
bn_mean: None,
|
||||
bn_var: None,
|
||||
}
|
||||
}
|
||||
|
||||
/// A head whose shared-conv stack is one `ch->ch` conv, with empty seg/uv heads,
|
||||
/// so the bench isolates a single conv-layer cost.
|
||||
fn single_conv_head(ch: usize, k: usize) -> DensePoseHead {
|
||||
let mut config = DensePoseConfig::new(ch, 1, 2);
|
||||
config.kernel_size = k;
|
||||
config.padding = k / 2; // same padding
|
||||
config.hidden_channels = vec![ch];
|
||||
let weights = DensePoseWeights {
|
||||
shared_conv: vec![conv_layer(ch, ch, k)],
|
||||
segmentation_head: vec![],
|
||||
uv_head: vec![],
|
||||
};
|
||||
DensePoseHead::with_weights(config, weights).expect("valid head")
|
||||
}
|
||||
|
||||
fn bench_native_conv(c: &mut Criterion) {
|
||||
let mut group = c.benchmark_group("native_conv");
|
||||
// (channels, spatial, kernel) — a modest map and a larger one.
|
||||
for &(ch, hw, k) in &[(16usize, 32usize, 3usize), (32, 32, 3)] {
|
||||
let head = single_conv_head(ch, k);
|
||||
let input = Tensor::Float4D(Array4::from_shape_fn((1, ch, hw, hw), |(_, c, y, x)| {
|
||||
((c + y + x) as f32 * 0.001).cos()
|
||||
}));
|
||||
// Throughput in output elements processed.
|
||||
group.throughput(Throughput::Elements((ch * hw * hw) as u64));
|
||||
group.bench_with_input(
|
||||
BenchmarkId::from_parameter(format!("ch{ch}_hw{hw}_k{k}")),
|
||||
&input,
|
||||
|bencher, inp| {
|
||||
bencher.iter(|| {
|
||||
let out = head.forward(black_box(inp)).expect("forward ok");
|
||||
black_box(out);
|
||||
});
|
||||
},
|
||||
);
|
||||
}
|
||||
group.finish();
|
||||
}
|
||||
|
||||
criterion_group!(benches, bench_native_conv);
|
||||
criterion_main!(benches);
|
||||
|
|
@ -338,7 +338,16 @@ impl DensePoseHead {
|
|||
|
||||
let mut output = Array4::zeros((batch, out_channels, out_height, out_width));
|
||||
|
||||
// Simple convolution implementation (not optimized)
|
||||
// Naive direct convolution (one MAC per tap). ADR-155 M2 §4: a
|
||||
// range-clamped variant (hoisting the per-tap in-bounds branch out of the
|
||||
// inner loops) was prototyped and proven bit-identical, but a committed
|
||||
// criterion bench (`benches/native_conv_bench.rs`) showed the perf result
|
||||
// is INCONCLUSIVE on this host: a ~35% win on padding-heavy small-channel
|
||||
// maps but a small (~3%) *regression* on channel-heavy maps, all inside a
|
||||
// ±20% run-to-run noise floor. Per the §0 PROOF discipline we do not ship
|
||||
// a perf change whose benefit isn't robustly positive, nor fabricate a
|
||||
// number — the naive loop is kept and the rewrite is honestly deferred
|
||||
// (see ADR-155 §8). Behaviour pinned by `native_conv_matches_reference`.
|
||||
for b in 0..batch {
|
||||
for oc in 0..out_channels {
|
||||
for oh in 0..out_height {
|
||||
|
|
@ -565,6 +574,61 @@ impl BodyPart {
|
|||
#[cfg(test)]
|
||||
mod tests {
|
||||
use super::*;
|
||||
use ndarray::Array4;
|
||||
|
||||
/// ADR-155 M2 §4: characterize the native conv against **hand-computed**
|
||||
/// values so the §8 native-conv perf rewrite (or any future change) has a
|
||||
/// behaviour anchor — a 1×1 conv is just a per-pixel scalar multiply, and a
|
||||
/// same-padded 3×3 corner has a known truncated-window sum. Pins CURRENT
|
||||
/// behaviour (no behaviour change in this milestone — the rewrite was
|
||||
/// reverted as perf-inconclusive; see `benches/native_conv_bench.rs`).
|
||||
#[test]
|
||||
fn native_conv_matches_reference() {
|
||||
// --- Case 1: a 1×1 conv (no padding) is exactly `out = w·in + b`. ---
|
||||
let w11 = ConvLayerWeights {
|
||||
weight: Array4::from_shape_fn((1, 1, 1, 1), |_| 2.0_f32),
|
||||
bias: Some(ndarray::Array1::from_elem(1, 0.5_f32)),
|
||||
bn_gamma: None,
|
||||
bn_beta: None,
|
||||
bn_mean: None,
|
||||
bn_var: None,
|
||||
};
|
||||
let input = Array4::from_shape_fn((1, 1, 2, 2), |(_, _, y, x)| (y * 2 + x) as f32);
|
||||
let mut cfg = DensePoseConfig::new(1, 1, 2);
|
||||
cfg.kernel_size = 1;
|
||||
cfg.padding = 0;
|
||||
cfg.hidden_channels = vec![1];
|
||||
let head = DensePoseHead::new(cfg).unwrap();
|
||||
let out = head.apply_conv_layer(&input, &w11).unwrap();
|
||||
assert_eq!(out.dim(), (1, 1, 2, 2));
|
||||
// out[y,x] = 2·in[y,x] + 0.5 ⇒ {0.5, 2.5, 4.5, 6.5}.
|
||||
for (got, want) in out.iter().zip([0.5_f32, 2.5, 4.5, 6.5].iter()) {
|
||||
assert!((got - want).abs() < 1e-6, "1x1 conv: got {got}, want {want}");
|
||||
}
|
||||
|
||||
// --- Case 2: a same-padded 3×3 all-ones kernel sums the in-bounds
|
||||
// window. Input is all 1.0 on a 3×3 map ⇒ the centre output = 9 (full
|
||||
// window), each corner = 4 (2×2 truncated window). ---
|
||||
let w33 = ConvLayerWeights {
|
||||
weight: Array4::from_elem((1, 1, 3, 3), 1.0_f32),
|
||||
bias: None,
|
||||
bn_gamma: None,
|
||||
bn_beta: None,
|
||||
bn_mean: None,
|
||||
bn_var: None,
|
||||
};
|
||||
let ones = Array4::from_elem((1, 1, 3, 3), 1.0_f32);
|
||||
let mut cfg2 = DensePoseConfig::new(1, 1, 2);
|
||||
cfg2.kernel_size = 3;
|
||||
cfg2.padding = 1;
|
||||
cfg2.hidden_channels = vec![1];
|
||||
let head2 = DensePoseHead::new(cfg2).unwrap();
|
||||
let out2 = head2.apply_conv_layer(&ones, &w33).unwrap();
|
||||
assert_eq!(out2.dim(), (1, 1, 3, 3));
|
||||
assert!((out2[[0, 0, 1, 1]] - 9.0).abs() < 1e-6, "centre full window = 9");
|
||||
assert!((out2[[0, 0, 0, 0]] - 4.0).abs() < 1e-6, "corner 2x2 window = 4");
|
||||
assert!((out2[[0, 0, 0, 1]] - 6.0).abs() < 1e-6, "edge 2x3 window = 6");
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_config_validation() {
|
||||
|
|
|
|||
|
|
@ -98,8 +98,64 @@ pub struct LinearHead {
|
|||
var_b: f32,
|
||||
}
|
||||
|
||||
/// A shape mismatch when building a [`LinearHead`] from supplied weights.
|
||||
///
|
||||
/// Returned by [`LinearHead::try_new`] so a caller loading weights from an
|
||||
/// **untrusted / deserialized** source can validate the tensor shapes without
|
||||
/// the panic that [`LinearHead::new`] raises on a programmer-supplied mismatch
|
||||
/// (ADR-155 M2 §3: a pure-Rust input guard ahead of the construction contract).
|
||||
#[derive(Debug, Clone, PartialEq, Eq)]
|
||||
pub enum RfHeadError {
|
||||
/// `w.len()` was not `out_dim * EMBEDDING_DIM`.
|
||||
WeightShape {
|
||||
/// Expected length (`out_dim * EMBEDDING_DIM`).
|
||||
expected: usize,
|
||||
/// Actual `w.len()`.
|
||||
got: usize,
|
||||
},
|
||||
/// `b.len()` was not `out_dim`.
|
||||
BiasShape {
|
||||
/// Expected length (`out_dim`).
|
||||
expected: usize,
|
||||
/// Actual `b.len()`.
|
||||
got: usize,
|
||||
},
|
||||
/// `var_w.len()` was not `EMBEDDING_DIM`.
|
||||
VarWeightShape {
|
||||
/// Expected length (`EMBEDDING_DIM`).
|
||||
expected: usize,
|
||||
/// Actual `var_w.len()`.
|
||||
got: usize,
|
||||
},
|
||||
}
|
||||
|
||||
impl std::fmt::Display for RfHeadError {
|
||||
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
|
||||
match self {
|
||||
Self::WeightShape { expected, got } => {
|
||||
write!(f, "weight shape mismatch: expected {expected}, got {got}")
|
||||
}
|
||||
Self::BiasShape { expected, got } => {
|
||||
write!(f, "bias shape mismatch: expected {expected}, got {got}")
|
||||
}
|
||||
Self::VarWeightShape { expected, got } => {
|
||||
write!(f, "var weight shape mismatch: expected {expected}, got {got}")
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
impl std::error::Error for RfHeadError {}
|
||||
|
||||
impl LinearHead {
|
||||
/// Build a head with given weights. `w.len()` must be `out_dim * EMBEDDING_DIM`.
|
||||
///
|
||||
/// # Panics
|
||||
///
|
||||
/// Panics on a shape mismatch (`w`/`b`/`var_w`). This is a construction-time
|
||||
/// API contract on *programmer-supplied* vectors. For weights from an
|
||||
/// untrusted / deserialized source, prefer [`LinearHead::try_new`], which
|
||||
/// returns a typed [`RfHeadError`] instead of panicking.
|
||||
#[must_use]
|
||||
pub fn new(task: TaskKind, out_dim: usize, w: Vec<f32>, b: Vec<f32>, var_w: Vec<f32>, var_b: f32) -> Self {
|
||||
assert_eq!(w.len(), out_dim * EMBEDDING_DIM, "weight shape mismatch");
|
||||
|
|
@ -108,6 +164,40 @@ impl LinearHead {
|
|||
Self { task, w, b, out_dim, var_w, var_b }
|
||||
}
|
||||
|
||||
/// Fallible constructor: validate the weight shapes and return a typed
|
||||
/// [`RfHeadError`] on mismatch instead of panicking (ADR-155 M2 §3).
|
||||
///
|
||||
/// Use this when `w` / `b` / `var_w` originate from a checkpoint or any
|
||||
/// untrusted source. On success the produced head is byte-for-byte identical
|
||||
/// to [`LinearHead::new`] with the same arguments.
|
||||
///
|
||||
/// # Errors
|
||||
///
|
||||
/// Returns [`RfHeadError`] when any of:
|
||||
/// - `w.len() != out_dim * EMBEDDING_DIM`
|
||||
/// - `b.len() != out_dim`
|
||||
/// - `var_w.len() != EMBEDDING_DIM`
|
||||
pub fn try_new(
|
||||
task: TaskKind,
|
||||
out_dim: usize,
|
||||
w: Vec<f32>,
|
||||
b: Vec<f32>,
|
||||
var_w: Vec<f32>,
|
||||
var_b: f32,
|
||||
) -> Result<Self, RfHeadError> {
|
||||
let expected_w = out_dim * EMBEDDING_DIM;
|
||||
if w.len() != expected_w {
|
||||
return Err(RfHeadError::WeightShape { expected: expected_w, got: w.len() });
|
||||
}
|
||||
if b.len() != out_dim {
|
||||
return Err(RfHeadError::BiasShape { expected: out_dim, got: b.len() });
|
||||
}
|
||||
if var_w.len() != EMBEDDING_DIM {
|
||||
return Err(RfHeadError::VarWeightShape { expected: EMBEDDING_DIM, got: var_w.len() });
|
||||
}
|
||||
Ok(Self { task, w, b, out_dim, var_w, var_b })
|
||||
}
|
||||
|
||||
/// A zero-initialised head (uncertainty = softplus(0) ≈ 0.693).
|
||||
#[must_use]
|
||||
pub fn zeros(task: TaskKind, out_dim: usize) -> Self {
|
||||
|
|
@ -136,9 +226,14 @@ impl LinearHead {
|
|||
}
|
||||
}
|
||||
|
||||
/// Input magnitude above which `softplus(x) ≈ x` to f32 precision, so the
|
||||
/// `exp` is skipped to avoid overflow (ADR-155 M2 §8: de-magicked from a bare
|
||||
/// `20.0`; value unchanged). At x = 20, `ln(1+e^20) − 20 ≈ 2e-9`, below f32 eps.
|
||||
const SOFTPLUS_LINEAR_THRESHOLD: f32 = 20.0;
|
||||
|
||||
fn softplus(x: f32) -> f32 {
|
||||
// Numerically stable softplus.
|
||||
if x > 20.0 {
|
||||
if x > SOFTPLUS_LINEAR_THRESHOLD {
|
||||
x
|
||||
} else {
|
||||
(1.0 + x.exp()).ln()
|
||||
|
|
@ -270,6 +365,48 @@ mod tests {
|
|||
RfEmbedding::new(vec![fill; EMBEDDING_DIM])
|
||||
}
|
||||
|
||||
/// ADR-155 M2 §8: the de-magicked softplus linear-threshold must equal the
|
||||
/// prior inline `20.0` literal exactly (operating-value guard).
|
||||
#[test]
|
||||
fn softplus_threshold_unchanged_from_literal() {
|
||||
assert_eq!(SOFTPLUS_LINEAR_THRESHOLD, 20.0_f32);
|
||||
}
|
||||
|
||||
/// ADR-155 M2 §3: `try_new` accepts correctly-shaped weights and produces a
|
||||
/// head byte-identical to `new`, but returns a typed error on a mismatched
|
||||
/// (e.g. corrupt-checkpoint) shape instead of panicking.
|
||||
#[test]
|
||||
fn try_new_accepts_valid_and_rejects_each_bad_shape() {
|
||||
let out_dim = 2;
|
||||
let w = vec![0.0; out_dim * EMBEDDING_DIM];
|
||||
let b = vec![0.0; out_dim];
|
||||
let var_w = vec![0.0; EMBEDDING_DIM];
|
||||
|
||||
// Valid: try_new == new (forward identical on a probe embedding).
|
||||
let head = LinearHead::try_new(TaskKind::Presence, out_dim, w.clone(), b.clone(), var_w.clone(), 0.0)
|
||||
.expect("valid shapes must construct");
|
||||
let reference = LinearHead::new(TaskKind::Presence, out_dim, w.clone(), b.clone(), var_w.clone(), 0.0);
|
||||
assert_eq!(head.forward(&emb(0.5)).values, reference.forward(&emb(0.5)).values);
|
||||
|
||||
// Bad weight length.
|
||||
assert_eq!(
|
||||
LinearHead::try_new(TaskKind::Presence, out_dim, vec![0.0; 3], b.clone(), var_w.clone(), 0.0)
|
||||
.unwrap_err(),
|
||||
RfHeadError::WeightShape { expected: out_dim * EMBEDDING_DIM, got: 3 }
|
||||
);
|
||||
// Bad bias length.
|
||||
assert_eq!(
|
||||
LinearHead::try_new(TaskKind::Presence, out_dim, w.clone(), vec![0.0; 1], var_w.clone(), 0.0)
|
||||
.unwrap_err(),
|
||||
RfHeadError::BiasShape { expected: out_dim, got: 1 }
|
||||
);
|
||||
// Bad var-weight length.
|
||||
assert_eq!(
|
||||
LinearHead::try_new(TaskKind::Presence, out_dim, w, b, vec![0.0; 5], 0.0).unwrap_err(),
|
||||
RfHeadError::VarWeightShape { expected: EMBEDDING_DIM, got: 5 }
|
||||
);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn head_forward_produces_values_and_finite_uncertainty() {
|
||||
let head = LinearHead::zeros(TaskKind::Presence, 2);
|
||||
|
|
|
|||
|
|
@ -10,6 +10,11 @@
|
|||
// Helper math functions
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
/// LayerNorm numerical-stability epsilon added under the variance square root
|
||||
/// (`(x − μ)/√(σ² + ε)`). The standard transformer default (ADR-155 M2 §8:
|
||||
/// de-magicked from a bare `1e-5`; value unchanged, no behaviour change).
|
||||
const LAYER_NORM_EPS: f32 = 1e-5;
|
||||
|
||||
/// GELU activation (Hendrycks & Gimpel, 2016 approximation).
|
||||
pub fn gelu(x: f32) -> f32 {
|
||||
let c = (2.0_f32 / std::f32::consts::PI).sqrt();
|
||||
|
|
@ -24,7 +29,7 @@ pub fn layer_norm(x: &[f32]) -> Vec<f32> {
|
|||
}
|
||||
let mean = x.iter().sum::<f32>() / n;
|
||||
let var = x.iter().map(|v| (v - mean).powi(2)).sum::<f32>() / n;
|
||||
let inv_std = 1.0 / (var + 1e-5_f32).sqrt();
|
||||
let inv_std = 1.0 / (var + LAYER_NORM_EPS).sqrt();
|
||||
x.iter().map(|v| (v - mean) * inv_std).collect()
|
||||
}
|
||||
|
||||
|
|
@ -390,6 +395,13 @@ mod tests {
|
|||
assert!(layer_norm(&[]).is_empty());
|
||||
}
|
||||
|
||||
/// ADR-155 M2 §8: the de-magicked LayerNorm epsilon must equal the prior
|
||||
/// inline `1e-5` literal exactly (operating-value guard).
|
||||
#[test]
|
||||
fn layer_norm_eps_unchanged_from_literal() {
|
||||
assert_eq!(LAYER_NORM_EPS, 1e-5_f32);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn mean_pool_simple() {
|
||||
let p = global_mean_pool(&[1.0, 2.0, 3.0, 5.0, 6.0, 7.0], 2, 3);
|
||||
|
|
|
|||
|
|
@ -5,6 +5,12 @@
|
|||
|
||||
use std::collections::HashMap;
|
||||
|
||||
/// Smallest in-domain / few-shot MPJPE treated as positive before it divides a
|
||||
/// ratio. Below this the denominator is considered ≈0 and the ratio falls back
|
||||
/// to a sentinel (`1.0` or `INFINITY`) rather than dividing by ≈0 (ADR-155 M2
|
||||
/// §8: de-magicked from a bare `1e-10`; value unchanged, no behaviour change).
|
||||
const MIN_POSITIVE_MPJPE: f32 = 1e-10;
|
||||
|
||||
/// Aggregated cross-domain evaluation metrics.
|
||||
#[derive(Debug, Clone)]
|
||||
pub struct CrossDomainMetrics {
|
||||
|
|
@ -79,14 +85,14 @@ impl CrossDomainEvaluator {
|
|||
} else {
|
||||
cross_dom
|
||||
};
|
||||
let gap = if in_dom > 1e-10 {
|
||||
let gap = if in_dom > MIN_POSITIVE_MPJPE {
|
||||
cross_dom / in_dom
|
||||
} else if cross_dom > 1e-10 {
|
||||
} else if cross_dom > MIN_POSITIVE_MPJPE {
|
||||
f32::INFINITY
|
||||
} else {
|
||||
1.0
|
||||
};
|
||||
let speedup = if few_shot > 1e-10 {
|
||||
let speedup = if few_shot > MIN_POSITIVE_MPJPE {
|
||||
cross_dom / few_shot
|
||||
} else {
|
||||
1.0
|
||||
|
|
@ -132,6 +138,43 @@ fn mean_of(v: Option<&Vec<f32>>) -> f32 {
|
|||
mod tests {
|
||||
use super::*;
|
||||
|
||||
/// ADR-155 M2 §8: the de-magicked division-guard floor must equal the prior
|
||||
/// inline `1e-10` literal exactly (operating-value guard).
|
||||
#[test]
|
||||
fn eval_min_positive_mpjpe_unchanged_from_literal() {
|
||||
assert_eq!(MIN_POSITIVE_MPJPE, 1e-10_f32);
|
||||
}
|
||||
|
||||
/// Characterize the `in_dom ≈ 0` boundary: a perfect in-domain fit but
|
||||
/// nonzero cross-domain error yields the `INFINITY` gap sentinel (the
|
||||
/// middle branch), not a divide-by-≈0 NaN.
|
||||
#[test]
|
||||
fn domain_gap_infinite_when_in_domain_perfect_but_cross_nonzero() {
|
||||
let ev = CrossDomainEvaluator::new(1);
|
||||
let preds = vec![
|
||||
(vec![1.0, 2.0, 3.0], vec![1.0, 2.0, 3.0]), // dom 0: err 0
|
||||
(vec![0.0, 0.0, 0.0], vec![2.0, 0.0, 0.0]), // dom 1: err 2
|
||||
];
|
||||
let m = ev.evaluate(&preds, &[0, 1]);
|
||||
assert!((m.in_domain_mpjpe).abs() < MIN_POSITIVE_MPJPE);
|
||||
assert!(m.domain_gap_ratio.is_infinite());
|
||||
}
|
||||
|
||||
/// Characterize the all-perfect boundary: in-domain AND cross-domain both ≈0
|
||||
/// ⇒ gap falls back to the `1.0` sentinel (the final else branch), never NaN.
|
||||
#[test]
|
||||
fn domain_gap_unity_when_everything_perfect() {
|
||||
let ev = CrossDomainEvaluator::new(1);
|
||||
let preds = vec![
|
||||
(vec![1.0, 2.0, 3.0], vec![1.0, 2.0, 3.0]),
|
||||
(vec![4.0, 5.0, 6.0], vec![4.0, 5.0, 6.0]),
|
||||
];
|
||||
let m = ev.evaluate(&preds, &[0, 1]);
|
||||
assert!((m.domain_gap_ratio - 1.0).abs() < 1e-6);
|
||||
// few_shot derived = (0+0)/2 = 0 ⇒ speedup also falls back to 1.0.
|
||||
assert!((m.adaptation_speedup - 1.0).abs() < 1e-6);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn mpjpe_known_value() {
|
||||
assert!((mpjpe(&[0.0, 0.0, 0.0], &[3.0, 4.0, 0.0], 1) - 5.0).abs() < 1e-6);
|
||||
|
|
|
|||
|
|
@ -166,6 +166,13 @@ impl DeepSets {
|
|||
}
|
||||
|
||||
/// Encode a set of embeddings (each of length `geometry_dim`) into one vector.
|
||||
///
|
||||
/// # Panics
|
||||
///
|
||||
/// Panics if `ap_embeddings` is empty — a permutation-invariant mean-pool
|
||||
/// over zero elements is undefined. Callers with optional AP sets must guard
|
||||
/// for the empty case before calling (no behaviour change; documents the
|
||||
/// existing `assert!`).
|
||||
pub fn encode(&self, ap_embeddings: &[Vec<f32>]) -> Vec<f32> {
|
||||
assert!(
|
||||
!ap_embeddings.is_empty(),
|
||||
|
|
|
|||
|
|
@ -72,6 +72,28 @@ pub const CANON_LEFT_HIP: usize = 11;
|
|||
/// COCO joint index of the right hip.
|
||||
pub const CANON_RIGHT_HIP: usize = 12;
|
||||
|
||||
// --- Tuning constants (ADR-155 M2 §8: de-magicked from bare literals; values
|
||||
// are bit-identical to the prior inline literals — documentation only, no
|
||||
// behaviour change). ---
|
||||
|
||||
/// Visibility cutoff: a keypoint counts as *visible* iff `visibility[j] >= 0.5`.
|
||||
///
|
||||
/// This is the COCO convention (visibility flag 2 = "labelled and visible";
|
||||
/// any soft confidence ≥ 0.5 is treated as present). Used identically in
|
||||
/// [`bounding_box_diagonal`], [`canonical_torso_size`], [`pck_canonical`] and
|
||||
/// [`oks_canonical`].
|
||||
const VISIBILITY_THRESHOLD: f32 = 0.5;
|
||||
|
||||
/// Minimum positive extent for a usable reference scale (torso width or bbox
|
||||
/// diagonal). Below this the sample has no measurable evidence and is reported
|
||||
/// as unscoreable (PCK `(0,0,0.0)` / OKS `0.0`) rather than dividing by ≈0.
|
||||
const MIN_REFERENCE_EXTENT: f32 = 1e-6;
|
||||
|
||||
/// Fallback per-joint OKS sigma for joint indices beyond the 17 COCO-defined
|
||||
/// keypoints (defensive: the canonical path only ever scores `j < 17`). Mid-range
|
||||
/// of the COCO sigma band — see [`COCO_KP_SIGMAS`].
|
||||
const OKS_FALLBACK_SIGMA: f32 = 0.07;
|
||||
|
||||
/// Compute the Euclidean diagonal of the bounding box of visible keypoints.
|
||||
///
|
||||
/// The bounding box is defined by the axis-aligned extent of all keypoints
|
||||
|
|
@ -89,7 +111,7 @@ pub(crate) fn bounding_box_diagonal(
|
|||
let mut any_visible = false;
|
||||
|
||||
for j in 0..num_joints {
|
||||
if visibility[j] >= 0.5 {
|
||||
if visibility[j] >= VISIBILITY_THRESHOLD {
|
||||
let x = kp[[j, 0]];
|
||||
let y = kp[[j, 1]];
|
||||
x_min = x_min.min(x);
|
||||
|
|
@ -123,19 +145,19 @@ pub fn canonical_torso_size(gt_kpts: &Array2<f32>, visibility: &Array1<f32>) ->
|
|||
let n = gt_kpts.shape()[0].min(visibility.len());
|
||||
if CANON_LEFT_HIP < n
|
||||
&& CANON_RIGHT_HIP < n
|
||||
&& visibility[CANON_LEFT_HIP] >= 0.5
|
||||
&& visibility[CANON_RIGHT_HIP] >= 0.5
|
||||
&& visibility[CANON_LEFT_HIP] >= VISIBILITY_THRESHOLD
|
||||
&& visibility[CANON_RIGHT_HIP] >= VISIBILITY_THRESHOLD
|
||||
{
|
||||
let dx = gt_kpts[[CANON_LEFT_HIP, 0]] - gt_kpts[[CANON_RIGHT_HIP, 0]];
|
||||
let dy = gt_kpts[[CANON_LEFT_HIP, 1]] - gt_kpts[[CANON_RIGHT_HIP, 1]];
|
||||
let torso = (dx * dx + dy * dy).sqrt();
|
||||
if torso > 1e-6 {
|
||||
if torso > MIN_REFERENCE_EXTENT {
|
||||
return Some(torso);
|
||||
}
|
||||
}
|
||||
// Fallback: bounding-box diagonal of visible keypoints.
|
||||
let diag = bounding_box_diagonal(gt_kpts, visibility, n);
|
||||
if diag > 1e-6 {
|
||||
if diag > MIN_REFERENCE_EXTENT {
|
||||
Some(diag)
|
||||
} else {
|
||||
None
|
||||
|
|
@ -179,7 +201,7 @@ pub fn pck_canonical(
|
|||
let mut correct = 0usize;
|
||||
let mut total = 0usize;
|
||||
for j in 0..n {
|
||||
if visibility[j] < 0.5 {
|
||||
if visibility[j] < VISIBILITY_THRESHOLD {
|
||||
continue;
|
||||
}
|
||||
total += 1;
|
||||
|
|
@ -229,7 +251,7 @@ pub fn oks_canonical(
|
|||
let mut num = 0.0f32;
|
||||
let mut den = 0.0f32;
|
||||
for j in 0..n {
|
||||
if visibility[j] < 0.5 {
|
||||
if visibility[j] < VISIBILITY_THRESHOLD {
|
||||
continue;
|
||||
}
|
||||
den += 1.0;
|
||||
|
|
@ -239,7 +261,7 @@ pub fn oks_canonical(
|
|||
let k = if j < COCO_KP_SIGMAS.len() {
|
||||
COCO_KP_SIGMAS[j]
|
||||
} else {
|
||||
0.07
|
||||
OKS_FALLBACK_SIGMA
|
||||
};
|
||||
num += (-d_sq / (2.0 * s_sq * k * k)).exp();
|
||||
}
|
||||
|
|
@ -249,3 +271,65 @@ pub fn oks_canonical(
|
|||
0.0
|
||||
}
|
||||
}
|
||||
|
||||
#[cfg(test)]
|
||||
mod consts_tests {
|
||||
use super::*;
|
||||
|
||||
/// ADR-155 M2 §8: the de-magicked tuning consts must equal the prior inline
|
||||
/// literals exactly — this pins them so a future "tidy-up" cannot silently
|
||||
/// shift the metric definition (operating-value guard).
|
||||
#[test]
|
||||
fn metrics_core_consts_unchanged_from_literals() {
|
||||
assert_eq!(VISIBILITY_THRESHOLD, 0.5_f32);
|
||||
assert_eq!(MIN_REFERENCE_EXTENT, 1e-6_f32);
|
||||
assert_eq!(OKS_FALLBACK_SIGMA, 0.07_f32);
|
||||
assert_eq!(CANON_LEFT_HIP, 11);
|
||||
assert_eq!(CANON_RIGHT_HIP, 12);
|
||||
}
|
||||
|
||||
/// Characterize the visibility-threshold boundary: a keypoint at exactly the
|
||||
/// cutoff (vis == 0.5) is INCLUDED (`>=`), just below (0.499) is EXCLUDED.
|
||||
/// Pins current `>=`-inclusive behaviour at the edge.
|
||||
#[test]
|
||||
fn visibility_threshold_boundary_is_inclusive() {
|
||||
// Two GT hips give a positive torso; vary the (single) scored joint's
|
||||
// visibility around the 0.5 cutoff and confirm it flips total in/out.
|
||||
let gt = Array2::from_shape_vec(
|
||||
(13, 2),
|
||||
(0..13).flat_map(|j| [j as f32, 0.0]).collect::<Vec<_>>(),
|
||||
)
|
||||
.unwrap();
|
||||
// hips at 11,12 give torso = |11-12| = 1.0 along x.
|
||||
let pred = gt.clone();
|
||||
let mk_vis = |v0: f32| {
|
||||
let mut vis = Array1::<f32>::zeros(13);
|
||||
vis[CANON_LEFT_HIP] = 1.0;
|
||||
vis[CANON_RIGHT_HIP] = 1.0;
|
||||
vis[0] = v0; // joint 0 is the one we toggle
|
||||
vis
|
||||
};
|
||||
// At exactly 0.5 → joint 0 is counted (total includes it: 3 visible).
|
||||
let (_, total_at, _) = pck_canonical(&pred, >, &mk_vis(0.5), 0.2);
|
||||
assert_eq!(total_at, 3, "vis == 0.5 must be INCLUDED (>=)");
|
||||
// Just below → joint 0 excluded (only the 2 hips visible).
|
||||
let (_, total_below, _) = pck_canonical(&pred, >, &mk_vis(0.499), 0.2);
|
||||
assert_eq!(total_below, 2, "vis < 0.5 must be EXCLUDED");
|
||||
}
|
||||
|
||||
/// Characterize the reference-extent floor: a near-zero-extent GT pose (all
|
||||
/// keypoints coincident, hips coincident) is UNSCOREABLE → `(0,0,0.0)`,
|
||||
/// never a trivial perfect score. Pins the `MIN_REFERENCE_EXTENT` guard.
|
||||
#[test]
|
||||
fn degenerate_extent_below_floor_is_unscoreable() {
|
||||
// All 13 joints at the same point ⇒ torso ≈ 0, bbox diag ≈ 0 < 1e-6.
|
||||
let gt = Array2::<f32>::zeros((13, 2));
|
||||
let pred = gt.clone();
|
||||
let mut vis = Array1::<f32>::zeros(13);
|
||||
vis[CANON_LEFT_HIP] = 1.0;
|
||||
vis[CANON_RIGHT_HIP] = 1.0;
|
||||
assert!(canonical_torso_size(>, &vis).is_none());
|
||||
assert_eq!(pck_canonical(&pred, >, &vis, 0.2), (0, 0, 0.0));
|
||||
assert_eq!(oks_canonical(&pred, >, &vis), 0.0);
|
||||
}
|
||||
}
|
||||
|
|
|
|||
|
|
@ -11,8 +11,9 @@
|
|||
//! by the code. That placeholder is gone. The two `*_loss` functions are now
|
||||
//! pure evaluators of the real objective, and [`RapidAdaptation::adapt`]
|
||||
//! descends them with a **finite-difference gradient** of that exact loss.
|
||||
//! Finite differences genuinely minimize the stated objective (to O(ε)
|
||||
//! truncation), so "the adaptation loss decreases" is now a real, reproducible
|
||||
//! Finite differences genuinely minimize the stated objective (central
|
||||
//! differences are accurate to O(ε²) truncation; see [`RapidAdaptation::adapt`]),
|
||||
//! so "the adaptation loss decreases" is now a real, reproducible
|
||||
//! measurement rather than an artefact of a hand-tuned fake step.
|
||||
//!
|
||||
//! **Scope caveat (still honest):** this minimizes a *self-supervised proxy*
|
||||
|
|
|
|||
|
|
@ -108,6 +108,31 @@ const COCO_SIGMAS: [f32; 17] = [
|
|||
/// left_hip, right_hip.
|
||||
const TORSO_INDICES: [usize; 4] = [5, 6, 11, 12];
|
||||
|
||||
// --- Tuning constants (ADR-155 M2 §8: de-magicked from bare literals; values
|
||||
// bit-identical to the prior inline literals — documentation only, no behaviour
|
||||
// change). ---
|
||||
|
||||
/// Number of COCO body keypoints. Loops over keypoints are bounded by this so
|
||||
/// short/adversarial inputs cannot panic (ADR-155 §Tier-2).
|
||||
const NUM_KEYPOINTS: usize = 17;
|
||||
|
||||
/// Visibility cutoff: a keypoint is *visible* iff `visibility[j] >= 0.5`
|
||||
/// (COCO convention; matches [`crate::metrics_core`]).
|
||||
const VISIBILITY_THRESHOLD: f32 = 0.5;
|
||||
|
||||
/// PCK acceptance ratio: a keypoint is correct iff its error ≤ `0.2 · bbox_diag`
|
||||
/// (the ADR-152 / WiFlow-STD PCK@0.2 convention).
|
||||
const PCK_THRESHOLD: f32 = 0.2;
|
||||
|
||||
/// Floor on the GT bounding-box diagonal used as the OKS/PCK reference scale.
|
||||
/// Guards the `dist_thr = ratio · diag` and OKS `s` against a degenerate
|
||||
/// (≈0-extent) pose producing a divide-by-≈0 (Inf/NaN) score.
|
||||
const MIN_BBOX_DIAG: f32 = 1e-3;
|
||||
|
||||
/// Floor on a tracking-sequence duration (minutes) before it divides the
|
||||
/// false-track count, so a zero-length window cannot yield `Inf` per-minute.
|
||||
const MIN_DURATION_MINUTES: f32 = 1e-6;
|
||||
|
||||
/// Evaluate Metric 1: Joint Error.
|
||||
///
|
||||
/// # Arguments
|
||||
|
|
@ -141,21 +166,21 @@ pub fn evaluate_joint_error(
|
|||
}
|
||||
|
||||
// PCK@0.2 computation.
|
||||
let pck_threshold = 0.2;
|
||||
let pck_threshold = PCK_THRESHOLD;
|
||||
let mut all_correct = 0_usize;
|
||||
let mut all_total = 0_usize;
|
||||
let mut torso_correct = 0_usize;
|
||||
let mut torso_total = 0_usize;
|
||||
let mut oks_sum = 0.0_f64;
|
||||
let mut per_kp_errors: Vec<Vec<f32>> = vec![Vec::new(); 17];
|
||||
let mut per_kp_errors: Vec<Vec<f32>> = vec![Vec::new(); NUM_KEYPOINTS];
|
||||
|
||||
for i in 0..n {
|
||||
let bbox_diag = compute_bbox_diag(>_kpts[i], &visibility[i]);
|
||||
let safe_diag = bbox_diag.max(1e-3);
|
||||
let safe_diag = bbox_diag.max(MIN_BBOX_DIAG);
|
||||
let dist_thr = pck_threshold * safe_diag;
|
||||
|
||||
for (j, kp_errors) in per_kp_errors.iter_mut().enumerate() {
|
||||
if visibility[i][j] < 0.5 {
|
||||
if visibility[i][j] < VISIBILITY_THRESHOLD {
|
||||
continue;
|
||||
}
|
||||
let dx = pred_kpts[i][[j, 0]] - gt_kpts[i][[j, 0]];
|
||||
|
|
@ -378,7 +403,7 @@ pub fn evaluate_tracking(
|
|||
};
|
||||
|
||||
// False tracks per minute.
|
||||
let safe_duration = duration_minutes.max(1e-6);
|
||||
let safe_duration = duration_minutes.max(MIN_DURATION_MINUTES);
|
||||
let false_tracks_per_min = total_false_positives as f32 / safe_duration;
|
||||
|
||||
// MOTA = 1 - (misses + false_positives + id_switches) / total_gt
|
||||
|
|
@ -612,8 +637,8 @@ fn compute_bbox_diag(kp: &Array2<f32>, vis: &Array1<f32>) -> f32 {
|
|||
let mut y_max = f32::MIN;
|
||||
let mut any = false;
|
||||
|
||||
for j in 0..17.min(kp.shape()[0]) {
|
||||
if vis[j] >= 0.5 {
|
||||
for j in 0..NUM_KEYPOINTS.min(kp.shape()[0]) {
|
||||
if vis[j] >= VISIBILITY_THRESHOLD {
|
||||
let x = kp[[j, 0]];
|
||||
let y = kp[[j, 1]];
|
||||
x_min = x_min.min(x);
|
||||
|
|
@ -640,11 +665,11 @@ fn compute_single_oks(pred: &Array2<f32>, gt: &Array2<f32>, vis: &Array1<f32>, s
|
|||
let s_sq = s * s;
|
||||
// ADR-155 §Tier-2: bound the loop to the actual array extents so adversarial
|
||||
// / short inputs (< 17 rows, mismatched vis length) cannot panic on `[j]`.
|
||||
let n = pred.shape()[0].min(gt.shape()[0]).min(vis.len()).min(17);
|
||||
let n = pred.shape()[0].min(gt.shape()[0]).min(vis.len()).min(NUM_KEYPOINTS);
|
||||
let mut num = 0.0_f32;
|
||||
let mut den = 0.0_f32;
|
||||
for j in 0..n {
|
||||
if vis[j] < 0.5 {
|
||||
if vis[j] < VISIBILITY_THRESHOLD {
|
||||
continue;
|
||||
}
|
||||
den += 1.0;
|
||||
|
|
@ -675,7 +700,7 @@ fn compute_torso_jitter(pred_kpts: &[Array2<f32>], visibility: &[Array1<f32>]) -
|
|||
let mut cy = 0.0_f32;
|
||||
let mut count = 0_usize;
|
||||
for &idx in &TORSO_INDICES {
|
||||
if vis[idx] >= 0.5 {
|
||||
if vis[idx] >= VISIBILITY_THRESHOLD {
|
||||
cx += kp[[idx, 0]];
|
||||
cy += kp[[idx, 1]];
|
||||
count += 1;
|
||||
|
|
@ -730,6 +755,50 @@ mod tests {
|
|||
use super::*;
|
||||
use ndarray::{Array1, Array2};
|
||||
|
||||
/// ADR-155 M2 §8: the de-magicked tuning consts must equal the prior inline
|
||||
/// literals exactly (operating-value guard against a future silent shift).
|
||||
#[test]
|
||||
fn ruview_metrics_consts_unchanged_from_literals() {
|
||||
assert_eq!(NUM_KEYPOINTS, 17);
|
||||
assert_eq!(VISIBILITY_THRESHOLD, 0.5_f32);
|
||||
assert_eq!(PCK_THRESHOLD, 0.2_f32);
|
||||
assert_eq!(MIN_BBOX_DIAG, 1e-3_f32);
|
||||
assert_eq!(MIN_DURATION_MINUTES, 1e-6_f32);
|
||||
}
|
||||
|
||||
/// Characterize `evaluate_tracking`'s duration floor: a zero-minute window
|
||||
/// must NOT produce an Inf per-minute false-track rate — it divides by the
|
||||
/// `MIN_DURATION_MINUTES` floor instead. Pins the guard.
|
||||
#[test]
|
||||
fn tracking_zero_duration_does_not_divide_by_zero() {
|
||||
let frames = vec![TrackingFrame {
|
||||
frame_idx: 0,
|
||||
gt_ids: vec![1],
|
||||
pred_ids: vec![1, 2], // one extra ⇒ a false positive track
|
||||
assignments: vec![(1, 1)],
|
||||
}];
|
||||
let r = evaluate_tracking(&frames, 0.0, &TrackingThresholds::default());
|
||||
assert!(
|
||||
r.false_tracks_per_min.is_finite(),
|
||||
"zero duration must not yield Inf false-tracks/min: {}",
|
||||
r.false_tracks_per_min
|
||||
);
|
||||
}
|
||||
|
||||
/// Characterize `compute_single_oks`'s short-array bound at exactly the
|
||||
/// `NUM_KEYPOINTS` edge and just below: fewer than 17 rows must score the
|
||||
/// available joints without panicking on `[j]`.
|
||||
#[test]
|
||||
fn oks_short_array_is_bounded_at_keypoint_count() {
|
||||
// 16 rows (one below NUM_KEYPOINTS): must not panic, finite result.
|
||||
let pred = Array2::<f32>::zeros((16, 2));
|
||||
let gt = Array2::<f32>::zeros((16, 2));
|
||||
let mut vis = Array1::<f32>::ones(16);
|
||||
vis[0] = 1.0;
|
||||
let oks = compute_single_oks(&pred, >, &vis, 1.0);
|
||||
assert!(oks.is_finite());
|
||||
}
|
||||
|
||||
fn make_perfect_kpts() -> (Array2<f32>, Array2<f32>, Array1<f32>) {
|
||||
let kp = Array2::from_shape_fn((17, 2), |(j, d)| {
|
||||
if d == 0 {
|
||||
|
|
|
|||
|
|
@ -20,6 +20,34 @@ use ndarray::{s, Array4};
|
|||
use ruvector_solver::neumann::NeumannSolver;
|
||||
use ruvector_solver::types::CsrMatrix;
|
||||
|
||||
// --- Sparse-interpolation tuning constants (ADR-155 M2 §8: de-magicked from
|
||||
// bare literals in `interpolate_subcarriers_sparse`; values bit-identical to the
|
||||
// prior inline literals — documentation only, no behaviour change). ---
|
||||
|
||||
/// Gaussian-basis width (in the normalised `[0,1]` subcarrier position space)
|
||||
/// for the sparse-interpolation kernel `exp(-Δ²/σ²)`. Wider σ ⇒ smoother fit.
|
||||
const SPARSE_BASIS_SIGMA: f32 = 0.15;
|
||||
|
||||
/// Sparsity cutoff: basis entries below this magnitude are dropped from the
|
||||
/// normal-equations assembly, keeping `AᵀA` sparse.
|
||||
const SPARSE_BASIS_THRESHOLD: f32 = 1e-4;
|
||||
|
||||
/// Tikhonov regularisation strength `λ` added to the `AᵀA` diagonal for
|
||||
/// numerical stability of the (possibly ill-conditioned) normal equations.
|
||||
const SPARSE_REGULARIZATION_LAMBDA: f32 = 0.1;
|
||||
|
||||
/// Magnitude below which an assembled `AᵀA` entry is treated as structurally
|
||||
/// zero and omitted from the COO triplet list.
|
||||
const SPARSE_COO_PRUNE_EPS: f32 = 1e-8;
|
||||
|
||||
/// Convergence tolerance for the Neumann-series sparse solver (`f64` to match
|
||||
/// [`NeumannSolver::new`]).
|
||||
const SPARSE_SOLVER_TOL: f64 = 1e-5;
|
||||
|
||||
/// Maximum Neumann-series iterations before the solver returns (falls back to
|
||||
/// linear interpolation on non-convergence).
|
||||
const SPARSE_SOLVER_MAX_ITERS: usize = 500;
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// interpolate_subcarriers
|
||||
// ---------------------------------------------------------------------------
|
||||
|
|
@ -167,7 +195,7 @@ pub fn interpolate_subcarriers_sparse(arr: &Array4<f32>, target_sc: usize) -> Ar
|
|||
|
||||
// Build the Gaussian basis matrix A: [src_sc, target_sc]
|
||||
// A[j, k] = exp(-((j/(n_sc-1) - k/(target_sc-1))^2) / sigma^2)
|
||||
let sigma = 0.15_f32;
|
||||
let sigma = SPARSE_BASIS_SIGMA;
|
||||
let sigma_sq = sigma * sigma;
|
||||
|
||||
// Source and target normalized positions in [0, 1]
|
||||
|
|
@ -191,12 +219,12 @@ pub fn interpolate_subcarriers_sparse(arr: &Array4<f32>, target_sc: usize) -> Ar
|
|||
.collect();
|
||||
|
||||
// Only include entries above a sparsity threshold
|
||||
let threshold = 1e-4_f32;
|
||||
let threshold = SPARSE_BASIS_THRESHOLD;
|
||||
|
||||
// Build A^T A + λI regularized system for normal equations
|
||||
// We solve: (A^T A + λI) x = A^T b
|
||||
// A^T A is [target_sc × target_sc]
|
||||
let lambda = 0.1_f32; // regularization
|
||||
let lambda = SPARSE_REGULARIZATION_LAMBDA;
|
||||
let mut ata_coo: Vec<(usize, usize, f32)> = Vec::new();
|
||||
|
||||
// Compute A^T A
|
||||
|
|
@ -226,7 +254,7 @@ pub fn interpolate_subcarriers_sparse(arr: &Array4<f32>, target_sc: usize) -> Ar
|
|||
for (k, row) in ata.iter().enumerate() {
|
||||
for (k2, &cell) in row.iter().enumerate() {
|
||||
let val = cell + if k == k2 { lambda } else { 0.0 };
|
||||
if val.abs() > 1e-8 {
|
||||
if val.abs() > SPARSE_COO_PRUNE_EPS {
|
||||
ata_coo.push((k, k2, val));
|
||||
}
|
||||
}
|
||||
|
|
@ -234,7 +262,7 @@ pub fn interpolate_subcarriers_sparse(arr: &Array4<f32>, target_sc: usize) -> Ar
|
|||
|
||||
// Build CsrMatrix for the normal equations system (A^T A + λI)
|
||||
let normal_matrix = CsrMatrix::<f32>::from_coo(target_sc, target_sc, ata_coo);
|
||||
let solver = NeumannSolver::new(1e-5, 500);
|
||||
let solver = NeumannSolver::new(SPARSE_SOLVER_TOL, SPARSE_SOLVER_MAX_ITERS);
|
||||
|
||||
let mut out = Array4::<f32>::zeros((n_t, n_tx, n_rx, target_sc));
|
||||
|
||||
|
|
@ -350,6 +378,42 @@ mod tests {
|
|||
use super::*;
|
||||
use approx::assert_abs_diff_eq;
|
||||
|
||||
/// ADR-155 M2 §8: the de-magicked sparse-interpolation consts must equal the
|
||||
/// prior inline literals exactly (operating-value guard).
|
||||
#[test]
|
||||
fn sparse_interp_consts_unchanged_from_literals() {
|
||||
assert_eq!(SPARSE_BASIS_SIGMA, 0.15_f32);
|
||||
assert_eq!(SPARSE_BASIS_THRESHOLD, 1e-4_f32);
|
||||
assert_eq!(SPARSE_REGULARIZATION_LAMBDA, 0.1_f32);
|
||||
assert_eq!(SPARSE_COO_PRUNE_EPS, 1e-8_f32);
|
||||
assert_eq!(SPARSE_SOLVER_TOL, 1e-5_f64);
|
||||
assert_eq!(SPARSE_SOLVER_MAX_ITERS, 500);
|
||||
}
|
||||
|
||||
/// Characterize the `target_sc == 1` boundary of `compute_interp_weights`:
|
||||
/// the single output maps to source index 0 with zero fraction (the special
|
||||
/// branch that avoids dividing by `target_sc - 1 == 0`).
|
||||
#[test]
|
||||
fn compute_interp_weights_single_target_is_index_zero() {
|
||||
let w = compute_interp_weights(7, 1);
|
||||
assert_eq!(w.len(), 1);
|
||||
let (i0, i1, frac) = w[0];
|
||||
assert_eq!(i0, 0);
|
||||
assert_eq!(i1, 0);
|
||||
assert_abs_diff_eq!(frac, 0.0_f32, epsilon = 1e-6);
|
||||
}
|
||||
|
||||
/// Characterize sparse interpolation to a single subcarrier: must produce
|
||||
/// the right shape and a finite value (exercises the `target_sc == 1`
|
||||
/// normalized-position branch).
|
||||
#[test]
|
||||
fn sparse_interp_single_target_is_finite() {
|
||||
let arr = Array4::<f32>::from_shape_fn((2, 1, 1, 8), |(_, _, _, k)| k as f32);
|
||||
let out = interpolate_subcarriers_sparse(&arr, 1);
|
||||
assert_eq!(out.shape(), &[2, 1, 1, 1]);
|
||||
assert!(out.iter().all(|v| v.is_finite()));
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn identity_resample() {
|
||||
let arr =
|
||||
|
|
|
|||
|
|
@ -17,6 +17,15 @@
|
|||
|
||||
use std::f32::consts::PI;
|
||||
|
||||
/// Floor on the Box-Muller `u1` sample so `ln(u1)` stays finite when the PRNG
|
||||
/// returns ≈0 (ADR-155 M2 §8: de-magicked from a bare `1e-10`; value unchanged).
|
||||
const BOX_MULLER_U1_FLOOR: f32 = 1e-10;
|
||||
|
||||
/// Magnitude below which `room_scale` is treated as zero and the amplitude
|
||||
/// division is skipped (guards `val / room_scale` against ÷≈0). De-magicked from
|
||||
/// a bare `1e-10`; value unchanged, no behaviour change.
|
||||
const MIN_ROOM_SCALE: f32 = 1e-10;
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Xorshift64 PRNG (matches dataset.rs pattern)
|
||||
// ---------------------------------------------------------------------------
|
||||
|
|
@ -67,7 +76,7 @@ impl Xorshift64 {
|
|||
/// Sample an approximate Gaussian (mean=0, std=1) via Box-Muller.
|
||||
#[inline]
|
||||
pub fn next_gaussian(&mut self) -> f32 {
|
||||
let u1 = self.next_f32().max(1e-10);
|
||||
let u1 = self.next_f32().max(BOX_MULLER_U1_FLOOR);
|
||||
let u2 = self.next_f32();
|
||||
(-2.0 * u1.ln()).sqrt() * (2.0 * PI * u2).cos()
|
||||
}
|
||||
|
|
@ -158,7 +167,7 @@ impl VirtualDomainAugmentor {
|
|||
for (k, &val) in frame.iter().enumerate() {
|
||||
let k_f = k as f32;
|
||||
// 1. Room-scale amplitude attenuation (guard against zero scale)
|
||||
let scaled = if domain.room_scale.abs() < 1e-10 {
|
||||
let scaled = if domain.room_scale.abs() < MIN_ROOM_SCALE {
|
||||
val
|
||||
} else {
|
||||
val / domain.room_scale
|
||||
|
|
@ -207,6 +216,42 @@ impl VirtualDomainAugmentor {
|
|||
mod tests {
|
||||
use super::*;
|
||||
|
||||
/// ADR-155 M2 §8: the de-magicked guard epsilons must equal the prior inline
|
||||
/// `1e-10` literals exactly (operating-value guard).
|
||||
#[test]
|
||||
fn virtual_aug_guard_consts_unchanged_from_literals() {
|
||||
assert_eq!(BOX_MULLER_U1_FLOOR, 1e-10_f32);
|
||||
assert_eq!(MIN_ROOM_SCALE, 1e-10_f32);
|
||||
}
|
||||
|
||||
/// Characterize the zero-room-scale guard: a `room_scale` of exactly 0 must
|
||||
/// pass amplitude through unscaled (the guard branch), never produce
|
||||
/// Inf/NaN from `val / 0`.
|
||||
#[test]
|
||||
fn augment_frame_zero_room_scale_passes_amplitude_finite() {
|
||||
let aug = VirtualDomainAugmentor::default();
|
||||
let domain = VirtualDomain {
|
||||
room_scale: 0.0,
|
||||
// reflection_coeff = 1.0 ⇒ refl = 1.0 + (1-1)·cos(..) = 1.0 (constant,
|
||||
// so the reflection step is the identity for this characterization).
|
||||
reflection_coeff: 1.0,
|
||||
n_scatterers: 0, // no scatterer interference
|
||||
noise_std: 0.0, // no additive noise
|
||||
domain_id: 1,
|
||||
};
|
||||
let frame = vec![1.0_f32, 2.0, 3.0, 4.0];
|
||||
let out = aug.augment_frame(&frame, &domain);
|
||||
assert_eq!(out.len(), frame.len());
|
||||
assert!(
|
||||
out.iter().all(|v| v.is_finite()),
|
||||
"zero room_scale must not yield Inf/NaN: {out:?}"
|
||||
);
|
||||
// With every other transform neutralised, the guard leaves amplitude as-is.
|
||||
for (o, f) in out.iter().zip(frame.iter()) {
|
||||
assert!((o - f).abs() < 1e-6, "expected pass-through, got {o} vs {f}");
|
||||
}
|
||||
}
|
||||
|
||||
fn make_domain(scale: f32, coeff: f32, scatter: usize, noise: f32, id: u32) -> VirtualDomain {
|
||||
VirtualDomain {
|
||||
room_scale: scale,
|
||||
|
|
|
|||
Loading…
Reference in New Issue