* refactor(train): ADR-155 M2 §8 — de-magic train non-tch tuning constants + boundary tests Lift bare numeric literals used as thresholds / guard epsilons in the non-tch (host-verifiable) train surface into named, documented consts and pin each set with a *_consts_unchanged_from_literals test. Values are bit-identical to the prior inline literals — cleanup, no behaviour change. De-magicked (const + pin test): - metrics_core.rs: VISIBILITY_THRESHOLD (0.5), MIN_REFERENCE_EXTENT (1e-6), OKS_FALLBACK_SIGMA (0.07) - ruview_metrics.rs: NUM_KEYPOINTS (17), VISIBILITY_THRESHOLD (0.5), PCK_THRESHOLD (0.2), MIN_BBOX_DIAG (1e-3), MIN_DURATION_MINUTES (1e-6) - subcarrier.rs: SPARSE_BASIS_SIGMA (0.15), SPARSE_BASIS_THRESHOLD (1e-4), SPARSE_REGULARIZATION_LAMBDA (0.1), SPARSE_COO_PRUNE_EPS (1e-8), SPARSE_SOLVER_TOL (1e-5 f64), SPARSE_SOLVER_MAX_ITERS (500) - eval.rs: MIN_POSITIVE_MPJPE (1e-10) - domain.rs: LAYER_NORM_EPS (1e-5) - virtual_aug.rs: BOX_MULLER_U1_FLOOR (1e-10), MIN_ROOM_SCALE (1e-10) Boundary / characterization tests (pin CURRENT behaviour): - visibility_threshold_boundary_is_inclusive (>= 0.5 at the edge) - degenerate_extent_below_floor_is_unscoreable ((0,0,0.0)/0.0, not perfect) - tracking_zero_duration_does_not_divide_by_zero - oks_short_array_is_bounded_at_keypoint_count (16 rows, no panic) - compute_interp_weights_single_target_is_index_zero (target_sc==1) - sparse_interp_single_target_is_finite - domain_gap_infinite_when_in_domain_perfect_but_cross_nonzero - domain_gap_unity_when_everything_perfect - augment_frame_zero_room_scale_passes_amplitude_finite Doc-only (no behaviour change): - rapid_adapt.rs: correct module-doc O(eps) -> O(eps^2) for central differences - geometry.rs: add # Panics to DeepSets::encode (documents existing assert!) train --no-default-features: 191 lib (was 176), 303 total (was 288), 0 failed. Co-Authored-By: claude-flow <ruv@ruv.net> * feat(nn): ADR-155 M2 §3 — pure-Rust LinearHead::try_new input guard + de-magic softplus threshold ADR-155 §3 found rf_encoder.rs has no adversarial checkpoint-deserialization assert — its assert_eq!s in LinearHead::new are construction-time API contracts on programmer-supplied vectors. This adds the honest, in-scope improvement the M2 task allows: a pure-Rust *fallible* constructor so weights from an untrusted / deserialized checkpoint can be shape-validated without panicking. - Add RfHeadError (WeightShape / BiasShape / VarWeightShape) + Display + Error. - Add LinearHead::try_new returning Result<Self, RfHeadError>; on success the head is byte-identical to LinearHead::new. new() is unchanged (still asserts; now documents # Panics and points to try_new) — no behaviour change for existing callers. - De-magic softplus's bare 20.0 overflow threshold into SOFTPLUS_LINEAR_THRESHOLD (value unchanged) + pin test. Tests: try_new_accepts_valid_and_rejects_each_bad_shape (valid == new forward; each bad shape → typed error), softplus_threshold_unchanged_from_literal. nn --no-default-features lib: 37 passed (was 35), 0 failed. Co-Authored-By: claude-flow <ruv@ruv.net> * perf(nn): ADR-155 M2 §4 — native-conv bench-first → MEASURED-INCONCLUSIVE (no perf change shipped) The §8 "native-conv naive-loop rewrite" backlog item: DensePoseHead:: apply_conv_layer is a pure-Rust 6-nested-loop conv (benchable on this host, not tch/ort-gated). Bench-first per the §0 PROOF discipline. - Add committed criterion bench benches/native_conv_bench.rs measuring forward() through the naive conv on representative single-layer configs (--no-default- features; no ort download). - Prototyped a bit-identical range-clamped variant (hoist the per-tap in-bounds branch by pre-clamping kh/kw ranges; same ic→kh→kw MAC order ⇒ bit-identical). MEASURED before/after on this host: ~35% faster on padding-heavy small-channel maps (4.40→2.84 ms) but a ~3% *regression* on channel-heavy maps (11.09→11.48 ms), all inside a ±20% run-to-run noise floor. Verdict: INCONCLUSIVE — the benefit is not robustly positive, so the rewrite is NOT shipped and NOT a fabricated speedup. Reverted to the naive loop; honestly deferred (ADR-155 §8). - Add native_conv_matches_reference: a hand-computed characterization anchor (1×1 = scalar MAC; same-padded 3×3 ones = truncated-window sums 9/6/4) pinning CURRENT conv behaviour for any future rewrite. nn --no-default-features lib: 38 passed (was 37), 0 failed. No behaviour change. Co-Authored-By: claude-flow <ruv@ruv.net> * docs(adr-155): M2 §8.2 — enumerated host-verifiable P3 backlog clearance + CHANGELOG Replace the §8 bulk "~40 lower-severity findings" line with the real, enumerated M2 resolution (§8.2): 7 de-magicked (const + pin == prior literal), 9 boundary tests, 1 input guard (rf_encoder try_new), 2 doc-only, 1 perf bench-first MEASURED-INCONCLUSIVE (not shipped). Mark native-conv + rf_encoder RESOLVED; state which §8 items stay data-gated (GraphPose-Fi/INT4/CSI-JEPA) or tch-gated (proof/trainer/model panic sites, metrics *_v2 dead code) and ONNX read-lock upstream-gated — blocked, not dropped. Declare the non-tch-verifiable subset of §8 cleared. Validation: train --no-default-features 303 passed (was 288); nn lib 38 (was 35); workspace --no-default-features 3,293 passed, 0 failed; Python proof VERDICT PASS, hash f8e76f21…46f7a UNCHANGED bit-exact. Co-Authored-By: claude-flow <ruv@ruv.net> |
||
|---|---|---|
| .. | ||
| benches | ||
| src | ||
| tests/fixtures | ||
| Cargo.toml | ||
| README.md | ||
README.md
wifi-densepose-nn
Multi-backend neural network inference for WiFi-based DensePose estimation.
Overview
wifi-densepose-nn provides the inference engine that maps processed WiFi CSI features to
DensePose body surface predictions. It supports three backends -- ONNX Runtime (default),
PyTorch via tch-rs, and Candle -- so models can run on CPU, CUDA GPU, or TensorRT depending
on the deployment target.
The crate implements two key neural components:
- DensePose Head -- Predicts 24 body part segmentation masks and per-part UV coordinate regression.
- Modality Translator -- Translates CSI feature embeddings into visual feature space, bridging the domain gap between WiFi signals and image-based pose estimation.
Features
- ONNX Runtime backend (default) -- Load and run
.onnxmodels with CPU or GPU execution providers. - PyTorch backend (
tch-backend) -- Native PyTorch inference via libtorch FFI. - Candle backend (
candle-backend) -- Pure-Rust inference withcandle-coreandcandle-nn. - CUDA acceleration (
cuda) -- GPU execution for supported backends. - TensorRT optimization (
tensorrt) -- INT8/FP16 optimized inference via ONNX Runtime. - Batched inference -- Process multiple CSI frames in a single forward pass.
- Model caching -- Memory-mapped model weights via
memmap2.
Feature flags
| Flag | Default | Description |
|---|---|---|
onnx |
yes | ONNX Runtime backend |
tch-backend |
no | PyTorch (tch-rs) backend |
candle-backend |
no | Candle pure-Rust backend |
cuda |
no | CUDA GPU acceleration |
tensorrt |
no | TensorRT via ONNX Runtime |
all-backends |
no | Enable onnx + tch + candle together |
Quick Start
use wifi_densepose_nn::{InferenceEngine, DensePoseConfig, OnnxBackend};
// Create inference engine with ONNX backend
let config = DensePoseConfig::default();
let backend = OnnxBackend::from_file("model.onnx")?;
let engine = InferenceEngine::new(backend, config)?;
// Run inference on a CSI feature tensor
let input = ndarray::Array4::zeros((1, 256, 64, 64));
let output = engine.infer(&input)?;
println!("Body parts: {}", output.body_parts.shape()[1]); // 24
Architecture
wifi-densepose-nn/src/
lib.rs -- Re-exports, constants (NUM_BODY_PARTS=24), prelude
densepose.rs -- DensePoseHead, DensePoseConfig, DensePoseOutput
inference.rs -- Backend trait, InferenceEngine, InferenceOptions
onnx.rs -- OnnxBackend, OnnxSession (feature-gated)
tensor.rs -- Tensor, TensorShape utilities
translator.rs -- ModalityTranslator (CSI -> visual space)
error.rs -- NnError, NnResult
Related Crates
| Crate | Role |
|---|---|
wifi-densepose-core |
Foundation types and NeuralInference trait |
wifi-densepose-signal |
Produces CSI features consumed by inference |
wifi-densepose-train |
Trains the models this crate loads |
ort |
ONNX Runtime Rust bindings |
tch |
PyTorch Rust bindings |
candle-core |
Hugging Face pure-Rust ML framework |
License
MIT OR Apache-2.0