wifi-densepose/benchmarks
ruv 41665d3de9 test(wasm-edge): synthetic-ground-truth validation harness for edge skills (ADR-160)
Plant signals with known answers, run the real detector, MEASURE detection
accuracy / precision / recall / rate-error — synthetic-ground-truth ONLY, not
field accuracy.

MEASURED-on-synthetic (12 tests, all green):
- vital_trend, exo_ghost_hunter(hidden breathing), occupancy, intrusion,
  exo_rain_detect, sig_optimal_transport: acc 1.000
- exo_time_crystal: 1.000 on periodic-vs-aperiodic (its sub-harmonic-vs-clean-
  period claim is NOT separable by autocorrelation — recorded honestly)
- sig_flash_attention: 8/8 peak localization; spt_spiking_tracker: 4/4 zone
  localization (sparse plant); sig_mincut_person_match: 0 id-swaps/40 frames
- lrn_dtw_gesture_learn: enrollment validated (replay-match reported, not asserted)
- sig_sparse_recovery: trigger validated; recovery accuracy reported NEGATIVE
  (-2.2% vs unrecovered baseline) — only its detect/trigger path is validated

DATA-GATED (listed, NOT faked): med_seizure/apnea/cardiac/respiratory/gait,
sec_weapon_detect, exo_emotion/happiness/dream_stage/gesture_language — each
needs real labelled clinical/affect/ASL/metal-object data; no number claimed.

benchmarks/edge-skills/RESULTS.md documents every result + reproduce command and
the explicit honesty boundary. ADR-160 deferred 'per-skill accuracy validation'
item updated to PARTIALLY MEASURED-on-synthetic + DATA-GATED.

Suite: 631 passed default / 669 medical, 0 failed.

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-06-13 00:33:51 -04:00
..
edge-latency docs(ADR-163): edge-latency RESULTS + PROOF/prove.sh wiring (T3) 2026-06-12 08:02:07 -04:00
edge-skills test(wasm-edge): synthetic-ground-truth validation harness for edge skills (ADR-160) 2026-06-13 00:33:51 -04:00
wiflow-std ADR-152: WiFi-Pose SOTA 2026 intake — WiFlow-STD benchmark, Rust integrations, ADR-153 802.11bf layer, efficiency frontier (#1008) 2026-06-11 17:02:23 -04:00