Commit Graph

955 Commits

Author SHA1 Message Date
rUv b8e870b314
Merge pull request #1025 from ruvnet/feat/v2-beyond-sota-sweep-m7
Beyond-SOTA sweep M7 (ADR-161): HOMECORE WS auth-bypass fix + automation engine + security
2026-06-12 01:15:42 -04:00
ruv d1328b0299 test(homecore-api): serialize HOMECORE_CORS_ORIGINS env tests (fix parallel race)
env_override_* and env_empty_* both set_var/remove_var the same process-global
HOMECORE_CORS_ORIGINS; under full-workspace parallelism they raced (one's
remove_var wiped the other's value mid-assert). Serialize via a poison-tolerant
module Mutex. Test-only.

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-06-12 01:00:58 -04:00
ruv d0da5888e3 docs(adr): ADR-161 — HOMECORE server-layer security & honest-labeling sweep (M7)
Records the Milestone 7 audit: library cores are real (anti-slop positive) but
the network boundary had a CRITICAL WS auth bypass (A1) + reply-theater (A2) +
documented-but-no-op automation (A3-A7) + a network-exposed dev bin (A8), all
fixed and graded MEASURED with failing-on-old tests. Cites the NO-ACTION
security positives (uuid::v4 CSPRNG refuted-suspicion, hardened CORS,
no-traversal migrate, no-secrets-in-logs, honest HAP stub) and the deferred
backlog (plugin authority-isolation P5, sig-verification P4, HAP real pairing
P2, bounded run-modes, YAML load-at-boot).

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-06-12 00:55:52 -04:00
ruv e51704cd25 docs(homecore-plugins): label sig/hash fields '(P4 - not yet enforced)' (ADR-161 B5)
manifest.rs documented wasm_module_hash as 'verified before execution' but
wasm_module_hash/wasm_module_sig/publisher_key are never read for verification
(only set to None in tests). Re-doc'd the three fields as P4-not-yet-enforced
so the doc matches the code. No verification code added (that is P4); no false
capability claimed.

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-06-12 00:55:51 -04:00
ruv dff75a479e fix(homecore-automation): start engine + implement time/run-mode/choose/template (ADR-161 A3-A7)
A3 (HIGH): homecore-server constructed AutomationEngine then dropped it
immediately while the doc claimed automation was active. Now .start()s the
engine into a long-lived binding (event loop + timer task).

A4 (HIGH): Trigger::Time was hard-coded false with no timer. Added a 1 Hz
wall-clock timer task that fires time: automations when local HH:MM:SS matches
'at' (HH:MM or HH:MM:SS); matches_sync(Time)=false is now correct + documented.

A5 (HIGH): RunMode was documented as AtomicBool-enforced but every trigger
spawned unbounded parallel. Each automation now carries a running AtomicBool;
Single/IgnoreFirst skip re-entrant triggers, Parallel fires every time.
(Bounded Queued/Restart/max → ACCEPTED-FUTURE, honestly stated in the doc.)

A6 (HIGH): Action::Choose discarded choices and always ran default. Now
deserialises each branch's conditions, evaluates them, and runs the first
matching branch; default only if none match.

A7 (MEDIUM): template: conditions were always false in the engine path
(EvalContext built with template_env: None). The engine now builds a
TemplateEnvironment over the state machine and threads it into every
EvalContext (event loop, timer, Choose).

Tests (fail on old source):
- engine_behaviors::time_trigger_fires_via_timer_path (A4)
- engine_behaviors::single_mode_does_not_double_fire_on_rapid_triggers (A5; old fired 2x)
- engine_behaviors::parallel_mode_does_fire_concurrently (A5)
- action::choose_runs_matching_branch_not_default (A6; old ran default)
- engine_behaviors::template_condition_evaluates_true_in_engine (A7; old always false)

engine.rs kept <500 lines; behavioral tests moved to tests/engine_behaviors.rs.

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-06-12 00:55:34 -04:00
ruv 9d52d49c0b fix(homecore-api): close WS auth bypass + reply-theater, harden dev bin (ADR-161 A1/A2/A8)
A1 (CRITICAL): the /api/websocket handshake accepted any non-empty token,
ignoring the LongLivedTokenStore whitelist the REST path enforces — a full
WS auth bypass. Now validates via state.tokens().is_valid() before auth_ok;
wrong tokens get auth_invalid + close.

A2 (HIGH): WS command replies were pushed into an mpsc whose only consumer
logged and discarded them — no result/pong/event reached the client. Split
the socket with futures StreamExt::split; a dedicated writer task drains the
response channel onto the wire.

A8 (HIGH): the homecore-api dev bin bound 0.0.0.0 with unconditional
allow-any auth and no env path. Wired the HOMECORE_TOKENS env path (dev
fallback warn-logged when unset) and defaulted the bind to 127.0.0.1
(HOMECORE_BIND to opt into LAN).

Tests (fail on old source):
- ws_handshake::wrong_token_is_rejected (old → auth_ok)
- ws_handshake::result_reply_is_received / ping_pong_reply_is_received (old → timeout)
- server_bin_auth::provisioned_bin_rejects_wrong_bearer / from_env_path_enforces_whitelist

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-06-12 00:55:16 -04:00
rUv d0a7690f8f
Merge pull request #1024 from ruvnet/feat/v2-beyond-sota-sweep-m5
Beyond-SOTA sweep M5–M6 (ADR-159/160): appliance + edge-skill honesty + crates.io publish
2026-06-12 00:39:21 -04:00
ruv 8487192d0f docs(proof): PROOF.md capstone + scripts/prove.sh reproduction harness
One-command harness: clone, run scripts/prove.sh, and every headline claim is
either verified on your machine (re-runs the bug-catching tests) or printed as
'CLAIMED — not reproduced here' with the exact prerequisite. Hard gate =
workspace tests + deterministic Python proof; section 3 re-runs 7 anti-slop
assertion tests (each fails on pre-fix code); gated claims (GPU/dataset/hardware/
trained-checkpoint/named-identity) are honestly listed, never faked.

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-06-12 00:19:43 -04:00
ruv d120cc2278 test(sensing-server): unique per-process temp dirs (deterministic under concurrent runs)
checkpoint_round_trip / rvf_test / rvf_pipeline_test shared fixed temp_dir paths
and remove_dir at teardown, so two concurrent/repeated test runs raced (one's
teardown wiped the other's file -> NotFound). Make each dir process-unique.
Test-only; no public API change.

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-06-12 00:11:24 -04:00
ruv 8ad0d0f91c test+docs(wasm-edge): honest-labeling presence tests + ADR-160 (ADR-159 backlog now TRUE)
- tests/honest_labeling.rs: 10 source-presence tests asserting the A1-A5 claim
  invariants (disclaimers present, uncited stat removed, WEAPON_ALERT no longer
  exported, med_* feature-gated, no static-mut event buffers). Each is designed to
  FAIL on the pre-fix source (ADR-159 A5 manifest-roundtrip style).
- ADR-160: records the headline (0 stubs/0 theater, all real DSP -> claim-surface
  honesty debt), the graded A1-A5 fixes, NO-ACTION positives, per-prefix
  classification, and the DATA-GATED deferred backlog (criterion benches,
  per-skill accuracy validation, wasm32 static_mut_refs CI confirmation).
- ADR-159: its deferred-backlog line "wasm-edge ... honestly labelled, not claimed"
  is now actually TRUE.

Validation (all 0 failed, host --features std):
  DEFAULT 615 | MEDICAL (+medical-experimental) 653 | NO-DEFAULT 615; 0 warnings.

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-06-12 00:01:22 -04:00
ruv 36af09a4a8 feat(wasm-edge): honest labeling + static-mut soundness for edge skills (ADR-160)
The wasm-edge skill library runs real DSP with 0 stubs / 0 theater; the exposure
is an over-confident claim surface on unvalidated skills plus a latent static-mut
soundness issue. Make the labels TRUE (do not pretend to validate the capability)
and fix the soundness mechanically:

- A1 (HIGH): med_seizure/cardiac/respiratory/sleep_apnea/gait -- add mandatory
  "EXPERIMENTAL / NOT VALIDATED AGAINST CLINICAL DATA / NOT A MEDICAL DEVICE"
  disclaimers, soften assertive verbs to "flags candidate <X>-like signatures",
  and gate all 5 behind a NON-default medical-experimental cargo feature so they
  cannot be silently shipped. DSP kept.
- A2 (HIGH): exo_happiness_score/exo_emotion_detect -- delete the uncited
  "~12% faster" stat, add "speculative, unvalidated affect heuristic; outputs are
  NOT measurements of emotion" disclaimers, reframe HAPPINESS_SCORE as a
  gait-energy proxy. Math kept.
- A3 (MEDIUM): sec_weapon_detect -- rename EVENT_WEAPON_ALERT ->
  EVENT_HIGH_METAL_REFLECTIVITY and WEAPON_RATIO_THRESH -> HIGH_REFLECTIVITY_THRESH
  (a variance ratio measures reflectivity, not weapons). Registry updated.
- A4 (MEDIUM): exo_dream_stage/exo_gesture_language -- add experimental
  disclaimers, promote the Exotic/Research tag into the header.
- A5 (MEDIUM, soundness): replace ~61 `static mut EVENTS`/EV/TE/EMPTY per-call
  scratch buffers (60 modules) with owned per-instance `events` fields returned as
  `&self.events[..n]`. Public signature unchanged; behavior preserved. Only the
  two legitimate single-threaded WASM module singletons (lib.rs STATE,
  ghost_hunter DETECTOR) remain as static mut. Removes the static_mut_refs source.

NO-ACTION positives (cited, labels untouched): qnt_* (quantum-/Grover-inspired,
disclosed), exo_time_crystal, exo_ghost_hunter, sig_*/lrn_* algorithm-named skills.

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-06-12 00:01:04 -04:00
ruv 772ece4568 docs(adr): ADR-159 Cognitum appliance beyond-SOTA sweep
Records the anti-AI-slop sweep over cog-person-count, cog-pose-estimation,
cog-ha-matter, ruview-swarm. HEADLINE: the "never identified anyone"
accusation is REFUTED (real SHA-pinned Ed25519-signed trained Candle
models, honest 34%/3% accuracy in manifests). Documents claim-surface
fixes A1-A5 (MEASURED), NO-ACTION positives (witness chain, fusion, PPO +
randn audit), graded SOTA landscape (counting/pose DATA-GATED, swarm MARL
untrained-at-runtime by design), and the deferred backlog (benches,
Location/Vector, Matter v0.8, wasm-edge accuracy).

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-06-11 23:10:03 -04:00
ruv 48b002fa7e docs(cog-ha-matter): stop claiming Matter until it exists (ADR-159 A5)
Matter commissioning is deferred to v0.8 (TlsConfig::Off, LAN-only, per
tls_defaults_to_off_for_v1_lan_only). Soften the Cargo.toml description
from "Home Assistant + Matter integration" to "Home Assistant (MQTT)
integration ... Matter Bridge commissioning is deferred to v0.8 and not
yet implemented" (honest-absence, ADR-158 pattern). No code change.

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-06-11 23:10:02 -04:00
ruv 8d9c5994db fix(ruview-swarm): honest NED metres in Remote ID, not WGS84 (ADR-159 A3)
RemoteIdBroadcast::update stored NED metres (state.position.x/.y) into
drone_lat/drone_lon, so the ASTM F3411 broadcast would carry physically
-impossible coordinates ("latitude = 37.5 m"). The module doc claimed a
Location/Vector message but only encode_basic_id() exists.

- Rename drone_lat/drone_lon -> drone_north_m/drone_east_m (NED metres
  relative to the operator/takeoff datum), documented as non-geodetic.
  operator_lat/lon stay true WGS84.
- Correct the module doc to claim Basic ID only; Location/Vector encoding
  is deferred until a datum-anchored NED->WGS84 transform lands.

Never broadcast physically-impossible coordinates.

Failing-on-old test:
security::remote_id::tests::test_ned_offset_stored_as_metres_not_latlon.

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-06-11 23:10:02 -04:00
ruv 6b5fd3cf25 fix(cog-person-count): emit real signed manifest from CLI (ADR-159 A4)
cmd_manifest emitted a null skeleton (binary_sha256: null) while the
real signed manifest existed on disk at
cog/artifacts/manifests/<arch>/manifest.json.

- New manifest module include_str!-embeds the real signed manifests
  (x86_64 + arm), selected by build target arch.
- cmd_manifest parses-then-emits the embedded signed manifest, mirroring
  cog-pose-estimation manifest_roundtrips. CLI now reports the real
  binary_sha256, weights_sha256, Ed25519 signature, and honest
  build_metadata (training_class1_accuracy = 0.343).

Failing-on-old test:
manifest::tests::embedded_manifest_has_non_null_binary_sha256 (+
embedded_manifest_is_signed, embedded_manifest_id_matches_cog).
Verified end-to-end: cog-person-count manifest -> non-null sha256.

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-06-11 23:10:01 -04:00
ruv 2400216920 fix(cog-person-count): flag untrained-class counts low_confidence (ADR-159 A2)
The count head has 8 classes but count_train_results.json only has
support for classes 0/1 (presence, not multi-occupant counting). An
argmax on classes 2..=7 is out-of-distribution, yet the cog emitted it
as a confident headcount and the crate billed itself a "multi-person
counter".

- Add MAX_TRAINED_CLASS=1, CountPrediction::is_low_confidence() and
  clamped_count().
- person.count events now carry low_confidence + raw_count, downgrade to
  level "warn" when OOD, and clamp the reported count to the trained
  range (no fabricated headcount).
- run.started discloses count_max_trained_class / count_classes.
- Cargo.toml description: "multi-person counter" ->
  "presence detector + (data-gated) person count".

Multi-occupant accuracy stays DATA-GATED (not fabricated).

Failing-on-old test: untrained_class_argmax_is_flagged_low_confidence.

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-06-11 23:10:01 -04:00
ruv 98bf8c4726 fix(cog-pose-estimation): emit frames under default config (ADR-159 A1)
pose_v1 has no confidence head, so infer() emits a constant 0.185 per
frame. The config default_min_confidence was 0.3 and the runtime gates
on confidence >= min_confidence, so a default install silently emitted
ZERO pose.frame events while health reported healthy.

- Add inference::MODEL_TYPICAL_CONFIDENCE (0.185, the validation PCK@50)
  as the single published per-frame confidence.
- Pin default_min_confidence() to MODEL_TYPICAL_CONFIDENCE so a default
  install clears its own gate and emits.
- Warn at run.started when min_confidence exceeds the model typical
  confidence (disclosed, not silent); document the trade-off in the
  config field, the JSON schema, and inference.rs.

Failing-on-old test: default_config_emits_frames_with_real_model
(with old 0.3 it panics: "default install would emit zero pose.frame
events").

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-06-11 23:10:00 -04:00
ruv 2e4461d64d release: bump 9 crates changed in the beyond-SOTA sweep for crates.io
vitals/wifiscan/hardware/nn 0.3.0->0.3.1, ruvector 0.3.1->0.3.2,
signal 0.3.2->0.3.3, train 0.3.1->0.3.2, mat 0.3.0->0.3.1,
sensing-server 0.3.1->0.3.2.

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-06-11 22:41:21 -04:00
rUv 427c56881b
Merge pull request #1023 from ruvnet/feat/v2-beyond-sota-sweep
Beyond-SOTA v2/crates sweep (ADR-154–158) + implement every stub for real (no AI-slop)
2026-06-11 22:27:59 -04:00
ruv 97fae198d1 docs(changelog): beyond-SOTA sweep ADR-154–158 + stub-implementation push
Co-Authored-By: claude-flow <ruv@ruv.net>
2026-06-11 22:16:05 -04:00
ruv 156323564a docs(readme): correct person-identification claims to measured reality (#1021)
An external audit correctly found the person-ID/Soul-Signature capability was
spec-only with a no-op oracle. The §3.6 matcher is now real (wifi-densepose-bfld)
but WiFi-only channels are MEASURED not-separable (cardiac+respiratory gap ~0.0005);
named identity is data-gated on enrollment with the decisive AETHER/body-resonance
channel. README now frames person re-id as experimental research, not a shipped feature.

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-06-11 22:13:05 -04:00
ruv d79c22e03a fix(homecore-assist): exact in-memory cosine k-NN, drop fragile :memory: HNSW
The semantic recognizer built a ruvector-core VectorDB at ":memory:"; under
full-workspace feature unification the file-storage backend is enabled and
":memory:" is an invalid Windows filename (os error 123), panicking via
.expect(). Replace the external index with an exact in-memory cosine k-NN over
the enrolled exemplars (embeddings are L2-normalised, so cosine = dot product).
For HOMECORE's small intent vocabularies this is faster, fully deterministic,
and removes the storage backend + cross-crate feature coupling entirely.
ruvector-core dropped from the crate (only used here). Workspace 3122 passed/0 failed.

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-06-11 22:13:04 -04:00
ruv 3d96789475 docs(adr): ADR-158 MAT/world-model beyond-SOTA sweep (graded, MEASURED)
Records the cluster sweep: §1 triage unification, §2 real RSSI + dedup, §3 real
ESP32/UDP/PCAP ingest with honest typed errors, §4 parabolic interpolation,
§5 real GDOP, §6 occworld-prior fail-safe (mat consumes none). Graded SOTA table
(RF-through-rubble DATA-GATED; worldgraph NO-ACTION already-SOTA; worldmodel
clamp-proven; pointcloud cited), confirmed negative results, deferred backlog
(nothing dropped), and reproduction commands.

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-06-11 21:54:04 -04:00
ruv e1dc6e05ab feat(mat): wire real ESP32/UDP/PCAP CSI ingest; honest typed errors for gated adapters (ADR-158 §3)
hardware_adapter read_esp32_csi/read_udp_csi/read_pcap_csi returned 'not yet
implemented'. Wired them to the real CsiParser/PcapCsiReader that already live in
csi_receiver:
 - UDP: bind + recv + parse (auto-detect) -> CsiReadings. End-to-end test sends a
   real JSON datagram on the wire and parses it.
 - PCAP: load + read_next + parse. End-to-end test writes a real little-endian
   .pcap with one record and reads it back.
 - ESP32: parse CSI_DATA CSV via the real parser; live serial byte I/O behind an
   optional  feature (native serialport gated off the default/appliance
   build) — without it, live reads return a typed UnsupportedAdapter while the
   byte parser still works (tested).

Intel5300/Atheros/PicoScenes now return typed HardwareUnavailable/UnsupportedAdapter
(no device/driver/validatable-format here) instead of fake CSI — added
AdapterError::HardwareUnavailable and ::UnsupportedAdapter. Test asserts the gated
adapters error honestly.

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-06-11 21:54:04 -04:00
ruv 982994ca3c fix(mat): real dimensionless GDOP = sqrt(trace((HtH)^-1)), not ad-hoc angle factor (ADR-158 §5)
estimate_gdop returned an average-pair-angle factor merely labelled GDOP (the same
class of defect ADR-156 §2.3 fixed). Replaced with the genuine Geometric Dilution
of Precision computed from the range-measurement Jacobian H (unit target->sensor
bearings): GDOP = sqrt(trace((HtH)^-1)), dimensionless, returning None for singular
(collinear) geometry which the caller treats as factor 1.0. Tests assert a
well-spread array yields lower GDOP than a near-collinear one, cross-check the
closed form, and confirm singular geometry returns None.

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-06-11 21:54:04 -04:00
ruv c9a8ca758a feat(mat): real 3-point parabolic peak interpolation in find_dominant_frequency (ADR-158 §4)
The comment claimed interpolation but the function returned the bin center,
capping breathing-rate resolution at +/-half a bin. Implemented quadratic
(3-point parabolic) peak interpolation: delta = 0.5*(yL-yR)/(yL-2y0+yR), clamped
to [-0.5,0.5], with an edge fallback to bin center. For a parabola-shaped peak the
recovery is exact (delta=0.4 for a true peak at bin 10.4). Test asserts the result
lands within half a bin of truth and strictly beats the old bin-center estimate.

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-06-11 21:54:04 -04:00
ruv 650e2b5c52 fix(mat): real RSSI localization + vitals-signature dedup, kill count inflation (ADR-158 §2)
simulate_rssi_measurements always returned vec![], so every survivor got
location: None, which disabled spatial dedup — one person re-detected across N
scan cycles became N survivors, fabricating a mass-casualty event. Two fixes:

1. Real RSSI source: SensorPosition gains an optional last_rssi (populated by the
   hardware layer from actual signal-strength readings). collect_rssi_measurements
   reads only real per-sensor RSSI and feeds the existing triangulator; it NEVER
   fabricates a value. <min_sensors real readings -> None location (honest).

2. Zone + vitals-signature dedup: when no usable location exists, record_detection
   matches an existing active, un-located survivor in the same zone whose latest
   vital signature (breathing presence + START rate band, heartbeat presence,
   movement class) is compatible — collapsing repeat detections of one person while
   keeping genuinely distinct survivors (different rate bands) separate.

Tests (fail on old code): 3x identical-vitals/None-location -> 1 survivor (was 3);
distinct vitals stay 2; real-RSSI path yields a position; no-RSSI path yields None.

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-06-11 21:54:04 -04:00
ruv 78821f1657 fix(mat): unify divergent triage engines to single canonical source (ADR-158 §1)
The ensemble gate (EnsembleClassifier::determine_triage) and the survivor
record (Survivor::new -> TriageCalculator::calculate) used two different
START-protocol approximations with different rate bands and movement handling.
The pipeline gated on the ensemble triage then discarded it and recomputed via
TriageCalculator, so a survivor could be admitted as one priority and recorded
as another (e.g. 28 bpm + Tremor: gate said Delayed, record said Immediate).
In a mass-casualty tool that divergence is a life-safety defect.

determine_triage now delegates to TriageCalculator (the single source of truth),
retaining only the ensemble confidence gate (low confidence -> Unknown, except
Immediate which is never suppressed). Updated unit + integration tests to the
canonical expectations and added a divergent-boundary regression asserting
gate triage == survivor-record triage.

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-06-11 21:54:03 -04:00
ruv 67dd539e68 bench(pointcloud): sweep points-per-cell density for splats bench
Realistic depth backprojection is dense (many points per 8 cm voxel). Sweep
points-per-cell {4,16,64,256} at n=50k instead of point-count, so the
measurement reflects where the 9-pass→2-pass reduction actually applies.
Parity guard (old≡new, bit-for-bit) holds at every density.

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-06-11 21:47:19 -04:00
ruv 2754af804e feat(occworld): real conv encoder/decoder forward pass + honesty flag
Replace the `Tensor::randn` stubs in occworld-candle's VQVAE encoder
(`encode_occupancy`) and decoder (`decode_to_logits`) with a real,
deterministic, input-dependent convolutional forward pass. Previously
`predict()` emitted trajectory waypoints + confidence that were a function
of RANDOM NOISE, independent of the input and silently presented as model
output — the exact "AI slop" the project must eliminate.

occworld-candle:
- New `cnn.rs`: `Encoder2D` (3× Conv2d + GELU, interpolate2d to pin the
  token grid) and `Decoder2D` (upsample_nearest2d + Conv2d + 1×1 head).
  Both are deterministic functions of the input — same input → identical
  output; different input → different output. No randn in any forward path.
- Deterministic weight init (`det_fill`, seeded xorshift64*) across all
  `dummy()` constructors (encoder/decoder, VQ codebook, quant-convs,
  transformer), so untrained engines are bit-for-bit reproducible.
- `InferenceOutput.weights_trained: bool` — honest disclosure flag. `false`
  for `dummy()` (real but untrained net), `true` only after `load()` reads a
  real checkpoint. Priors are always from the real forward pass, never faked.
- VQ codebook + quant/post-quant convs kept and wired encoder→VQ→decoder.
- Centerpiece tests in `tests/predict_honesty.rs` (input-dependence,
  run-to-run + cross-engine determinism, untrained flag). All three FAIL on
  the old randn stub (verified by temporarily reinstating randn).

pointcloud:
- Optimize `to_gaussian_splats` hot path: 9 separate `.iter().sum()` passes
  per voxel → 2 fused accumulation passes. Bit-identical output.
- `benches/splats_bench.rs` (criterion) measures old 9-pass vs new 2-pass
  with a parity guard. ~1.3× faster on representative cloud sizes.
- Confirmed: no `randn`/placeholder in any claimed production path. The
  remaining synthetic generators (`send_test_frames`, `demo_depth_cloud`)
  and honestly-flagged heuristics (`heuristic_pose_from_amplitude`,
  luminance pseudo-depth fallback) are explicitly disclosed, not faked output.

DATA-GATED: a trained checkpoint. An untrained-but-real net is the honest
deliverable; accuracy is flagged via `weights_trained`, never claimed.

Tests: occworld 16 unit + 3 integration + 2 doc, pointcloud 18 — all pass
(CPU `Device::Cpu`; CUDA feature is GPU-gated and untouched).

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-06-11 21:47:19 -04:00
ruv 7c80711454 feat(homecore-assist,homecore-recorder): replace stubs with real impls (ADR-132/133)
Implements the three placeholder paths with real, tested behaviour and an
honest typed result wherever a capability is genuinely data-gated.

homecore-assist:
- runner.rs: add LocalRunner — runs the real IntentRecognizer pipeline and
  returns a fully-formed RufloResponse (resolved intent + speech). NoopRunner
  is now honest: typed NotStarted before spawn, explicit empty after (never a
  silent fabricated response). A live ruflo-agent.js subprocess remains the
  data-gated future path.
- recognizer.rs / semantic_recognizer.rs: real SemanticIntentRecognizer — embeds
  the utterance (deterministic feature-hash embedding, new embedding.rs) and runs
  ruvector-core HNSW nearest-neighbour search over enrolled exemplars, accepting
  matches above a configurable cosine-similarity threshold (default 0.75) and
  falling back to regex below it. Measured: paraphrase "turn on the kitchen
  light" vs exemplar "turn on the light" -> sim 0.855 (match); "schedule a
  dentist appointment" -> sim 0.106 (no-match). `semantic` feature on by default.

homecore-recorder:
- db.rs: search_states_by_text — real SQL LIKE query over entity_id/state/attrs
  returning real rows (newest-first, k-capped, LIKE-escaped). search_semantic now
  falls back to it when the vector index yields no hits, so it is no longer
  always-empty under the default NullSemanticIndex.

Tests (real behaviour; each fails on the old always-empty stub, verified):
- homecore-assist: 39 passed / 0 failed
- homecore-recorder (P1, no features): 19 passed / 0 failed
- homecore-recorder (P2, --features ruvector): 25 passed / 0 failed
All files < 500 lines; homecore-server consumer still builds.

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-06-11 21:40:20 -04:00
ruv a0e72eef50 feat(wifiscan,sensing): native wlanapi.dll FFI + real Matter manual code
wifiscan (Tier 2 wlanapi adapter ONLY):
- Real native wlanapi.dll BSS-list FFI (new adapter/wlanapi_native.rs):
  WlanOpenHandle -> WlanEnumInterfaces -> WlanGetNetworkBssList ->
  WlanFreeMemory/WlanCloseHandle via windows-sys 0.59 (already in lock
  tree). Per-BSSID RSSI(dBm)/channel/band/radio-type/SSID + CSI-capable
  filter. #[cfg(windows)] real path; #[cfg(not(windows))] returns typed
  WifiScanError::Unsupported (honest, never fabricated).
- wlanapi_scanner now native-first with documented netsh fallback,
  native_scans metric, scan_native()/scan_native_csi_capable(), and a
  benchmark() that MEASURES real Hz (no hardcoded "10x" claim).
- MEASURED 9.74 Hz native on ruvzen (30 iters, Native backend) vs netsh
  ~2 Hz baseline. Live measurement kept as an #[ignore] test.
- Cargo.toml: unsafe_code forbid->deny so only the audited wlan_ffi
  module opts into unsafe; all unsafe confined + null-checked + freed.

sensing-server (Matter commissioning):
- Replaced the lossy modulo placeholder in matter/commissioning.rs with
  the real Matter Core Spec 1.3 §5.1.4.1.1 field-packing. Canonical
  vector (20202021, 3840) now encodes to the published 34970112332.
- Added ManualPairingCode::decode + DecodedManualCode proving the code
  is real/lossless (passcode round-trips bit-for-bit; short
  discriminator = top 4 bits) with Verhoeff integrity, incl. proptest.

Tests: wifi-densepose-wifiscan 145 passed (real FFI exercised on
Windows); wifi-densepose-sensing-server 614 passed. 0 failed.

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-06-11 21:39:42 -04:00
ruv b0ee2a4aaf docs(soul): mark §3.6 matching algorithm as implemented + data-gated
Update specification.md §3.6 ONLY with an honest implementation-status note:
the matching algorithm is now implemented and tested in
v2/crates/wifi-densepose-bfld/, weights remain unvalidated design intent, and
named-identity locking is data-gated (cardiac+respiratory alone are not
separable — measured gap ~0.0005). The broader Soul Signature system remains
Pre-Implementation.

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-06-11 21:16:41 -04:00
ruv e2864bbd52 test(bfld): measured §3.6 separability + audit's cardiac-alone negative result
Deterministic synthetic-data tests producing reproducible, honestly-labeled
numbers (MEASURED-on-synthetic, explicitly NOT real-person identification):

- same_person_scores_higher_than_cross_person: self-match ≈1.0000,
  cross-person ≈0.8088 (full channels) — a real but modest ~0.19 margin.
- cardiac_alone_cannot_separate_identity_matches_audit (centerpiece): with the
  decisive channels (AETHER 0.35, subcarrier 0.20) absent, cardiac (0.15) +
  respiratory (0.10) alone give same=1.0000 cross=0.9995, gap=0.0005 — no
  threshold fits, so the matcher correctly refuses to lock identity. Proves the
  audit's claim 'your heartbeat alone overlaps too much' with real numbers.
- Graceful degradation, zero-norm/NaN safety, insufficient-channels typed
  result, empty-enrolled-set, threshold boundary, min-channels gate.

13 new tests; full crate suite 364 passed / 0 failed.

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-06-11 21:16:20 -04:00
ruv b08e49e47c feat(bfld): implement §3.6 Soul Signature matcher + real SoulMatchOracle
First running implementation of the spec's §3.6 per-channel weighted-cosine
matcher (docs/research/soul/specification.md). Replaces reliance on NullOracle
(which always returns NotEnrolled) with a real EnrolledMatcher oracle.

- soul_channels.rs: 8-channel SoulChannels container (AETHER reuses
  IdentityEmbedding, preserving invariant I2 — no Clone/Serialize, zeroized on
  Drop), MatchWeights with the §3.6 default table (unvalidated design intent),
  heapless FeatureVector. no_std-compatible.
- soul_match.rs: match_score() implementing the exact formula
  Σ w·cos / Σ w·availability, with graceful degradation, zero-norm/NaN safety,
  and a typed 'insufficient channels' result (never a default-high score).
  EnrolledMatcher (std) satisfies the existing SoulMatchOracle trait, gated on
  a score threshold AND a minimum shared-channel count (so a single low-weight
  channel can never lock identity). NullOracle retained as the disabled default.

Named-identity locking remains data-gated: it requires real AETHER enrollment +
body-resonance data, which has not been provided.

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-06-11 21:16:05 -04:00
ruv 66ebf798e5 docs(adr): ADR-157 Hardware/Sensing beyond-SOTA sweep — Milestone 3
Documents Milestone 3 across the four acquisition crates (vitals, hardware,
wifiscan, calibration). Honest headline: this layer was already well-hardened,
so the real work is small.

- §A1 (perf, MEASURED): Vec::remove(0) O(n^2) sliding windows -> VecDeque.
  End-to-end win is NULL within noise at realistic window sizes (DSP dominates);
  the win is the algorithmic O(n^2)->O(n) shown in isolation. Claimed nothing
  more -- the committed bench proves the null.
- §A2 (correctness): breathing partial-weights scale-mixing -> normalized by
  Sigma(effective weights). Pinned by two fail-on-old tests.
- §A3 (stability): IIR resonator divergence. Corrected the research report's
  physically-inaccurate trigger (divergence needs |r|>=1, i.e. bw>=4, not "r
  negative"); clamp + finite-guard. Pinned by two fail-on-old tests.
- §B1 hardening on an unreachable (already-gated) truncation path -- disclosed.
- §B4 (constant-time HMAC compare) DEFERRED: not worth a new direct `subtle`
  dependency for an 8-byte LAN sync-beacon tag.
- MEASURED negative-results section (the centerpiece): esp32_parser length gate,
  sync_packet infallible slices, the whole ieee80211bf validate-on-deserialize /
  no-panic-FSM / single-role / SBP-single-evaluate model, secure_tdm HMAC+replay,
  netsh_scanner fixed-argv + Option parse, geometry_embedding MAX_COORD_M -- each
  cited file:line, all NO-ACTION.
- SOTA landscape: deep-CSI vitals (DATA-GATED), 802.11bf conformance (CLAIMED,
  non-public suite), per-room calibration (CLAIMED on numbers), native wlanapi
  FFI multi-BSSID (CLAIMED-unmeasured -- explicitly NOT claiming the 10x). Mostly
  NO-ACTION / ACCEPTED-FUTURE.
- Deferred backlog (§8): nothing silently dropped.

Validation: cargo test --workspace --no-default-features = 3054 passed / 0
failed; python verify.py = VERDICT PASS (hash unchanged, Rust-only changes).

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-06-11 21:00:59 -04:00
ruv 0b78eb6e03 fix(hardware): drop-instead-of-truncate subcarrier count in 802.11bf bridge (ADR-157 §B1)
OpportunisticCsiBridge::ingest built CsiReportPayload.n_subcarriers via
`self.amp_accum.len() as u16`, which would silently wrap a count above 65_535.
Replace with `u16::try_from(...).ok()?` (drop-instead-of-truncate). Disclosed
honestly as defense-in-depth on an UNREACHABLE path: ingest already gates
subcarrier_count > MAX_REPORT_SUBCARRIERS (484) at entry and report.validate()
rejects oversized counts downstream, so the cast can never wrap in practice.
Correct-by-construction rather than gate-dependent; no behavior change, no new
test (the gate prevents the input that would exercise it).

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-06-11 21:00:32 -04:00
ruv 8fb6ef6547 fix(vitals): renormalize partial-weight fusion + clamp IIR resonator (ADR-157 §A2/§A3)
§A2 (correctness): BreathingExtractor weighted fusion was an un-normalized sum.
When `weights` was supplied shorter than n, supplied entries were used raw while
the missing tail defaulted to uniform 1/n -- two scales summed with no
renormalization, silently mis-scaling the breathing signal by a factor of
weights.len(). Extract to fuse_weighted_residuals() and normalize by
Sigma(effective weights), mirroring heartrate::compute_phase_coherence_signal.
Tests: partial_weights_are_renormalized_not_scale_mixed,
partial_weights_fusion_is_weighted_average (both fail on old code).

§A3 (stability): the IIR resonator pole radius r = 1 - bw/2 diverges when the
pole MAGNITUDE |r| >= 1 (i.e. bw >= 4: a very low fs relative to band width) --
NOT merely when r is negative, as the research report stated (a negative r with
|r| < 1 is still stable; the comments/tests are corrected accordingly). On
divergence the filter overflows to +/-inf within ~600 frames, NaN-poisons acf0,
and the extractor stalls permanently. Clamp r to [0, 0.9999] AND finite-guard
the filter output before the history push (defense-in-depth, mirrors ADR-154 §3).
Applied to both heartrate.rs and breathing.rs. Tests:
{heartrate,breathing}::low_sample_rate_filter_stays_finite (fs=0.5, 0.1-0.9 Hz
band, 600-frame unit step -> all-finite; both panic on old code).

These files also carry the §A1 VecDeque window conversion (bit-identical).

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-06-11 21:00:19 -04:00
ruv a7f7adfabc perf(vitals,wifiscan): O(1) VecDeque sliding windows + vitals bench (ADR-157 §A1/§D1)
Replace Vec::remove(0) (O(n) per-sample buffer shift -> O(n^2) full-window
sweep) with VecDeque push_back/pop_front (O(1) eviction) in the fixed-length
sliding/ring buffers of the vital-sign and wifiscan extractors. Where the
autocorrelation / zero-crossing / Pearson loop needs a contiguous slice,
make_contiguous() is called once per extract(), matching the idiom already used
in wifiscan/pipeline/orchestrator.rs. Output is bit-identical.

Sites: anomaly.rs (rr/hr history), store.rs (readings ring; history() now takes
&mut self to hand back a contiguous slice, no external callers), wifiscan
breathing_extractor.rs (filtered history), wifiscan correlator.rs (per-BSSID
histories -> Vec<VecDeque<f32>>). (heartrate.rs/breathing.rs windows land with
the §A2/§A3 fixes in a separate commit.)

New criterion bench crates/wifi-densepose-vitals/benches/vitals_bench.rs drives
each extractor over a full-window fill. Honest MEASURED result: end-to-end win
is NULL within noise at realistic ESP32 window sizes (1500-3000) because the
per-frame DSP dominates the eviction (heartrate 42.8ms->44.4ms, breathing
7.95ms->7.86ms, overlapping CIs). In isolation the eviction collapses O(n^2)
-> O(n) (34.6x at window=3000, 3158x at window=100000); A1 lands as the correct
data structure removing a latent O(n^2), NOT a claimed hot-path speedup.

Reproduce: cargo bench -p wifi-densepose-vitals --bench vitals_bench

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-06-11 20:59:57 -04:00
ruv 0ce2ac6440 docs(adr): ADR-156 RuVector/Fusion beyond-SOTA sweep — Milestone 2
Documents Milestone 2 of the beyond-SOTA sweep on the cross-viewpoint fusion
path: four correctness/integrity/security fixes (each pinned by a bug-catching
test), one MEASURED hot-path perf win, and the ANN/fusion SOTA landscape graded
MEASURED/CLAIMED/data-gated.

- Integrity: honest dimensionless GDOP (was RMSE mislabelled); canonical wrapped
  angular distance (disclosed numeric no-op under cos kernel — landed for
  contract/single-source-of-truth, not claimed as a behaviour change).
- Security: crafted-index/zero-bin DoS panics closed on the multistatic path.
- Perf: fuse() double-clone eliminated, ~2.17x on marshalling (MEASURED).
- SOTA landscape: SymphonyQG (#1, CLAIMED — reproduction deferred) +
  multi-bit/Extended RaBitQ (#2, accepted near-term, the sketch.rs Pass-2);
  GraphPose-Fi learned fusion head documented ACCEPTED-FUTURE, data-gated per
  ADR-152 (b); CRB/sensor-placement investigated, no action (already SOTA).
- Deferred backlog (§8): nothing silently dropped.

Validation: cargo test --workspace --no-default-features = 3050 passed / 0
failed; python verify.py = VERDICT PASS.

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-06-11 20:23:43 -04:00
ruv a92b043143 perf(ruvector): eliminate fuse() double-clone (~2.17x marshalling) + bench (ADR-156 §2.4, §4)
MultistaticArray::fuse / fuse_ungated cloned every viewpoint embedding twice per
fusion (once into `extracted`, again when building the attention input). Now the
embeddings are MOVED out of `extracted` (one clone per viewpoint instead of two),
capturing geometry/ids by Copy in the same pass. Correctness-neutral — all 100
viewpoint/mat lib tests pass unchanged.

MEASURED (new benches/fusion_bench.rs, embedding_extract A/B, 8 vp x 128-d):
  before_double_clone 1.0029 us -> after_single_clone 461.6 ns  (~2.17x)
End-to-end fusion_pipeline (8 vp): 202 us — marshalling is <1% of fusion
(n*n attention dominates), so end-to-end win is modest; the A/B isolates the
clone elimination. Reproduce:
  cargo bench -p wifi-densepose-ruvector --bench fusion_bench

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-06-11 20:23:27 -04:00
ruv a2daa2e443 fix(ruvector): crafted-input DoS — no panic on out-of-range indices (ADR-156 §2.2)
Security fix: two functions on a fusion/localisation path that can carry
network-sourced multistatic frames panicked on crafted input (remote DoS).

- triangulation::solve_triangulation indexed ap_positions[0] (empty table) and
  ap_positions[i]/[j] (crafted out-of-range AP index in a TDoA tuple). Now uses
  .first()? / .get(i)? / .get(j)? — returns None, never panics.
- heartbeat::band_power computed n_freq_bins-1 (usize underflow on a zero-bin
  spectrogram) and did not clamp low_bin. Now guards n_freq_bins==0 and clamps
  both bounds into [0,last]; returns 0.0 for empty/inverted ranges.

Tests (each panics on old code, verified by revert):
triangulation_out_of_range_index_returns_none_no_panic,
triangulation_empty_ap_positions_returns_none_no_panic,
heartbeat_band_power_zero_bins_no_panic,
heartbeat_band_power_out_of_range_bounds_no_panic.

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-06-11 20:23:12 -04:00
ruv 5b3e337c6d fix(ruvector): honest GDOP + canonical wrapped angular distance (ADR-156 §2.1, §2.3)
Two correctness/integrity fixes on the cross-viewpoint fusion geometry path,
each pinned by a regression test that fails on the old code.

- GDOP mislabel (§2.3): CramerRaoBound.gdop was `sqrt(crb_x+crb_y)` — identical
  to rmse_lower_bound (metres, noise-dependent), NOT a dimensionless GDOP. Now
  computes true GDOP = sqrt(trace(G^-1)) on the unit-variance bearing geometry,
  in both estimate() and estimate_regularised(); INFINITY (not NaN) for
  degenerate collinear geometry. Test gdop_is_dimensionless_and_noise_independent
  asserts GDOP is unchanged under 10x noise while RMSE scales 10x (old code
  failed: it scaled with noise, proving it was RMSE).

- Angular wrap (§2.1): GeometricBias::build_matrix used raw |delta-azimuth|
  (can exceed pi, mis-states the 0/2pi seam) instead of the wrapped distance.
  angular_distance made pub and reused as the single canonical helper. HONEST:
  under the current cos() kernel this is a NUMERIC NO-OP (cos is even/periodic,
  cos(raw)==cos(wrapped)); landed for contract correctness + single-source-of-
  truth + future non-even kernels, not as a behaviour change. Tests pin the
  contract (wrapped value in [0,pi], seam symmetry).

ruvector lib tests: 100 passed / 0 failed (+ new tests).

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-06-11 20:22:59 -04:00
ruv ea5ead7fb7 docs(adr): ADR-155 NN/training beyond-SOTA sweep — Milestone 1
Records the integrity-critical fixes (unified canonical metric, leak-free
subject-disjoint split + synthetic-val disclosure, rapid_adapt real gradients,
proof margin + committed-hash rigor), the Tier-2 correctness/security fixes, the
measured Tier-3 perf win, the NN SOTA landscape graded MEASURED/CLAIMED/
THEORETICAL (GraphPose-Fi as top ACCEPTED-future candidate; INT4; CSI-JEPA-vs-MAE
with the honest "no JEPA/MAE-on-WiFi-pose yet" caveat; "Mamba-CSI-pose does not
exist"), and the ~45-finding deferred backlog. Discloses the libtorch/tch-gating
limitation and that the Rust proof is honestly in SKIP until a baseline is
committed.

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-06-11 19:57:54 -04:00
ruv 5cacb5fe0a perf(nn): zero-copy ORT input (~1.48x) + dynamic-dim guard + concurrency bench (ADR-155 §Tier-3)
- onnx.rs ORT input: arr.as_slice() single-memcpy fast path with iterator
  fallback for strided views. MEASURED [1,256,64,64]: 1.972ms -> 1.336ms
  (~1.48x). Repro: cargo bench -p wifi-densepose-nn --no-default-features
  --features onnx --bench onnx_bench -- onnx_input_copy
- onnx.rs checked_output_dims: reject ONNX dim <= 0 (incl. unresolved -1) before
  allocation (config-OOM class) + test.
- onnx_concurrency bench: empirically proves the per-inference write lock
  serializes (throughput drops with more threads). The intended read-lock win is
  NOT landable on ort 2.0.0-rc.11 (safe Session::run is &mut self, verified) and
  is deferred to the backlog with the upgrade path documented in-code.

New committed fixture tests/fixtures/tiny_conv.onnx (666 B, not gitignored).

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-06-11 19:57:53 -04:00
ruv aa3a6725a6 fix(train,nn): Tier-2 correctness/security — metric scale, OOM bounds, panics (ADR-155 §Tier-2)
Each fix ships a test that would have caught the bug:
- ruview_metrics OKS: derive scale from GT extent (no s=1.0 fake-Gold), reject
  s<=0, bound the loop to array extents (no panic on short/adversarial input).
- config.validate(): UPPER bounds on window_frames/subcarriers/backbone_channels/
  heatmap_size/keypoints/body_parts/batch_size + reject negative gpu_device_id
  (closes the config-OOM class); defaults+presets still validate.
- subcarrier.rs: graceful fallback instead of panic on non-contiguous input.
- ablation.rs latency_percentiles: total_cmp + NaN guard (no partial_cmp unwrap).
- tensor.rs softmax(axis): normalize per-lane along the given axis (was whole-
  tensor), out-of-range axis -> NnError; fixes densepose per-pixel probs.
- translator.rs apply_attention: real scaled-dot-product attention (was a
  uniform 1/seq_len stub that made any "with attention" ablation == without);
  mis-shaped checkpoint projections rejected.

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-06-11 19:57:32 -04:00
ruv 84e2c920fd fix(train): proof margin + committed-hash requirement (ADR-155 §Tier-1.4)
The deterministic proof self-certified: PASS on any loss decrease (incl. 1e-9
noise) and a missing expected hash defaulted to PASS.

- MIN_LOSS_DECREASE=1e-4: a run counts as learning only above float noise; a
  noise-only pipeline now FAILS.
- is_pass() requires hash_matches==Some(true); no-hash -> SKIP (exit 2), never
  PASS. verify-training fails fast on a sub-margin loss before the hash compare,
  so a missing baseline cannot mask a non-learning pipeline.

Documented honestly: the proof certifies reproducibility/determinism on a
synthetic dataset, NOT that real data produced the weights nor that any accuracy
claim is met. Tests: no_committed_hash_is_skip_not_pass,
submargin_loss_change_fails_even_without_hash,
committed_matching_hash_with_real_decrease_passes.

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-06-11 19:57:16 -04:00
ruv 7fb3e33557 fix(train): rapid_adapt real finite-difference gradients, not a fake step (ADR-155 §Tier-1.3)
contrastive_step/entropy_step wrote a fake gradient (grad += v*0.01) unrelated
to the stated objective, so any "TTA improves the metric" was unsupported. The
*_loss functions are now pure evaluators of the real objective; adapt() descends
them with a central finite-difference gradient of that exact loss, so "the
adaptation loss decreases" is now a real, reproducible measurement.

Honest scope caveat (documented): this minimizes a self-supervised proxy over a
LoRA bottleneck on raw CSI; it is NOT wired to the pose model and there is NO
measured end-to-end PCK gain on WiFi pose from this path.

Tests: contrastive_loss_decreases, entropy_loss_decreases (real gradient steps
don't increase the loss), reported_loss_is_the_real_objective_not_a_placeholder.

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-06-11 19:57:15 -04:00
ruv 2a2a2c5b06 fix(train): leak-free subject-disjoint split + synthetic-val disclosure (ADR-155 §Tier-1.2)
MM-Fi windows are stride-1 (~99% overlap), so an index-level split leaks; and
bin/train.rs validated real training against a SYNTHETIC val set, making any
printed PCK meaningless on two counts.

- MmFiDataset::subject_disjoint_split partitions whole subjects -> the two views
  share no subject and no window (leak-free by construction, deterministic per
  seed). assert_split_leak_free verifies subject- AND window-disjointness and is
  called inside the split so a leaky split is never handed out.
- bin/train.rs now prefers the real split; the synthetic path is a labelled
  run_smoke_test ("[SMOKE-TEST] DO NOT REPORT") reachable only as a fallback.
- New DatasetError::InvalidSplit.

Tests prove disjointness, determinism, single-subject/bad-fraction rejection,
and that the validator catches an injected subject leak.

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-06-11 19:56:57 -04:00
ruv 50b657459f fix(train): unify 7 divergent PCK/OKS into one canonical metric (ADR-155 §Tier-1.1)
Collapse the four PCK and three OKS implementations into a single source of
truth — pck_canonical (torso hip↔hip, COCO/ADR-152 convention validated at
~96% PCK@20 in benchmarks/wiflow-std) and oks_canonical (scale from GT pose
extent). MetricsAccumulator, compute_pck/_per_joint/_oks, aggregate_metrics and
the deprecated *_v2 path all route through them, so Trainer::evaluate() and the
bench definition agree.

Fixes two claim-inflating bugs, each pinned by a regression test:
- zero-visible-joint PCK was 1.0 (false-perfect) -> now 0.0
- OKS s=1.0 on normalized coords made OKS~=1.0 for any pose ("fake Gold tier")
  -> scale now derived from the pose; a 3x-torso-wrong pose yields OKS<0.2

Divergent local kernels (training_bench raw-threshold, sensing-server
torso-height) annotated "DO NOT USE for reported metrics". Legitimately changed
test expectations (all-coincident "perfect" fixtures are correctly unscoreable;
all-invisible -> 0.0) updated with comments citing the finding.

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-06-11 19:56:44 -04:00