213 lines
12 KiB
Markdown
213 lines
12 KiB
Markdown
# ADR-158: MAT / World-Model Cluster — Beyond-SOTA Sweep, Anti-"AI-Slop" Hardening
|
||
|
||
- **Status**: accepted
|
||
- **Date**: 2026-06-11
|
||
- **Deciders**: ruv
|
||
- **Tags**: mat, life-safety, localization, triage, worldmodel, worldgraph, geo, engine, prove-everything
|
||
|
||
## Context
|
||
|
||
This ADR records the beyond-SOTA sweep over the MAT / world-model cluster
|
||
(`wifi-densepose-mat`, `-worldmodel`, `-worldgraph`, `-geo`, `-engine`), executed
|
||
under the project's **prove-everything / anti-"AI-slop"** directive: every stub is
|
||
either implemented with real logic or replaced by an honest typed error; no
|
||
fake/always-empty/random outputs; tests pass on real behaviour; results are graded
|
||
**MEASURED** (reproduced here with the command recorded), **CLAIMED**,
|
||
**DATA-GATED** (real code path present, needs hardware/data we lack), or
|
||
**NO-ACTION** (already-SOTA — cited as a positive).
|
||
|
||
The Mass Casualty Assessment Tool touches life-safety. A triage metric that is
|
||
disconnected from the decision it gates, or a survivor count that inflates, is the
|
||
worst class of slop: it produces confident, wrong rescue prioritisation. An audit
|
||
against live code found six concrete defects, four of which were silent
|
||
correctness bugs (not missing features) in the triage → gate → record path and in
|
||
the localization/dedup path.
|
||
|
||
Grading vocabulary follows ADR-152 (F-evidence grades) and the sweep convention:
|
||
- **MEASURED** — reproduced in this worktree, command recorded below.
|
||
- **DATA-GATED** — real code path implemented; returns a typed error / honest
|
||
provenance flag where hardware or labelled data is genuinely absent.
|
||
- **NO-ACTION (already-SOTA)** — audited, found correct, cited as a positive.
|
||
- **ACCEPTED-FUTURE** — deliberately deferred, nothing dropped.
|
||
|
||
## Graded SOTA Landscape
|
||
|
||
| Capability | Grade | Note |
|
||
|------------|-------|------|
|
||
| RF-through-rubble survivor detection | **DATA-GATED** | Real detection + triage + localization code paths run end-to-end on real CSI bytes; field detection *accuracy* is unproven without instrumented rubble trials and is **not fabricated** here. |
|
||
| OccWorld occupancy architecture (`-worldmodel`) | **NO-ACTION (current)** | `occupancy.rs` voxel mapping is clamp-proven bounds-safe; converts WorldGraph person positions to a 200×200×16 grid with no out-of-bounds path. |
|
||
| WorldGraph provenance / privacy / pruning (`-worldgraph`) | **NO-ACTION (already-SOTA)** | `graph.rs` implements append-with-provenance (`DerivedFrom`), deterministic LRU pruning, and a privacy rollup (`PrivacyLimitedBy`). Cited as a positive; no changes needed. |
|
||
| Point-cloud parser bounds-safety (`-pointcloud`) | **NO-ACTION (already-SOTA)** | Another agent's crate; cited only — its parser is bounds-checked. Out of scope for this ADR's edits. |
|
||
| Learned multi-person counter | **DATA-GATED** | Deferred; requires labelled multi-occupant CSI. The zone+vitals-signature dedup (below) is the honest non-learned stand-in. |
|
||
| RF point-cloud generation | **ACCEPTED-FUTURE** | Not dropped; tracked as future work. |
|
||
|
||
## Decision — Fixes Landed (MEASURED)
|
||
|
||
### §1 Unify the two divergent triage engines (CRITICAL)
|
||
|
||
**Was:** `EnsembleClassifier::determine_triage` (ensemble gate) and
|
||
`TriageCalculator::calculate` (survivor record) were two different START-protocol
|
||
approximations with different rate bands and movement handling. The pipeline
|
||
gated on the ensemble's confidence (`lib.rs:489`), discarded the ensemble triage
|
||
(`lib.rs:524`, `_ensemble`), and recomputed via `TriageCalculator` in
|
||
`Survivor::new` (`survivor.rs:194`). A survivor could be admitted at one priority
|
||
and recorded at another.
|
||
|
||
**Now:** `determine_triage` delegates to `TriageCalculator` — the **single source
|
||
of truth** used by both the gate and the survivor record. The only ensemble-
|
||
specific behaviour retained is the confidence gate (low confidence → `Unknown`,
|
||
except `Immediate`, which is never suppressed — a missed survivor in distress is
|
||
costlier than a false positive). Rate bands follow START (<10 / >30 bpm →
|
||
Immediate).
|
||
|
||
**Failing-on-old test:** `detection::ensemble::tests::test_divergent_boundary_28bpm_tremor_gate_equals_survivor`
|
||
— 28 bpm Normal + Tremor. Old gate → Delayed, old survivor record → Immediate
|
||
(divergent). Unified result: gate == survivor == **Immediate**. Companion tests
|
||
(`test_no_vitals_is_unknown_canonical`, `test_normal_breathing_no_movement_is_immediate_canonical`,
|
||
the updated `integration_adr001::test_ensemble_classifier_triage_logic`) assert
|
||
gate-vs-record equality on every boundary.
|
||
|
||
### §2 Real RSSI/ToA localization + kill count-inflation (HIGH)
|
||
|
||
**Was:** `fusion.rs:79 simulate_rssi_measurements` always returned `vec![]`, so
|
||
every survivor got `location: None`, so spatial dedup (`disaster_event.rs:285`,
|
||
which only fired on `Some` location) was disabled. One trapped person re-detected
|
||
across N scan cycles became **N survivors** — a fabricated mass-casualty count.
|
||
|
||
**Now, two real mechanisms:**
|
||
1. **Real RSSI source:** `SensorPosition` gains an optional `last_rssi`
|
||
(populated by the hardware layer from actual signal-strength readings).
|
||
`collect_rssi_measurements` reads only real per-sensor RSSI and feeds the
|
||
existing triangulator; it **never fabricates** a value. With `< min_sensors`
|
||
real readings, `estimate_position` returns `None` (honest).
|
||
2. **Zone + vitals-signature dedup:** when no usable location exists,
|
||
`record_detection` matches an existing *active, un-located* survivor in the
|
||
same zone whose latest vital signature (breathing presence + START rate band,
|
||
heartbeat presence, movement class) is compatible — collapsing repeat
|
||
detections of one person while keeping genuinely distinct survivors separate.
|
||
|
||
**MEASURED:** `test_identical_vitals_no_location_dedup_to_one` — 3× identical-vitals
|
||
/ `None`-location → **1 survivor** (old code: 3). `test_distinct_vitals_no_location_stay_separate`
|
||
keeps two distinct survivors at 2 (no under-count). `test_estimate_position_uses_real_rssi`
|
||
yields a position from 3 real-RSSI sensors; `test_estimate_position_none_without_real_rssi`
|
||
yields `None` (no fabrication).
|
||
|
||
### §3 Real ESP32/UDP/PCAP CSI ingest; honest typed errors elsewhere (HIGH)
|
||
|
||
**Was:** `hardware_adapter.rs read_esp32_csi` / `read_udp_csi` / `read_pcap_csi`
|
||
returned "not yet implemented" — even though `csi_receiver.rs` already contained a
|
||
working `CsiParser` (ESP32 CSV, JSON, Intel5300/Atheros/Nexmon byte decoders) and a
|
||
real `PcapCsiReader`.
|
||
|
||
**Now:**
|
||
- **UDP** — binds, receives one datagram, parses (auto-detect) → `CsiReadings`.
|
||
End-to-end test sends a real JSON datagram on the wire.
|
||
- **PCAP** — `load` + `read_next` + parse. End-to-end test writes a real
|
||
little-endian `.pcap` with one record and reads it back.
|
||
- **ESP32** — parses `CSI_DATA` CSV via the real parser. Live serial byte I/O is
|
||
behind an optional `serial` cargo feature (native `serialport` kept off the
|
||
default / aarch64 appliance build); with the feature off, live reads return a
|
||
typed `UnsupportedAdapter` while the byte parser still works.
|
||
- **Intel 5300 / Atheros / PicoScenes** — return typed
|
||
`AdapterError::HardwareUnavailable` / `UnsupportedAdapter` (no device, no
|
||
driver, or no validatable format here). **Never fake CSI.** New error variants
|
||
added to make the gating typed rather than a `String` "Hardware" soup.
|
||
|
||
**MEASURED:** `test_esp32_bytes_parse_end_to_end`, `test_udp_read_end_to_end`,
|
||
`test_pcap_read_end_to_end`, `test_intel_and_atheros_are_honestly_unavailable`.
|
||
|
||
### §4 Real parabolic peak interpolation in `find_dominant_frequency` (MED)
|
||
|
||
**Was:** `breathing.rs:243` comment claimed interpolation but returned the bin
|
||
center, capping breathing-rate resolution at ±half a bin.
|
||
|
||
**Now:** 3-point parabolic (quadratic) peak interpolation,
|
||
`δ = 0.5·(yL − yR)/(yL − 2y0 + yR)`, clamped to `[-0.5, 0.5]`, with an edge
|
||
fallback to bin center.
|
||
|
||
**MEASURED:** `test_find_dominant_frequency_parabolic_interpolation` — for a
|
||
parabola-shaped peak at true bin 10.4 the recovery is exact (δ = 0.4); the test
|
||
asserts the result lands within half a bin of truth and strictly beats the
|
||
old bin-center estimate.
|
||
|
||
### §5 GDOP honesty (LOW)
|
||
|
||
**Was:** `triangulation.rs:248 estimate_gdop` returned an ad-hoc average-pair-angle
|
||
factor *labelled* GDOP (the same defect class ADR-156 §2.3 fixed elsewhere).
|
||
|
||
**Now:** real, dimensionless **GDOP = √(trace((HᵀH)⁻¹))** from the range-measurement
|
||
Jacobian `H` (unit target→sensor bearings), returning `None` for singular
|
||
(collinear) geometry, which the caller treats as factor 1.0 (no fabrication).
|
||
|
||
**MEASURED:** `test_gdop_is_real_dilution` — a well-spread array gives a lower GDOP
|
||
than a near-collinear one, cross-checked against the closed form;
|
||
`test_gdop_singular_collinear_is_none` confirms singular geometry returns `None`.
|
||
|
||
### §6 OccWorld trajectory-prior consumer honesty (fail-safe)
|
||
|
||
**Finding:** `wifi-densepose-mat` does **not** consume OccWorld trajectory priors
|
||
and has no `-worldmodel`/`-worldgraph`/occworld dependency (grep-verified: zero
|
||
hits across `crates/wifi-densepose-mat/`). There is therefore no random-derived
|
||
prior being consumed. **No code change** is warranted; the fail-safe (ignore
|
||
priors until a typed `weights_complete`/`stubbed` flag exists) is already the
|
||
status quo by absence. Recorded here so a future consumer wires the flag rather
|
||
than re-introducing the risk.
|
||
|
||
## Negative Results (Confirmed — NO-ACTION)
|
||
|
||
These were audited and found genuinely correct; they are cited as positives, not
|
||
edited:
|
||
|
||
- **`worldgraph` provenance / privacy / pruning** (`graph.rs`) — append-with-
|
||
provenance (`add_semantic_state` + `DerivedFrom`), deterministic LRU pruning
|
||
(`prune_semantic_states`, with `prune_is_deterministic_for_equal_timestamps`),
|
||
and a privacy rollup (`apply_privacy_mode` → `PrivacyLimitedBy`). Already-SOTA.
|
||
- **`worldmodel` occupancy clamp** (`occupancy.rs:74–125`) — `to_voxel_xy` /
|
||
`to_voxel_z` `.clamp()` voxel indices into `[0, GRID-1]`; the flat index is
|
||
always in-bounds. No out-of-bounds / fabrication path.
|
||
- **`pointcloud` parser bounds-safety** — another agent's crate; cited only, its
|
||
parser is bounds-checked.
|
||
|
||
## Deferred Backlog (Nothing Dropped)
|
||
|
||
- **Learned multi-person counter** — DATA-GATED on labelled multi-occupant CSI.
|
||
The zone+vitals-signature dedup (§2) is the honest non-learned stand-in until
|
||
then.
|
||
- **RF point-cloud generation** — ACCEPTED-FUTURE.
|
||
- **PicoScenes container decode** — DATA-GATED; needs matching NIC/plugin to
|
||
validate against. Returns `UnsupportedAdapter` today.
|
||
- **Intel 5300 / Atheros live capture** — DATA-GATED on patched drivers; byte
|
||
parsers exist and are exercised on supplied bytes.
|
||
|
||
## Consequences
|
||
|
||
- Triage is now a single auditable function; gate and survivor record can never
|
||
diverge.
|
||
- Survivor counts cannot inflate from repeat detection of one un-located person.
|
||
- The CSI ingest layer either produces real data or fails with a typed error that
|
||
names *why* — no path silently substitutes simulated/fabricated CSI.
|
||
- `SensorPosition` grows an optional `last_rssi` field (serde-`default`, non-
|
||
breaking for deserialisation; 7 constructors updated).
|
||
- A new optional `serial` feature isolates the native `serialport` dependency from
|
||
the default / appliance builds.
|
||
|
||
## Reproduction (MEASURED)
|
||
|
||
```bash
|
||
cd v2
|
||
# MAT — default features (181 unit + 6 + 3[3 ignored] integration)
|
||
cargo test -p wifi-densepose-mat
|
||
# MAT — all features (same counts; exercises ruvector + api + serde paths)
|
||
cargo test -p wifi-densepose-mat --all-features
|
||
# MAT — serial feature compiles (native serialport path)
|
||
cargo check -p wifi-densepose-mat --features serial
|
||
# Sibling crates (cited NO-ACTION; confirmed green)
|
||
cargo test -p wifi-densepose-worldmodel # 12 + 1
|
||
cargo test -p wifi-densepose-worldgraph # 9
|
||
cargo test -p wifi-densepose-geo # 9 + 8
|
||
cargo test -p wifi-densepose-engine # 27
|
||
```
|
||
|
||
Result at time of writing: MAT **181 passed; 0 failed** (default and all-features);
|
||
worldmodel **13**, worldgraph **9**, geo **17**, engine **27** — all 0 failed.
|