security(core,cli): pin CSI-deserialiser DoS-resistance + ADR-172 (clean-with-evidence) (#1091)

* test(core,cli): pin DoS-resistance of CSI deserialisers (ADR-127 security review)

Beyond-SOTA security review of wifi-densepose-core + wifi-densepose-cli.
Load-bearing-question verdict: the NaN-state-poisoning bug class does NOT
originate in core — core exposes no stateful accumulator (no Welford,
von-Mises, IIR, voxel grid, running mean); each downstream crate rolls its
own, so each fix is correctly local. Both crates confirmed clean on every
reviewed dimension (panic-on-adversarial-input, NaN handling, unbounded
memory, path traversal, secrets) — no production code changed.

Adds 4 regression pins locking in two existing-but-untested DoS guards:
- core: from_canonical_bytes shape guard (Vec::with_capacity bound) — proven
  to fail with `capacity overflow` when the saturating-mul guard is removed.
- core: canonical decoder never panics on arbitrary/truncated bytes.
- cli: parse_csi_packet rejects an oversized n_antennas*n_subcarriers claim
  before Array2 allocation (33 MB claim in a 2 KB datagram -> None).
- cli: parse_csi_packet never panics on arbitrary UDP bytes.

core: 35 -> 37 lib tests; cli: 24 -> 26 tests; 0 failed. Python proof
unchanged (f8e76f21…46f7a — off the signal path).

Co-Authored-By: claude-flow <ruv@ruv.net>

* docs(adr): ADR-172 — wifi-densepose-cli + core CSI-deserialiser security review

Records the clean-with-evidence verdict + 4 DoS-resistance regression pins
(test-only, committed in a1051607d). Documents the load-bearing finding:
the NaN-state-poisoning bug class does NOT originate in a shared core
primitive (core exposes no stateful accumulator — MEASURED via grep), so
the 3 prior downstream-local fixes are complete. Gives the wifi-densepose-cli
review its own ADR slot (core portion cross-refs ADR-127 §9).

Co-Authored-By: claude-flow <ruv@ruv.net>
This commit is contained in:
rUv 2026-06-14 23:58:09 -04:00 committed by GitHub
parent 71e8756051
commit cfd0ad76cf
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
4 changed files with 227 additions and 0 deletions

File diff suppressed because one or more lines are too long

View File

@ -0,0 +1,117 @@
# ADR-172: `wifi-densepose-cli` + `wifi-densepose-core` CSI-Deserialiser Security Review
| Field | Value |
|-------|-------|
| **Status** | Accepted — clean-with-evidence, 4 regression pins added |
| **Date** | 2026-06-15 |
| **Deciders** | ruv |
| **Codename** | **CSI-DESERIALISER-HARDENING** |
| **Supersedes / amends** | none (records review; references ADR-127 §9 for the `core` portion, ADR-136 for the pre-existing DoS ACs) |
## Context
The beyond-SOTA security sweep (branch `feat/v2-beyond-sota-sweep`) reviewed each
`v2/` crate for real, reproducible defects. Two crates had no prior dedicated
security ADR:
- **`wifi-densepose-core`** — the dependency root for all 12 downstream crates
(types, traits, error types, CSI frame primitives). A defect here is a
force-multiplier: every consumer inherits it.
- **`wifi-densepose-cli`** — the user-facing entrypoint
(`calibrate`/`calibrate-serve`/`enroll`/`train-room`/`room-watch` + MAT-gated),
which parses untrusted UDP CSI packets and operator-supplied paths.
A **specific hypothesis** motivated the core review. Three earlier reviews in
this campaign found a systemic **NaN-state-poisoning bug class** in crates that
depend on core (`wifi-densepose-calibration`, `-vitals`, `-geo`): a non-finite
(NaN/Inf) input latched into persistent filter/accumulator state (IIR `y1/y2`,
running mean, Welford/von-Mises accumulator, voxel grid) → silent **permanent**
feature failure. The load-bearing question for this review: **does that bug class
originate in a shared `wifi-densepose-core` primitive** (making the right fix a
single root fix), or was it independently re-implemented in each downstream
crate (making the three existing local fixes complete)?
## Decision
Record the review outcome and lock in the existing DoS guards with regression
tests. **No production code is changed** — both crates were already hardened
(ADR-136 acceptance criteria + `sanitize_room_id`); the gap was *untested*
guards, which a future refactor could silently remove.
### Load-bearing question — VERDICT: **NO** (the NaN class does not live in core)
`wifi-densepose-core` exposes **no stateful accumulator of any kind** — no
Welford/running-mean, no von-Mises/circular-mean, no IIR/biquad filter state, no
voxel grid.
- **MEASURED:** `grep` over `core/src` for
`welford|von_mises|biquad|y1|y2|running_mean|accumulat|voxel|self.*+=` matched
only the `InvalidState` *error* enum variant, "reset state" doc comments, and a
test-only LCG — **zero** stateful logic. The only float math in core is
construction-time projection (`CsiFrame::new` → amplitude/phase via `mapv`) and
pure stateless `utils` functions; nothing persists across frames.
- **Corroboration:** `wifi-densepose-calibration::Features::from_series`
(`extract.rs:103133`) already filters non-finite samples → `Features::ZERO`.
The downstream fixes are independently re-implemented, confirming each crate
rolls its own accumulator and each local fix is correct and complete. **A fix
in core would be a no-op (there is nothing to fix).**
Consequence: the NaN-state-poisoning class is a *downstream-local* pattern, not a
core-rooted defect. No hidden fourth instance exists in the shared primitive.
### Findings (all pins — guards already present, now tested)
| # | Location | Guard (pre-existing) | Regression pin | Evidence (MEASURED) |
|---|----------|----------------------|----------------|---------------------|
| 1 | `core` `types.rs:801` `from_canonical_bytes` | `saturating_mul` shape-vs-length check before `Vec::with_capacity(rows*cols)` | `canonical_decode_oversized_shape_is_bounded_not_allocated` | With guard removed: **panics `capacity overflow` at `types.rs:801`**; with guard: passes |
| 2 | `core` `types.rs` decoder | typed `CanonicalDecodeError`, never panics | `canonical_decode_never_panics_on_arbitrary_bytes` (fuzz sweep) | panic-free on arbitrary bytes |
| 3 | `cli` `calibrate.rs:276291` | length check `buf.len() < 20 + n_pairs*2` before `Array2::zeros(n_antennas*n_subcarriers)` | `test_parse_csi_packet_oversized_claim_is_rejected_not_allocated` | 255×65535 claim in a 2 KB packet → `None` (no allocation) |
| 4 | `cli` `calibrate.rs` parser | `None`-returning on malformed input | `test_parse_csi_packet_never_panics_on_arbitrary_bytes` (fuzz sweep) | panic-free on arbitrary UDP bytes |
### Dimensions confirmed clean (with evidence)
1. **Panic-on-adversarial-input = 0**`from_canonical_bytes` returns a typed
error for every malformed class; `parse_csi_packet` returns `None`. Both
fuzz-swept panic-free.
2. **NaN handling**`Confidence::new` rejects NaN
(`!(0.0..=1.0).contains(&NaN)` ⇒ `Err`); `compute_bounding_box` /
`to_flat_array` are NaN-tolerant (f32 min/max ignore NaN).
3. **Empty-frame safety**`amplitude_variance` / `mean_amplitude` are
panic-free on an empty `Array2` (ndarray 0.17 returns finite / `None`).
4. **Unbounded-memory DoS** — bounded in both deserialisers (findings 1 & 3).
5. **Path traversal**`calibrate-serve` defends every client-supplied
`room_id`/`bank`/`baseline` via `sanitize_room_id` (`[A-Za-z0-9_-]`, 64-char
cap) with existing tests; bearer-auth gate + non-loopback-bind warning present.
`mat export` writes to an operator-supplied `PathBuf` (acceptable CLI behavior).
6. **Secrets**`--token` is read from `CALIBRATE_TOKEN` env, never embedded.
## Validation
- `cargo test -p wifi-densepose-core`**35 → 37** lib passed, 0 failed (+3 doctests)
- `cargo test -p wifi-densepose-cli --no-default-features`**24 → 26** passed, 0 failed
- `cargo test --workspace --no-default-features`**exit 0**, 0 failed
- `python archive/v1/data/proof/verify.py`**VERDICT: PASS**, hash
`f8e76f21a0f9852b70b6d9dd5318239f6b20cbcb4cdd995863263cecdc446f7a` **unchanged**
(core/cli are off the signal proof path — confirms no pipeline alteration)
## Consequences
### Positive
- Two CSI deserialisers (the untrusted-input boundary of both the library root
and the network-facing CLI) now have their DoS guards pinned against
regression — a future refactor that drops a length check fails CI.
- The NaN-state-poisoning class is settled as downstream-local; reviewers no
longer need to suspect a shared-root defect, and the three prior local fixes
are confirmed complete.
### Negative
- None. Test-only change; no behavior or API change.
### Neutral
- The `core` portion is also noted in ADR-127 §9 (shared security-review log);
this ADR is the canonical record for the `wifi-densepose-cli` review.
## Links
- ADR-127 — HOMECORE state machine (shared security-review log, §9)
- ADR-136 — pre-existing CSI deserialiser DoS acceptance criteria
- ADR-151 — per-room calibration (`calibrate`/`calibrate-serve` surfaces)

View File

@ -471,6 +471,54 @@ mod tests {
assert!(ht.record(&f).is_err());
}
/// Security pin (review 2026-06, ADR-127): the UDP parser is the CLI's
/// widest attack surface — `calibrate` / `enroll` / `room-watch` bind it to
/// 0.0.0.0 by default, so any host on the LAN can send arbitrary bytes. A
/// header that *claims* a huge `n_antennas * n_subcarriers` must be rejected
/// by the length check BEFORE the `Array2::zeros` allocation, so a single
/// small datagram can never trigger a multi-MB allocation (unbounded-memory
/// DoS). The largest possible claim (255 × 65535 pairs ≈ 33 MB of IQ) inside
/// a RECV_BUF-sized (2048-byte) datagram parses to `None`, never OOMs.
#[test]
fn test_parse_csi_packet_oversized_claim_is_rejected_not_allocated() {
let mut buf = vec![0u8; RECV_BUF];
buf[0..4].copy_from_slice(&0xC511_0001u32.to_le_bytes());
buf[4] = 1; // node_id
buf[5] = 255; // n_antennas (max)
buf[6..8].copy_from_slice(&65535u16.to_le_bytes()); // n_subcarriers (max)
buf[8..12].copy_from_slice(&2432u32.to_le_bytes());
// n_pairs = 255 * 65535 = 16_711_425 → needs ~33 MB of IQ bytes that a
// 2048-byte datagram cannot carry → length check fails → None.
assert!(parse_csi_packet(&buf, "ht20").is_none());
}
/// Security pin (review 2026-06): the parser must never panic on ANY byte
/// string — truncated headers, lying length fields, odd sizes. IQ-loop
/// indexing is guarded by the length check; this sweeps a spread of
/// adversarial inputs to lock in panic-on-adversarial-input = 0.
#[test]
fn test_parse_csi_packet_never_panics_on_arbitrary_bytes() {
let mut st = 0x1234_5678u64;
let mut next = move || {
st = st
.wrapping_mul(6_364_136_223_846_793_005)
.wrapping_add(1_442_695_040_888_963_407);
(st >> 33) as u8
};
for len in 0..600usize {
let buf: Vec<u8> = (0..len).map(|_| next()).collect();
for tier in ["ht20", "he20", "garbage"] {
let _ = parse_csi_packet(&buf, tier);
}
}
// Valid magic, lying n_subcarriers, no payload → None (not a panic).
let mut buf = vec![0u8; 20];
buf[0..4].copy_from_slice(&0xC511_0001u32.to_le_bytes());
buf[5] = 3;
buf[6..8].copy_from_slice(&500u16.to_le_bytes());
assert!(parse_csi_packet(&buf, "ht20").is_none());
}
#[test]
fn test_freq_to_channel_24ghz() {
assert_eq!(freq_mhz_to_channel(2437), 6);

View File

@ -1636,6 +1636,67 @@ mod tests {
}
}
/// Security pin (review 2026-06, ADR-127) — `from_canonical_bytes` is a
/// deserialisation boundary for replayed/forwarded captures. A forged header
/// advertising an enormous `rows × cols` must be rejected by the
/// shape-vs-length check (`expect` uses saturating multiplies) BEFORE the
/// `Vec::with_capacity(rows * cols)` allocation — otherwise an attacker could
/// drive a multi-GB allocation from a few header bytes (unbounded-memory
/// DoS). The check guarantees `rows*cols*16 <= bytes.len()`, so the capacity
/// is bounded by the input the caller already holds. This must not OOM.
#[test]
fn canonical_decode_oversized_shape_is_bounded_not_allocated() {
use ndarray::Array2;
let meta = CsiMetadata::new(DeviceId::new("n"), FrequencyBand::Band2_4GHz, 1);
let data = Array2::from_shape_fn((1, 2), |(_, c)| Complex64::new(c as f64, 0.0));
let mut bytes = CsiFrame::new(meta, data).to_canonical_bytes();
// The (rows, cols) u32 pair is the last 8 bytes before the payload.
// Overwrite with a maximal claim (u32::MAX × u32::MAX) and lop off the
// payload so the buffer is tiny but the header lies enormously.
let shape_off = bytes.len() - 8 - 2 * 16; // 2 samples × 16 bytes payload
bytes[shape_off..shape_off + 4].copy_from_slice(&u32::MAX.to_le_bytes());
bytes[shape_off + 4..shape_off + 8].copy_from_slice(&u32::MAX.to_le_bytes());
bytes.truncate(shape_off + 8); // drop the real payload
// expect = MAX*MAX*16 (saturated) > found → PayloadMismatch, no alloc.
assert!(matches!(
CsiFrame::from_canonical_bytes(&bytes),
Err(CanonicalDecodeError::PayloadMismatch { .. })
));
}
/// Security pin (review 2026-06) — the decoder must never panic on arbitrary
/// bytes: every malformed input is a typed `CanonicalDecodeError`, never an
/// unwinding panic (panic-on-adversarial-input = 0). Sweep truncations and a
/// deterministic fuzz spread.
#[test]
fn canonical_decode_never_panics_on_arbitrary_bytes() {
use ndarray::Array2;
let mut meta = CsiMetadata::new(DeviceId::new("node"), FrequencyBand::Band5GHz, 36);
meta.antenna_config.spacing_mm = Some(50.0);
let data = Array2::from_shape_fn((2, 8), |(r, c)| Complex64::new(r as f64, c as f64));
let good = CsiFrame::new(meta, data).to_canonical_bytes();
// Every prefix of a valid encoding must decode without panicking.
for n in 0..good.len() {
let _ = CsiFrame::from_canonical_bytes(&good[..n]);
}
// Deterministic LCG fuzz over varied lengths.
let mut st = 0xDEAD_BEEFu64;
for len in 0..400usize {
let buf: Vec<u8> = (0..len)
.map(|_| {
st = st
.wrapping_mul(6_364_136_223_846_793_005)
.wrapping_add(1_442_695_040_888_963_407);
(st >> 33) as u8
})
.collect();
let _ = CsiFrame::from_canonical_bytes(&buf);
}
}
/// AC8c (review finding 7) — `Some(Uuid::nil())` calibration is an
/// encoding error: nil is the wire sentinel for `None`, so encoding it
/// would alias two distinct frames to one byte string (and one witness).