security(core,cli): pin CSI-deserialiser DoS-resistance + ADR-172 (clean-with-evidence) (#1091)
* test(core,cli): pin DoS-resistance of CSI deserialisers (ADR-127 security review)
Beyond-SOTA security review of wifi-densepose-core + wifi-densepose-cli.
Load-bearing-question verdict: the NaN-state-poisoning bug class does NOT
originate in core — core exposes no stateful accumulator (no Welford,
von-Mises, IIR, voxel grid, running mean); each downstream crate rolls its
own, so each fix is correctly local. Both crates confirmed clean on every
reviewed dimension (panic-on-adversarial-input, NaN handling, unbounded
memory, path traversal, secrets) — no production code changed.
Adds 4 regression pins locking in two existing-but-untested DoS guards:
- core: from_canonical_bytes shape guard (Vec::with_capacity bound) — proven
to fail with `capacity overflow` when the saturating-mul guard is removed.
- core: canonical decoder never panics on arbitrary/truncated bytes.
- cli: parse_csi_packet rejects an oversized n_antennas*n_subcarriers claim
before Array2 allocation (33 MB claim in a 2 KB datagram -> None).
- cli: parse_csi_packet never panics on arbitrary UDP bytes.
core: 35 -> 37 lib tests; cli: 24 -> 26 tests; 0 failed. Python proof
unchanged (f8e76f21…46f7a — off the signal path).
Co-Authored-By: claude-flow <ruv@ruv.net>
* docs(adr): ADR-172 — wifi-densepose-cli + core CSI-deserialiser security review
Records the clean-with-evidence verdict + 4 DoS-resistance regression pins
(test-only, committed in a1051607d). Documents the load-bearing finding:
the NaN-state-poisoning bug class does NOT originate in a shared core
primitive (core exposes no stateful accumulator — MEASURED via grep), so
the 3 prior downstream-local fixes are complete. Gives the wifi-densepose-cli
review its own ADR slot (core portion cross-refs ADR-127 §9).
Co-Authored-By: claude-flow <ruv@ruv.net>
This commit is contained in:
parent
71e8756051
commit
cfd0ad76cf
File diff suppressed because one or more lines are too long
|
|
@ -0,0 +1,117 @@
|
|||
# ADR-172: `wifi-densepose-cli` + `wifi-densepose-core` CSI-Deserialiser Security Review
|
||||
|
||||
| Field | Value |
|
||||
|-------|-------|
|
||||
| **Status** | Accepted — clean-with-evidence, 4 regression pins added |
|
||||
| **Date** | 2026-06-15 |
|
||||
| **Deciders** | ruv |
|
||||
| **Codename** | **CSI-DESERIALISER-HARDENING** |
|
||||
| **Supersedes / amends** | none (records review; references ADR-127 §9 for the `core` portion, ADR-136 for the pre-existing DoS ACs) |
|
||||
|
||||
## Context
|
||||
|
||||
The beyond-SOTA security sweep (branch `feat/v2-beyond-sota-sweep`) reviewed each
|
||||
`v2/` crate for real, reproducible defects. Two crates had no prior dedicated
|
||||
security ADR:
|
||||
|
||||
- **`wifi-densepose-core`** — the dependency root for all 12 downstream crates
|
||||
(types, traits, error types, CSI frame primitives). A defect here is a
|
||||
force-multiplier: every consumer inherits it.
|
||||
- **`wifi-densepose-cli`** — the user-facing entrypoint
|
||||
(`calibrate`/`calibrate-serve`/`enroll`/`train-room`/`room-watch` + MAT-gated),
|
||||
which parses untrusted UDP CSI packets and operator-supplied paths.
|
||||
|
||||
A **specific hypothesis** motivated the core review. Three earlier reviews in
|
||||
this campaign found a systemic **NaN-state-poisoning bug class** in crates that
|
||||
depend on core (`wifi-densepose-calibration`, `-vitals`, `-geo`): a non-finite
|
||||
(NaN/Inf) input latched into persistent filter/accumulator state (IIR `y1/y2`,
|
||||
running mean, Welford/von-Mises accumulator, voxel grid) → silent **permanent**
|
||||
feature failure. The load-bearing question for this review: **does that bug class
|
||||
originate in a shared `wifi-densepose-core` primitive** (making the right fix a
|
||||
single root fix), or was it independently re-implemented in each downstream
|
||||
crate (making the three existing local fixes complete)?
|
||||
|
||||
## Decision
|
||||
|
||||
Record the review outcome and lock in the existing DoS guards with regression
|
||||
tests. **No production code is changed** — both crates were already hardened
|
||||
(ADR-136 acceptance criteria + `sanitize_room_id`); the gap was *untested*
|
||||
guards, which a future refactor could silently remove.
|
||||
|
||||
### Load-bearing question — VERDICT: **NO** (the NaN class does not live in core)
|
||||
|
||||
`wifi-densepose-core` exposes **no stateful accumulator of any kind** — no
|
||||
Welford/running-mean, no von-Mises/circular-mean, no IIR/biquad filter state, no
|
||||
voxel grid.
|
||||
|
||||
- **MEASURED:** `grep` over `core/src` for
|
||||
`welford|von_mises|biquad|y1|y2|running_mean|accumulat|voxel|self.*+=` matched
|
||||
only the `InvalidState` *error* enum variant, "reset state" doc comments, and a
|
||||
test-only LCG — **zero** stateful logic. The only float math in core is
|
||||
construction-time projection (`CsiFrame::new` → amplitude/phase via `mapv`) and
|
||||
pure stateless `utils` functions; nothing persists across frames.
|
||||
- **Corroboration:** `wifi-densepose-calibration::Features::from_series`
|
||||
(`extract.rs:103–133`) already filters non-finite samples → `Features::ZERO`.
|
||||
The downstream fixes are independently re-implemented, confirming each crate
|
||||
rolls its own accumulator and each local fix is correct and complete. **A fix
|
||||
in core would be a no-op (there is nothing to fix).**
|
||||
|
||||
Consequence: the NaN-state-poisoning class is a *downstream-local* pattern, not a
|
||||
core-rooted defect. No hidden fourth instance exists in the shared primitive.
|
||||
|
||||
### Findings (all pins — guards already present, now tested)
|
||||
|
||||
| # | Location | Guard (pre-existing) | Regression pin | Evidence (MEASURED) |
|
||||
|---|----------|----------------------|----------------|---------------------|
|
||||
| 1 | `core` `types.rs:801` `from_canonical_bytes` | `saturating_mul` shape-vs-length check before `Vec::with_capacity(rows*cols)` | `canonical_decode_oversized_shape_is_bounded_not_allocated` | With guard removed: **panics `capacity overflow` at `types.rs:801`**; with guard: passes |
|
||||
| 2 | `core` `types.rs` decoder | typed `CanonicalDecodeError`, never panics | `canonical_decode_never_panics_on_arbitrary_bytes` (fuzz sweep) | panic-free on arbitrary bytes |
|
||||
| 3 | `cli` `calibrate.rs:276–291` | length check `buf.len() < 20 + n_pairs*2` before `Array2::zeros(n_antennas*n_subcarriers)` | `test_parse_csi_packet_oversized_claim_is_rejected_not_allocated` | 255×65535 claim in a 2 KB packet → `None` (no allocation) |
|
||||
| 4 | `cli` `calibrate.rs` parser | `None`-returning on malformed input | `test_parse_csi_packet_never_panics_on_arbitrary_bytes` (fuzz sweep) | panic-free on arbitrary UDP bytes |
|
||||
|
||||
### Dimensions confirmed clean (with evidence)
|
||||
|
||||
1. **Panic-on-adversarial-input = 0** — `from_canonical_bytes` returns a typed
|
||||
error for every malformed class; `parse_csi_packet` returns `None`. Both
|
||||
fuzz-swept panic-free.
|
||||
2. **NaN handling** — `Confidence::new` rejects NaN
|
||||
(`!(0.0..=1.0).contains(&NaN)` ⇒ `Err`); `compute_bounding_box` /
|
||||
`to_flat_array` are NaN-tolerant (f32 min/max ignore NaN).
|
||||
3. **Empty-frame safety** — `amplitude_variance` / `mean_amplitude` are
|
||||
panic-free on an empty `Array2` (ndarray 0.17 returns finite / `None`).
|
||||
4. **Unbounded-memory DoS** — bounded in both deserialisers (findings 1 & 3).
|
||||
5. **Path traversal** — `calibrate-serve` defends every client-supplied
|
||||
`room_id`/`bank`/`baseline` via `sanitize_room_id` (`[A-Za-z0-9_-]`, 64-char
|
||||
cap) with existing tests; bearer-auth gate + non-loopback-bind warning present.
|
||||
`mat export` writes to an operator-supplied `PathBuf` (acceptable CLI behavior).
|
||||
6. **Secrets** — `--token` is read from `CALIBRATE_TOKEN` env, never embedded.
|
||||
|
||||
## Validation
|
||||
|
||||
- `cargo test -p wifi-densepose-core` → **35 → 37** lib passed, 0 failed (+3 doctests)
|
||||
- `cargo test -p wifi-densepose-cli --no-default-features` → **24 → 26** passed, 0 failed
|
||||
- `cargo test --workspace --no-default-features` → **exit 0**, 0 failed
|
||||
- `python archive/v1/data/proof/verify.py` → **VERDICT: PASS**, hash
|
||||
`f8e76f21a0f9852b70b6d9dd5318239f6b20cbcb4cdd995863263cecdc446f7a` **unchanged**
|
||||
(core/cli are off the signal proof path — confirms no pipeline alteration)
|
||||
|
||||
## Consequences
|
||||
|
||||
### Positive
|
||||
- Two CSI deserialisers (the untrusted-input boundary of both the library root
|
||||
and the network-facing CLI) now have their DoS guards pinned against
|
||||
regression — a future refactor that drops a length check fails CI.
|
||||
- The NaN-state-poisoning class is settled as downstream-local; reviewers no
|
||||
longer need to suspect a shared-root defect, and the three prior local fixes
|
||||
are confirmed complete.
|
||||
|
||||
### Negative
|
||||
- None. Test-only change; no behavior or API change.
|
||||
|
||||
### Neutral
|
||||
- The `core` portion is also noted in ADR-127 §9 (shared security-review log);
|
||||
this ADR is the canonical record for the `wifi-densepose-cli` review.
|
||||
|
||||
## Links
|
||||
- ADR-127 — HOMECORE state machine (shared security-review log, §9)
|
||||
- ADR-136 — pre-existing CSI deserialiser DoS acceptance criteria
|
||||
- ADR-151 — per-room calibration (`calibrate`/`calibrate-serve` surfaces)
|
||||
|
|
@ -471,6 +471,54 @@ mod tests {
|
|||
assert!(ht.record(&f).is_err());
|
||||
}
|
||||
|
||||
/// Security pin (review 2026-06, ADR-127): the UDP parser is the CLI's
|
||||
/// widest attack surface — `calibrate` / `enroll` / `room-watch` bind it to
|
||||
/// 0.0.0.0 by default, so any host on the LAN can send arbitrary bytes. A
|
||||
/// header that *claims* a huge `n_antennas * n_subcarriers` must be rejected
|
||||
/// by the length check BEFORE the `Array2::zeros` allocation, so a single
|
||||
/// small datagram can never trigger a multi-MB allocation (unbounded-memory
|
||||
/// DoS). The largest possible claim (255 × 65535 pairs ≈ 33 MB of IQ) inside
|
||||
/// a RECV_BUF-sized (2048-byte) datagram parses to `None`, never OOMs.
|
||||
#[test]
|
||||
fn test_parse_csi_packet_oversized_claim_is_rejected_not_allocated() {
|
||||
let mut buf = vec![0u8; RECV_BUF];
|
||||
buf[0..4].copy_from_slice(&0xC511_0001u32.to_le_bytes());
|
||||
buf[4] = 1; // node_id
|
||||
buf[5] = 255; // n_antennas (max)
|
||||
buf[6..8].copy_from_slice(&65535u16.to_le_bytes()); // n_subcarriers (max)
|
||||
buf[8..12].copy_from_slice(&2432u32.to_le_bytes());
|
||||
// n_pairs = 255 * 65535 = 16_711_425 → needs ~33 MB of IQ bytes that a
|
||||
// 2048-byte datagram cannot carry → length check fails → None.
|
||||
assert!(parse_csi_packet(&buf, "ht20").is_none());
|
||||
}
|
||||
|
||||
/// Security pin (review 2026-06): the parser must never panic on ANY byte
|
||||
/// string — truncated headers, lying length fields, odd sizes. IQ-loop
|
||||
/// indexing is guarded by the length check; this sweeps a spread of
|
||||
/// adversarial inputs to lock in panic-on-adversarial-input = 0.
|
||||
#[test]
|
||||
fn test_parse_csi_packet_never_panics_on_arbitrary_bytes() {
|
||||
let mut st = 0x1234_5678u64;
|
||||
let mut next = move || {
|
||||
st = st
|
||||
.wrapping_mul(6_364_136_223_846_793_005)
|
||||
.wrapping_add(1_442_695_040_888_963_407);
|
||||
(st >> 33) as u8
|
||||
};
|
||||
for len in 0..600usize {
|
||||
let buf: Vec<u8> = (0..len).map(|_| next()).collect();
|
||||
for tier in ["ht20", "he20", "garbage"] {
|
||||
let _ = parse_csi_packet(&buf, tier);
|
||||
}
|
||||
}
|
||||
// Valid magic, lying n_subcarriers, no payload → None (not a panic).
|
||||
let mut buf = vec![0u8; 20];
|
||||
buf[0..4].copy_from_slice(&0xC511_0001u32.to_le_bytes());
|
||||
buf[5] = 3;
|
||||
buf[6..8].copy_from_slice(&500u16.to_le_bytes());
|
||||
assert!(parse_csi_packet(&buf, "ht20").is_none());
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_freq_to_channel_24ghz() {
|
||||
assert_eq!(freq_mhz_to_channel(2437), 6);
|
||||
|
|
|
|||
|
|
@ -1636,6 +1636,67 @@ mod tests {
|
|||
}
|
||||
}
|
||||
|
||||
/// Security pin (review 2026-06, ADR-127) — `from_canonical_bytes` is a
|
||||
/// deserialisation boundary for replayed/forwarded captures. A forged header
|
||||
/// advertising an enormous `rows × cols` must be rejected by the
|
||||
/// shape-vs-length check (`expect` uses saturating multiplies) BEFORE the
|
||||
/// `Vec::with_capacity(rows * cols)` allocation — otherwise an attacker could
|
||||
/// drive a multi-GB allocation from a few header bytes (unbounded-memory
|
||||
/// DoS). The check guarantees `rows*cols*16 <= bytes.len()`, so the capacity
|
||||
/// is bounded by the input the caller already holds. This must not OOM.
|
||||
#[test]
|
||||
fn canonical_decode_oversized_shape_is_bounded_not_allocated() {
|
||||
use ndarray::Array2;
|
||||
let meta = CsiMetadata::new(DeviceId::new("n"), FrequencyBand::Band2_4GHz, 1);
|
||||
let data = Array2::from_shape_fn((1, 2), |(_, c)| Complex64::new(c as f64, 0.0));
|
||||
let mut bytes = CsiFrame::new(meta, data).to_canonical_bytes();
|
||||
|
||||
// The (rows, cols) u32 pair is the last 8 bytes before the payload.
|
||||
// Overwrite with a maximal claim (u32::MAX × u32::MAX) and lop off the
|
||||
// payload so the buffer is tiny but the header lies enormously.
|
||||
let shape_off = bytes.len() - 8 - 2 * 16; // 2 samples × 16 bytes payload
|
||||
bytes[shape_off..shape_off + 4].copy_from_slice(&u32::MAX.to_le_bytes());
|
||||
bytes[shape_off + 4..shape_off + 8].copy_from_slice(&u32::MAX.to_le_bytes());
|
||||
bytes.truncate(shape_off + 8); // drop the real payload
|
||||
|
||||
// expect = MAX*MAX*16 (saturated) > found → PayloadMismatch, no alloc.
|
||||
assert!(matches!(
|
||||
CsiFrame::from_canonical_bytes(&bytes),
|
||||
Err(CanonicalDecodeError::PayloadMismatch { .. })
|
||||
));
|
||||
}
|
||||
|
||||
/// Security pin (review 2026-06) — the decoder must never panic on arbitrary
|
||||
/// bytes: every malformed input is a typed `CanonicalDecodeError`, never an
|
||||
/// unwinding panic (panic-on-adversarial-input = 0). Sweep truncations and a
|
||||
/// deterministic fuzz spread.
|
||||
#[test]
|
||||
fn canonical_decode_never_panics_on_arbitrary_bytes() {
|
||||
use ndarray::Array2;
|
||||
let mut meta = CsiMetadata::new(DeviceId::new("node"), FrequencyBand::Band5GHz, 36);
|
||||
meta.antenna_config.spacing_mm = Some(50.0);
|
||||
let data = Array2::from_shape_fn((2, 8), |(r, c)| Complex64::new(r as f64, c as f64));
|
||||
let good = CsiFrame::new(meta, data).to_canonical_bytes();
|
||||
|
||||
// Every prefix of a valid encoding must decode without panicking.
|
||||
for n in 0..good.len() {
|
||||
let _ = CsiFrame::from_canonical_bytes(&good[..n]);
|
||||
}
|
||||
// Deterministic LCG fuzz over varied lengths.
|
||||
let mut st = 0xDEAD_BEEFu64;
|
||||
for len in 0..400usize {
|
||||
let buf: Vec<u8> = (0..len)
|
||||
.map(|_| {
|
||||
st = st
|
||||
.wrapping_mul(6_364_136_223_846_793_005)
|
||||
.wrapping_add(1_442_695_040_888_963_407);
|
||||
(st >> 33) as u8
|
||||
})
|
||||
.collect();
|
||||
let _ = CsiFrame::from_canonical_bytes(&buf);
|
||||
}
|
||||
}
|
||||
|
||||
/// AC8c (review finding 7) — `Some(Uuid::nil())` calibration is an
|
||||
/// encoding error: nil is the wire sentinel for `None`, so encoding it
|
||||
/// would alias two distinct frames to one byte string (and one witness).
|
||||
|
|
|
|||
Loading…
Reference in New Issue