security(occworld-candle): int32-checkpoint crash + degenerate-input guards + ADR-179 (closes Milestone #9) (#1101)
* fix(occworld-candle): security review fixes — int32 checkpoint crash + predict input validation
Beyond-SOTA security + correctness review of wifi-densepose-occworld-candle
(Milestone #9, crate 4/4 — the last ungated crate).
Findings fixed:
1. HIGH (MEASURED) — checkpoint-load crash on any int32 tensor.
model.rs mapped safetensors I32 -> candle DType::I64 and passed the raw
int32 byte buffer (4 bytes/elem) to Tensor::from_raw_buffer(.., I64, ..).
Candle derives elem_count = data.len() / dtype.size(), so the I64 path
halved the count while keeping the original shape -> a tensor whose shape
claims 2x its storage. Reading it PANICS (slice OOB: "range end index 6
out of range for slice of length 3") on any checkpoint containing an int32
tensor. Fixed: I32 -> DType::I32, I16 -> DType::I16 (both first-class
candle dtypes). Reproduced on old code; pinned in tests/checkpoint_loading.rs.
2. LOW (MEASURED) — predict() lacked frame/batch validation at the input
boundary. f_in > num_frames*2 over-indexed the temporal embedding (cryptic
candle "gather" error); zero frame/batch fed a zero-element tensor in. Now
rejected with a clear ShapeMismatch. Pinned in tests/input_validation.rs.
3. LOW (MEASURED) — divide-by-zero panic in the public VQCodebook::encode on a
rank-0 / empty-last-dim tensor (last == 0). Now fails closed with a clear
error. Pinned in vqvae.rs unit tests.
Dimensions confirmed clean with evidence: panic surface (no unwrap/expect/
panic in prod paths), NaN-state-poisoning (N/A — stateless engine, u8 input),
unbounded-alloc/shape-data mismatch (defended upstream by safetensors::
validate), secrets (none). unsafe_code = forbid.
Validation (MEASURED, Windows): crate 31/31 pass; workspace 0 failed (lone
desktop api_integration "Access is denied" file-lock flake passes 21/21 in
isolation); Python proof VERDICT PASS, hash f8e76f21…446f7a unchanged.
Warrants ADR slot 179 (parent to author).
Co-Authored-By: claude-flow <ruv@ruv.net>
* docs(adr): ADR-179 — occworld-candle checkpoint-load hardening (closes Milestone #9)
Records the HIGH int32-checkpoint crash fix (I32→I64 dtype-widening → slice-OOB
panic on load = DoS) + 2 LOW degenerate-input fixes from 5e77f47e5. Stateless
engine (NaN-poisoning N/A), unsafe forbidden, safetensors validate() defends
malloc upstream. occworld 31/31. Final ungated crate — Milestone #9 complete.
Co-Authored-By: claude-flow <ruv@ruv.net>