The model.safetensors file currently published at
huggingface.co/ruvnet/wifi-densepose-pretrained has a malformed header:
the 8-byte u64 declares 1464 header bytes, the JSON document ends at
byte 1461, and the last 3 bytes of the header zone are literal 0x00
padding instead of the spec-required 0x20 spaces. Strict safetensors
readers — Rust safetensors crate, Candle, safetensors.torch.load_file —
reject with 'SafetensorError: trailing characters at line 1 column 1462'.
This commit:
- adds docs/huggingface/SAFETENSORS-HEADER-BUG.md with byte-level
evidence, spec citation, source-of-bug location (the SafeTensorsWriter
in vendor/ruvector/.../export.js — separate repo at ruvnet/ruvector),
list of three trainer scripts that go through this path
(train-wiflow.js, train-ruvllm.js, train-camera-free.js), table of
affected vs lenient consumers, 10-line strict-reader repro that
reproduces the exact error class against a synthetic file, proposed
upstream fix (0x20 padding or no padding), and a follow-ups checklist
including the need to re-train/re-export and re-upload the HF artifact
- flags the bundle as needing republish under [Unreleased] in CHANGELOG.md
- updates the HF model section of docs/user-guide.md so the load example
now patches the header with scripts/fix-safetensors-header.py before
calling safetensors.torch.load_file (which would otherwise crash on
the current bundle), and flips the Python/PyTorch row of the
consumer-status table from 'Works' to 'Broken header — strict readers
reject; patch with scripts/fix-safetensors-header.py'
Addresses three findings from the 2026-05-11 training-pipeline audit:
#1/#2 — `wifi-densepose-signal` was a phantom dependency of `wifi-densepose-train`
(listed in Cargo.toml, never imported), and vitals/CSI signal features were
absent from the pipeline. New module `wifi_densepose_train::signal_features`:
`extract_signal_features(&Array4<f32>, &Array4<f32>) -> Array1<f32>` (and the
convenience method `CsiSample::signal_features()`) runs a windowed observation's
centre frame through `wifi_densepose_signal::features::FeatureExtractor`,
producing a fixed-length (FEATURE_LEN=12) amplitude / phase-coherence / PSD
feature vector — the hook for a future vitals / multi-task supervision head
(breathing- and heart-rate-band power are read off the PSD summary). The vector
is produced on demand and is not yet fed back into the loss; wiring it as a
training target is the documented follow-up. `wifi-densepose-signal` is now an
actually-used dependency. 5 new tests (2 unit in signal_features.rs, 3
integration in tests/test_dataset.rs); existing wifi-densepose-train tests
unchanged and green.
#3 — `docs/huggingface/MODEL_CARD.md` presented PIR/BME280 environmental-sensor
weak-label fine-tuning as a current capability; there is no env-sensor
ingestion in the training pipeline. Marked that path as planned/not-implemented
in the training-steps list and the data-provenance section.
(#5 — README's "92.9% PCK@20" overclaim — fixed separately in PR #535.)
CHANGELOG updated.
- publish-huggingface.sh: retrieves HF token from GCloud Secrets,
uploads models to ruvnet/wifi-densepose-pretrained
- publish-huggingface.py: Python alternative with --dry-run support
- docs/huggingface/MODEL_CARD.md: beginner-friendly model card with
WiFi sensing explanation, quick start code, hardware BOM, and citation
GCloud Secret: HUGGINGFACE_API_KEY in project cognitum-20260110
Co-Authored-By: claude-flow <ruv@ruv.net>