Commit Graph

3 Commits

Author SHA1 Message Date
lockewerks 67d186549a docs(huggingface): document safetensors header padding bug + workaround
The model.safetensors file currently published at
huggingface.co/ruvnet/wifi-densepose-pretrained has a malformed header:
the 8-byte u64 declares 1464 header bytes, the JSON document ends at
byte 1461, and the last 3 bytes of the header zone are literal 0x00
padding instead of the spec-required 0x20 spaces. Strict safetensors
readers — Rust safetensors crate, Candle, safetensors.torch.load_file —
reject with 'SafetensorError: trailing characters at line 1 column 1462'.

This commit:
- adds docs/huggingface/SAFETENSORS-HEADER-BUG.md with byte-level
  evidence, spec citation, source-of-bug location (the SafeTensorsWriter
  in vendor/ruvector/.../export.js — separate repo at ruvnet/ruvector),
  list of three trainer scripts that go through this path
  (train-wiflow.js, train-ruvllm.js, train-camera-free.js), table of
  affected vs lenient consumers, 10-line strict-reader repro that
  reproduces the exact error class against a synthetic file, proposed
  upstream fix (0x20 padding or no padding), and a follow-ups checklist
  including the need to re-train/re-export and re-upload the HF artifact
- flags the bundle as needing republish under [Unreleased] in CHANGELOG.md
- updates the HF model section of docs/user-guide.md so the load example
  now patches the header with scripts/fix-safetensors-header.py before
  calling safetensors.torch.load_file (which would otherwise crash on
  the current bundle), and flips the Python/PyTorch row of the
  consumer-status table from 'Works' to 'Broken header — strict readers
  reject; patch with scripts/fix-safetensors-header.py'
2026-05-25 17:03:42 -06:00
rUv eaedfded6f
fix(train): wire wifi-densepose-signal into the pipeline; correct MODEL_CARD env-sensor claim (#536)
Addresses three findings from the 2026-05-11 training-pipeline audit:

#1/#2 — `wifi-densepose-signal` was a phantom dependency of `wifi-densepose-train`
(listed in Cargo.toml, never imported), and vitals/CSI signal features were
absent from the pipeline. New module `wifi_densepose_train::signal_features`:
`extract_signal_features(&Array4<f32>, &Array4<f32>) -> Array1<f32>` (and the
convenience method `CsiSample::signal_features()`) runs a windowed observation's
centre frame through `wifi_densepose_signal::features::FeatureExtractor`,
producing a fixed-length (FEATURE_LEN=12) amplitude / phase-coherence / PSD
feature vector — the hook for a future vitals / multi-task supervision head
(breathing- and heart-rate-band power are read off the PSD summary). The vector
is produced on demand and is not yet fed back into the loss; wiring it as a
training target is the documented follow-up. `wifi-densepose-signal` is now an
actually-used dependency. 5 new tests (2 unit in signal_features.rs, 3
integration in tests/test_dataset.rs); existing wifi-densepose-train tests
unchanged and green.

#3 — `docs/huggingface/MODEL_CARD.md` presented PIR/BME280 environmental-sensor
weak-label fine-tuning as a current capability; there is no env-sensor
ingestion in the training pipeline. Marked that path as planned/not-implemented
in the training-steps list and the data-provenance section.

(#5 — README's "92.9% PCK@20" overclaim — fixed separately in PR #535.)

CHANGELOG updated.
2026-05-11 23:40:55 -04:00
ruv 9a2bc1839a feat: HuggingFace model publishing pipeline + model card
- publish-huggingface.sh: retrieves HF token from GCloud Secrets,
  uploads models to ruvnet/wifi-densepose-pretrained
- publish-huggingface.py: Python alternative with --dry-run support
- docs/huggingface/MODEL_CARD.md: beginner-friendly model card with
  WiFi sensing explanation, quick start code, hardware BOM, and citation

GCloud Secret: HUGGINGFACE_API_KEY in project cognitum-20260110

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-04-02 22:04:16 -04:00