Replace the stale "centred-skeleton stub returning confidence=0, wiring is a separate PR" sentence with a description of what now ships: - `InferenceEngine::with_weights` loads `pose_v1.safetensors` through Candle and runs the documented Conv1d + Linear forward pass - output shape is 17 (x, y) keypoint pairs with the published confidence = 0.185 (validation PCK@50) - when the weights file is missing on disk, the engine logs a `tracing::warn!` and falls back to the centred-skeleton stub so the runtime contract is preserved The 3% PCK@20 / 18.5% PCK@50 accuracy table and the "Honest reading" section are unchanged — wiring the weights does not change the trained model, only stops the cog from emitting placeholder values. |
||
|---|---|---|
| .. | ||
| artifacts | ||
| Makefile | ||
| README.md | ||
| config.schema.json | ||
| manifest.template.json | ||
README.md
Pose Estimation Cog
17-keypoint COCO pose estimation from WiFi CSI, deployed as a Cognitum Cog.
What it does
Subscribes to the local sensing-server's CSI stream, runs each window through a contrastive encoder (initialised from ruvnet/wifi-densepose-pretrained) and a 17-keypoint regression head, and emits one pose.frame event per inferred window on stdout. The appliance's cog-gateway picks up those events and routes them to the dashboard.
Inputs
[56 subcarriers × 20 frames]CSI windows (matches the[56, 20]shape produced byscripts/align-ground-truth.js).- Sensing-server frame poll URL configured via
config.json(sensing_url, default loopback).
Outputs
{"ts": 1779210883.444, "level": "info", "event": "pose.frame",
"fields": {
"tick": 12345,
"n_persons": 1,
"persons": [{"keypoints": [[0.48, 0.31], ...], "confidence": 0.81}]
}}
Status — v0.0.1
Pipeline scaffold + a first-cut trained model. The model is stored at cog/artifacts/pose_v1.safetensors (507 KB) and trained from data/paired/wiflow-p7-1779210883.paired.jsonl (1,077 samples, avg conf 0.44) using candle-core 0.9 on an RTX 5080 — see the full training-result dump at cog/artifacts/train_results.json.
Measured accuracy (validation set, 217 held-out samples)
Overall: PCK@20 = 3.0% PCK@50 = 18.5% MPJPE (normalized) = 0.0931
Per-joint PCK@20 PCK@50 Per-joint PCK@20 PCK@50
───────── ────── ────── ───────── ────── ──────
nose 0.5% 5.1% l_hip 0.0% 27.3%
l_eye 2.8% 8.3% r_hip 25.0% 76.9% ← strongest signal
r_eye 1.9% 15.7% l_knee 2.3% 20.8%
l_ear 0.0% 3.2% r_knee 0.9% 35.2%
r_ear 1.9% 9.7% l_ankle 1.4% 7.9%
l_shoulder 4.6% 8.8% r_ankle 0.9% 9.3%
r_shoulder 1.9% 19.9% l_elbow 1.9% 26.4%
l_wrist 3.2% 24.1% r_elbow 0.0% 4.2%
r_wrist 1.4% 12.0%
Loss curve: 0.181 (epoch 0) → 0.014 (epoch 399), eval loss 0.010. 400 epochs in 2.1 s on the RTX 5080 (~5 ms/epoch full-batch).
Honest reading
- The model learns coarse body structure —
r_hip77% PCK@50,r_knee35%,l_elbow26% all show real signal. PCK@50 = 18.5% averaged across joints is well above the random-baseline 0% that the pure-JS SPSA training produced. - It is below the ADR-079 target of PCK@20 ≥ 35%. The bottleneck is data quality and quantity, not infra. The single 30-min seated-at-desk recording produced 1,077 paired samples at avg confidence 0.44 — strong asymmetry between left/right side (r_hip 77% vs l_hip 27%) reflects the camera framing more than any model defect.
- Distal joints (wrists, ankles) and face joints are still near-random: 56-subcarrier CSI at our 20-frame window doesn't carry enough fine-grained spatial information.
Next-iteration plan (tracked in #645)
- Multi-session, multi-room recordings with full-body framing (target ≥ 30K paired samples at conf ≥ 0.7).
- Re-train with the same Candle pipeline (already validated to converge in seconds on RTX 5080).
- Hailo HEF export via the Dataflow Compiler on a self-hosted runner.
The cog's runtime inference path now loads pose_v1.safetensors directly through Candle in src/inference.rs — see InferenceEngine::with_weights and the weights_load_and_forward_produces_seventeen_keypoint_pairs test in the same file. The forward pass mirrors the training script (Conv1d 56→64→128→128 encoder with dilations [1, 2, 4], GlobalMeanPool, Linear 128→256→34, sigmoid) and emits [17, 2] keypoints with the published confidence = 0.185 (PCK@50). If the safetensors file is missing on disk, the engine logs a tracing::warn! and falls back to the centred-skeleton stub (confidence = 0) so the runtime contract is preserved and the dashboard surfaces "no model yet" instead of crashing. The 3% PCK@20 / 18.5% PCK@50 numbers above remain the right way to read this model — wiring the weights does not improve accuracy, only replaces the placeholder output with the trained values.
See also
- ADR-100: Cognitum Cog Packaging Specification.
- ADR-101: Pose Estimation Cog (the design behind this directory).
- ADR-079: Camera-supervised pose training pipeline.
- v0-appliance companion crate:
cognitum-pose-estimation(Hailo HEF runtime).