wifi-densepose/v2/crates/cog-pose-estimation/cog
rUv fcb6f4bf12
feat(cog-pose-estimation): x86_64 release v0.0.1 — parallel to arm (#643)
Adds the x86_64-unknown-linux-gnu binary uploaded to
gs://cognitum-apps/cogs/x86_64/, signed with the same Ed25519
COGNITUM_OWNER_SIGNING_KEY as the arm release. Together with the
already-shipped arm artifact, the cog now ships natively for both
target architectures the Cognitum fleet supports.

x86_64 release:
  sha256:    a434739a24415b34e1aff50e5e1c3c32e568db96af473bbb3e5ecc9b95fe71fa
  signature: pNNuxhgM18PztN8BSZdfw5oAShG2pV3na5T/q2QdlJWX/5FJgo4QTiUCbcTAxI2Uiva8VURSOlRzMU3xoQPqCQ==
  size:      4,548,856 bytes
  cold-start: 5.4 ms / invocation on ruvultra (RTX 5080, NVMe)

Reorganizes manifests under cog/artifacts/manifests/{arm,x86_64}/
so each arch carries its own manifest with the matching binary_sha256
and signature — same layout the release pipeline will use for the
future hailo8 / hailo10 variants.

Updates docs/benchmarks/pose-estimation-cog.md with the cross-arch
cold-start table:

  Windows (x86_64)   76.2 ms
  ruvultra (x86_64)   5.4 ms   <- this release
  Pi 5 (aarch64)     8.4 ms

Verified via anonymous GCS download + SHA round-trip — identical to
local build.

Hailo HEF remains the only pending arch, still blocked on Hailo SDK
provisioning to a self-hosted runner.
2026-05-19 17:08:23 -04:00
..
artifacts feat(cog-pose-estimation): x86_64 release v0.0.1 — parallel to arm (#643) 2026-05-19 17:08:23 -04:00
Makefile feat(cog-pose-estimation): scaffold first Cog from this repo (ADR-100 + ADR-101) (#642) 2026-05-19 17:03:09 -04:00
README.md feat(cog-pose-estimation): scaffold first Cog from this repo (ADR-100 + ADR-101) (#642) 2026-05-19 17:03:09 -04:00
config.schema.json feat(cog-pose-estimation): scaffold first Cog from this repo (ADR-100 + ADR-101) (#642) 2026-05-19 17:03:09 -04:00
manifest.template.json feat(cog-pose-estimation): scaffold first Cog from this repo (ADR-100 + ADR-101) (#642) 2026-05-19 17:03:09 -04:00

README.md

Pose Estimation Cog

17-keypoint COCO pose estimation from WiFi CSI, deployed as a Cognitum Cog.

What it does

Subscribes to the local sensing-server's CSI stream, runs each window through a contrastive encoder (initialised from ruvnet/wifi-densepose-pretrained) and a 17-keypoint regression head, and emits one pose.frame event per inferred window on stdout. The appliance's cog-gateway picks up those events and routes them to the dashboard.

Inputs

  • [56 subcarriers × 20 frames] CSI windows (matches the [56, 20] shape produced by scripts/align-ground-truth.js).
  • Sensing-server frame poll URL configured via config.json (sensing_url, default loopback).

Outputs

{"ts": 1779210883.444, "level": "info", "event": "pose.frame",
 "fields": {
   "tick": 12345,
   "n_persons": 1,
   "persons": [{"keypoints": [[0.48, 0.31], ...], "confidence": 0.81}]
 }}

Status — v0.0.1

Pipeline scaffold + a first-cut trained model. The model is stored at cog/artifacts/pose_v1.safetensors (507 KB) and trained from data/paired/wiflow-p7-1779210883.paired.jsonl (1,077 samples, avg conf 0.44) using candle-core 0.9 on an RTX 5080 — see the full training-result dump at cog/artifacts/train_results.json.

Measured accuracy (validation set, 217 held-out samples)

                Overall:   PCK@20 = 3.0%   PCK@50 = 18.5%   MPJPE (normalized) = 0.0931

   Per-joint    PCK@20   PCK@50      Per-joint   PCK@20   PCK@50
   ─────────   ──────   ──────      ─────────   ──────   ──────
   nose          0.5%     5.1%      l_hip         0.0%    27.3%
   l_eye         2.8%     8.3%      r_hip        25.0%    76.9%   ← strongest signal
   r_eye         1.9%    15.7%      l_knee        2.3%    20.8%
   l_ear         0.0%     3.2%      r_knee        0.9%    35.2%
   r_ear         1.9%     9.7%      l_ankle       1.4%     7.9%
   l_shoulder    4.6%     8.8%      r_ankle       0.9%     9.3%
   r_shoulder    1.9%    19.9%      l_elbow       1.9%    26.4%
   l_wrist       3.2%    24.1%      r_elbow       0.0%     4.2%
   r_wrist       1.4%    12.0%

Loss curve: 0.181 (epoch 0) → 0.014 (epoch 399), eval loss 0.010. 400 epochs in 2.1 s on the RTX 5080 (~5 ms/epoch full-batch).

Honest reading

  • The model learns coarse body structurer_hip 77% PCK@50, r_knee 35%, l_elbow 26% all show real signal. PCK@50 = 18.5% averaged across joints is well above the random-baseline 0% that the pure-JS SPSA training produced.
  • It is below the ADR-079 target of PCK@20 ≥ 35%. The bottleneck is data quality and quantity, not infra. The single 30-min seated-at-desk recording produced 1,077 paired samples at avg confidence 0.44 — strong asymmetry between left/right side (r_hip 77% vs l_hip 27%) reflects the camera framing more than any model defect.
  • Distal joints (wrists, ankles) and face joints are still near-random: 56-subcarrier CSI at our 20-frame window doesn't carry enough fine-grained spatial information.

Next-iteration plan (tracked in #640)

  • Multi-session, multi-room recordings with full-body framing (target ≥ 30K paired samples at conf ≥ 0.7).
  • Re-train with the same Candle pipeline (already validated to converge in seconds on RTX 5080).
  • Hailo HEF export via the Dataflow Compiler on a self-hosted runner.

The cog's runtime inference path is currently a centred-skeleton stub returning confidence=0. Wiring the pose_v1.safetensors weights into src/inference.rs is the next code change — separate PR.

See also

  • ADR-100: Cognitum Cog Packaging Specification.
  • ADR-101: Pose Estimation Cog (the design behind this directory).
  • ADR-079: Camera-supervised pose training pipeline.
  • v0-appliance companion crate: cognitum-pose-estimation (Hailo HEF runtime).