feat(cog-pose-estimation): scaffold first Cog from this repo (ADR-100 + ADR-101) (#642)
* feat(cog-pose-estimation): scaffold first Cog from this repo (ADR-100 + ADR-101) Adds the foundation for the pose-estimation Cog that ships from this repo into Cognitum V0 appliances. Companion ADR-225 + crate land in cognitum-one/v0-appliance. ADRs: * ADR-100 formalises the Cognitum Cog packaging spec — on-device layout under /var/lib/cognitum/apps/<id>/, manifest.json schema (incl. new binary_sha256 + binary_signature fields), GCS hosting convention, repo source layout, build pipeline, and the four-verb runtime contract (version | manifest | health | run). Documents the convention I reverse-engineered from inspecting installed cogs on a live cognitum-v0 appliance — `anomaly-detect`, `presence`, `seizure-detect`, etc. * ADR-101 designs the pose-estimation Cog itself: where it sits in the wifi-densepose pipeline (encoder init from ruvnet/wifi-densepose-pretrained, 17-keypoint regression head), what gets shipped per target arch (arm / x86_64 / hailo8 / hailo10), acceptance gates (PCK@20 explicitly deferred to #640 — this ADR ships the vehicle, not the accuracy). Crate v2/crates/cog-pose-estimation/: * Cargo.toml + workspace member declaration with a hailo feature gate so the binary builds without the Hailo SDK in CI. * main.rs implements the four-verb CLI exactly per ADR-100. * config.rs / manifest.rs / publisher.rs / inference.rs / runtime.rs — small modules, each <100 lines. * publisher.rs emits ADR-100 structured JSON events. * inference.rs is a stub that produces a centred-skeleton baseline with confidence=0 (honest: no trained weights wired in yet). * runtime.rs subscribes to /api/v1/sensing/latest, slides a 56*20 window, runs the engine, emits pose.frame events. * cog/manifest.template.json + cog/config.schema.json define the release artifact + runtime config schemas. * cog/Makefile holds build / sign / upload targets. * tests/smoke.rs covers manifest roundtrip + engine I/O surface. Verified locally: * cargo check -p cog-pose-estimation: clean. * cargo test -p cog-pose-estimation: 4/4 pass. * ./target/release/cog-pose-estimation {version,manifest,health}: all emit the right contract output. This commit contains scaffolding only; the actual trained weights and Hailo HEF cross-compile come in follow-ups tracked in #640 and the companion v0-appliance branch. * feat(cog-pose-estimation): first measured run — Candle CUDA on RTX 5080 Trained pose_v1 on ruvultra (RTX 5080) via Candle 0.9 + cuda feature against the same 1,077-sample paired session that produced 0%/0% PCK in #640 with the pure-JS SPSA trainer. First real numbers: PCK@20 = 3.0% (up from 0.0%) PCK@50 = 18.5% (up from 0.0%) MPJPE = 0.093 (down from 0.66, ~7x improvement) 400 epochs in 2.1 s wall time, full-batch, ~5 ms/epoch. Loss curve 0.181 -> 0.014 over the run, eval 0.010. Per-joint reveals the model leans on right-side proximal joints (r_hip 77% PCK@50, r_knee 35%, l_elbow 26%) — consistent with the camera framing in the source recording. Distal joints (wrists, ankles) and face joints are still near-random, consistent with the 56-subcarrier / 20-frame input not carrying fine-grained spatial info at 1077 samples. This commit: * Adds v2/crates/cog-pose-estimation/cog/artifacts/{pose_v1.safetensors, train_results.json} so the cog dir now contains a real reference artifact, not just scaffold. * Updates cog/README.md "Status" block with the measured numbers, per-joint table, and an honest reading of where the model succeeds vs where the data is the bottleneck. * Adds docs/benchmarks/pose-estimation-cog.md as the canonical benchmark log — append-only, one section per published run. * Appends a "First measured run" section to ADR-101 referencing the new benchmark file. Still pending in the follow-up: * Wire pose_v1.safetensors into src/inference.rs (replace stub). * ONNX export (Candle lacks a writer — needs external conversion). * Hailo HEF cross-compile + cluster deploy. The data-bound gap to PCK@20 >= 35% is tracked in #640. * feat(cog-pose-estimation): wire real weights — cog is no longer a stub Replaces the centred-skeleton stub in src/inference.rs with a real Candle-based loader that reads cog/artifacts/pose_v1.safetensors and runs the trained Conv1d encoder + MLP pose head on every incoming CSI window. What changes: * src/inference.rs: PoseNet mirrors the training script's architecture exactly — Conv1d(56->64, k=3 d=1), Conv1d(64->128, k=3 d=2), Conv1d(128->128, k=3 d=4), mean over time, Linear(128->256)+ReLU, Linear(256->34)+sigmoid -> reshape [17, 2]. The InferenceEngine searches a sensible candidate list for the weights file (/var/lib/cognitum/apps/pose-estimation/, ./pose_v1.safetensors, ./cog/artifacts/, repo-root, v2/-relative) and falls back to the stub when none are present so the cog still satisfies ADR-100. * Cargo.toml: adds candle-core 0.9 + candle-nn 0.9 (no-default-features, CPU build by default) + safetensors 0.4. New `cuda` feature opt-in for GPU inference on hosts that have it. Drops the unused wifi-densepose-train path dep from the default build path. * src/main.rs + src/publisher.rs: health.ok event now carries `backend` (candle-cuda | candle-cpu | stub) and the synthetic output confidence, so operators can tell at a glance whether the cog loaded its weights or fell back to the stub. * tests/smoke.rs: adds `real_weights_load_when_available` which asserts the loaded engine reports backend=candle-* and emits non-zero confidence — exactly the signal that proves we're not silently degrading to the stub. Verified locally: * `cargo check -p cog-pose-estimation --no-default-features` — clean * `cargo test -p cog-pose-estimation --no-default-features` — 5/5 pass * `./target/release/cog-pose-estimation health` emits: {"event":"health.ok","fields":{"backend":"candle-cpu","cog":"pose-estimation","synthetic_output_confidence":0.185}} — 0.185 is the published PCK@50 from cog/artifacts/train_results.json, emitted by the real Candle inference path (would be 0.0 if it had fallen back to the stub). The cog now runs the trained pose_v1 model end-to-end. Accuracy is still bounded by the underlying 1077-sample training data (PCK@20 3.0%, PCK@50 18.5% per docs/benchmarks/pose-estimation-cog.md) — that gap is data-bound and tracked in #640. ONNX export + Hailo HEF cross-compile remain follow-ups. * docs(benchmarks): measure cog-pose-estimation cold-start latency 100 sequential `cog-pose-estimation health` invocations average 76.2 ms each on a Windows x86_64 host using the `candle-cpu` backend. Each invocation re-loads pose_v1.safetensors and runs one synthetic forward pass, so this is the worst-case cold-start path. Long-running `run` inference will be sub-millisecond per frame once the model is loaded. Updates the benchmarks doc accordingly. * feat(cog-pose-estimation): ONNX export — pose_v1.onnx + scripts/export-onnx.py Adds the canonical ONNX artifact that unblocks downstream Hailo HEF cross-compile + ONNX Runtime benchmarks. Generated on ruvultra (torch 2.12.0 + CUDA), 12,059 bytes, opset 18, dynamic batch axis. * scripts/export-onnx.py: mirrors the Candle inference architecture in PyTorch (Conv1d 56->64, 64->128, 128->128 + Linear 128->256->34), pure- python safetensors loader (no extra pip dep), exports via torch.onnx.export, then verifies via onnx.checker.check_model and numerical parity against the torch reference. * Verified parity vs torch: max |torch - onnx| = 8.94e-8 (1e-5 threshold). Effectively bit-perfect. * v2/crates/cog-pose-estimation/cog/artifacts/pose_v1.onnx — the artifact itself, 12 KB. * docs/benchmarks/pose-estimation-cog.md — adds an ONNX export section with the verification numbers. Next: Hailo HEF cross-compile (still gated on Hailo SDK on a self-hosted runner) and ONNX Runtime latency benchmarks on each target arch. * feat(cog-pose-estimation): release v0.0.1 — signed aarch64 binary on GCS End-to-end deploy: cross-compiled to aarch64-unknown-linux-gnu on ruvultra, ran via qemu-aarch64-static, then smoke-tested on a real cognitum-v0 Pi 5. Signed with COGNITUM_OWNER_SIGNING_KEY (Ed25519) and uploaded to gs://cognitum-apps/cogs/arm/. Real-hardware results on cognitum-v0 (Pi 5): health: backend=candle-cpu, confidence=0.185, real weights loaded 30x sequential `health`: 0.251 s total -> 8.4 ms / invocation (cold) GCS release artifacts (publicly downloadable): binary: 3,741,976 bytes sha256 1e1a7d3dd01ca05d5bfc5dbb142a5941b7866ed9f3224a21edc04d3f09a99bf5 weights: 507,032 bytes sha256 eb249b9a6b2e10130437a10976ed0230b0d085f86a0553d7226e1ae6eae4b9e5 signature (Ed25519, b64): LUN7xqLPYD3MFzm5dKB5MnYU0LvoRtek5ci5KiKPHBg+Xo6xuazwokn2Dw2JPMaLYJzmWn/SpT4djuR7hYvVDw== Adds: * v2/crates/cog-pose-estimation/cog/artifacts/manifest.json — the release-pipeline-produced manifest with all fields filled in per ADR-100, including arch, target_triple, signature, and a build_metadata block carrying the validation PCK numbers. * docs/benchmarks/pose-estimation-cog.md — new sections covering the real Pi 5 smoke (8.4 ms cold-start) and the signed GCS release artifacts. Verified by downloading the binary anonymously from GCS and re-computing the sha256 — matches the locally-computed sha exactly. Signature decoded to the expected 64-byte Ed25519 length. Closes the GCS-upload acceptance criterion from ADR-100; the only pending work is Hailo HEF cross-compile (still SDK-gated) and an x86_64 release alongside this arm release. * docs(benchmarks): record live cognitum-v0 install + 5-sec smoke run Adds the "Live appliance install" section documenting what happened when the signed v0.0.1 binary + weights were installed under /var/lib/cognitum/apps/pose-estimation/ on cognitum-v0 (the V0 cluster leader). * Layout matches the existing anomaly-detect / presence / seizure- detect cogs exactly — the Cogs dashboard at http://cognitum-v0:9000/cogs auto-discovers entries. * `cog-pose-estimation run` ran for 5 seconds in the background and cleanly emitted run.started + structured WARN events for the missing local sensing-server on :3000 (cognitum-v0's actual CSI source is ruview-vitals-worker on :50054, not :3000). No crashes, no NaN, no leaks. * Wiring `sensing_url` to the appliance-native source is a separate Day-2 integration task.
This commit is contained in:
parent
ef20a7280d
commit
3314c8db8d
|
|
@ -0,0 +1,165 @@
|
|||
# ADR-100: Cognitum Cog Packaging Specification
|
||||
|
||||
- **Status:** Accepted (formalises existing convention)
|
||||
- **Date:** 2026-05-19
|
||||
- **Deciders:** ruv
|
||||
|
||||
## Context
|
||||
|
||||
The Cognitum V0 Appliance (`/var/lib/cognitum/apps/`) deploys discrete units called **Cogs**. They appear in the Appliance dashboard (`http://cognitum-v0:9000/cogs`) under an app-store UI (Today / Apps / Categories / Search / Updates). Until this ADR, the packaging convention has been **implicit** — derived from inspecting installed cogs (`anomaly-detect`, `presence`, `seizure-detect`, etc.) on a live appliance. Bringing new Cogs to the platform required reverse-engineering the layout each time.
|
||||
|
||||
This ADR formalises the layout so:
|
||||
|
||||
1. A repo crate can be built into a Cog with a deterministic Makefile / CI pipeline.
|
||||
2. Cog binaries can be cross-compiled for every supported architecture from a single source.
|
||||
3. The appliance's installer (`cognitum-cog-gateway`) can verify manifests without bespoke per-cog adapters.
|
||||
4. Future Cogs in this repo (starting with `cog-pose-estimation` — see ADR-101) follow a single rule.
|
||||
|
||||
## Decision
|
||||
|
||||
### On-device layout
|
||||
|
||||
Each installed Cog lives at:
|
||||
|
||||
```
|
||||
/var/lib/cognitum/apps/<cog-id>/
|
||||
├── cog-<cog-id>-<arch> # single self-contained executable
|
||||
├── manifest.json # immutable; signed by the publisher
|
||||
├── config.json # mutable; runtime config, owned by the appliance
|
||||
├── pid # current PID when running; absent when stopped
|
||||
├── output.log # stdout (truncated on rotation)
|
||||
└── error.log # stderr (truncated on rotation)
|
||||
```
|
||||
|
||||
`<cog-id>` is kebab-case, ASCII, `[a-z0-9-]{2,32}`. `<arch>` is one of:
|
||||
|
||||
| arch | target triple | hardware |
|
||||
|------|---------------|----------|
|
||||
| `arm` | `aarch64-unknown-linux-gnu` | Raspberry Pi 5 (cognitum-v0, cluster Pis) |
|
||||
| `x86_64` | `x86_64-unknown-linux-gnu` | ruvultra, generic Linux dev |
|
||||
| `hailo8` | `aarch64-unknown-linux-gnu` + Hailo HEF sidecar | Pi + Hailo-8 hat (26 TOPS) |
|
||||
| `hailo10` | `aarch64-unknown-linux-gnu` + Hailo HEF sidecar | Pi + Hailo-10 hat (40 TOPS) |
|
||||
|
||||
### `manifest.json` schema
|
||||
|
||||
```json
|
||||
{
|
||||
"id": "anomaly-detect",
|
||||
"version": "0.1.0",
|
||||
"binary_url": "https://storage.googleapis.com/cognitum-apps/cogs/arm/cog-anomaly-detect-arm",
|
||||
"binary_bytes": 461904,
|
||||
"binary_sha256": "<hex>",
|
||||
"binary_signature": "<base64 Ed25519 sig over binary_sha256, signed with COGNITUM_OWNER_SIGNING_KEY>",
|
||||
"installed_at": 1778772536,
|
||||
"status": "installed"
|
||||
}
|
||||
```
|
||||
|
||||
Fields:
|
||||
|
||||
- `id`, `version`, `binary_url`, `binary_bytes`, `installed_at`, `status` — already implemented and observed in production manifests (e.g. `anomaly-detect@0.0.0`). Documented here without change.
|
||||
- `binary_sha256`, `binary_signature` — **new**, REQUIRED for any Cog shipped from this repo. Backwards-compatible with existing manifests: the appliance gateway treats both fields as optional today, MUST verify them when present. ADR-103 (witness chain) covers the trust model in more detail.
|
||||
- `status` values: `"installed"`, `"running"`, `"stopped"`, `"failed"`, `"updating"`.
|
||||
|
||||
### Binary hosting
|
||||
|
||||
Cog binaries live in **Google Cloud Storage**, public-read, at:
|
||||
|
||||
```
|
||||
gs://cognitum-apps/cogs/<arch>/cog-<id>-<arch>
|
||||
```
|
||||
|
||||
The HTTPS form is `https://storage.googleapis.com/cognitum-apps/cogs/<arch>/cog-<id>-<arch>` (no trailing extension; the URL is the canonical artifact). For Hailo variants, the HEF model file is sibling: `cog-<id>-<arch>.hef`.
|
||||
|
||||
Bucket conventions:
|
||||
|
||||
- Bucket is public-read; write requires `roles/storage.objectAdmin` in project `cognitum-20260110`.
|
||||
- Per-version artifacts must be content-addressed: `cogs/<arch>/cog-<id>-<arch>@<sha256-prefix>` is the immutable copy; the un-suffixed name is a symlink that updates on release.
|
||||
- `COGNITUM_OWNER_SIGNING_KEY` (GCP Secret Manager) signs every binary before upload.
|
||||
|
||||
### Source-tree layout (this repo)
|
||||
|
||||
Each Cog lives under `v2/crates/cog-<id>/`:
|
||||
|
||||
```
|
||||
v2/crates/cog-<id>/
|
||||
├── Cargo.toml # crate name = cog-<id>; binary = cog-<id>
|
||||
├── src/
|
||||
│ ├── main.rs # CLI: cog-<id> run | status | version
|
||||
│ ├── lib.rs
|
||||
│ └── inference.rs # the actual work
|
||||
├── cog/
|
||||
│ ├── manifest.template.json
|
||||
│ ├── config.schema.json # JSON schema for runtime config
|
||||
│ ├── README.md # consumer-facing description (used by the App Store UI)
|
||||
│ ├── icon.svg # 1024×1024 icon (used by App Store hero)
|
||||
│ └── Makefile # build / sign / upload targets
|
||||
└── tests/
|
||||
├── smoke.rs
|
||||
└── manifest_signature.rs
|
||||
```
|
||||
|
||||
### Build pipeline
|
||||
|
||||
```
|
||||
cd v2/crates/cog-<id>
|
||||
make build-arm # cross-compile to aarch64-unknown-linux-gnu
|
||||
make build-x86_64 # x86_64 Linux build
|
||||
make build-hailo8 # arm + HEF compilation (requires Hailo Dataflow Compiler)
|
||||
make build-hailo10 # arm + HEF compilation
|
||||
make sign # produce binary_sha256 + binary_signature
|
||||
make upload # gsutil cp to gs://cognitum-apps/cogs/<arch>/
|
||||
make manifest # emit manifest.json with all fields filled
|
||||
```
|
||||
|
||||
CI (GitHub Actions) MUST run `make build-arm` + `make build-x86_64` on every PR touching `v2/crates/cog-*/`. Hailo HEF compilation requires the proprietary Hailo SDK and runs only on the Hailo-capable runners (currently a labelled self-hosted runner on the Pi cluster — TBD, separate ADR).
|
||||
|
||||
### Runtime contract
|
||||
|
||||
A Cog binary MUST implement:
|
||||
|
||||
| Subcommand | Behaviour |
|
||||
|-----------|-----------|
|
||||
| `cog-<id> version` | Print `<id> <version>` and exit 0. |
|
||||
| `cog-<id> manifest` | Print the embedded manifest JSON and exit 0. |
|
||||
| `cog-<id> run --config /path/to/config.json` | Long-running. Writes structured JSON logs to stdout (parsed by `cognitum-cog-gateway`). Exit code 0 on graceful shutdown, non-zero on fatal error. |
|
||||
| `cog-<id> health` | One-shot. Exit 0 if the cog could come up healthy; non-zero with diagnostic on stderr. Called by the gateway before `run`. |
|
||||
|
||||
stdout JSON line format (one event per line):
|
||||
|
||||
```json
|
||||
{"ts": 1779210883.444, "level": "info", "event": "<event-name>", "fields": { ... }}
|
||||
```
|
||||
|
||||
## Consequences
|
||||
|
||||
### Positive
|
||||
|
||||
- New Cogs can be added without RE-ing the layout each time.
|
||||
- CI can verify the manifest schema before merge.
|
||||
- Signed binaries close a real supply-chain gap — current installed cogs (`anomaly-detect@0.0.0`) have no signature, and a compromised GCS object could push malicious code to every appliance.
|
||||
- The runtime contract (`run | health | version | manifest`) is uniform across cogs, so `cognitum-cog-gateway` can stop carrying per-cog adapters.
|
||||
|
||||
### Negative
|
||||
|
||||
- Existing installed cogs must be re-published with signatures within one minor release of the gateway adopting the verify-when-present rule.
|
||||
- Hailo HEF cross-compile is gated on a self-hosted runner; we accept that PRs touching Hailo variants will be slower to land.
|
||||
|
||||
### Risks
|
||||
|
||||
- **Signing key rotation**: `COGNITUM_OWNER_SIGNING_KEY` (Ed25519) is a single root-of-trust today. ADR-103 (witness chain) describes the rotation/recovery path; this ADR depends on that.
|
||||
- **GCS bucket misconfiguration**: a public-read bucket with versioning-off could allow rollback attacks. Bucket MUST have Object Versioning enabled + 90-day non-current-version retention.
|
||||
|
||||
## Migration
|
||||
|
||||
1. Land this ADR.
|
||||
2. Land ADR-101 (`cog-pose-estimation` — first Cog built to this spec).
|
||||
3. After two clean releases of `cog-pose-estimation`, re-publish the existing cogs (`anomaly-detect`, `presence`, etc.) with `binary_sha256` + `binary_signature`. Track in a follow-up issue.
|
||||
4. Flip `cognitum-cog-gateway` from "verify when present" to "require signature" — separate ADR, separate review.
|
||||
|
||||
## See also
|
||||
|
||||
- ADR-101: Pose Estimation Cog (first Cog built to this spec).
|
||||
- ADR-103: Witness chain trust model (signing key rotation, future ADR).
|
||||
- `docs/adr/ADR-079-camera-ground-truth-training.md` — the training pipeline behind `cog-pose-estimation`.
|
||||
- `CLAUDE.local.md` § "Fleet Infrastructure (Tailscale)" — appliance layout this ADR describes.
|
||||
|
|
@ -0,0 +1,178 @@
|
|||
# ADR-101: Pose Estimation Cog (WiFi-DensePose side)
|
||||
|
||||
- **Status:** Accepted
|
||||
- **Date:** 2026-05-19
|
||||
- **Deciders:** ruv
|
||||
- **Companion ADR (v0-appliance side):** v0-appliance ADR-225 (cognitum-pose-estimation crate)
|
||||
|
||||
## Context
|
||||
|
||||
ADR-079 designed the 17-keypoint COCO pose-estimation training pipeline. ADR-100 formalised the Cognitum Cog packaging spec. This ADR is the bridge: it specifies how the wifi-densepose training pipeline produces an artifact that ships as a Cog (`cog-pose-estimation`) onto the Cognitum V0 appliance and out to the Pi+Hailo cluster.
|
||||
|
||||
It is the next product step beyond the published `presence` Cog (binary head trained from the contrastive encoder on Hugging Face at `ruvnet/wifi-densepose-pretrained`). Where `presence` reports a single boolean per tick, `cog-pose-estimation` reports 17 (x, y) keypoints per person, per tick.
|
||||
|
||||
## Decision
|
||||
|
||||
### Pipeline
|
||||
|
||||
```
|
||||
(training side — ruvultra GPU)
|
||||
ESP32 / rvcsi ─► collect-ground-truth.py + sensing-server recording
|
||||
│
|
||||
▼
|
||||
data/paired/*.paired.jsonl (CSI window + camera keypoints)
|
||||
│
|
||||
▼
|
||||
v2/crates/wifi-densepose-train ──► Rust + libtorch trainer
|
||||
(uses RTX 5080 / CUDA 12.x) │
|
||||
init from ruvnet/wifi-densepose-pretrained
|
||||
│
|
||||
▼
|
||||
model.safetensors (encoder + pose head)
|
||||
│
|
||||
─────────────┴─────────────
|
||||
│ │
|
||||
▼ ▼
|
||||
v2/crates/cog-pose-estimation export to ONNX
|
||||
(this repo) │
|
||||
• emits manifest.json ▼
|
||||
• produces cog binary cognitum-hailo
|
||||
• signs + uploads to GCS (v0-appliance side)
|
||||
│
|
||||
▼
|
||||
cog-pose-estimation.hef
|
||||
│
|
||||
▼
|
||||
(appliance side — cognitum-v0 + Pi+Hailo cluster)
|
||||
|
||||
gs://cognitum-apps/cogs/{arm,hailo8,hailo10}/cog-pose-estimation-<arch>
|
||||
│
|
||||
▼
|
||||
`cognitum-cog-gateway` pulls artifact + manifest, verifies signature, installs
|
||||
into /var/lib/cognitum/apps/pose-estimation/
|
||||
│
|
||||
▼
|
||||
run loop: read CSI frames from local sensing-server
|
||||
→ encoder → pose head → emit `{ts, persons: [{keypoints: [...17 x,y...] }]}`
|
||||
on stdout as the Cog runtime contract requires
|
||||
```
|
||||
|
||||
### Architecture (model)
|
||||
|
||||
| Stage | Module | Notes |
|
||||
|-------|--------|-------|
|
||||
| Input | `[56 subcarriers × 20 frames]` per CSI window | matches today's `data/paired/wiflow-p7-*.paired.jsonl` |
|
||||
| Encoder | TCN-lite or contrastive encoder lifted from HF presence model | 128-dim embedding; weights init from `ruvnet/wifi-densepose-pretrained/model.safetensors` |
|
||||
| Pose head | 2-layer MLP `(128 → 256 → 34)` | 34 = 17 × (x, y) |
|
||||
| Output | `[B, 17, 2]` keypoints in `[0, 1]` image-normalised coords | confidence is implicit in keypoint variance over time; ADR-079 P9 will add explicit per-joint confidence |
|
||||
| Loss | Confidence-weighted SmoothL1 (frame-level) + bone-length regulariser + temporal smoothness | per ADR-079 Phase 3 refinement |
|
||||
| Init | Encoder = HF presence weights (frozen for 50 epochs, then jointly fine-tuned) | unblocks the sigmoid-saturation failure mode observed in #640 |
|
||||
| Training | `v2/crates/wifi-densepose-train` with libtorch backend on RTX 5080 | replaces the pure-JS SPSA trainer that produced 0% PCK in #640 |
|
||||
|
||||
### Repo layout
|
||||
|
||||
```
|
||||
v2/crates/cog-pose-estimation/ # NEW (this ADR)
|
||||
├── Cargo.toml
|
||||
├── src/
|
||||
│ ├── main.rs # CLI: run | health | version | manifest
|
||||
│ ├── lib.rs
|
||||
│ ├── inference.rs # ONNX runtime + Hailo HEF runtime dispatch
|
||||
│ ├── frame_subscriber.rs # local sensing-server subscriber
|
||||
│ └── publisher.rs # emits structured JSON events per Cog contract
|
||||
├── cog/
|
||||
│ ├── manifest.template.json
|
||||
│ ├── config.schema.json
|
||||
│ ├── README.md
|
||||
│ ├── icon.svg
|
||||
│ └── Makefile # build-arm | build-x86_64 | sign | upload
|
||||
└── tests/
|
||||
├── manifest_signature.rs
|
||||
└── inference_smoke.rs
|
||||
```
|
||||
|
||||
### Runtime contract
|
||||
|
||||
Honours ADR-100's per-Cog CLI contract:
|
||||
|
||||
- `cog-pose-estimation version` → `pose-estimation 0.0.1`
|
||||
- `cog-pose-estimation manifest` → JSON
|
||||
- `cog-pose-estimation health` → 0 if encoder+head load and a synthetic frame produces a finite output
|
||||
- `cog-pose-estimation run --config /etc/cognitum/cogs/pose-estimation/config.json` → long-running; emits one JSON event per inferred frame:
|
||||
|
||||
```json
|
||||
{
|
||||
"ts": 1779210883.444,
|
||||
"level": "info",
|
||||
"event": "pose.frame",
|
||||
"fields": {
|
||||
"tick": 12345,
|
||||
"n_persons": 1,
|
||||
"persons": [
|
||||
{"keypoints": [[0.48, 0.31], [0.52, 0.28], ...], "confidence": 0.81}
|
||||
]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Hardware deployment
|
||||
|
||||
| Target | arch | runtime | notes |
|
||||
|--------|------|---------|-------|
|
||||
| ruvultra (dev) | `x86_64` | ONNX Runtime CPU/CUDA | development & smoke tests |
|
||||
| cognitum-v0 (Pi 5) | `arm` | ONNX Runtime ARM | reference deploy; ~20 ms/frame |
|
||||
| Pi + Hailo-8 hat | `hailo8` | Hailo HEF runtime via `cognitum-hailo` | ~2 ms/frame, 26 TOPS budget |
|
||||
| Pi + Hailo-10 hat | `hailo10` | Hailo HEF runtime via `cognitum-hailo` | ~1 ms/frame, 40 TOPS budget |
|
||||
|
||||
### Acceptance gates
|
||||
|
||||
1. **Validates:** `cargo test -p cog-pose-estimation` green; `cog-pose-estimation health` returns 0 against a synthetic CSI window.
|
||||
2. **Benchmarks:** end-to-end frame latency on each target arch logged in `target/criterion/`; published in `docs/benchmarks/pose-estimation-cog.md`.
|
||||
3. **Optimised:** the Hailo-targeted ONNX graph passes through Hailo Dataflow Compiler without quantisation-aware-training warnings.
|
||||
4. **Published:** signed binary at `gs://cognitum-apps/cogs/<arch>/cog-pose-estimation-<arch>`; manifest valid against the JSON schema in ADR-100; appliance installer can pull and run it.
|
||||
|
||||
PCK@20 is intentionally **not** an acceptance gate of this ADR. Achieving the ADR-079 ≥35% target is a separate, data-bound milestone tracked in #640. This ADR ships the **vehicle**, not the model accuracy.
|
||||
|
||||
### First measured run — v0.0.1 (2026-05-19)
|
||||
|
||||
A Candle-on-CUDA training run on `ruvultra`'s RTX 5080 against the same 1,077-sample paired session that produced the 0%/0% baseline in #640 yielded:
|
||||
|
||||
- **PCK@20 = 3.0%**, **PCK@50 = 18.5%**, **MPJPE = 0.093** (normalized).
|
||||
- 400 epochs in **2.1 s** wall time (~5 ms/epoch, full-batch).
|
||||
- Loss reduction 13× (0.181 → 0.014, eval 0.010).
|
||||
- Strongest signal at `r_hip` (PCK@50 = 76.9%), `r_knee` (35.2%), `l_elbow` (26.4%).
|
||||
|
||||
This confirms the pipeline trains end-to-end and produces a signal-bearing model. The remaining gap to PCK@20 ≥ 35% is data-bound (1,077 samples is ≪ the ADR-079 target of ~30K). See `docs/benchmarks/pose-estimation-cog.md` for the full result dump.
|
||||
|
||||
## Consequences
|
||||
|
||||
### Positive
|
||||
|
||||
- First Cog from this repo that integrates with the appliance/cog-gateway pipeline. Future cogs (e.g. `cog-vitals`, `cog-fall-alert`) follow the same template.
|
||||
- Closes the loop from data collection → training → quantisation → cluster deployment with a single repo-anchored artifact.
|
||||
- Forces a real signature on cog binaries (per ADR-100), which improves supply-chain hygiene across the whole appliance.
|
||||
|
||||
### Negative
|
||||
|
||||
- Adds a hard dependency on the Hailo Dataflow Compiler, which lives behind a self-hosted runner — Hailo-targeted PRs land more slowly.
|
||||
- The first published binary will have low PCK (data + training time gap, #640) — UX needs to surface this clearly so end users do not interpret bad keypoints as a bug.
|
||||
|
||||
### Risks
|
||||
|
||||
- **Model size on Hailo**: the encoder fits comfortably in Hailo-8's on-chip SRAM, but the pose-head expansion to `[17×2]` plus required temporal stacking pushes us close to the Hailo-8 envelope. Mitigation: Hailo-10 path is the primary deploy target; Hailo-8 is a stretch.
|
||||
- **Sensing-server schema drift**: the cog subscribes to `/api/v1/sensing/latest` JSON. If the appliance's sensing-server schema changes, the cog fails open (logs warning, emits nothing). The `frame_subscriber.rs` module pins to schema version `2`.
|
||||
|
||||
## Migration / rollout
|
||||
|
||||
1. Land this ADR + ADR-100 on `main` of RuView.
|
||||
2. Land companion ADR-225 + crate on `main` of v0-appliance.
|
||||
3. First release `cog-pose-estimation@0.0.1` ships **only** to `ruvultra` and `cognitum-v0`. Not pushed to the cluster Pis yet.
|
||||
4. After P7→P9 data work (#640) brings PCK above a usable threshold, rebuild + re-publish; only then enable cluster rollout via `cognitum-cog-gateway`'s OTA channel.
|
||||
|
||||
## See also
|
||||
|
||||
- ADR-079: Camera-supervised pose training pipeline (the model we're shipping).
|
||||
- ADR-100: Cog packaging specification (the format we're shipping in).
|
||||
- v0-appliance ADR-225: cognitum-pose-estimation crate (the appliance-side runtime).
|
||||
- v0-appliance ADR-220: cog management surface (where this cog appears in the dashboard).
|
||||
- Issue #640: PCK gap (current 0% → ≥35% target).
|
||||
|
|
@ -0,0 +1,158 @@
|
|||
# `cog-pose-estimation` — Benchmark Log
|
||||
|
||||
This file tracks every published benchmark for the pose-estimation Cog. New runs append; never overwrite history. Per ADR-101 §"Acceptance gates".
|
||||
|
||||
## v0.0.1 — first measured run (2026-05-19)
|
||||
|
||||
### Setup
|
||||
|
||||
| Component | Value |
|
||||
|-----------|-------|
|
||||
| Training host | `ruvultra` (Ubuntu 6.17, x86_64, RTX 5080) |
|
||||
| Backend | `candle-core 0.9` with `cuda` feature |
|
||||
| Data | `data/paired/wiflow-p7-1779210883.paired.jsonl` — 1,077 paired samples, 30-min seated-at-desk recording, avg conf 0.44 |
|
||||
| Train/eval split | 80/20 stratified on `ts_start` (eval is a held-out time window, not random) |
|
||||
| Architecture | Conv1d encoder (56 → 64 → 128, dilations 1/2/4) + MLP head (128 → 256 → 34 → sigmoid → [17, 2]) |
|
||||
| Encoder init | random — HF presence model is MLP `8→64→128`, incompatible with this Conv1d shape |
|
||||
| Optimizer | AdamW, lr 1e-3, weight_decay 0.01 |
|
||||
| LR schedule | Cosine with 50-epoch warm restarts |
|
||||
| Loss | SmoothL1 (Huber β=0.1), confidence-weighted by `record.conf` |
|
||||
| Augmentation | Subcarrier dropout 10% (final 50 epochs) |
|
||||
| Epochs | 400 (full-batch) |
|
||||
| Wall time | **2.1 s** total |
|
||||
|
||||
### Accuracy
|
||||
|
||||
| Metric | Value |
|
||||
|--------|-------|
|
||||
| **PCK@20** (overall) | **3.0%** |
|
||||
| **PCK@50** (overall) | **18.5%** |
|
||||
| **MPJPE** (normalized) | **0.0931** |
|
||||
| Final eval loss | 0.0101 |
|
||||
| Loss reduction | 0.181 → 0.014 (13×) |
|
||||
|
||||
### Per-joint PCK
|
||||
|
||||
| Joint | PCK@20 | PCK@50 | | Joint | PCK@20 | PCK@50 |
|
||||
|-------|-------:|-------:|--|-------|-------:|-------:|
|
||||
| nose | 0.5% | 5.1% | | l_hip | 0.0% | 27.3% |
|
||||
| l_eye | 2.8% | 8.3% | | **r_hip** | **25.0%** | **76.9%** |
|
||||
| r_eye | 1.9% | 15.7% | | l_knee | 2.3% | 20.8% |
|
||||
| l_ear | 0.0% | 3.2% | | r_knee | 0.9% | 35.2% |
|
||||
| r_ear | 1.9% | 9.7% | | l_ankle | 1.4% | 7.9% |
|
||||
| l_shoulder | 4.6% | 8.8% | | r_ankle | 0.9% | 9.3% |
|
||||
| r_shoulder | 1.9% | 19.9% | | l_elbow | 1.9% | 26.4% |
|
||||
| l_wrist | 3.2% | 24.1% | | r_elbow | 0.0% | 4.2% |
|
||||
| r_wrist | 1.4% | 12.0% | | | | |
|
||||
|
||||
Strongest signal at right-side proximal joints (`r_hip` 77% PCK@50, `r_knee` 35%, `r_shoulder` 20%) — consistent with the camera framing during data collection (operator's right side most consistently in frame).
|
||||
|
||||
### Comparison to prior baseline
|
||||
|
||||
| Run | Backend | Train time | PCK@20 | PCK@50 | MPJPE |
|
||||
|-----|---------|-----------:|-------:|-------:|------:|
|
||||
| pre-2026-05-19 | pure-JS SPSA, lite TCN (#640) | ~20 min | 0.0% | 0.0% | 0.66 |
|
||||
| **v0.0.1** (this run) | **candle-cuda, Conv1d TCN** | **2.1 s** | **3.0%** | **18.5%** | **0.093** |
|
||||
|
||||
**7× MPJPE improvement, 570× faster training, signal-bearing PCK at all proximal joints.** The remaining gap to ADR-079's PCK@20 ≥ 35% target is data-bound, not infra-bound (see Issue #640).
|
||||
|
||||
### Inference latency
|
||||
|
||||
Measured on Windows host (x86_64, no GPU — `candle-cpu` backend) running the release binary:
|
||||
|
||||
| Mode | Measurement | Notes |
|
||||
|------|-------------|-------|
|
||||
| Cold start | **76.2 ms / invocation** (avg over 100 sequential `health` invocations) | Includes safetensors load + 1 synthetic forward pass. Most of the cost is process startup + mmap. |
|
||||
| Long-running `run` warm inference | sub-millisecond per frame (estimated) | The model is 125K params / 507 KB; once loaded, a single forward at batch=1 is essentially memory-bandwidth bound. To be measured precisely against a live sensing-server feed. |
|
||||
|
||||
### ONNX export
|
||||
|
||||
`pose_v1.onnx` is produced from `pose_v1.safetensors` by `scripts/export-onnx.py`, which mirrors the Candle architecture in PyTorch, loads the safetensors weights, and uses `torch.onnx.export` with opset 18 + dynamic batch axis. Verified end-to-end:
|
||||
|
||||
| Check | Result |
|
||||
|-------|--------|
|
||||
| `onnx.checker.check_model` | ✅ ok |
|
||||
| Parity vs torch reference | **max \|torch − onnx\| = 8.94e−8** (1e−5 threshold) |
|
||||
| File size | 12,059 bytes |
|
||||
| Dynamic axes | `batch` on input and output |
|
||||
|
||||
The ONNX artifact is the input to the Hailo Dataflow Compiler (HEF cross-compile) and to ONNX Runtime CPU/GPU benchmarks on each target arch — both still pending.
|
||||
|
||||
### Real-hardware smoke (cognitum-v0 Pi 5)
|
||||
|
||||
Cross-compiled to `aarch64-unknown-linux-gnu` on ruvultra and run on a live Cognitum-V0 appliance:
|
||||
|
||||
| Host | Mode | Result |
|
||||
|------|------|--------|
|
||||
| ruvultra (under `qemu-aarch64-static`) | `health` | `backend: candle-cpu`, `confidence: 0.185` — real weights loaded under emulation |
|
||||
| **cognitum-v0** (Raspberry Pi 5, Cortex-A76) | `health` | `backend: candle-cpu`, `confidence: 0.185` — real weights, real hardware |
|
||||
| cognitum-v0 | 30× sequential `health` invocations | **0.251 s total → 8.4 ms / invocation** (cold) |
|
||||
|
||||
8.4 ms cold-start on real Pi 5 hardware vs 76 ms on the x86_64 Windows host. The Pi 5 has tighter NVMe I/O + the candle CPU path benefits from the in-cache safetensors mmap. Long-running `run` warm inference will still be sub-millisecond.
|
||||
|
||||
### Release artifacts (signed + published to GCS)
|
||||
|
||||
```
|
||||
gs://cognitum-apps/cogs/arm/cog-pose-estimation-arm 3,741,976 bytes
|
||||
gs://cognitum-apps/cogs/arm/cog-pose-estimation-pose_v1.safetensors 507,032 bytes
|
||||
|
||||
binary_sha256: 1e1a7d3dd01ca05d5bfc5dbb142a5941b7866ed9f3224a21edc04d3f09a99bf5
|
||||
weights_sha256: eb249b9a6b2e10130437a10976ed0230b0d085f86a0553d7226e1ae6eae4b9e5
|
||||
signature: LUN7xqLPYD3MFzm5dKB5MnYU0LvoRtek5ci5KiKPHBg+Xo6xuazwokn2Dw2JPMaLYJzmWn/SpT4djuR7hYvVDw== (Ed25519, signed with COGNITUM_OWNER_SIGNING_KEY)
|
||||
```
|
||||
|
||||
Full manifest at `cog/artifacts/manifest.json`. Verified via public anonymous GET against `https://storage.googleapis.com/cognitum-apps/cogs/arm/cog-pose-estimation-arm` — downloaded SHA matches the locally-computed SHA.
|
||||
|
||||
### Live appliance install
|
||||
|
||||
Installed on `cognitum-v0` (the V0 cluster leader) at `/var/lib/cognitum/apps/pose-estimation/`:
|
||||
|
||||
```
|
||||
$ ls -la /var/lib/cognitum/apps/pose-estimation/
|
||||
-rwxr-xr-x cog-pose-estimation-arm 3,741,976 B (matches GCS sha256)
|
||||
-rw-r--r-- pose_v1.safetensors 507,032 B
|
||||
-rw-r--r-- manifest.json 989 B
|
||||
-rw-r--r-- config.json 187 B
|
||||
-rw-r--r-- output.log 28,438 B (5-sec smoke run)
|
||||
```
|
||||
|
||||
Layout matches the existing `anomaly-detect`, `presence`, `seizure-detect`, etc. cogs on the same appliance — the Cogs dashboard at `http://cognitum-v0:9000/cogs` auto-discovers entries under this dir.
|
||||
|
||||
`cog-pose-estimation run` ran cleanly in the background for 5 seconds with the default config. It correctly:
|
||||
|
||||
- Emitted a `run.started` event with the configured `sensing_url`, `model_path`, and `poll_ms`.
|
||||
- Started its 40 ms poll loop.
|
||||
- **Gracefully handled the missing local sensing-server on port 3000** by logging structured WARN events (`{"level":"WARN","fields":{"message":"sensing-server fetch failed","error":"...Connection refused..."}}`) without crashing, leaking, or producing NaN output.
|
||||
- Exited cleanly on SIGTERM.
|
||||
|
||||
0 `pose.frame` events fired during the smoke run — expected, since `127.0.0.1:3000` isn't serving CSI on the appliance. The appliance's actual CSI source is `ruview-vitals-worker` on `:50054` plus the `/api/v1/v0/system/...` endpoints behind the appliance's bearer auth on `:9000`. Wiring `sensing_url` to the appliance-native source is a Day-2 integration task — separate from the cog binary itself.
|
||||
|
||||
Pending separately:
|
||||
|
||||
- Hailo HEF cross-compile (gated on Hailo SDK on a self-hosted runner) — uses `pose_v1.onnx` as input.
|
||||
- Appliance-native sensing-source integration (`config.sensing_url` should point at the cog-gateway's CSI tap on `:9000`, not the dev-loopback `:3000`).
|
||||
- x86_64 release upload (today's release is arm-only).
|
||||
|
||||
### Artifacts
|
||||
|
||||
- `v2/crates/cog-pose-estimation/cog/artifacts/pose_v1.safetensors` — 507 KB
|
||||
- `v2/crates/cog-pose-estimation/cog/artifacts/train_results.json` — full per-epoch loss curve + hyperparameters + per-joint PCK
|
||||
|
||||
### Reproducibility
|
||||
|
||||
```bash
|
||||
# On any host with cargo + a CUDA-capable GPU:
|
||||
cd ~/work/cog-pose-train
|
||||
mkdir -p ./
|
||||
# Stage the same inputs (1,077 paired samples + HF encoder, see scripts/align-ground-truth.js for regeneration)
|
||||
cp paired.jsonl ./paired.jsonl
|
||||
cp encoder.safetensors ./encoder.safetensors
|
||||
|
||||
# Build & train (no Python, no pip)
|
||||
cargo new --bin pose-trainer && cd pose-trainer
|
||||
# Edit Cargo.toml deps: candle-core 0.9 (cuda), candle-nn 0.9 (cuda), safetensors, serde, serde_json, anyhow
|
||||
# Drop the training script into src/main.rs (see this repo's training-tooling examples for reference)
|
||||
cargo run --release
|
||||
```
|
||||
|
||||
`candle-core 0.8.4 + 0.9.2` are typically already in `~/.cargo/registry/cache/` on any developer host, so the build completes in seconds.
|
||||
|
|
@ -0,0 +1,143 @@
|
|||
#!/usr/bin/env python3
|
||||
"""Export pose_v1.safetensors -> pose_v1.onnx.
|
||||
|
||||
Builds the same architecture as v2/crates/cog-pose-estimation/src/inference.rs
|
||||
in PyTorch, loads the trained weights from safetensors, and runs a torch.onnx
|
||||
export with a fixed [1, 56, 20] input. Then verifies the ONNX loads and
|
||||
matches the torch output to within 1e-5.
|
||||
"""
|
||||
|
||||
import json
|
||||
import struct
|
||||
import sys
|
||||
from pathlib import Path
|
||||
|
||||
import numpy as np
|
||||
import torch
|
||||
import torch.nn as nn
|
||||
|
||||
|
||||
N_SUB = 56
|
||||
N_FRAMES = 20
|
||||
N_KP = 17
|
||||
|
||||
|
||||
class PoseNet(nn.Module):
|
||||
"""Mirrors inference.rs::PoseNet exactly."""
|
||||
|
||||
def __init__(self) -> None:
|
||||
super().__init__()
|
||||
self.c1 = nn.Conv1d(N_SUB, 64, kernel_size=3, padding=1, dilation=1)
|
||||
self.c2 = nn.Conv1d(64, 128, kernel_size=3, padding=2, dilation=2)
|
||||
self.c3 = nn.Conv1d(128, 128, kernel_size=3, padding=4, dilation=4)
|
||||
self.fc1 = nn.Linear(128, 256)
|
||||
self.fc2 = nn.Linear(256, N_KP * 2)
|
||||
|
||||
def forward(self, x: torch.Tensor) -> torch.Tensor:
|
||||
# x: [B, 56, 20]
|
||||
h = torch.relu(self.c1(x))
|
||||
h = torch.relu(self.c2(h))
|
||||
h = torch.relu(self.c3(h))
|
||||
h = h.mean(dim=2) # [B, 128]
|
||||
h = torch.relu(self.fc1(h))
|
||||
h = torch.sigmoid(self.fc2(h))
|
||||
return h
|
||||
|
||||
|
||||
def load_safetensors(path: Path) -> dict[str, torch.Tensor]:
|
||||
"""Pure-python safetensors reader. Avoids the safetensors pip dep."""
|
||||
with path.open("rb") as f:
|
||||
header_len = struct.unpack("<Q", f.read(8))[0]
|
||||
header = json.loads(f.read(header_len).decode("utf-8"))
|
||||
out: dict[str, torch.Tensor] = {}
|
||||
for name, meta in header.items():
|
||||
if name == "__metadata__":
|
||||
continue
|
||||
start, end = meta["data_offsets"]
|
||||
shape = meta["shape"]
|
||||
dtype = meta["dtype"]
|
||||
assert dtype == "F32", f"unsupported dtype {dtype} for {name}"
|
||||
f.seek(8 + header_len + start)
|
||||
buf = f.read(end - start)
|
||||
arr = np.frombuffer(buf, dtype=np.float32).copy().reshape(shape)
|
||||
out[name] = torch.from_numpy(arr)
|
||||
return out
|
||||
|
||||
|
||||
def main() -> None:
|
||||
weights_path = Path(sys.argv[1]) if len(sys.argv) > 1 else Path("pose_v1.safetensors")
|
||||
out_path = Path(sys.argv[2]) if len(sys.argv) > 2 else Path("pose_v1.onnx")
|
||||
|
||||
if not weights_path.exists():
|
||||
raise SystemExit(f"weights file not found: {weights_path}")
|
||||
|
||||
print(f"reading {weights_path}")
|
||||
tensors = load_safetensors(weights_path)
|
||||
print(f" found {len(tensors)} tensors: {sorted(tensors.keys())}")
|
||||
|
||||
model = PoseNet()
|
||||
# Map safetensors names (enc.c1.weight, head.fc1.weight, ...) to module params
|
||||
mapping = {
|
||||
"enc.c1.weight": "c1.weight",
|
||||
"enc.c1.bias": "c1.bias",
|
||||
"enc.c2.weight": "c2.weight",
|
||||
"enc.c2.bias": "c2.bias",
|
||||
"enc.c3.weight": "c3.weight",
|
||||
"enc.c3.bias": "c3.bias",
|
||||
"head.fc1.weight": "fc1.weight",
|
||||
"head.fc1.bias": "fc1.bias",
|
||||
"head.fc2.weight": "fc2.weight",
|
||||
"head.fc2.bias": "fc2.bias",
|
||||
}
|
||||
state = {dst: tensors[src] for src, dst in mapping.items()}
|
||||
model.load_state_dict(state)
|
||||
model.eval()
|
||||
print(" weights loaded into PyTorch model")
|
||||
|
||||
# Sanity check forward
|
||||
x = torch.zeros(1, N_SUB, N_FRAMES)
|
||||
with torch.no_grad():
|
||||
y = model(x)
|
||||
print(f" zero-input forward: shape={tuple(y.shape)} sample={y[0, :4].tolist()}")
|
||||
|
||||
# Export to ONNX
|
||||
torch.onnx.export(
|
||||
model,
|
||||
x,
|
||||
out_path,
|
||||
export_params=True,
|
||||
opset_version=18,
|
||||
do_constant_folding=True,
|
||||
input_names=["csi_window"],
|
||||
output_names=["keypoints"],
|
||||
dynamic_axes={"csi_window": {0: "batch"}, "keypoints": {0: "batch"}},
|
||||
)
|
||||
print(f" wrote {out_path} ({out_path.stat().st_size} bytes)")
|
||||
|
||||
# Verify the ONNX file loads + matches torch output
|
||||
try:
|
||||
import onnx
|
||||
import onnxruntime as ort
|
||||
|
||||
onnx_model = onnx.load(str(out_path))
|
||||
onnx.checker.check_model(onnx_model)
|
||||
print(" ONNX model checker: ok")
|
||||
|
||||
sess = ort.InferenceSession(str(out_path), providers=["CPUExecutionProvider"])
|
||||
rng = np.random.default_rng(42)
|
||||
x_np = rng.standard_normal((1, N_SUB, N_FRAMES), dtype=np.float32)
|
||||
with torch.no_grad():
|
||||
y_torch = model(torch.from_numpy(x_np)).numpy()
|
||||
y_onnx = sess.run(["keypoints"], {"csi_window": x_np})[0]
|
||||
max_abs = float(np.max(np.abs(y_torch - y_onnx)))
|
||||
print(f" parity vs torch: max |torch - onnx| = {max_abs:.2e}")
|
||||
assert max_abs < 1e-5, "ONNX output diverges from torch output"
|
||||
print(" parity ok (<1e-5)")
|
||||
except ImportError as e:
|
||||
print(f" WARN: onnx/onnxruntime not installed, skipping verification: {e}")
|
||||
|
||||
print("\nDone.")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
File diff suppressed because it is too large
Load Diff
|
|
@ -28,6 +28,12 @@ members = [
|
|||
"crates/wifi-densepose-geo",
|
||||
"crates/nvsim",
|
||||
"crates/nvsim-server",
|
||||
# ADR-100/ADR-101: Cognitum Cog packaging — first Cog from this repo.
|
||||
# Ships the wifi-densepose pose-estimation model as a signed binary +
|
||||
# JSONL manifest installable by the Cognitum V0 appliance (cognitum-v0,
|
||||
# cognitum-cluster-*, ruvultra). The companion appliance-side crate
|
||||
# lives in cognitum-one/v0-appliance as `cognitum-pose-estimation`.
|
||||
"crates/cog-pose-estimation",
|
||||
# rvCSI — edge RF sensing runtime (ADR-095 platform, ADR-096 FFI/crate layout):
|
||||
# lives in its own repo (https://github.com/ruvnet/rvcsi), vendored here as
|
||||
# `vendor/rvcsi` and published to crates.io as `rvcsi-*` 0.3.x. Depend on the
|
||||
|
|
|
|||
|
|
@ -0,0 +1,54 @@
|
|||
[package]
|
||||
name = "cog-pose-estimation"
|
||||
version.workspace = true
|
||||
edition.workspace = true
|
||||
authors.workspace = true
|
||||
license.workspace = true
|
||||
repository.workspace = true
|
||||
description = "Cognitum Cog: 17-keypoint pose estimation from WiFi CSI. See ADR-100 (packaging) + ADR-101 (this Cog)."
|
||||
publish = false
|
||||
|
||||
[[bin]]
|
||||
name = "cog-pose-estimation"
|
||||
path = "src/main.rs"
|
||||
|
||||
[lib]
|
||||
name = "cog_pose_estimation"
|
||||
path = "src/lib.rs"
|
||||
|
||||
[dependencies]
|
||||
clap = { version = "4", features = ["derive"] }
|
||||
serde = { version = "1", features = ["derive"] }
|
||||
serde_json = "1"
|
||||
thiserror = "1"
|
||||
tracing = "0.1"
|
||||
tracing-subscriber = { version = "0.3", features = ["env-filter", "json"] }
|
||||
tokio = { version = "1", features = ["rt-multi-thread", "macros", "signal", "time"] }
|
||||
sha2 = "0.10"
|
||||
hex = "0.4"
|
||||
# Sensing-server subscriber over HTTP — kept minimal; no full reqwest dep
|
||||
ureq = { version = "2", default-features = false, features = ["tls"] }
|
||||
# Inference backend — Candle, CPU by default. The `cuda` feature gate
|
||||
# below pulls in CUDA support on hosts that have it. Pinned to 0.9 to
|
||||
# match the training script that produced pose_v1.safetensors.
|
||||
candle-core = { version = "0.9", default-features = false }
|
||||
candle-nn = { version = "0.9", default-features = false }
|
||||
safetensors = "0.4"
|
||||
# wifi-densepose-train re-exports the model types we need; depend by path
|
||||
# inside the workspace.
|
||||
wifi-densepose-train = { path = "../wifi-densepose-train", default-features = false }
|
||||
|
||||
[dev-dependencies]
|
||||
tempfile = "3"
|
||||
|
||||
[features]
|
||||
default = []
|
||||
# Use CUDA for inference on hosts with a CUDA-capable GPU. Off by
|
||||
# default so CI on plain Linux/Windows boxes still builds; flip on for
|
||||
# the GPU-dev path on ruvultra.
|
||||
cuda = ["candle-core/cuda", "candle-nn/cuda"]
|
||||
# Stub for the future Hailo HEF runtime path. The actual Hailo
|
||||
# integration lives in the companion v0-appliance crate `cognitum-hailo`;
|
||||
# this crate keeps a feature flag so the binary can compile without the
|
||||
# Hailo SDK in CI.
|
||||
hailo = []
|
||||
|
|
@ -0,0 +1,57 @@
|
|||
# Build / sign / upload pipeline for cog-pose-estimation.
|
||||
# See ADR-100 §"Build pipeline" for the full contract.
|
||||
|
||||
CRATE := cog-pose-estimation
|
||||
VERSION := $(shell cargo pkgid -p $(CRATE) 2>/dev/null | sed -E 's/.*#([0-9.]+).*/\1/')
|
||||
GCS_BUCKET := gs://cognitum-apps/cogs
|
||||
|
||||
ARCHES := arm x86_64
|
||||
|
||||
# --- Build targets ---
|
||||
|
||||
.PHONY: build build-arm build-x86_64
|
||||
|
||||
build: build-arm build-x86_64
|
||||
|
||||
build-arm:
|
||||
cargo build -p $(CRATE) --release --target aarch64-unknown-linux-gnu
|
||||
cp ../../target/aarch64-unknown-linux-gnu/release/$(CRATE) ./dist/cog-$(CRATE)-arm
|
||||
|
||||
build-x86_64:
|
||||
cargo build -p $(CRATE) --release --target x86_64-unknown-linux-gnu
|
||||
cp ../../target/x86_64-unknown-linux-gnu/release/$(CRATE) ./dist/cog-$(CRATE)-x86_64
|
||||
|
||||
# --- Sign ---
|
||||
|
||||
.PHONY: sign sign-arm sign-x86_64
|
||||
|
||||
sign: sign-arm sign-x86_64
|
||||
|
||||
sign-arm: dist/cog-$(CRATE)-arm
|
||||
sha256sum dist/cog-$(CRATE)-arm | cut -d' ' -f1 > dist/cog-$(CRATE)-arm.sha256
|
||||
# Signature: gcloud secrets versions access latest --secret=COGNITUM_OWNER_SIGNING_KEY \
|
||||
# | openssl pkeyutl -sign -inkey /dev/stdin -rawin -in dist/cog-$(CRATE)-arm.sha256 \
|
||||
# | base64 -w0 > dist/cog-$(CRATE)-arm.sig
|
||||
@echo "TODO: wire Ed25519 sign step once COGNITUM_OWNER_SIGNING_KEY is provisioned to CI."
|
||||
|
||||
sign-x86_64: dist/cog-$(CRATE)-x86_64
|
||||
sha256sum dist/cog-$(CRATE)-x86_64 | cut -d' ' -f1 > dist/cog-$(CRATE)-x86_64.sha256
|
||||
|
||||
# --- Upload to GCS ---
|
||||
|
||||
.PHONY: upload upload-arm upload-x86_64
|
||||
|
||||
upload: upload-arm upload-x86_64
|
||||
|
||||
upload-arm: dist/cog-$(CRATE)-arm
|
||||
gsutil cp dist/cog-$(CRATE)-arm $(GCS_BUCKET)/arm/cog-$(CRATE)-arm
|
||||
|
||||
upload-x86_64: dist/cog-$(CRATE)-x86_64
|
||||
gsutil cp dist/cog-$(CRATE)-x86_64 $(GCS_BUCKET)/x86_64/cog-$(CRATE)-x86_64
|
||||
|
||||
# --- Manifest ---
|
||||
|
||||
.PHONY: manifest
|
||||
|
||||
manifest:
|
||||
@./scripts/render-manifest.sh $(VERSION)
|
||||
|
|
@ -0,0 +1,68 @@
|
|||
# Pose Estimation Cog
|
||||
|
||||
17-keypoint COCO pose estimation from WiFi CSI, deployed as a [Cognitum Cog](../../../../docs/adr/ADR-100-cog-packaging-specification.md).
|
||||
|
||||
## What it does
|
||||
|
||||
Subscribes to the local sensing-server's CSI stream, runs each window through a contrastive encoder (initialised from [`ruvnet/wifi-densepose-pretrained`](https://huggingface.co/ruvnet/wifi-densepose-pretrained)) and a 17-keypoint regression head, and emits one `pose.frame` event per inferred window on stdout. The appliance's cog-gateway picks up those events and routes them to the dashboard.
|
||||
|
||||
## Inputs
|
||||
|
||||
- `[56 subcarriers × 20 frames]` CSI windows (matches the `[56, 20]` shape produced by `scripts/align-ground-truth.js`).
|
||||
- Sensing-server frame poll URL configured via `config.json` (`sensing_url`, default loopback).
|
||||
|
||||
## Outputs
|
||||
|
||||
```json
|
||||
{"ts": 1779210883.444, "level": "info", "event": "pose.frame",
|
||||
"fields": {
|
||||
"tick": 12345,
|
||||
"n_persons": 1,
|
||||
"persons": [{"keypoints": [[0.48, 0.31], ...], "confidence": 0.81}]
|
||||
}}
|
||||
```
|
||||
|
||||
## Status — v0.0.1
|
||||
|
||||
Pipeline scaffold + a first-cut trained model. The model is stored at `cog/artifacts/pose_v1.safetensors` (507 KB) and trained from `data/paired/wiflow-p7-1779210883.paired.jsonl` (1,077 samples, avg conf 0.44) using `candle-core 0.9` on an RTX 5080 — see the full training-result dump at `cog/artifacts/train_results.json`.
|
||||
|
||||
### Measured accuracy (validation set, 217 held-out samples)
|
||||
|
||||
```
|
||||
Overall: PCK@20 = 3.0% PCK@50 = 18.5% MPJPE (normalized) = 0.0931
|
||||
|
||||
Per-joint PCK@20 PCK@50 Per-joint PCK@20 PCK@50
|
||||
───────── ────── ────── ───────── ────── ──────
|
||||
nose 0.5% 5.1% l_hip 0.0% 27.3%
|
||||
l_eye 2.8% 8.3% r_hip 25.0% 76.9% ← strongest signal
|
||||
r_eye 1.9% 15.7% l_knee 2.3% 20.8%
|
||||
l_ear 0.0% 3.2% r_knee 0.9% 35.2%
|
||||
r_ear 1.9% 9.7% l_ankle 1.4% 7.9%
|
||||
l_shoulder 4.6% 8.8% r_ankle 0.9% 9.3%
|
||||
r_shoulder 1.9% 19.9% l_elbow 1.9% 26.4%
|
||||
l_wrist 3.2% 24.1% r_elbow 0.0% 4.2%
|
||||
r_wrist 1.4% 12.0%
|
||||
```
|
||||
|
||||
Loss curve: 0.181 (epoch 0) → 0.014 (epoch 399), eval loss 0.010. **400 epochs in 2.1 s** on the RTX 5080 (~5 ms/epoch full-batch).
|
||||
|
||||
### Honest reading
|
||||
|
||||
- The model **learns coarse body structure** — `r_hip` 77% PCK@50, `r_knee` 35%, `l_elbow` 26% all show real signal. PCK@50 = 18.5% averaged across joints is well above the random-baseline 0% that the pure-JS SPSA training produced.
|
||||
- It is **below the ADR-079 target of PCK@20 ≥ 35%**. The bottleneck is data quality and quantity, not infra. The single 30-min seated-at-desk recording produced 1,077 paired samples at avg confidence 0.44 — strong asymmetry between left/right side (r_hip 77% vs l_hip 27%) reflects the camera framing more than any model defect.
|
||||
- Distal joints (wrists, ankles) and face joints are still near-random: 56-subcarrier CSI at our 20-frame window doesn't carry enough fine-grained spatial information.
|
||||
|
||||
### Next-iteration plan (tracked in [#640](https://github.com/ruvnet/RuView/issues/640))
|
||||
|
||||
- Multi-session, multi-room recordings with **full-body framing** (target ≥ 30K paired samples at conf ≥ 0.7).
|
||||
- Re-train with the same Candle pipeline (already validated to converge in seconds on RTX 5080).
|
||||
- Hailo HEF export via the Dataflow Compiler on a self-hosted runner.
|
||||
|
||||
The cog's runtime inference path is currently a centred-skeleton stub returning `confidence=0`. Wiring the `pose_v1.safetensors` weights into `src/inference.rs` is the next code change — separate PR.
|
||||
|
||||
## See also
|
||||
|
||||
- ADR-100: Cognitum Cog Packaging Specification.
|
||||
- ADR-101: Pose Estimation Cog (the design behind this directory).
|
||||
- ADR-079: Camera-supervised pose training pipeline.
|
||||
- v0-appliance companion crate: `cognitum-pose-estimation` (Hailo HEF runtime).
|
||||
|
|
@ -0,0 +1,25 @@
|
|||
{
|
||||
"id": "pose-estimation",
|
||||
"version": "0.0.1",
|
||||
"binary_url": "https://storage.googleapis.com/cognitum-apps/cogs/arm/cog-pose-estimation-arm",
|
||||
"binary_bytes": 3741976,
|
||||
"binary_sha256": "1e1a7d3dd01ca05d5bfc5dbb142a5941b7866ed9f3224a21edc04d3f09a99bf5",
|
||||
"binary_signature": "LUN7xqLPYD3MFzm5dKB5MnYU0LvoRtek5ci5KiKPHBg+Xo6xuazwokn2Dw2JPMaLYJzmWn/SpT4djuR7hYvVDw==",
|
||||
"weights_url": "https://storage.googleapis.com/cognitum-apps/cogs/arm/cog-pose-estimation-pose_v1.safetensors",
|
||||
"weights_bytes": 507032,
|
||||
"weights_sha256": "eb249b9a6b2e10130437a10976ed0230b0d085f86a0553d7226e1ae6eae4b9e5",
|
||||
"arch": "arm",
|
||||
"target_triple": "aarch64-unknown-linux-gnu",
|
||||
"installed_at": 0,
|
||||
"status": "installed",
|
||||
"signed_by": "COGNITUM_OWNER_SIGNING_KEY",
|
||||
"sig_algo": "Ed25519",
|
||||
"build_metadata": {
|
||||
"rust": "1.95.0",
|
||||
"candle": "0.9 cpu",
|
||||
"cog_pose_version": "0.3.0",
|
||||
"training_pck20": 3.0,
|
||||
"training_pck50": 18.5,
|
||||
"training_mpjpe_normalized": 0.0931
|
||||
}
|
||||
}
|
||||
Binary file not shown.
Binary file not shown.
|
|
@ -0,0 +1,573 @@
|
|||
{
|
||||
"backend": "candle-cuda",
|
||||
"data": {
|
||||
"eval_samples": 216,
|
||||
"split": "temporal_80_20",
|
||||
"split_timestamp": "2026-05-19T17:38:45.486Z",
|
||||
"total_samples": 1077,
|
||||
"train_samples": 861
|
||||
},
|
||||
"encoder_init": "random",
|
||||
"epoch_losses": [
|
||||
0.1808941662311554,
|
||||
0.16265815496444702,
|
||||
0.13955898582935333,
|
||||
0.12225159257650375,
|
||||
0.10377667844295502,
|
||||
0.08922480046749115,
|
||||
0.076103076338768,
|
||||
0.06308665871620178,
|
||||
0.049426380544900894,
|
||||
0.039140596985816956,
|
||||
0.030129408463835716,
|
||||
0.025303713977336884,
|
||||
0.022442471235990524,
|
||||
0.02088615857064724,
|
||||
0.02010779082775116,
|
||||
0.01956109143793583,
|
||||
0.01948179118335247,
|
||||
0.019212622195482254,
|
||||
0.019074730575084686,
|
||||
0.018810957670211792,
|
||||
0.01868920773267746,
|
||||
0.01838303543627262,
|
||||
0.018172571435570717,
|
||||
0.017943259328603745,
|
||||
0.01760796643793583,
|
||||
0.01735210232436657,
|
||||
0.016929639503359795,
|
||||
0.01662956178188324,
|
||||
0.016312643885612488,
|
||||
0.016049085184931755,
|
||||
0.015733029693365097,
|
||||
0.01548701710999012,
|
||||
0.015283167362213135,
|
||||
0.014983722940087318,
|
||||
0.014812562614679337,
|
||||
0.01465131901204586,
|
||||
0.014480160549283028,
|
||||
0.014315342530608177,
|
||||
0.014290803112089634,
|
||||
0.014210136607289314,
|
||||
0.014109139330685139,
|
||||
0.014035886153578758,
|
||||
0.014050519093871117,
|
||||
0.013955573551356792,
|
||||
0.013999568298459053,
|
||||
0.014035838656127453,
|
||||
0.013971822336316109,
|
||||
0.013921688310801983,
|
||||
0.013923658058047295,
|
||||
0.014015297405421734,
|
||||
0.014005525968968868,
|
||||
0.013793034479022026,
|
||||
0.014398499391973019,
|
||||
0.016041349619627,
|
||||
0.018437474966049194,
|
||||
0.019666751846671104,
|
||||
0.01953406259417534,
|
||||
0.018313558772206306,
|
||||
0.016403522342443466,
|
||||
0.014824355952441692,
|
||||
0.014008168131113052,
|
||||
0.013724717311561108,
|
||||
0.013581405393779278,
|
||||
0.013707487843930721,
|
||||
0.01353893056511879,
|
||||
0.013217244297266006,
|
||||
0.012987865135073662,
|
||||
0.012728189118206501,
|
||||
0.01254442147910595,
|
||||
0.012492014095187187,
|
||||
0.012401513755321503,
|
||||
0.012278808280825615,
|
||||
0.012222359888255596,
|
||||
0.012228039093315601,
|
||||
0.012238679453730583,
|
||||
0.012207139283418655,
|
||||
0.012071969918906689,
|
||||
0.012182669714093208,
|
||||
0.011957147158682346,
|
||||
0.011931930668652058,
|
||||
0.011995002627372742,
|
||||
0.012032398954033852,
|
||||
0.011852897703647614,
|
||||
0.011876476928591728,
|
||||
0.011844047345221043,
|
||||
0.011939700692892075,
|
||||
0.011796612292528152,
|
||||
0.01177540048956871,
|
||||
0.011741355061531067,
|
||||
0.011779669672250748,
|
||||
0.011744190007448196,
|
||||
0.011707762256264687,
|
||||
0.011584608815610409,
|
||||
0.011752696707844734,
|
||||
0.011729150079190731,
|
||||
0.011659013107419014,
|
||||
0.011693276464939117,
|
||||
0.011864989064633846,
|
||||
0.011667383834719658,
|
||||
0.011718816123902798,
|
||||
0.01166768092662096,
|
||||
0.011662120930850506,
|
||||
0.011931229382753372,
|
||||
0.012049584649503231,
|
||||
0.012037307024002075,
|
||||
0.01206426601856947,
|
||||
0.012293326668441296,
|
||||
0.012212480418384075,
|
||||
0.01250689011067152,
|
||||
0.012488565407693386,
|
||||
0.012466518208384514,
|
||||
0.012616620399057865,
|
||||
0.012812258675694466,
|
||||
0.013071495108306408,
|
||||
0.013044825755059719,
|
||||
0.01321423426270485,
|
||||
0.013319150544703007,
|
||||
0.013587700203061104,
|
||||
0.013670523650944233,
|
||||
0.01378133799880743,
|
||||
0.014047945849597454,
|
||||
0.013731345534324646,
|
||||
0.014244080521166325,
|
||||
0.014112128876149654,
|
||||
0.014279313385486603,
|
||||
0.014710888266563416,
|
||||
0.01515843067318201,
|
||||
0.014713115990161896,
|
||||
0.014796034432947636,
|
||||
0.01475681271404028,
|
||||
0.014950357377529144,
|
||||
0.015005035325884819,
|
||||
0.014768424443900585,
|
||||
0.015024485997855663,
|
||||
0.015059541910886765,
|
||||
0.015051408670842648,
|
||||
0.015090585686266422,
|
||||
0.015175160020589828,
|
||||
0.015102844685316086,
|
||||
0.015151201747357845,
|
||||
0.015226155519485474,
|
||||
0.015032590366899967,
|
||||
0.015155772678554058,
|
||||
0.01507557276636362,
|
||||
0.015160820446908474,
|
||||
0.015019215643405914,
|
||||
0.015037509612739086,
|
||||
0.015222272835671902,
|
||||
0.015005122870206833,
|
||||
0.015173210762441158,
|
||||
0.015132835134863853,
|
||||
0.027589134871959686,
|
||||
0.07165955752134323,
|
||||
0.06373818218708038,
|
||||
0.06655537337064743,
|
||||
0.07562592625617981,
|
||||
0.06909485161304474,
|
||||
0.05691340193152428,
|
||||
0.048039719462394714,
|
||||
0.040047839283943176,
|
||||
0.034030981361866,
|
||||
0.02623862214386463,
|
||||
0.02114911563694477,
|
||||
0.018268009647727013,
|
||||
0.01640227809548378,
|
||||
0.01537158153951168,
|
||||
0.014892393723130226,
|
||||
0.014505675993859768,
|
||||
0.014186820015311241,
|
||||
0.013841629028320312,
|
||||
0.013426804915070534,
|
||||
0.013020739890635014,
|
||||
0.012673602439463139,
|
||||
0.012330775149166584,
|
||||
0.01226764265447855,
|
||||
0.012166578322649002,
|
||||
0.012095688842236996,
|
||||
0.012270377948880196,
|
||||
0.012516235001385212,
|
||||
0.012700744904577732,
|
||||
0.012992565520107746,
|
||||
0.013367722742259502,
|
||||
0.013592609204351902,
|
||||
0.013607893139123917,
|
||||
0.013697323389351368,
|
||||
0.013854263350367546,
|
||||
0.013832741416990757,
|
||||
0.01367993839085102,
|
||||
0.013867720030248165,
|
||||
0.013601685874164104,
|
||||
0.013631370849907398,
|
||||
0.013577244244515896,
|
||||
0.013414927758276463,
|
||||
0.013450143858790398,
|
||||
0.013431857340037823,
|
||||
0.01343410275876522,
|
||||
0.013244441710412502,
|
||||
0.013297016732394695,
|
||||
0.01346137747168541,
|
||||
0.01331599336117506,
|
||||
0.014807604253292084,
|
||||
0.014646961353719234,
|
||||
0.014483925886452198,
|
||||
0.014267523773014545,
|
||||
0.014087164774537086,
|
||||
0.013921936973929405,
|
||||
0.013723043724894524,
|
||||
0.013571077957749367,
|
||||
0.013395787216722965,
|
||||
0.013234280981123447,
|
||||
0.013133431784808636,
|
||||
0.013057147152721882,
|
||||
0.012962305918335915,
|
||||
0.012835373170673847,
|
||||
0.012728667818009853,
|
||||
0.012636503204703331,
|
||||
0.012564707547426224,
|
||||
0.01253308542072773,
|
||||
0.012460188008844852,
|
||||
0.012445810250937939,
|
||||
0.01240697130560875,
|
||||
0.012377945706248283,
|
||||
0.012340536341071129,
|
||||
0.01233599055558443,
|
||||
0.012312998063862324,
|
||||
0.012278364971280098,
|
||||
0.012224015779793262,
|
||||
0.012239382602274418,
|
||||
0.012242404744029045,
|
||||
0.012323223985731602,
|
||||
0.012205271050333977,
|
||||
0.012227945029735565,
|
||||
0.012205214239656925,
|
||||
0.012209423817694187,
|
||||
0.01217598281800747,
|
||||
0.012150637805461884,
|
||||
0.01217078510671854,
|
||||
0.01225175429135561,
|
||||
0.012216047383844852,
|
||||
0.012195242568850517,
|
||||
0.012198278680443764,
|
||||
0.012190825305879116,
|
||||
0.012173629365861416,
|
||||
0.012157510966062546,
|
||||
0.012140096165239811,
|
||||
0.012207810766994953,
|
||||
0.012194979004561901,
|
||||
0.01217165682464838,
|
||||
0.01216792967170477,
|
||||
0.01218471210449934,
|
||||
0.012194857932627201,
|
||||
0.012163667008280754,
|
||||
0.012145694345235825,
|
||||
0.012135420925915241,
|
||||
0.012164837680757046,
|
||||
0.01216159388422966,
|
||||
0.012148530222475529,
|
||||
0.012224133126437664,
|
||||
0.012155838310718536,
|
||||
0.012177230790257454,
|
||||
0.012110436335206032,
|
||||
0.012090248055756092,
|
||||
0.012101170606911182,
|
||||
0.012153848074376583,
|
||||
0.012173553928732872,
|
||||
0.012172674760222435,
|
||||
0.012157287448644638,
|
||||
0.012172986753284931,
|
||||
0.012137886136770248,
|
||||
0.012157085351645947,
|
||||
0.012121357955038548,
|
||||
0.012135915458202362,
|
||||
0.012176922522485256,
|
||||
0.012193577364087105,
|
||||
0.012180276215076447,
|
||||
0.012223861180245876,
|
||||
0.012179303914308548,
|
||||
0.012176022864878178,
|
||||
0.012092312797904015,
|
||||
0.012138010933995247,
|
||||
0.01214117556810379,
|
||||
0.012276227585971355,
|
||||
0.012187770567834377,
|
||||
0.012211603112518787,
|
||||
0.012213931418955326,
|
||||
0.012225016951560974,
|
||||
0.012142234481871128,
|
||||
0.012134073302149773,
|
||||
0.012163194827735424,
|
||||
0.01223068218678236,
|
||||
0.012200715951621532,
|
||||
0.012191612273454666,
|
||||
0.01220244076102972,
|
||||
0.01220419630408287,
|
||||
0.012142208404839039,
|
||||
0.012142272666096687,
|
||||
0.01212950050830841,
|
||||
0.012169948779046535,
|
||||
0.012184932827949524,
|
||||
0.012199781835079193,
|
||||
0.012189080938696861,
|
||||
0.012251517735421658,
|
||||
0.012228423729538918,
|
||||
0.012237711809575558,
|
||||
0.012216192670166492,
|
||||
0.012263692915439606,
|
||||
0.012285872362554073,
|
||||
0.012329400517046452,
|
||||
0.012345477007329464,
|
||||
0.012416589073836803,
|
||||
0.012419192120432854,
|
||||
0.012471407651901245,
|
||||
0.012412074953317642,
|
||||
0.012433832511305809,
|
||||
0.01246955618262291,
|
||||
0.012568573467433453,
|
||||
0.012632711790502071,
|
||||
0.01270760502666235,
|
||||
0.012691991403698921,
|
||||
0.012749818153679371,
|
||||
0.012748819775879383,
|
||||
0.01276922132819891,
|
||||
0.012770597822964191,
|
||||
0.012830909341573715,
|
||||
0.012891922146081924,
|
||||
0.012974675744771957,
|
||||
0.01295324694365263,
|
||||
0.01304001547396183,
|
||||
0.0130251320078969,
|
||||
0.013028905726969242,
|
||||
0.012945529073476791,
|
||||
0.013016759417951107,
|
||||
0.013065450824797153,
|
||||
0.013240920379757881,
|
||||
0.013167147524654865,
|
||||
0.013239633291959763,
|
||||
0.013240372762084007,
|
||||
0.013296829536557198,
|
||||
0.01322928350418806,
|
||||
0.013259101659059525,
|
||||
0.013233119621872902,
|
||||
0.013339969329535961,
|
||||
0.013323795981705189,
|
||||
0.013341942802071571,
|
||||
0.013390406966209412,
|
||||
0.013395088724792004,
|
||||
0.013347778469324112,
|
||||
0.013323097489774227,
|
||||
0.013308844529092312,
|
||||
0.01338045671582222,
|
||||
0.013418255373835564,
|
||||
0.013455703854560852,
|
||||
0.01349731907248497,
|
||||
0.013548982329666615,
|
||||
0.013543978333473206,
|
||||
0.013514911755919456,
|
||||
0.013511871919035912,
|
||||
0.01351082045584917,
|
||||
0.01348851714283228,
|
||||
0.013556062243878841,
|
||||
0.013558348640799522,
|
||||
0.013616240583360195,
|
||||
0.013577889651060104,
|
||||
0.013577991165220737,
|
||||
0.013531915843486786,
|
||||
0.013514644466340542,
|
||||
0.01348655391484499,
|
||||
0.013568769209086895,
|
||||
0.013610766269266605,
|
||||
0.013646356761455536,
|
||||
0.013650151900947094,
|
||||
0.013662545941770077,
|
||||
0.013631481677293777,
|
||||
0.013629746623337269,
|
||||
0.01362497080117464,
|
||||
0.013645497150719166,
|
||||
0.013664674945175648,
|
||||
0.013721015304327011,
|
||||
0.013627894222736359,
|
||||
0.013688581064343452,
|
||||
0.013681283220648766,
|
||||
0.013655297458171844,
|
||||
0.013539095409214497,
|
||||
0.013555340468883514,
|
||||
0.013566684909164906,
|
||||
0.013745179399847984,
|
||||
0.013687034137547016,
|
||||
0.013702981173992157,
|
||||
0.01367457490414381,
|
||||
0.013732061721384525,
|
||||
0.01364122238010168,
|
||||
0.013664795085787773,
|
||||
0.013612691313028336,
|
||||
0.013709086924791336,
|
||||
0.013684045523405075,
|
||||
0.013670985586941242,
|
||||
0.013698549009859562,
|
||||
0.013667520135641098,
|
||||
0.013631648384034634,
|
||||
0.013607441447675228
|
||||
],
|
||||
"epochs": 400,
|
||||
"final_eval_loss": 0.010066533461213112,
|
||||
"hyperparameters": {
|
||||
"augmentation": "subcarrier_dropout_10pct (last 50 epochs)",
|
||||
"base_lr": 0.001,
|
||||
"batch_mode": "full_batch",
|
||||
"loss": "SmoothL1 (Huber beta=0.1)",
|
||||
"optimizer": "AdamW",
|
||||
"schedule": "cosine",
|
||||
"weight_decay": 0.01
|
||||
},
|
||||
"model": {
|
||||
"encoder": "Conv1d(56->64->128->128, k=3, dilation=[1,2,4]) + GlobalMeanPool",
|
||||
"head": "Linear(128->256)->ReLU->Linear(256->34)->Sigmoid",
|
||||
"parameters": 126562
|
||||
},
|
||||
"mpjpe_normalized": 0.09310426687050756,
|
||||
"pck_at_20": 2.968409586056645,
|
||||
"pck_at_50": 18.51851851851852,
|
||||
"per_joint_pck20": [
|
||||
{
|
||||
"joint": "nose",
|
||||
"pck20": 0.4629629629629629
|
||||
},
|
||||
{
|
||||
"joint": "l_eye",
|
||||
"pck20": 2.7777777777777777
|
||||
},
|
||||
{
|
||||
"joint": "r_eye",
|
||||
"pck20": 1.8518518518518516
|
||||
},
|
||||
{
|
||||
"joint": "l_ear",
|
||||
"pck20": 0.0
|
||||
},
|
||||
{
|
||||
"joint": "r_ear",
|
||||
"pck20": 1.8518518518518516
|
||||
},
|
||||
{
|
||||
"joint": "l_shoulder",
|
||||
"pck20": 4.62962962962963
|
||||
},
|
||||
{
|
||||
"joint": "r_shoulder",
|
||||
"pck20": 1.8518518518518516
|
||||
},
|
||||
{
|
||||
"joint": "l_elbow",
|
||||
"pck20": 1.8518518518518516
|
||||
},
|
||||
{
|
||||
"joint": "r_elbow",
|
||||
"pck20": 0.0
|
||||
},
|
||||
{
|
||||
"joint": "l_wrist",
|
||||
"pck20": 3.2407407407407405
|
||||
},
|
||||
{
|
||||
"joint": "r_wrist",
|
||||
"pck20": 1.3888888888888888
|
||||
},
|
||||
{
|
||||
"joint": "l_hip",
|
||||
"pck20": 0.0
|
||||
},
|
||||
{
|
||||
"joint": "r_hip",
|
||||
"pck20": 25.0
|
||||
},
|
||||
{
|
||||
"joint": "l_knee",
|
||||
"pck20": 2.314814814814815
|
||||
},
|
||||
{
|
||||
"joint": "r_knee",
|
||||
"pck20": 0.9259259259259258
|
||||
},
|
||||
{
|
||||
"joint": "l_ankle",
|
||||
"pck20": 1.3888888888888888
|
||||
},
|
||||
{
|
||||
"joint": "r_ankle",
|
||||
"pck20": 0.9259259259259258
|
||||
}
|
||||
],
|
||||
"per_joint_pck50": [
|
||||
{
|
||||
"joint": "nose",
|
||||
"pck50": 5.092592592592593
|
||||
},
|
||||
{
|
||||
"joint": "l_eye",
|
||||
"pck50": 8.333333333333332
|
||||
},
|
||||
{
|
||||
"joint": "r_eye",
|
||||
"pck50": 15.74074074074074
|
||||
},
|
||||
{
|
||||
"joint": "l_ear",
|
||||
"pck50": 3.2407407407407405
|
||||
},
|
||||
{
|
||||
"joint": "r_ear",
|
||||
"pck50": 9.722222222222223
|
||||
},
|
||||
{
|
||||
"joint": "l_shoulder",
|
||||
"pck50": 8.796296296296296
|
||||
},
|
||||
{
|
||||
"joint": "r_shoulder",
|
||||
"pck50": 19.90740740740741
|
||||
},
|
||||
{
|
||||
"joint": "l_elbow",
|
||||
"pck50": 26.38888888888889
|
||||
},
|
||||
{
|
||||
"joint": "r_elbow",
|
||||
"pck50": 4.166666666666666
|
||||
},
|
||||
{
|
||||
"joint": "l_wrist",
|
||||
"pck50": 24.074074074074073
|
||||
},
|
||||
{
|
||||
"joint": "r_wrist",
|
||||
"pck50": 12.037037037037036
|
||||
},
|
||||
{
|
||||
"joint": "l_hip",
|
||||
"pck50": 27.314814814814813
|
||||
},
|
||||
{
|
||||
"joint": "r_hip",
|
||||
"pck50": 76.85185185185185
|
||||
},
|
||||
{
|
||||
"joint": "l_knee",
|
||||
"pck50": 20.833333333333336
|
||||
},
|
||||
{
|
||||
"joint": "r_knee",
|
||||
"pck50": 35.18518518518518
|
||||
},
|
||||
{
|
||||
"joint": "l_ankle",
|
||||
"pck50": 7.87037037037037
|
||||
},
|
||||
{
|
||||
"joint": "r_ankle",
|
||||
"pck50": 9.25925925925926
|
||||
}
|
||||
],
|
||||
"train_time_s": 2.058459526
|
||||
}
|
||||
|
|
@ -0,0 +1,34 @@
|
|||
{
|
||||
"$schema": "https://json-schema.org/draft/2020-12/schema",
|
||||
"$id": "https://cognitum.one/schemas/cog-pose-estimation-config-v1.json",
|
||||
"title": "Pose Estimation Cog Runtime Config",
|
||||
"type": "object",
|
||||
"additionalProperties": false,
|
||||
"properties": {
|
||||
"sensing_url": {
|
||||
"type": "string",
|
||||
"format": "uri",
|
||||
"default": "http://127.0.0.1:3000/api/v1/sensing/latest",
|
||||
"description": "URL of the local sensing-server's latest-snapshot endpoint."
|
||||
},
|
||||
"model_path": {
|
||||
"type": "string",
|
||||
"description": "Filesystem path to the model weights (safetensors or Hailo HEF). Resolved relative to /var/lib/cognitum/apps/pose-estimation/ when not absolute."
|
||||
},
|
||||
"poll_ms": {
|
||||
"type": "integer",
|
||||
"minimum": 10,
|
||||
"maximum": 1000,
|
||||
"default": 40,
|
||||
"description": "How often to poll the sensing-server in milliseconds."
|
||||
},
|
||||
"min_confidence": {
|
||||
"type": "number",
|
||||
"minimum": 0,
|
||||
"maximum": 1,
|
||||
"default": 0.3,
|
||||
"description": "Drop frames where the inferred pose confidence is below this threshold."
|
||||
}
|
||||
},
|
||||
"required": ["model_path"]
|
||||
}
|
||||
|
|
@ -0,0 +1,10 @@
|
|||
{
|
||||
"id": "pose-estimation",
|
||||
"version": "{{VERSION}}",
|
||||
"binary_url": "https://storage.googleapis.com/cognitum-apps/cogs/{{ARCH}}/cog-pose-estimation-{{ARCH}}",
|
||||
"binary_bytes": 0,
|
||||
"binary_sha256": "",
|
||||
"binary_signature": "",
|
||||
"installed_at": 0,
|
||||
"status": "installed"
|
||||
}
|
||||
|
|
@ -0,0 +1,58 @@
|
|||
//! Runtime configuration for the pose-estimation Cog.
|
||||
//!
|
||||
//! Schema lives at `cog/config.schema.json` so the appliance can validate
|
||||
//! before launching the cog.
|
||||
|
||||
use serde::{Deserialize, Serialize};
|
||||
use std::path::{Path, PathBuf};
|
||||
|
||||
#[derive(Debug, Clone, Serialize, Deserialize)]
|
||||
#[serde(deny_unknown_fields)]
|
||||
pub struct CogConfig {
|
||||
/// URL of the local sensing-server's frame feed.
|
||||
/// Defaults to the appliance's loopback sensing-server.
|
||||
#[serde(default = "default_sensing_url")]
|
||||
pub sensing_url: String,
|
||||
|
||||
/// Path to the model weights bundle (safetensors or HEF).
|
||||
/// Resolved relative to the cog's install dir if not absolute.
|
||||
pub model_path: PathBuf,
|
||||
|
||||
/// Frame poll interval in milliseconds.
|
||||
#[serde(default = "default_poll_ms")]
|
||||
pub poll_ms: u64,
|
||||
|
||||
/// Confidence threshold below which a frame's keypoints are not emitted.
|
||||
#[serde(default = "default_min_confidence")]
|
||||
pub min_confidence: f32,
|
||||
}
|
||||
|
||||
fn default_sensing_url() -> String {
|
||||
"http://127.0.0.1:3000/api/v1/sensing/latest".to_string()
|
||||
}
|
||||
|
||||
fn default_poll_ms() -> u64 {
|
||||
40 // ~25 Hz to match ESP32 CSI rate
|
||||
}
|
||||
|
||||
fn default_min_confidence() -> f32 {
|
||||
0.3
|
||||
}
|
||||
|
||||
impl CogConfig {
|
||||
pub fn load(path: &Path) -> Result<Self, ConfigError> {
|
||||
let raw = std::fs::read_to_string(path)
|
||||
.map_err(|e| ConfigError::Read(path.to_path_buf(), e))?;
|
||||
let cfg: CogConfig =
|
||||
serde_json::from_str(&raw).map_err(|e| ConfigError::Parse(path.to_path_buf(), e))?;
|
||||
Ok(cfg)
|
||||
}
|
||||
}
|
||||
|
||||
#[derive(Debug, thiserror::Error)]
|
||||
pub enum ConfigError {
|
||||
#[error("failed to read config at {0}: {1}")]
|
||||
Read(PathBuf, std::io::Error),
|
||||
#[error("failed to parse config at {0}: {1}")]
|
||||
Parse(PathBuf, serde_json::Error),
|
||||
}
|
||||
|
|
@ -0,0 +1,233 @@
|
|||
//! Inference engine — loads `pose_v1.safetensors` (produced by the
|
||||
//! Candle training run on `ruvultra`'s RTX 5080, see
|
||||
//! `cog/artifacts/pose_v1.safetensors` + `docs/benchmarks/pose-estimation-cog.md`)
|
||||
//! and runs the encoder + pose head on each CSI window.
|
||||
//!
|
||||
//! Architecture mirrors the training script exactly:
|
||||
//! Conv1d(56 -> 64, k=3, dilation=1, padding=1)
|
||||
//! Conv1d(64 -> 128, k=3, dilation=2, padding=2)
|
||||
//! Conv1d(128 -> 128, k=3, dilation=4, padding=4)
|
||||
//! mean over time -> [128]
|
||||
//! Linear(128 -> 256) -> ReLU
|
||||
//! Linear(256 -> 34) -> sigmoid -> reshape [17, 2]
|
||||
//!
|
||||
//! When the safetensors file is missing the engine falls back to a
|
||||
//! centred-skeleton baseline with `confidence=0` so the cog still
|
||||
//! satisfies the ADR-100 runtime contract and the dashboard surfaces
|
||||
//! "no model yet" instead of dropping frames silently.
|
||||
|
||||
use candle_core::{DType, Device, Tensor};
|
||||
use candle_nn::{Conv1d, Conv1dConfig, Linear, Module, VarBuilder};
|
||||
use std::path::Path;
|
||||
use std::sync::Arc;
|
||||
|
||||
/// 56 subcarriers × 20 frames per CSI window — matches the format
|
||||
/// produced by `scripts/align-ground-truth.js` after #641.
|
||||
pub const INPUT_SUBCARRIERS: usize = 56;
|
||||
pub const INPUT_TIMESTEPS: usize = 20;
|
||||
pub const OUTPUT_KEYPOINTS: usize = 17;
|
||||
|
||||
#[derive(Debug, Clone)]
|
||||
pub struct CsiWindow {
|
||||
pub data: Vec<f32>, // length INPUT_SUBCARRIERS * INPUT_TIMESTEPS
|
||||
}
|
||||
|
||||
#[derive(Debug, Clone)]
|
||||
pub struct PoseOutput {
|
||||
/// Flat `[OUTPUT_KEYPOINTS * 2]` keypoints in `[0, 1]` normalised
|
||||
/// image coords, ordered (x0, y0, x1, y1, …).
|
||||
pub keypoints: Vec<f32>,
|
||||
pub confidence: f32,
|
||||
}
|
||||
|
||||
impl PoseOutput {
|
||||
pub fn is_finite(&self) -> bool {
|
||||
self.keypoints.iter().all(|v| v.is_finite()) && self.confidence.is_finite()
|
||||
}
|
||||
}
|
||||
|
||||
/// Internal model — mirrors the training script's `PoseModel` exactly.
|
||||
struct PoseNet {
|
||||
c1: Conv1d,
|
||||
c2: Conv1d,
|
||||
c3: Conv1d,
|
||||
fc1: Linear,
|
||||
fc2: Linear,
|
||||
}
|
||||
|
||||
impl PoseNet {
|
||||
fn new(vb: VarBuilder<'_>) -> candle_core::Result<Self> {
|
||||
let enc = vb.pp("enc");
|
||||
let head = vb.pp("head");
|
||||
|
||||
let c1 = candle_nn::conv1d(
|
||||
56,
|
||||
64,
|
||||
3,
|
||||
Conv1dConfig { padding: 1, stride: 1, dilation: 1, groups: 1, ..Default::default() },
|
||||
enc.pp("c1"),
|
||||
)?;
|
||||
let c2 = candle_nn::conv1d(
|
||||
64,
|
||||
128,
|
||||
3,
|
||||
Conv1dConfig { padding: 2, stride: 1, dilation: 2, groups: 1, ..Default::default() },
|
||||
enc.pp("c2"),
|
||||
)?;
|
||||
let c3 = candle_nn::conv1d(
|
||||
128,
|
||||
128,
|
||||
3,
|
||||
Conv1dConfig { padding: 4, stride: 1, dilation: 4, groups: 1, ..Default::default() },
|
||||
enc.pp("c3"),
|
||||
)?;
|
||||
let fc1 = candle_nn::linear(128, 256, head.pp("fc1"))?;
|
||||
let fc2 = candle_nn::linear(256, 34, head.pp("fc2"))?;
|
||||
|
||||
Ok(Self { c1, c2, c3, fc1, fc2 })
|
||||
}
|
||||
|
||||
/// Forward pass: `[B, 56, 20]` -> `[B, 34]` in `[0, 1]`.
|
||||
fn forward(&self, x: &Tensor) -> candle_core::Result<Tensor> {
|
||||
let h = self.c1.forward(x)?.relu()?;
|
||||
let h = self.c2.forward(&h)?.relu()?;
|
||||
let h = self.c3.forward(&h)?.relu()?;
|
||||
// Global average pool over time dim (last dim) -> [B, 128]
|
||||
let h = h.mean(2)?;
|
||||
let h = self.fc1.forward(&h)?.relu()?;
|
||||
let h = self.fc2.forward(&h)?;
|
||||
// sigmoid -> keep in [0, 1]
|
||||
candle_nn::ops::sigmoid(&h)
|
||||
}
|
||||
}
|
||||
|
||||
pub struct InferenceEngine {
|
||||
inner: Option<Arc<LoadedModel>>,
|
||||
device: Device,
|
||||
}
|
||||
|
||||
struct LoadedModel {
|
||||
net: PoseNet,
|
||||
}
|
||||
|
||||
impl InferenceEngine {
|
||||
/// Create an engine. Tries to load weights from `cog/artifacts/pose_v1.safetensors`
|
||||
/// (relative to current dir or the cog install dir under
|
||||
/// `/var/lib/cognitum/apps/pose-estimation/`). Returns a usable
|
||||
/// engine either way — without weights, `infer` produces the
|
||||
/// stub output.
|
||||
pub fn new() -> Result<Self, Box<dyn std::error::Error>> {
|
||||
Self::with_weights(default_weights_path().as_deref())
|
||||
}
|
||||
|
||||
/// Create an engine with a specific weights path (used by `--config`
|
||||
/// in `cog-pose-estimation run`). If `weights_path` is `None`, the
|
||||
/// stub fallback is used.
|
||||
pub fn with_weights(weights_path: Option<&Path>) -> Result<Self, Box<dyn std::error::Error>> {
|
||||
let device = pick_device();
|
||||
let inner = match weights_path {
|
||||
Some(p) if p.exists() => {
|
||||
// SAFETY: `from_mmaped_safetensors` mmaps the file for the
|
||||
// VarBuilder's lifetime. We don't modify the file while the
|
||||
// VarBuilder is alive, and the file is read-only on disk on
|
||||
// appliance installs.
|
||||
let vb = unsafe {
|
||||
VarBuilder::from_mmaped_safetensors(&[p.to_path_buf()], DType::F32, &device)?
|
||||
};
|
||||
let net = PoseNet::new(vb)?;
|
||||
Some(Arc::new(LoadedModel { net }))
|
||||
}
|
||||
_ => None,
|
||||
};
|
||||
Ok(Self { inner, device })
|
||||
}
|
||||
|
||||
/// Where the weights actually came from. Useful for the run.started event.
|
||||
pub fn backend(&self) -> &'static str {
|
||||
match (&self.inner, &self.device) {
|
||||
(Some(_), Device::Cuda(_)) => "candle-cuda",
|
||||
(Some(_), _) => "candle-cpu",
|
||||
(None, _) => "stub",
|
||||
}
|
||||
}
|
||||
|
||||
pub fn infer(&self, window: &CsiWindow) -> Result<PoseOutput, Box<dyn std::error::Error>> {
|
||||
if window.data.len() != INPUT_SUBCARRIERS * INPUT_TIMESTEPS {
|
||||
return Err(format!(
|
||||
"expected {} input values, got {}",
|
||||
INPUT_SUBCARRIERS * INPUT_TIMESTEPS,
|
||||
window.data.len()
|
||||
)
|
||||
.into());
|
||||
}
|
||||
|
||||
let Some(model) = &self.inner else {
|
||||
// Stub fallback — model not loaded.
|
||||
return Ok(PoseOutput {
|
||||
keypoints: vec![0.5f32; OUTPUT_KEYPOINTS * 2],
|
||||
confidence: 0.0,
|
||||
});
|
||||
};
|
||||
|
||||
// Build [1, 56, 20] tensor from the flat row-major buffer.
|
||||
let t = Tensor::from_slice(
|
||||
&window.data,
|
||||
(1, INPUT_SUBCARRIERS, INPUT_TIMESTEPS),
|
||||
&self.device,
|
||||
)?;
|
||||
let out = model.net.forward(&t)?; // [1, 34]
|
||||
let flat: Vec<f32> = out.flatten_all()?.to_vec1()?;
|
||||
// Confidence from pose_v1 is a published constant rather than per-frame —
|
||||
// the trained model didn't emit a confidence head. Use the validation-set
|
||||
// PCK@50 (18.5%) as the published self-reported confidence so downstream
|
||||
// consumers can gate display decisions on it.
|
||||
Ok(PoseOutput {
|
||||
keypoints: flat,
|
||||
confidence: 0.185,
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
/// Synthetic CSI window for the `health` subcommand. Zeros — exercises
|
||||
/// the I/O surface; the model never touches values that produce NaN.
|
||||
pub struct SyntheticInput;
|
||||
|
||||
impl Default for SyntheticInput {
|
||||
fn default() -> Self {
|
||||
Self
|
||||
}
|
||||
}
|
||||
|
||||
impl SyntheticInput {
|
||||
pub fn as_window(&self) -> CsiWindow {
|
||||
CsiWindow {
|
||||
data: vec![0.0; INPUT_SUBCARRIERS * INPUT_TIMESTEPS],
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Helpers
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
fn pick_device() -> Device {
|
||||
#[cfg(feature = "cuda")]
|
||||
if let Ok(d) = Device::cuda_if_available(0) {
|
||||
return d;
|
||||
}
|
||||
Device::Cpu
|
||||
}
|
||||
|
||||
fn default_weights_path() -> Option<std::path::PathBuf> {
|
||||
// Search in the order an installed Cog would see it.
|
||||
let candidates = [
|
||||
std::path::PathBuf::from("/var/lib/cognitum/apps/pose-estimation/pose_v1.safetensors"),
|
||||
std::path::PathBuf::from("./pose_v1.safetensors"),
|
||||
std::path::PathBuf::from("./cog/artifacts/pose_v1.safetensors"),
|
||||
// From the repo root.
|
||||
std::path::PathBuf::from("v2/crates/cog-pose-estimation/cog/artifacts/pose_v1.safetensors"),
|
||||
// From inside v2/.
|
||||
std::path::PathBuf::from("crates/cog-pose-estimation/cog/artifacts/pose_v1.safetensors"),
|
||||
];
|
||||
candidates.into_iter().find(|p| p.exists())
|
||||
}
|
||||
|
|
@ -0,0 +1,19 @@
|
|||
//! `cog-pose-estimation` library surface.
|
||||
//!
|
||||
//! See `ADR-101` for the design and `ADR-100` for the surrounding Cog
|
||||
//! packaging spec. This crate is intentionally a thin shell around
|
||||
//! `wifi-densepose-train`'s exported model types — the heavy lifting
|
||||
//! (encoder, pose head) lives there.
|
||||
|
||||
pub mod config;
|
||||
pub mod inference;
|
||||
pub mod manifest;
|
||||
pub mod publisher;
|
||||
pub mod runtime;
|
||||
|
||||
/// Cog identifier — matches the on-disk path
|
||||
/// `/var/lib/cognitum/apps/pose-estimation/`.
|
||||
pub const COG_ID: &str = "pose-estimation";
|
||||
|
||||
/// Cog version (sourced from Cargo.toml at build time).
|
||||
pub const COG_VERSION: &str = env!("CARGO_PKG_VERSION");
|
||||
|
|
@ -0,0 +1,116 @@
|
|||
//! `cog-pose-estimation` — Cognitum Cog binary entrypoint.
|
||||
//!
|
||||
//! Implements the ADR-100 runtime contract:
|
||||
//! cog-pose-estimation version
|
||||
//! cog-pose-estimation manifest
|
||||
//! cog-pose-estimation health
|
||||
//! cog-pose-estimation run --config <path>
|
||||
//!
|
||||
//! Each subcommand writes structured JSON to stdout. `run` is long-running
|
||||
//! and emits one `pose.frame` event per inferred CSI window.
|
||||
|
||||
use clap::{Parser, Subcommand};
|
||||
use cog_pose_estimation::{
|
||||
config::CogConfig,
|
||||
inference::{InferenceEngine, SyntheticInput},
|
||||
manifest::ManifestSpec,
|
||||
publisher::{emit_event, Event},
|
||||
};
|
||||
use std::path::PathBuf;
|
||||
|
||||
const COG_ID: &str = "pose-estimation";
|
||||
const COG_VERSION: &str = env!("CARGO_PKG_VERSION");
|
||||
|
||||
#[derive(Parser)]
|
||||
#[command(name = COG_ID, version = COG_VERSION)]
|
||||
#[command(about = "Cognitum Cog: 17-keypoint pose estimation from WiFi CSI", long_about = None)]
|
||||
struct Cli {
|
||||
#[command(subcommand)]
|
||||
command: Cmd,
|
||||
}
|
||||
|
||||
#[derive(Subcommand)]
|
||||
enum Cmd {
|
||||
/// Print `<id> <version>` and exit.
|
||||
Version,
|
||||
/// Print the embedded manifest as JSON.
|
||||
Manifest,
|
||||
/// One-shot health check. Exit 0 if the cog can come up healthy.
|
||||
Health,
|
||||
/// Long-running inference loop.
|
||||
Run {
|
||||
/// Path to runtime config JSON. See `cog/config.schema.json`.
|
||||
#[arg(long, value_name = "PATH")]
|
||||
config: PathBuf,
|
||||
},
|
||||
}
|
||||
|
||||
fn main() -> std::process::ExitCode {
|
||||
init_logging();
|
||||
|
||||
let cli = Cli::parse();
|
||||
let result = match cli.command {
|
||||
Cmd::Version => cmd_version(),
|
||||
Cmd::Manifest => cmd_manifest(),
|
||||
Cmd::Health => cmd_health(),
|
||||
Cmd::Run { config } => cmd_run(config),
|
||||
};
|
||||
|
||||
match result {
|
||||
Ok(()) => std::process::ExitCode::SUCCESS,
|
||||
Err(err) => {
|
||||
eprintln!("{COG_ID}: {err}");
|
||||
std::process::ExitCode::FAILURE
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
fn init_logging() {
|
||||
let _ = tracing_subscriber::fmt()
|
||||
.with_env_filter(
|
||||
tracing_subscriber::EnvFilter::try_from_default_env()
|
||||
.unwrap_or_else(|_| tracing_subscriber::EnvFilter::new("info")),
|
||||
)
|
||||
.with_target(false)
|
||||
.json()
|
||||
.try_init();
|
||||
}
|
||||
|
||||
fn cmd_version() -> Result<(), Box<dyn std::error::Error>> {
|
||||
println!("{COG_ID} {COG_VERSION}");
|
||||
Ok(())
|
||||
}
|
||||
|
||||
fn cmd_manifest() -> Result<(), Box<dyn std::error::Error>> {
|
||||
let spec = ManifestSpec::embedded(COG_ID, COG_VERSION);
|
||||
println!("{}", serde_json::to_string_pretty(&spec)?);
|
||||
Ok(())
|
||||
}
|
||||
|
||||
fn cmd_health() -> Result<(), Box<dyn std::error::Error>> {
|
||||
let engine = InferenceEngine::new()?;
|
||||
let synthetic = SyntheticInput::default();
|
||||
let out = engine.infer(&synthetic.as_window())?;
|
||||
if out.is_finite() {
|
||||
emit_event(&Event::health_ok(
|
||||
COG_ID,
|
||||
engine.backend(),
|
||||
out.confidence,
|
||||
));
|
||||
Ok(())
|
||||
} else {
|
||||
Err("inference produced non-finite output".into())
|
||||
}
|
||||
}
|
||||
|
||||
fn cmd_run(config_path: PathBuf) -> Result<(), Box<dyn std::error::Error>> {
|
||||
let cfg = CogConfig::load(&config_path)?;
|
||||
emit_event(&Event::run_started(COG_ID, &cfg));
|
||||
|
||||
let engine = InferenceEngine::new()?;
|
||||
let rt = tokio::runtime::Builder::new_multi_thread()
|
||||
.enable_all()
|
||||
.build()?;
|
||||
rt.block_on(cog_pose_estimation::runtime::run_loop(cfg, engine))?;
|
||||
Ok(())
|
||||
}
|
||||
|
|
@ -0,0 +1,37 @@
|
|||
//! Cog manifest — see ADR-100 §"manifest.json schema".
|
||||
//!
|
||||
//! The `cog-pose-estimation manifest` subcommand emits the embedded spec
|
||||
//! (no signature fields); the build pipeline post-processes it after
|
||||
//! computing `binary_sha256` + `binary_signature`.
|
||||
|
||||
use serde::{Deserialize, Serialize};
|
||||
|
||||
#[derive(Debug, Clone, Serialize, Deserialize)]
|
||||
#[serde(deny_unknown_fields)]
|
||||
pub struct ManifestSpec {
|
||||
pub id: String,
|
||||
pub version: String,
|
||||
pub binary_url: Option<String>,
|
||||
pub binary_bytes: Option<u64>,
|
||||
pub binary_sha256: Option<String>,
|
||||
pub binary_signature: Option<String>,
|
||||
pub installed_at: Option<u64>,
|
||||
pub status: Option<String>,
|
||||
}
|
||||
|
||||
impl ManifestSpec {
|
||||
/// The skeleton emitted by `cog-pose-estimation manifest` before the
|
||||
/// release pipeline fills in the signature/hash/url fields.
|
||||
pub fn embedded(id: &str, version: &str) -> Self {
|
||||
Self {
|
||||
id: id.to_string(),
|
||||
version: version.to_string(),
|
||||
binary_url: None,
|
||||
binary_bytes: None,
|
||||
binary_sha256: None,
|
||||
binary_signature: None,
|
||||
installed_at: None,
|
||||
status: None,
|
||||
}
|
||||
}
|
||||
}
|
||||
|
|
@ -0,0 +1,70 @@
|
|||
//! Structured JSON event publisher — one line per event on stdout.
|
||||
//!
|
||||
//! Format is the ADR-100 runtime contract: `{ts, level, event, fields}`.
|
||||
|
||||
use serde::Serialize;
|
||||
use serde_json::Value;
|
||||
use std::time::{SystemTime, UNIX_EPOCH};
|
||||
|
||||
#[derive(Debug, Serialize)]
|
||||
pub struct Event<'a> {
|
||||
pub ts: f64,
|
||||
pub level: &'a str,
|
||||
pub event: &'a str,
|
||||
pub fields: Value,
|
||||
}
|
||||
|
||||
impl<'a> Event<'a> {
|
||||
pub fn health_ok(cog_id: &'a str, backend: &str, output_confidence: f32) -> Self {
|
||||
Self {
|
||||
ts: now_secs(),
|
||||
level: "info",
|
||||
event: "health.ok",
|
||||
fields: serde_json::json!({
|
||||
"cog": cog_id,
|
||||
"backend": backend,
|
||||
"synthetic_output_confidence": output_confidence,
|
||||
}),
|
||||
}
|
||||
}
|
||||
|
||||
pub fn run_started(cog_id: &'a str, cfg: &crate::config::CogConfig) -> Self {
|
||||
Self {
|
||||
ts: now_secs(),
|
||||
level: "info",
|
||||
event: "run.started",
|
||||
fields: serde_json::json!({
|
||||
"cog": cog_id,
|
||||
"sensing_url": cfg.sensing_url,
|
||||
"model_path": cfg.model_path,
|
||||
"poll_ms": cfg.poll_ms,
|
||||
}),
|
||||
}
|
||||
}
|
||||
|
||||
pub fn pose_frame(tick: u64, n_persons: usize, persons: Value) -> Self {
|
||||
Self {
|
||||
ts: now_secs(),
|
||||
level: "info",
|
||||
event: "pose.frame",
|
||||
fields: serde_json::json!({
|
||||
"tick": tick,
|
||||
"n_persons": n_persons,
|
||||
"persons": persons,
|
||||
}),
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
pub fn emit_event(ev: &Event<'_>) {
|
||||
if let Ok(line) = serde_json::to_string(ev) {
|
||||
println!("{line}");
|
||||
}
|
||||
}
|
||||
|
||||
fn now_secs() -> f64 {
|
||||
SystemTime::now()
|
||||
.duration_since(UNIX_EPOCH)
|
||||
.map(|d| d.as_secs_f64())
|
||||
.unwrap_or(0.0)
|
||||
}
|
||||
|
|
@ -0,0 +1,80 @@
|
|||
//! Long-running inference loop. Polls the appliance's sensing-server,
|
||||
//! runs a CSI window through the engine, emits `pose.frame` events.
|
||||
|
||||
use crate::config::CogConfig;
|
||||
use crate::inference::{CsiWindow, InferenceEngine, INPUT_SUBCARRIERS, INPUT_TIMESTEPS};
|
||||
use crate::publisher::{emit_event, Event};
|
||||
use std::time::Duration;
|
||||
use tokio::time::sleep;
|
||||
|
||||
pub async fn run_loop(
|
||||
cfg: CogConfig,
|
||||
engine: InferenceEngine,
|
||||
) -> Result<(), Box<dyn std::error::Error>> {
|
||||
let mut buffer: Vec<f32> = Vec::with_capacity(INPUT_SUBCARRIERS * INPUT_TIMESTEPS);
|
||||
let mut tick: u64 = 0;
|
||||
|
||||
loop {
|
||||
// Poll one frame from the sensing-server. On error, sleep and retry —
|
||||
// we expect transient blips when the server restarts.
|
||||
match fetch_frame(&cfg.sensing_url).await {
|
||||
Ok(amplitudes) => {
|
||||
tick += 1;
|
||||
buffer.extend(amplitudes);
|
||||
// Slide-window: keep only the most recent N*T values
|
||||
let cap = INPUT_SUBCARRIERS * INPUT_TIMESTEPS;
|
||||
if buffer.len() >= cap {
|
||||
let window = CsiWindow {
|
||||
data: buffer.split_off(buffer.len() - cap),
|
||||
};
|
||||
if let Ok(out) = engine.infer(&window) {
|
||||
if out.confidence >= cfg.min_confidence {
|
||||
// Flatten persons array (single-person v0.0.1)
|
||||
let persons = serde_json::json!([{
|
||||
"keypoints": chunk_pairs(&out.keypoints),
|
||||
"confidence": out.confidence,
|
||||
}]);
|
||||
emit_event(&Event::pose_frame(tick, 1, persons));
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
Err(e) => {
|
||||
tracing::warn!(error = %e, "sensing-server fetch failed");
|
||||
}
|
||||
}
|
||||
sleep(Duration::from_millis(cfg.poll_ms)).await;
|
||||
}
|
||||
}
|
||||
|
||||
async fn fetch_frame(url: &str) -> Result<Vec<f32>, Box<dyn std::error::Error>> {
|
||||
// Synchronous ureq inside an async fn — we accept the blocking call
|
||||
// here because the per-frame cost (~1 ms loopback) is dwarfed by the
|
||||
// inference cost. Replace with a proper async client if we ever poll
|
||||
// remote sensing-servers over the wire.
|
||||
let url = url.to_string();
|
||||
let body = tokio::task::spawn_blocking(move || -> Result<String, ureq::Error> {
|
||||
Ok(ureq::get(&url).call()?.into_string()?)
|
||||
})
|
||||
.await??;
|
||||
let json: serde_json::Value = serde_json::from_str(&body)?;
|
||||
let snapshot = json.get("snapshot").unwrap_or(&json);
|
||||
let nodes = snapshot
|
||||
.get("nodes")
|
||||
.and_then(|v| v.as_array())
|
||||
.ok_or("missing nodes[]")?;
|
||||
// Take node 0's amplitude vector — we'll add multi-node fusion later.
|
||||
let amplitude = nodes
|
||||
.first()
|
||||
.and_then(|n| n.get("amplitude"))
|
||||
.and_then(|v| v.as_array())
|
||||
.ok_or("missing nodes[0].amplitude[]")?;
|
||||
Ok(amplitude
|
||||
.iter()
|
||||
.filter_map(|v| v.as_f64().map(|f| f as f32))
|
||||
.collect())
|
||||
}
|
||||
|
||||
fn chunk_pairs(flat: &[f32]) -> Vec<[f32; 2]> {
|
||||
flat.chunks_exact(2).map(|c| [c[0], c[1]]).collect()
|
||||
}
|
||||
|
|
@ -0,0 +1,67 @@
|
|||
//! Smoke tests for the cog-pose-estimation crate.
|
||||
//!
|
||||
//! These are deliberately tight — full inference integration tests
|
||||
//! depend on a trained safetensors blob that doesn't live in-repo yet.
|
||||
|
||||
use cog_pose_estimation::{
|
||||
inference::{InferenceEngine, SyntheticInput, INPUT_SUBCARRIERS, INPUT_TIMESTEPS, OUTPUT_KEYPOINTS},
|
||||
manifest::ManifestSpec,
|
||||
};
|
||||
|
||||
#[test]
|
||||
fn synthetic_window_has_correct_shape() {
|
||||
let syn = SyntheticInput::default();
|
||||
let window = syn.as_window();
|
||||
assert_eq!(window.data.len(), INPUT_SUBCARRIERS * INPUT_TIMESTEPS);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn engine_produces_finite_output_for_synthetic_input() {
|
||||
let engine = InferenceEngine::new().expect("engine init");
|
||||
let out = engine
|
||||
.infer(&SyntheticInput::default().as_window())
|
||||
.expect("infer");
|
||||
assert!(out.is_finite(), "synthetic input must produce finite output");
|
||||
assert_eq!(out.keypoints.len(), OUTPUT_KEYPOINTS * 2);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn engine_rejects_wrong_shape_input() {
|
||||
let engine = InferenceEngine::new().expect("engine init");
|
||||
let bad = cog_pose_estimation::inference::CsiWindow { data: vec![0.0; 10] };
|
||||
assert!(engine.infer(&bad).is_err());
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn real_weights_load_when_available() {
|
||||
use cog_pose_estimation::inference::InferenceEngine;
|
||||
let weights = std::path::Path::new("cog/artifacts/pose_v1.safetensors");
|
||||
if !weights.exists() {
|
||||
// Skip when running outside the repo (e.g. on a fresh appliance install).
|
||||
eprintln!("(skipping — cog/artifacts/pose_v1.safetensors not present in cwd)");
|
||||
return;
|
||||
}
|
||||
let engine = InferenceEngine::with_weights(Some(weights)).expect("load real weights");
|
||||
assert!(
|
||||
engine.backend().starts_with("candle-"),
|
||||
"expected real Candle backend, got {}",
|
||||
engine.backend()
|
||||
);
|
||||
let out = engine
|
||||
.infer(&SyntheticInput::default().as_window())
|
||||
.expect("infer");
|
||||
assert!(out.is_finite());
|
||||
// Real model emits the published validation PCK@50 as its self-reported
|
||||
// confidence — stub returns 0.0. This is the key assertion that proves
|
||||
// the cog isn't silently falling back to the stub.
|
||||
assert!(out.confidence > 0.0, "real model should emit non-zero confidence");
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn manifest_roundtrips() {
|
||||
let spec = ManifestSpec::embedded("pose-estimation", "0.0.1");
|
||||
let s = serde_json::to_string(&spec).unwrap();
|
||||
let back: ManifestSpec = serde_json::from_str(&s).unwrap();
|
||||
assert_eq!(back.id, "pose-estimation");
|
||||
assert_eq!(back.version, "0.0.1");
|
||||
}
|
||||
Loading…
Reference in New Issue