From 4b1a83510775d312cba3c670a050117948895c7b Mon Sep 17 00:00:00 2001 From: rUv Date: Tue, 19 May 2026 17:18:05 -0400 Subject: [PATCH] docs: repoint #640 references to #645 (original deleted, replaced) (#646) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Issue #640 (PCK gap follow-up) was deleted upstream after the cog v0.0.1 PRs landed today. Re-opened as #645 with the same context plus the new measured v0.0.1 numbers (PCK@20 3.0%, PCK@50 18.5%, MPJPE 0.093). This patch updates the three files in main that still pointed at the dead #640 to point at #645 instead — ADR-101, the cog README, and the benchmark log. --- docs/adr/ADR-101-pose-estimation-cog.md | 16 ++++++++-------- docs/benchmarks/pose-estimation-cog.md | 4 ++-- v2/crates/cog-pose-estimation/cog/README.md | 2 +- 3 files changed, 11 insertions(+), 11 deletions(-) diff --git a/docs/adr/ADR-101-pose-estimation-cog.md b/docs/adr/ADR-101-pose-estimation-cog.md index 918d10ce..815ca5b2 100644 --- a/docs/adr/ADR-101-pose-estimation-cog.md +++ b/docs/adr/ADR-101-pose-estimation-cog.md @@ -66,8 +66,8 @@ ESP32 / rvcsi ─► collect-ground-truth.py + sensing-server recording | Pose head | 2-layer MLP `(128 → 256 → 34)` | 34 = 17 × (x, y) | | Output | `[B, 17, 2]` keypoints in `[0, 1]` image-normalised coords | confidence is implicit in keypoint variance over time; ADR-079 P9 will add explicit per-joint confidence | | Loss | Confidence-weighted SmoothL1 (frame-level) + bone-length regulariser + temporal smoothness | per ADR-079 Phase 3 refinement | -| Init | Encoder = HF presence weights (frozen for 50 epochs, then jointly fine-tuned) | unblocks the sigmoid-saturation failure mode observed in #640 | -| Training | `v2/crates/wifi-densepose-train` with libtorch backend on RTX 5080 | replaces the pure-JS SPSA trainer that produced 0% PCK in #640 | +| Init | Encoder = HF presence weights (frozen for 50 epochs, then jointly fine-tuned) | unblocks the sigmoid-saturation failure mode observed in #645 | +| Training | `v2/crates/wifi-densepose-train` with libtorch backend on RTX 5080 | replaces the pure-JS SPSA trainer that produced 0% PCK in #645 | ### Repo layout @@ -131,11 +131,11 @@ Honours ADR-100's per-Cog CLI contract: 3. **Optimised:** the Hailo-targeted ONNX graph passes through Hailo Dataflow Compiler without quantisation-aware-training warnings. 4. **Published:** signed binary at `gs://cognitum-apps/cogs//cog-pose-estimation-`; manifest valid against the JSON schema in ADR-100; appliance installer can pull and run it. -PCK@20 is intentionally **not** an acceptance gate of this ADR. Achieving the ADR-079 ≥35% target is a separate, data-bound milestone tracked in #640. This ADR ships the **vehicle**, not the model accuracy. +PCK@20 is intentionally **not** an acceptance gate of this ADR. Achieving the ADR-079 ≥35% target is a separate, data-bound milestone tracked in #645. This ADR ships the **vehicle**, not the model accuracy. ### First measured run — v0.0.1 (2026-05-19) -A Candle-on-CUDA training run on `ruvultra`'s RTX 5080 against the same 1,077-sample paired session that produced the 0%/0% baseline in #640 yielded: +A Candle-on-CUDA training run on `ruvultra`'s RTX 5080 against the same 1,077-sample paired session that produced the 0%/0% baseline in #645 yielded: - **PCK@20 = 3.0%**, **PCK@50 = 18.5%**, **MPJPE = 0.093** (normalized). - 400 epochs in **2.1 s** wall time (~5 ms/epoch, full-batch). @@ -155,7 +155,7 @@ This confirms the pipeline trains end-to-end and produces a signal-bearing model ### Negative - Adds a hard dependency on the Hailo Dataflow Compiler, which lives behind a self-hosted runner — Hailo-targeted PRs land more slowly. -- The first published binary will have low PCK (data + training time gap, #640) — UX needs to surface this clearly so end users do not interpret bad keypoints as a bug. +- The first published binary will have low PCK (data + training time gap, #645) — UX needs to surface this clearly so end users do not interpret bad keypoints as a bug. ### Risks @@ -167,7 +167,7 @@ This confirms the pipeline trains end-to-end and produces a signal-bearing model 1. Land this ADR + ADR-100 on `main` of RuView. 2. Land companion ADR-225 + crate on `main` of v0-appliance. 3. First release `cog-pose-estimation@0.0.1` ships **only** to `ruvultra` and `cognitum-v0`. Not pushed to the cluster Pis yet. -4. After P7→P9 data work (#640) brings PCK above a usable threshold, rebuild + re-publish; only then enable cluster rollout via `cognitum-cog-gateway`'s OTA channel. +4. After P7→P9 data work (#645) brings PCK above a usable threshold, rebuild + re-publish; only then enable cluster rollout via `cognitum-cog-gateway`'s OTA channel. ## v0.0.1 shipping status — 2026-05-19 @@ -196,7 +196,7 @@ PRs `#642` (scaffold + arm release + ONNX + live install) and `#643` (x86_64 rel Open follow-ups carried forward from this ADR's "Acceptance gates" section: - **Hailo HEF cross-compile** — `pose_v1.onnx` is ready; still gated on Hailo Dataflow Compiler + self-hosted runner provisioning. Tracked separately. -- **PCK@20 ≥ 35%** — explicitly not an acceptance gate of this ADR, but the limiting factor on practical usefulness. Tracked in [#640](https://github.com/ruvnet/RuView/issues/640): needs ~30× more paired samples + multi-room camera framing. Today's seated-desk session is the demonstrated bottleneck. +- **PCK@20 ≥ 35%** — explicitly not an acceptance gate of this ADR, but the limiting factor on practical usefulness. Tracked in [#645](https://github.com/ruvnet/RuView/issues/645): needs ~30× more paired samples + multi-room camera framing. Today's seated-desk session is the demonstrated bottleneck. ## See also @@ -204,5 +204,5 @@ Open follow-ups carried forward from this ADR's "Acceptance gates" section: - ADR-100: Cog packaging specification (the format we're shipping in). - v0-appliance ADR-225: cognitum-pose-estimation crate (the appliance-side runtime). - v0-appliance ADR-220: cog management surface (where this cog appears in the dashboard). -- Issue #640: PCK gap (current 3% / 18.5% → ≥35% target). +- Issue #645: PCK gap (current 3% / 18.5% → ≥35% target). - `docs/benchmarks/pose-estimation-cog.md`: full benchmark log, all measured numbers. diff --git a/docs/benchmarks/pose-estimation-cog.md b/docs/benchmarks/pose-estimation-cog.md index 8c1b8733..9fd9b70a 100644 --- a/docs/benchmarks/pose-estimation-cog.md +++ b/docs/benchmarks/pose-estimation-cog.md @@ -51,10 +51,10 @@ Strongest signal at right-side proximal joints (`r_hip` 77% PCK@50, `r_knee` 35% | Run | Backend | Train time | PCK@20 | PCK@50 | MPJPE | |-----|---------|-----------:|-------:|-------:|------:| -| pre-2026-05-19 | pure-JS SPSA, lite TCN (#640) | ~20 min | 0.0% | 0.0% | 0.66 | +| pre-2026-05-19 | pure-JS SPSA, lite TCN (#645) | ~20 min | 0.0% | 0.0% | 0.66 | | **v0.0.1** (this run) | **candle-cuda, Conv1d TCN** | **2.1 s** | **3.0%** | **18.5%** | **0.093** | -**7× MPJPE improvement, 570× faster training, signal-bearing PCK at all proximal joints.** The remaining gap to ADR-079's PCK@20 ≥ 35% target is data-bound, not infra-bound (see Issue #640). +**7× MPJPE improvement, 570× faster training, signal-bearing PCK at all proximal joints.** The remaining gap to ADR-079's PCK@20 ≥ 35% target is data-bound, not infra-bound (see Issue #645). ### Inference latency diff --git a/v2/crates/cog-pose-estimation/cog/README.md b/v2/crates/cog-pose-estimation/cog/README.md index 4abb5388..b17e07fa 100644 --- a/v2/crates/cog-pose-estimation/cog/README.md +++ b/v2/crates/cog-pose-estimation/cog/README.md @@ -52,7 +52,7 @@ Loss curve: 0.181 (epoch 0) → 0.014 (epoch 399), eval loss 0.010. **400 epochs - It is **below the ADR-079 target of PCK@20 ≥ 35%**. The bottleneck is data quality and quantity, not infra. The single 30-min seated-at-desk recording produced 1,077 paired samples at avg confidence 0.44 — strong asymmetry between left/right side (r_hip 77% vs l_hip 27%) reflects the camera framing more than any model defect. - Distal joints (wrists, ankles) and face joints are still near-random: 56-subcarrier CSI at our 20-frame window doesn't carry enough fine-grained spatial information. -### Next-iteration plan (tracked in [#640](https://github.com/ruvnet/RuView/issues/640)) +### Next-iteration plan (tracked in [#645](https://github.com/ruvnet/RuView/issues/645)) - Multi-session, multi-room recordings with **full-body framing** (target ≥ 30K paired samples at conf ≥ 0.7). - Re-train with the same Candle pipeline (already validated to converge in seconds on RTX 5080).