diff --git a/docs/adr/ADR-150-rf-foundation-encoder.md b/docs/adr/ADR-150-rf-foundation-encoder.md index 947e25bd..12fe259d 100644 --- a/docs/adr/ADR-150-rf-foundation-encoder.md +++ b/docs/adr/ADR-150-rf-foundation-encoder.md @@ -149,6 +149,31 @@ target-domain adaptation (a handful of labeled frames from the deployment room), curve implies will beat any amount of additional source subjects. This makes the encoder's *domain-invariance* objective (vs the failed subject-invariance one) the design priority. +### 3.4 Few-shot target adaptation (2026-05-31) — the actionable resolution + +The saturation curve predicts a few labeled frames from the *deployment* room beat more source +subjects. Confirmed. Base trained on all 32 source subjects (63.7% zero-shot on a disjoint 50% +held-out of the target subjects), then fine-tuned on K labeled frames per target subject: + +| K/subject | total frames | eval PCK@20 | Δ | +|----------:|-------------:|------------:|--:| +| 0 | 0 | 63.7% | — | +| 20 | 160 | 68.1% | +4.3 | +| **50** | **400** | **72.2%** | **+8.5 (≈ prior SOTA)** | +| 200 | 1,600 | 76.1% | +12.4 | +| 1000 | 8,000 | 78.3% | +14.6 | + +**Few-shot calibration dominates source volume.** §3.3 showed +24 source subjects (~190K frames) +buys +6 pts; here **200 target frames/subject (1,600 frames) buys +12.4 pts**. This **re-scopes the +ADR's acceptance gate and deployment story**: the cross-subject gate (§4, ≥6 pts) is *trivially* met +by ~50–200 labeled frames of in-room calibration — no foundation encoder or mass capture required for +the deployment win. **Recommended product behavior:** ship a **~30-second on-site calibration** (a few +hundred labeled frames per room/person) that recovers most of the gap. The foundation encoder's value +shifts from "close cross-subject zero-shot" (data says: hard) to "make the few-shot adaptation faster / +need fewer calibration frames" — a better-posed, achievable objective. **This supersedes the §3.2 +pessimism: the frontier is not closed by algorithms or bulk data, but it *is* cheaply closed at +deployment time by few-shot calibration.** + ## 4. Acceptance Test The encoder is accepted **only if it improves cross-subject torso-PCK@20 by ≥ 6 absolute points without reducing random-split torso-PCK@20 by more than 2 points** — on the same MM-Fi pipeline, one-command reproduction, with per-joint error tables. Results land as AetherArena witness rows (ADR-149), nothing published until reviewed.