docs(adr-150): subject-scaling study — capture diversity, not volume

Measured cross-subject PCK vs N training subjects: 4->8 = +21pts, but 24->32 = +0.45pt. Saturates ~64%, ~19pt below in-domain. Correction to 'more data': subject-count returns vanish past ~16-20; the residual is device/room/protocol shift. Re-scope phase-1 capture around DIVERSITY (rooms/devices/protocols) + few-shot target adaptation, not headcount. Co-Authored-By: claude-flow <ruv@ruv.net>
2026-05-31 01:43:49 -04:00 · 2026-05-31 01:43:49 -04:00 · 70bf9e41fe
parent 96ccfa58fb
commit 70bf9e41fe
1 changed files with 19 additions and 0 deletions
--- a/docs/adr/ADR-150-rf-foundation-encoder.md
+++ b/docs/adr/ADR-150-rf-foundation-encoder.md
@ -130,6 +130,25 @@ cross-subject acceptance gate (§4, ≥6 pts) is **unlikely to be met without ne
 capture** (fleet: `cognitum-seed-1` + multi-room, see `CLAUDE.local.md`). Recommend re-scoping
 phase 1 around data collection before further loss-stack engineering.

+### 3.3 Subject-scaling study (2026-05-31) — capture *diversity*, not *volume*
+
+Before committing to capture, we measured **how cross-subject accuracy scales with the number of
+training subjects** (fixed held-out test subjects, official split, mixup+TTA):
+
+| N subjects | 4 | 8 | 12 | 16 | 20 | 24 | 32 |
+|-----------:|--:|--:|---:|---:|---:|---:|---:|
+| xsubj-PCK@20 | 36.7 | 57.7 | 58.3 | 61.1 | 62.7 | 63.3 | **63.7** |
+
+The curve **saturates**: 4→8 subjects = **+21 pts**, but 24→32 = **+0.45 pts**. Asymptote ≈ 64–65%,
+still ~19 pts under in-domain. **Key correction to the "more data" recommendation:** simply capturing
+*more people from the same distribution* will **not** close the gap — subject-count returns vanish
+past ~16–20 subjects. The residual is **device/room/protocol shift** (MM-Fi's cross-subject split is
+partly cross-environment by construction). **Re-scoped phase-1 capture target: maximize DIVERSITY
+(rooms, devices, antenna geometries, traffic protocols), not headcount** — and pair it with few-shot
+target-domain adaptation (a handful of labeled frames from the deployment room), which the saturation
+curve implies will beat any amount of additional source subjects. This makes the encoder's
+*domain-invariance* objective (vs the failed subject-invariance one) the design priority.
+
 ## 4. Acceptance Test

 The encoder is accepted **only if it improves cross-subject torso-PCK@20 by ≥ 6 absolute points without reducing random-split torso-PCK@20 by more than 2 points** — on the same MM-Fi pipeline, one-command reproduction, with per-joint error tables. Results land as AetherArena witness rows (ADR-149), nothing published until reviewed.