docs(adr-150): subject-scaling study — capture diversity, not volume

Measured cross-subject PCK vs N training subjects: 4->8 = +21pts, but
24->32 = +0.45pt. Saturates ~64%, ~19pt below in-domain. Correction to
'more data': subject-count returns vanish past ~16-20; the residual is
device/room/protocol shift. Re-scope phase-1 capture around DIVERSITY
(rooms/devices/protocols) + few-shot target adaptation, not headcount.

Co-Authored-By: claude-flow <ruv@ruv.net>
This commit is contained in:
ruv 2026-05-31 01:43:49 -04:00
parent 96ccfa58fb
commit 70bf9e41fe
1 changed files with 19 additions and 0 deletions

View File

@ -130,6 +130,25 @@ cross-subject acceptance gate (§4, ≥6 pts) is **unlikely to be met without ne
capture** (fleet: `cognitum-seed-1` + multi-room, see `CLAUDE.local.md`). Recommend re-scoping
phase 1 around data collection before further loss-stack engineering.
### 3.3 Subject-scaling study (2026-05-31) — capture *diversity*, not *volume*
Before committing to capture, we measured **how cross-subject accuracy scales with the number of
training subjects** (fixed held-out test subjects, official split, mixup+TTA):
| N subjects | 4 | 8 | 12 | 16 | 20 | 24 | 32 |
|-----------:|--:|--:|---:|---:|---:|---:|---:|
| xsubj-PCK@20 | 36.7 | 57.7 | 58.3 | 61.1 | 62.7 | 63.3 | **63.7** |
The curve **saturates**: 4→8 subjects = **+21 pts**, but 24→32 = **+0.45 pts**. Asymptote ≈ 6465%,
still ~19 pts under in-domain. **Key correction to the "more data" recommendation:** simply capturing
*more people from the same distribution* will **not** close the gap — subject-count returns vanish
past ~1620 subjects. The residual is **device/room/protocol shift** (MM-Fi's cross-subject split is
partly cross-environment by construction). **Re-scoped phase-1 capture target: maximize DIVERSITY
(rooms, devices, antenna geometries, traffic protocols), not headcount** — and pair it with few-shot
target-domain adaptation (a handful of labeled frames from the deployment room), which the saturation
curve implies will beat any amount of additional source subjects. This makes the encoder's
*domain-invariance* objective (vs the failed subject-invariance one) the design priority.
## 4. Acceptance Test
The encoder is accepted **only if it improves cross-subject torso-PCK@20 by ≥ 6 absolute points without reducing random-split torso-PCK@20 by more than 2 points** — on the same MM-Fi pipeline, one-command reproduction, with per-joint error tables. Results land as AetherArena witness rows (ADR-149), nothing published until reviewed.