From 70bf9e41fe7206e3759e4b6d080142ffb28c2813 Mon Sep 17 00:00:00 2001 From: ruv Date: Sun, 31 May 2026 01:43:49 -0400 Subject: [PATCH] =?UTF-8?q?docs(adr-150):=20subject-scaling=20study=20?= =?UTF-8?q?=E2=80=94=20capture=20diversity,=20not=20volume?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Measured cross-subject PCK vs N training subjects: 4->8 = +21pts, but 24->32 = +0.45pt. Saturates ~64%, ~19pt below in-domain. Correction to 'more data': subject-count returns vanish past ~16-20; the residual is device/room/protocol shift. Re-scope phase-1 capture around DIVERSITY (rooms/devices/protocols) + few-shot target adaptation, not headcount. Co-Authored-By: claude-flow --- docs/adr/ADR-150-rf-foundation-encoder.md | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+) diff --git a/docs/adr/ADR-150-rf-foundation-encoder.md b/docs/adr/ADR-150-rf-foundation-encoder.md index 124682dd..947e25bd 100644 --- a/docs/adr/ADR-150-rf-foundation-encoder.md +++ b/docs/adr/ADR-150-rf-foundation-encoder.md @@ -130,6 +130,25 @@ cross-subject acceptance gate (§4, ≥6 pts) is **unlikely to be met without ne capture** (fleet: `cognitum-seed-1` + multi-room, see `CLAUDE.local.md`). Recommend re-scoping phase 1 around data collection before further loss-stack engineering. +### 3.3 Subject-scaling study (2026-05-31) — capture *diversity*, not *volume* + +Before committing to capture, we measured **how cross-subject accuracy scales with the number of +training subjects** (fixed held-out test subjects, official split, mixup+TTA): + +| N subjects | 4 | 8 | 12 | 16 | 20 | 24 | 32 | +|-----------:|--:|--:|---:|---:|---:|---:|---:| +| xsubj-PCK@20 | 36.7 | 57.7 | 58.3 | 61.1 | 62.7 | 63.3 | **63.7** | + +The curve **saturates**: 4→8 subjects = **+21 pts**, but 24→32 = **+0.45 pts**. Asymptote ≈ 64–65%, +still ~19 pts under in-domain. **Key correction to the "more data" recommendation:** simply capturing +*more people from the same distribution* will **not** close the gap — subject-count returns vanish +past ~16–20 subjects. The residual is **device/room/protocol shift** (MM-Fi's cross-subject split is +partly cross-environment by construction). **Re-scoped phase-1 capture target: maximize DIVERSITY +(rooms, devices, antenna geometries, traffic protocols), not headcount** — and pair it with few-shot +target-domain adaptation (a handful of labeled frames from the deployment room), which the saturation +curve implies will beat any amount of additional source subjects. This makes the encoder's +*domain-invariance* objective (vs the failed subject-invariance one) the design priority. + ## 4. Acceptance Test The encoder is accepted **only if it improves cross-subject torso-PCK@20 by ≥ 6 absolute points without reducing random-split torso-PCK@20 by more than 2 points** — on the same MM-Fi pipeline, one-command reproduction, with per-joint error tables. Results land as AetherArena witness rows (ADR-149), nothing published until reviewed.