From c95dd308fd8a9158d4e6193243eaa164ba1e0520 Mon Sep 17 00:00:00 2001
From: ruv <ruv@ruv.net>
Date: Sun, 31 May 2026 03:37:19 -0400
Subject: [PATCH] docs(study): cross-dataset confirmed on harder NTU-Fi-HumanID
 task

Re-ran transfer on 14-class person-ID (harder than 6-activity HAR): same
null-transfer result (MM-Fi pretrain 91.7% = random 92.8%). Unified root
cause: CSI in-domain classification lives in the target-trained readout
(random projection already separable); learned reps don't transfer across
subjects/rooms/datasets. WiFi-CSI is distribution-locked. Addresses the
'HAR too easy' caveat.

Co-Authored-By: claude-flow <ruv@ruv.net>
---
 docs/benchmarks/mmfi-wifi-sensing-study.md | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/docs/benchmarks/mmfi-wifi-sensing-study.md b/docs/benchmarks/mmfi-wifi-sensing-study.md
index 41655ecd..e8e512ea 100644
--- a/docs/benchmarks/mmfi-wifi-sensing-study.md
+++ b/docs/benchmarks/mmfi-wifi-sensing-study.md
@@ -117,8 +117,14 @@ architecture-agnostic LoRA on the pose head, tested).
   probe. CSI representations are **distribution-locked** (same root cause as the within-MM-Fi
   cross-subject/-environment collapse); the practical answer is on-target training/few-shot, not
   transferable zero-shot features. Caveat: NTU-Fi's 6 coarse activities are an *easy* target (random
-  features → 93%), so it weakly stresses representation quality. A harder cross-dataset pose benchmark
-  remains open.
+  features → 93%), so it weakly stresses representation quality — but re-running on the harder
+  **NTU-Fi-HumanID** task (14-class gait person-ID, chance 7.1%) gave the *same* result (MM-Fi
+  pretrain 91.7% ≈ random 92.8%). **Unified root cause:** for CSI, in-domain classification lives in
+  the *target-trained readout* (a random 256-d projection of 3,420-d CSI is already linearly
+  separable), while the *learned representation* fails to transfer across subjects, rooms, and
+  datasets alike. WiFi-CSI sensing is **distribution-locked**; the answer is on-target few-shot
+  calibration, not transferable features. A harder cross-dataset *pose* benchmark (vs classification)
+  remains the one open variant.
 - Random-split numbers are reported only to compare to prior work on the same protocol; they are
   in-domain and partly leaky. The cross-subject / cross-environment numbers are the honest ones.
 - Action-recognition accuracy is window-level (MM-Fi's own HAR experiment is clip-level); not directly