docs(adr-027): re-scope MERIDIAN to MAE foundation pre-training (§2.0, #68)
Adds §2.0 — the primary MERIDIAN path is now a three-stage pipeline:
1. pre-train a CIG-MAE-style dual-stream (amplitude+phase) masked autoencoder
on heterogeneous CSI (data breadth > pose-net capacity — arXiv:2511.18792);
2. fine-tune the existing §2.1–§2.6 heads (17-kpt/DensePose, AETHER, domain-
adversarial, geometry-conditioned) on top of the pre-trained encoder;
3. adapt per-room with source-free unsupervised domain adaptation behind
coherence_gate.rs::Recalibrate (separate ADR).
§2.1+ is retained but re-framed as the fine-tune-stage head, not a from-scratch
design. Adds the supporting references (2511.18792, 2512.04723, 2605.01369,
2506.12052, ACM TOSN 10.1145/3715130) and points at the 2026-Q2 SOTA survey.
Co-Authored-By: claude-flow <ruv@ruv.net>
This commit is contained in:
parent
8b2a2d94e8
commit
1d4f23bd41
|
|
@ -60,8 +60,41 @@ Five concurrent lines of research have converged on the domain generalization pr
|
|||
|
||||
## 2. Decision
|
||||
|
||||
### 2.0 — 2026-Q2 Re-scope: MERIDIAN-MAE foundation pre-training (primary path)
|
||||
|
||||
> **Status of this subsection:** Active. Supersedes the *training strategy* of §2.1–§2.6 (the dual-path / domain-adversarial / geometry-conditioned *architecture* is retained — it becomes the **fine-tune-stage head** on top of a pre-trained encoder, not a from-scratch network).
|
||||
> **Driver:** `docs/research/sota/2026-Q2-agentic-ai-and-edge-for-ruview.md` (§B1) and the 2025→2026 evidence below.
|
||||
|
||||
**What changed.** The 2026 WiFi-sensing literature converged on a single result: **masked-autoencoder (MAE) pre-training on large, heterogeneous CSI pools beats supervised baselines on cross-domain tasks, and the bottleneck is data breadth, not model capacity.**
|
||||
|
||||
- *Scale What Counts, Mask What Matters* (arXiv:2511.18792): pre-trains/evaluates across **14 datasets, >1.3 M CSI samples, 4 device types, 2.4/5/6 GHz**; **log-linear** cross-domain gains with pre-training data (+2.2 % to +15.7 % over supervised), **marginal** gains from bigger models.
|
||||
- **CIG-MAE** (arXiv:2512.04723): dual-stream MAE reconstructing **both amplitude and phase**, with information-guided masking — phase reconstruction is now SOTA-competitive (historically the hard part).
|
||||
- **AM-FM** (2026; arXiv:2602.11200, already cited in §1.2): ~9.2 M samples, ~20 device types — the data-breadth thesis at scale.
|
||||
- *A Tutorial-cum-Survey on SSL for Wi-Fi Sensing* (arXiv:2506.12052) and ACM TOSN (10.1145/3715130): MAE is the consistently strongest SSL choice for CSI.
|
||||
|
||||
**Revised decision.** The primary MERIDIAN program is now a **three-stage** pipeline:
|
||||
|
||||
1. **Pre-train** a CIG-MAE-style **dual-stream (amplitude + phase) masked autoencoder** on every CSI source RuView can reach — own recordings (`data/recordings/`, overnight captures), MM-Fi + Wi-Pose (ADR-015), public CSI corpora, and the multi-band virtual-subcarrier streams from `ruvsense/multiband.rs`. Thesis: *data breadth > pose-net capacity*.
|
||||
2. **Fine-tune** the existing MERIDIAN heads — the 17-keypoint / DensePose-UV regression heads, the AETHER contrastive embedding (ADR-024), and the domain-adversarial / geometry-conditioned layers of §2.1–§2.6 — on top of the **frozen-then-unfrozen** pre-trained encoder. The §2.x machinery is now *regularisation on a good representation* rather than the load-bearing structure.
|
||||
3. **Adapt** per room with **source-free unsupervised domain adaptation** (MU-SHOT-Fi, arXiv:2605.01369; Wi-SFDAGR) wired behind `ruvsense/coherence_gate.rs::Recalibrate` — a bounded MicroLoRA-delta + EWC++ pass on the head, triggered by the coherence z-score, logged via the witness chain. (Tracked separately; see the companion ADR referenced in the survey's Part C #2.)
|
||||
|
||||
**Why this is better than from-scratch (§2.1 as the primary path).** A model trained from scratch on one or two single-environment datasets *cannot* see enough multipath/hardware diversity to learn an environment-agnostic representation — that's the layout-overfitting / multipath-memorisation failure in §1.1. A pre-trained encoder front-loads that diversity, so the SISO-multistatic ESP32 input (§B3) has to carry far less, and the per-room work shrinks to adaptation (stage 3), not retraining.
|
||||
|
||||
**Token convention (implementation).** A CSI window `[T, tx, rx, sub]` → a sequence of `N = T·tx·rx` tokens, each a `sub`-dim *channel snapshot* — the same `[B, T·tx·rx, sub]` layout `model.rs::ModalityTranslator` already consumes. Amplitude and phase share the token grid, so one mask drives both streams.
|
||||
|
||||
**Implementation status & plan.**
|
||||
|
||||
- ✅ **Iteration 1** (this ADR revision): `wifi-densepose-train::csi_mae` — `MaeConfig` (+`validate`), `MaskStrategy`, `TokenLayout`, deterministic `mask_csi_window` / `reassemble_tokens` (pure Rust, dependency-free PRNG, 8 unit tests, builds & tests under `cargo test --no-default-features`); a re-scoped ADR (this section); a `model` submodule skeleton (v0 stub, gated behind `tch-backend`).
|
||||
- ◻ **Iteration 2**: the tch encoder/decoder (dual-stream → shared latent → narrow decoder over all positions with learned mask tokens → reconstruct amp+phase), `reconstruction_loss`, `pretrain_step`, a `pretrain-mae` binary driving `SyntheticCsiDataset` / `MmFiDataset`; information-guided masking; a "loss decreases over N steps on synthetic data" gated test.
|
||||
- ◻ **Iteration 3+**: pool & ingest heterogeneous CSI; real pre-train run (needs GPU — `scripts/gcloud-train.sh` / the cognitum project); fine-tune the §2.x heads on top; cross-domain eval (§4.6 protocol); ship the encoder as an RVF segment (§4.7).
|
||||
- ⏸ **Out of scope here**: the per-room SFDA adaptation (stage 3) — its own ADR.
|
||||
|
||||
The remainder of this ADR (§2.1 onward) describes the **fine-tune-stage architecture** — read it as "the head and regularisers that sit on top of the §2.0 pre-trained encoder", not as a from-scratch design.
|
||||
|
||||
### 2.1 Architecture: Environment-Disentangled Dual-Path Transformer
|
||||
|
||||
> *(Now the fine-tune-stage head — see §2.0.)*
|
||||
|
||||
MERIDIAN adds a domain generalization layer between the CSI encoder and the pose/embedding heads. The core insight is explicit factorization: decompose the latent representation into a **pose-relevant** component (invariant across environments) and an **environment** component (captures room geometry, hardware, layout):
|
||||
|
||||
```
|
||||
|
|
@ -546,3 +579,12 @@ ADR-011 Proof-of-Reality ──→ ⏳ Independent (Python v1 issue, high pr
|
|||
8. Ramesh, S. et al. (2025). "LatentCSI: High-resolution efficient image generation from WiFi CSI using a pretrained latent diffusion model." arXiv:2506.10605. https://arxiv.org/abs/2506.10605
|
||||
9. Ganin, Y. et al. (2016). "Domain-Adversarial Training of Neural Networks." JMLR 17(59):1-35. https://jmlr.org/papers/v17/15-239.html
|
||||
10. Perez, E. et al. (2018). "FiLM: Visual Reasoning with a General Conditioning Layer." AAAI 2018. arXiv:1709.07871. https://arxiv.org/abs/1709.07871
|
||||
|
||||
**2026-Q2 re-scope (§2.0) — masked-autoencoder foundation pre-training:**
|
||||
|
||||
11. "Scale What Counts, Mask What Matters: Evaluating Foundation Models for Zero-Shot Cross-Domain Wi-Fi Sensing." arXiv:2511.18792. https://arxiv.org/html/2511.18792 — 14 datasets / >1.3 M CSI samples; data-breadth > model-capacity.
|
||||
12. "CIG-MAE: Cross-Modal Information-Guided Masked Autoencoder for Self-Supervised WiFi Sensing." arXiv:2512.04723. https://arxiv.org/html/2512.04723v1 — dual-stream amplitude+phase MAE, information-guided masking.
|
||||
13. "MU-SHOT-Fi: Self-Supervised Multi-User Wi-Fi Sensing with Source-free Unsupervised Domain Adaptation." arXiv:2605.01369. https://arxiv.org/html/2605.01369 — per-room SFDA (MERIDIAN stage 3).
|
||||
14. "A Tutorial-cum-Survey on Self-Supervised Learning for Wi-Fi Sensing: Trends, Challenges, and Outlook." arXiv:2506.12052. https://arxiv.org/html/2506.12052
|
||||
15. "Evaluating Self-Supervised Learning for WiFi CSI-Based Human Activity Recognition." ACM Trans. Sensor Networks. https://dl.acm.org/doi/10.1145/3715130
|
||||
16. RuView 2026-Q2 SOTA survey — `docs/research/sota/2026-Q2-agentic-ai-and-edge-for-ruview.md` (§B1, Part C #1).
|
||||
|
|
|
|||
Loading…
Reference in New Issue