diff --git a/README.md b/README.md index 1bb22602..64a057a1 100644 --- a/README.md +++ b/README.md @@ -50,7 +50,7 @@ docker run -p 3000:3000 ruvnet/wifi-densepose:latest | [User Guide](docs/user-guide.md) | Step-by-step guide: installation, first run, API usage, hardware setup, training | | [Build Guide](docs/build-guide.md) | Building from source (Rust and Python) | | [Architecture Decisions](docs/adr/README.md) | 44 ADRs — why each technical choice was made, organized by domain (hardware, signal processing, ML, platform, infrastructure) | -| [Domain Models](docs/ddd/README.md) | 3 DDD models (RuvSense, WiFi-Mat, CHCI) — bounded contexts, aggregates, domain events, and ubiquitous language | +| [Domain Models](docs/ddd/README.md) | 7 DDD models (RuvSense, Signal Processing, Training Pipeline, Hardware Platform, Sensing Server, WiFi-Mat, CHCI) — bounded contexts, aggregates, domain events, and ubiquitous language | --- diff --git a/docs/ddd/README.md b/docs/ddd/README.md index 8b4b043d..a96a9b57 100644 --- a/docs/ddd/README.md +++ b/docs/ddd/README.md @@ -9,8 +9,12 @@ DDD organizes the codebase around the problem being solved — not around techni | Model | What it covers | Bounded Contexts | |-------|---------------|------------------| | [RuvSense](ruvsense-domain-model.md) | Multistatic WiFi sensing, pose tracking, vital signs, edge intelligence | 7 contexts: Sensing, Coherence, Tracking, Field Model, Longitudinal, Spatial Identity, Edge Intelligence | -| [WiFi-Mat](wifi-mat-domain-model.md) | Disaster response: survivor detection, START triage, mass casualty assessment | Scan Zone, Survivor Tracking, Triage | -| [CHCI](chci-domain-model.md) | Coherent Human Channel Imaging: sub-millimeter body surface reconstruction | Sounding, Channel Estimation, Imaging | +| [Signal Processing](signal-processing-domain-model.md) | SOTA signal processing: phase cleaning, feature extraction, motion analysis | 3 contexts: CSI Preprocessing, Feature Extraction, Motion Analysis | +| [Training Pipeline](training-pipeline-domain-model.md) | ML training: datasets, model architecture, embeddings, domain generalization | 4 contexts: Dataset Management, Model Architecture, Training Orchestration, Embedding & Transfer | +| [Hardware Platform](hardware-platform-domain-model.md) | ESP32 firmware, edge intelligence, WASM runtime, aggregation, provisioning | 5 contexts: Sensor Node, Edge Processing, WASM Runtime, Aggregation, Provisioning | +| [Sensing Server](sensing-server-domain-model.md) | Single-binary Axum server: CSI ingestion, model management, recording, training, visualization | 5 contexts: CSI Ingestion, Model Management, CSI Recording, Training Pipeline, Visualization | +| [WiFi-Mat](wifi-mat-domain-model.md) | Disaster response: survivor detection, START triage, mass casualty assessment | 3 contexts: Detection, Localization, Alerting | +| [CHCI](chci-domain-model.md) | Coherent Human Channel Imaging: sub-millimeter body surface reconstruction | 3 contexts: Sounding, Channel Estimation, Imaging | ## How to read these diff --git a/docs/ddd/hardware-platform-domain-model.md b/docs/ddd/hardware-platform-domain-model.md new file mode 100644 index 00000000..c667161e --- /dev/null +++ b/docs/ddd/hardware-platform-domain-model.md @@ -0,0 +1,1338 @@ +# Hardware Platform Domain Model + +The Hardware Platform domain covers everything from the ESP32-S3 silicon to the server-side aggregator: collecting raw CSI, processing it on-device, running programmable WASM modules at the edge, and provisioning fleets of sensor nodes. It is the physical foundation that all higher-level domains (RuvSense, WiFi-Mat, Pose Tracking) depend on for real radio data. + +This document defines the system using [Domain-Driven Design](https://martinfowler.com/bliki/DomainDrivenDesign.html) (DDD): bounded contexts that own their data and rules, aggregate roots that enforce invariants, value objects that carry meaning, and domain events that connect everything. The goal is to make the firmware and hardware layer's structure match the electronics it controls -- so that anyone reading the code (or an AI agent modifying it) understands *why* each piece exists, not just *what* it does. + +**Bounded Contexts:** + +| # | Context | Responsibility | Key ADRs | Code | +|---|---------|----------------|----------|------| +| 1 | [Sensor Node](#1-sensor-node-context) | WiFi CSI collection, channel hopping, TDM scheduling, UDP streaming | [ADR-012](../adr/ADR-012-esp32-csi-sensor-mesh.md), [ADR-018](../adr/ADR-018-dev-implementation.md) | `firmware/esp32-csi-node/main/{csi_collector,stream_sender,nvs_config}.c` | +| 2 | [Edge Processing](#2-edge-processing-context) | On-device DSP pipeline (Tiers 0-2): phase unwrap, presence, vitals, fall detection | [ADR-039](../adr/ADR-039-esp32-edge-intelligence.md) | `firmware/esp32-csi-node/main/edge_processing.c` | +| 3 | [WASM Runtime](#3-wasm-runtime-context) | Tier 3 programmable sensing: module management, host API, budget control, RVF containers | [ADR-040](../adr/ADR-040-wasm-programmable-sensing.md), [ADR-041](../adr/ADR-041-wasm-module-collection.md) | `firmware/esp32-csi-node/main/{wasm_runtime,wasm_upload,rvf_parser}.c` | +| 4 | [Aggregation](#4-aggregation-context) | Server-side CSI frame reception, timestamp alignment, multi-node feature fusion | [ADR-012](../adr/ADR-012-esp32-csi-sensor-mesh.md) | `crates/wifi-densepose-hardware/src/esp32/` | +| 5 | [Provisioning](#5-provisioning-context) | NVS configuration, firmware lifecycle, fleet management, deployment presets | [ADR-044](../adr/ADR-044-provisioning-tool-enhancements.md) | `firmware/esp32-csi-node/provision.py` | + +All firmware paths are relative to the repository root. Rust crate paths are relative to `rust-port/wifi-densepose-rs/`. + +--- + +## Domain-Driven Design Specification + +### Ubiquitous Language + +| Term | Definition | +|------|------------| +| **Sensor Node** | An ESP32-S3 device that captures WiFi CSI frames and streams them to an aggregator via UDP | +| **CSI Frame** | A snapshot of Channel State Information: amplitude and phase per subcarrier, extracted from WiFi preambles | +| **Subcarrier** | One of 52-56 OFDM frequency bins whose complex response encodes the radio channel; the atomic unit of CSI | +| **Edge Tier** | Processing level on the ESP32: 0 = raw passthrough, 1 = basic DSP, 2 = vitals pipeline, 3 = WASM programmable | +| **Core 0 / Core 1** | The two Xtensa LX7 cores on ESP32-S3; Core 0 runs WiFi + CSI callback, Core 1 runs the DSP pipeline | +| **SPSC Ring Buffer** | Single-producer single-consumer lock-free queue between Core 0 (CSI callback) and Core 1 (DSP task) | +| **Vitals Packet** | 32-byte UDP packet (magic `0xC5110002`) containing presence, breathing BPM, heart rate BPM, fall flag | +| **Compressed Frame** | Delta-compressed CSI frame (magic `0xC5110003`) using XOR + RLE for 30-50% bandwidth reduction | +| **WASM Module** | A `no_std` Rust program compiled to `wasm32-unknown-unknown`, executed on-device via WASM3 interpreter | +| **Module Slot** | One of 4 pre-allocated PSRAM arenas (160 KB each) that host a WASM module instance | +| **Host API** | 12 functions in the `csi` namespace that WASM modules call to read sensor data and emit events | +| **RVF Container** | Signed binary envelope (192-byte overhead) wrapping a WASM payload with manifest, capabilities, and Ed25519 signature | +| **Budget Guard** | Per-frame execution time limit (default 10 ms); modules exceeding 10 consecutive faults are auto-stopped | +| **Adaptive Budget** | Mincut-eigenvalue-gap-driven compute allocation: scene complexity drives how much CPU time WASM modules get | +| **Aggregator** | Server (laptop, RPi, or cloud) that receives UDP streams from all nodes, aligns timestamps, and fuses features | +| **Feature-Level Fusion** | Combining per-node extracted features (not raw I/Q) to avoid cross-node clock synchronization | +| **Fused Frame** | Aggregated observation from all nodes for one time window, with cross-node correlation and fused motion energy | +| **NVS** | Non-Volatile Storage on ESP32 flash; stores runtime configuration (WiFi creds, edge tier, TDM slot, etc.) | +| **Provisioning** | Writing NVS key-value pairs to a device without recompiling firmware | +| **TDM Slot** | Time-Division Multiplexing slot assignment for coordinated multi-node transmission | +| **Channel Hopping** | Switching the ESP32 radio across WiFi channels (e.g., 1, 6, 11) for multi-band CSI diversity | +| **OTA Update** | Over-the-air firmware update via HTTP endpoint on port 8032 | + +--- + +## Bounded Contexts + +### 1. Sensor Node Context + +**Responsibility:** Capture raw WiFi CSI frames via the ESP-IDF CSI API, serialize them into the ADR-018 binary format, and stream to the aggregator over UDP. Handle channel hopping, TDM scheduling, and rate limiting. + +``` ++--------------------------------------------------------------+ +| Sensor Node Context | ++--------------------------------------------------------------+ +| | +| +----------------+ +----------------+ | +| | CSI Collector | | NVS Config | | +| | (promiscuous | | (20+ keys: | | +| | mode, I/Q | | ssid, ip, | | +| | extraction) | | tier, tdm...) | | +| +-------+--------+ +-------+--------+ | +| | | | +| | CSI callback | Boot config | +| | (Core 0, 50 Hz | | +| | rate limit) | | +| v | | +| +----------------+ | | +| | Stream Sender |<-----------+ | +| | (UDP to agg, | | +| | seq numbers, | | +| | ENOMEM | | +| | backoff) |---> UDP frames (magic 0xC5110001) | +| +-------+--------+ | +| | | +| | SPSC ring buffer (to Core 1) | +| v | +| [Edge Processing Context] | +| | ++--------------------------------------------------------------+ +``` + +**Aggregates:** +- `SensorNode` (Aggregate Root) + +**Value Objects:** +- `CsiFrame` +- `NodeIdentity` +- `NvsConfig` +- `TdmSchedule` +- `ChannelHopConfig` + +**Domain Services:** +- `CsiCollectionService` -- Registers ESP-IDF CSI callback, extracts I/Q, enforces 50 Hz rate limit +- `StreamSendService` -- Serializes frames to ADR-018 binary format, sends UDP with sequence numbers +- `NvsConfigService` -- Reads 20+ NVS keys at boot, provides typed config to all firmware components + +--- + +### 2. Edge Processing Context + +**Responsibility:** On-device signal processing pipeline running on Core 1. Implements Tiers 0-2: phase extraction, Welford running statistics, top-K subcarrier selection, bandpass filtering, BPM estimation, presence detection, and fall detection. + +``` ++--------------------------------------------------------------+ +| Edge Processing Context | ++--------------------------------------------------------------+ +| | +| SPSC ring buffer (from Core 0) | +| | | +| v | +| +----------------+ | +| | Phase Extract | Tier 1 | +| | + Unwrap | | +| +-------+--------+ | +| | | +| v | +| +----------------+ +----------------+ | +| | Welford Stats | | Top-K Select | | +| | (per-subcarrier| | (by variance) | | +| | running var) | +-------+--------+ | +| +-------+--------+ | | +| | | | +| +----------+----------+ | +| | | +| v | +| +------------------+-----------+ Tier 2 | +| | Biquad IIR Bandpass Filters | | +| | breathing: 0.1-0.5 Hz | | +| | heart rate: 0.8-2.0 Hz | | +| +-------+----------------------+ | +| | | +| v | +| +----------------+ +----------------+ | +| | Zero-Crossing | | Presence | | +| | BPM Estimator | | Detector | | +| | | | (adaptive | | +| | | | threshold, | | +| | | | 3-sigma cal) | | +| +-------+--------+ +-------+--------+ | +| | | | +| +----------+----------+ | +| | | +| v | +| +------------------+--------+ | +| | Fall Detector | | +| | (phase acceleration | | +| | threshold) | | +| +------------------+--------+ | +| | | +| v | +| +------------------+--------+ | +| | Multi-Person Clustering | | +| | (subcarrier groups, <=4) |----> VitalsPacket (0xC5110002) | +| +---------------------------+----> CompressedFrame (0xC5110003)| +| | ++--------------------------------------------------------------+ +``` + +**Aggregates:** +- `EdgeProcessingState` (Aggregate Root) + +**Value Objects:** +- `VitalsPacket` +- `CompressedFrame` +- `PresenceState` +- `BpmEstimate` +- `FallAlert` +- `EdgeTier` + +**Domain Services:** +- `PhaseExtractionService` -- Converts raw I/Q to amplitude + phase, applies unwrapping +- `WelfordStatsService` -- Maintains per-subcarrier running mean and variance +- `TopKSelectionService` -- Selects K subcarriers with highest variance for downstream processing +- `BandpassFilterService` -- Biquad IIR filters for breathing and heart rate frequency bands +- `PresenceDetectionService` -- Adaptive threshold with 1200-frame, 3-sigma calibration +- `FallDetectionService` -- Phase acceleration exceeding configurable threshold (default 2.0 rad/s^2) +- `DeltaCompressionService` -- XOR + RLE delta encoding for 30-50% bandwidth reduction + +--- + +### 3. WASM Runtime Context + +**Responsibility:** Manage the Tier 3 WASM programmable sensing layer. Load, validate, execute, and monitor WASM modules compiled from Rust. Enforce budget guards, handle RVF container verification, expose Host API, and provide HTTP management endpoints. + +``` ++--------------------------------------------------------------+ +| WASM Runtime Context | ++--------------------------------------------------------------+ +| | +| +--------------------+ +--------------------+ | +| | Module Manager | | RVF Verifier | | +| | (4 slots, load/ | | (Ed25519 sig, | | +| | unload/start/ | | SHA-256 hash, | | +| | stop lifecycle) | | host API compat) | | +| +--------+-----------+ +--------+-----------+ | +| | | | +| +----------+--------------+ | +| | | +| v | +| +-------------------+------------------+ | +| | WASM3 Interpreter | | +| | +-----------+ +-----------+ | | +| | | Slot 0 | | Slot 1 | ...x4 | | +| | | 160 KB | | 160 KB | | | +| | | arena | | arena | | | +| | +-----------+ +-----------+ | | +| +-------------------+------------------+ | +| | | +| v | +| +-------------------+------------------+ | +| | Host API (12 funcs) | | +| | csi_get_phase, csi_get_amplitude, | | +| | csi_get_variance, csi_get_bpm_*, | | +| | csi_emit_event, csi_log, ... | | +| +-------------------+------------------+ | +| | | +| v | +| +-------------------+------------------+ | +| | Budget Controller | | +| | B = clamp(B0 + k1*dL + k2*A | | +| | - k3*T - k4*P, | | +| | B_min, B_max) | | +| | 10 consecutive faults -> auto-stop | | +| +-------------------+------------------+ | +| | | +| +----> WASM events (magic 0xC5110004) | +| | +| +--------------------+ | +| | HTTP Upload Server | | +| | (port 8032) | | +| | POST /wasm/upload | | +| | GET /wasm/list | | +| | POST /wasm/start/N | | +| | POST /wasm/stop/N | | +| | DELETE /wasm/N | | +| +--------------------+ | +| | ++--------------------------------------------------------------+ +``` + +**Aggregates:** +- `WasmModuleSlot` (Aggregate Root) + +**Value Objects:** +- `RvfContainer` +- `RvfManifest` +- `WasmTelemetry` +- `HostApiVersion` +- `CapabilityBitmask` +- `BudgetAllocation` +- `ModuleState` + +**Domain Services:** +- `RvfVerificationService` -- Parses RVF header, verifies SHA-256 hash and Ed25519 signature +- `ModuleLifecycleService` -- Handles load -> start -> run -> stop -> unload transitions +- `BudgetControllerService` -- Computes per-frame budget from mincut eigenvalue gap, thermal, and battery pressure +- `HostApiBindingService` -- Links 12 host functions to WASM3 imports in the "csi" namespace +- `WasmUploadService` -- HTTP server on port 8032 for module management endpoints + +--- + +### 4. Aggregation Context + +**Responsibility:** Receive UDP CSI streams from multiple ESP32 nodes on the server side. Align timestamps across nodes (without cross-node phase synchronization), compute cross-node correlations, and produce fused feature frames for downstream pipeline consumption. + +``` ++--------------------------------------------------------------+ +| Aggregation Context | ++--------------------------------------------------------------+ +| | +| UDP socket (:5005) | +| | | | | +| v v v | +| +--------+ +--------+ +--------+ | +| | Node 0 | | Node 1 | | Node 2 | ... (up to 6) | +| | State | | State | | State | | +| | (ring | | (ring | | (ring | | +| | buf, | | buf, | | buf, | | +| | drift)| | drift)| | drift)| | +| +---+----+ +---+----+ +---+----+ | +| | | | | +| +-----+----+-----+---+ | +| | | | +| v v | +| +--------------------+--+ +-----------------------+ | +| | Timestamp Aligner | | Cross-Node Correlator | | +| | (per-node monotonic, | | (amplitude ratios, | | +| | no NTP needed) | | fused motion energy) | | +| +-----------+-----------+ +----------+------------+ | +| | | | +| +----------+----------------+ | +| | | +| v | +| +----------------------+-----+ | +| | Fused Frame | | +| | per_node_features[] | | +| | cross_node_correlation |--> pipeline_tx (mpsc channel) | +| | fused_motion_energy | | +| | fused_breathing_band | | +| +----------------------------+ | +| | ++--------------------------------------------------------------+ +``` + +**Aggregates:** +- `Esp32Aggregator` (Aggregate Root) + +**Value Objects:** +- `FusedFrame` +- `NodeState` +- `CrossNodeCorrelation` +- `FusedMotionEnergy` + +**Domain Services:** +- `UdpReceiverService` -- Listens on UDP port 5005, demuxes by magic number and node ID +- `TimestampAlignmentService` -- Maps per-node monotonic timestamps to aggregator-local time +- `FeatureFusionService` -- Computes cross-node correlation, fused motion (max across nodes), fused breathing (highest SNR) +- `PipelineBridgeService` -- Feeds fused frames into the wifi-densepose Rust pipeline via mpsc channel + +--- + +### 5. Provisioning Context + +**Responsibility:** Configure ESP32 sensor nodes by writing NVS key-value pairs without recompiling firmware. Support fleet provisioning via config files, deployment presets, read-back verification, and auto-detection of connected devices. + +``` ++--------------------------------------------------------------+ +| Provisioning Context | ++--------------------------------------------------------------+ +| | +| +--------------------+ +--------------------+ | +| | CLI Interface | | Config File Loader | | +| | (--ssid, --port, | | (JSON mesh config, | | +| | --edge-tier, | | common + per-node | | +| | --preset, ...) | | settings) | | +| +--------+-----------+ +--------+-----------+ | +| | | | +| +----------+--------------+ | +| | | +| v | +| +-------------------+------------------+ | +| | Preset Resolver | | +| | basic, vitals, mesh-3, | | +| | mesh-6-vitals | | +| +-------------------+------------------+ | +| | | +| v | +| +-------------------+------------------+ | +| | NVS Writer | | +| | esptool partition write | | +| | 20+ keys: ssid, password, | | +| | target_ip, edge_tier, tdm_slot, | | +| | hop_count, wasm_max, ... | | +| +-------------------+------------------+ | +| | | +| v | +| +-------------------+------------------+ | +| | Verifier (optional) | | +| | serial monitor for 5s, | | +| | check for "CSI streaming active" | | +| +--------------------------------------+ | +| | +| +--------------------+ | +| | Read-Back | | +| | (--read: dump NVS | | +| | partition, parse | | +| | key-value pairs) | | +| +--------------------+ | +| | +| +--------------------+ | +| | Auto-Detect | | +| | (scan serial ports | | +| | for ESP32-S3) | | +| +--------------------+ | +| | ++--------------------------------------------------------------+ +``` + +**Aggregates:** +- `ProvisioningSession` (Aggregate Root) + +**Value Objects:** +- `NvsConfig` +- `DeploymentPreset` +- `MeshConfig` +- `PortIdentity` +- `VerificationResult` + +**Domain Services:** +- `NvsWriteService` -- Writes typed NVS key-value pairs to the ESP32 flash partition via esptool +- `PresetResolverService` -- Maps named presets (basic, vitals, mesh-3, mesh-6-vitals) to NVS key sets +- `MeshProvisionerService` -- Iterates over nodes in a config file, computing TDM slots automatically +- `ReadBackService` -- Reads NVS partition, parses binary format, returns typed config +- `BootVerificationService` -- Opens serial monitor post-provision, checks for expected log lines + +--- + +## Aggregates + +### SensorNode (Aggregate Root) + +```rust +/// A physical ESP32-S3 device configured for CSI collection. +/// Owns its identity, configuration, firmware version, and current edge tier. +pub struct SensorNode { + /// Unique node identifier (0-255, assigned during provisioning) + node_id: u8, + /// WiFi MAC address of the ESP32-S3 + mac: MacAddress, + /// Current WiFi channel + channel: u8, + /// Firmware version string (e.g., "1.2.0") + firmware_version: FirmwareVersion, + /// Current edge processing tier (0-3) + edge_tier: EdgeTier, + /// Full NVS configuration snapshot + config: NvsConfig, + /// TDM slot assignment (None if standalone) + tdm_slot: Option, + /// Channel hopping configuration + hop_config: Option, + /// Current operational status + status: NodeStatus, + /// Monotonic boot timestamp (ms since power-on) + uptime_ms: u64, +} + +impl SensorNode { + /// Invariant: node_id must be unique within a mesh deployment + /// Invariant: edge_tier 3 requires WASM runtime to be initialized + /// Invariant: tdm_slot.slot < tdm_slot.total_nodes + pub fn new(node_id: u8, mac: MacAddress, config: NvsConfig) -> Self { /* ... */ } + + pub fn transition_tier(&mut self, new_tier: EdgeTier) -> Result<(), TierError> { + // Cannot go to Tier 3 if WASM runtime is not available + // Cannot downgrade while WASM modules are running + /* ... */ + } +} +``` + +### EdgeProcessingState (Aggregate Root) + +```rust +/// Maintains the full on-device DSP pipeline state for one sensor node. +/// Runs exclusively on Core 1. +pub struct EdgeProcessingState { + /// Current processing tier + tier: EdgeTier, + /// Per-subcarrier running statistics (Welford) + subcarrier_stats: [WelfordAccumulator; 56], + /// Top-K selected subcarrier indices + top_k_indices: Vec, + /// Biquad IIR filter states + breathing_filter: BiquadState, + heartrate_filter: BiquadState, + /// Current presence detection state + presence: PresenceState, + /// Latest BPM estimates + breathing_bpm: Option, + heartrate_bpm: Option, + /// Fall detection state + fall_detector: FallDetectorState, + /// Multi-person clustering state (up to 4 persons) + person_clusters: Vec, + /// Calibration state (1200-frame adaptive threshold) + calibration: CalibrationState, +} + +impl EdgeProcessingState { + /// Invariant: Only processes frames on Core 1 (never Core 0) + /// Invariant: Tier 0 performs no processing (passthrough only) + /// Invariant: Tier 2 includes all of Tier 1 processing + /// Invariant: person_clusters.len() <= 4 + pub fn process_frame(&mut self, frame: &RawCsiFrame) -> ProcessingResult { /* ... */ } +} +``` + +### WasmModuleSlot (Aggregate Root) + +```rust +/// One of 4 pre-allocated WASM execution slots on the ESP32-S3. +/// Each slot owns its PSRAM arena, WASM3 runtime instance, and telemetry. +pub struct WasmModuleSlot { + /// Slot index (0-3) + slot_id: u8, + /// Pre-allocated PSRAM arena (160 KB, fixed at boot) + arena: FixedArena, + /// Loaded module metadata (None if slot is empty) + module: Option, + /// Current slot state + state: ModuleState, + /// Per-module telemetry counters + telemetry: WasmTelemetry, + /// Budget allocation for this slot (microseconds per frame) + budget_us: u32, +} + +/// Metadata for a loaded WASM module +pub struct LoadedModule { + /// Module name from RVF manifest (up to 32 chars) + name: String, + /// SHA-256 hash of the WASM payload + build_hash: [u8; 32], + /// Declared capability bitmask + capabilities: CapabilityBitmask, + /// Author string from manifest + author: String, + /// WASM3 function pointers for lifecycle + fn_on_init: WasmFunction, + fn_on_frame: WasmFunction, + fn_on_timer: WasmFunction, +} + +impl WasmModuleSlot { + /// Invariant: Arena is pre-allocated at boot and never freed (prevents fragmentation) + /// Invariant: Module auto-stopped after 10 consecutive budget faults + /// Invariant: RVF signature must be verified before loading (when wasm_verify=1) + /// Invariant: Module binary + WASM3 heap must fit within 160 KB arena + pub fn load(&mut self, rvf: &RvfContainer) -> Result<(), WasmLoadError> { /* ... */ } + + pub fn on_frame(&mut self, n_sc: i32) -> Result, WasmExecError> { + // Measure execution time + // Record telemetry + // Check budget guard + /* ... */ + } +} +``` + +### Esp32Aggregator (Aggregate Root) + +```rust +/// Server-side aggregator that receives CSI streams from multiple ESP32 nodes, +/// aligns timestamps, and produces fused feature frames. +pub struct Esp32Aggregator { + /// UDP socket listening for node streams (port 5005) + socket: UdpSocket, + /// Per-node state: ring buffer, last timestamp, drift estimate + nodes: HashMap, + /// Ring buffer of fused feature frames + fused_buffer: VecDeque, + /// Channel to downstream pipeline + pipeline_tx: mpsc::Sender, + /// Configuration + config: AggregatorConfig, +} + +impl Esp32Aggregator { + /// Invariant: Fuses features, never raw phases (clock drift makes cross-node + /// phase alignment impossible with 20-50 ppm crystal oscillators) + /// Invariant: Handles missing nodes gracefully (partial fused frames are valid) + /// Invariant: Sequence number gaps < 100ms are interpolated, not dropped + pub fn receive_and_fuse(&mut self) -> Result { /* ... */ } +} +``` + +### ProvisioningSession (Aggregate Root) + +```rust +/// A provisioning session that configures one or more ESP32 nodes. +/// Tracks which nodes have been provisioned and their verification status. +pub struct ProvisioningSession { + /// Session identifier + session_id: SessionId, + /// Common configuration shared across all nodes + common_config: CommonConfig, + /// Per-node provisioning state + node_results: Vec, + /// Preset used (if any) + preset: Option, + /// Mesh configuration (if provisioning multiple nodes) + mesh_config: Option, +} + +impl ProvisioningSession { + /// Invariant: WiFi credentials must be non-empty + /// Invariant: target_ip must be a valid IPv4 address + /// Invariant: TDM slot indices must be unique and contiguous within a mesh + /// Invariant: hop_count must match the length of the channel list + pub fn provision_node(&mut self, port: &PortIdentity) -> Result<(), ProvisionError> { + /* ... */ + } +} +``` + +--- + +## Value Objects + +### CsiFrame + +```rust +/// A single CSI observation from one ESP32 node. +/// Immutable snapshot of the radio channel at one instant. +pub struct CsiFrame { + /// Monotonic timestamp (ms since node boot) + timestamp_ms: u32, + /// Source node identifier + node_id: u8, + /// RSSI in dBm (typically -90 to -20) + rssi: i8, + /// WiFi channel number (1-13) + channel: u8, + /// Per-subcarrier amplitude (|CSI|, 52-56 values) + amplitude: Vec, + /// Per-subcarrier phase (arg(CSI), 52-56 values, radians) + phase: Vec, + /// Sequence number for loss detection + seq_num: u32, +} +``` + +### VitalsPacket + +```rust +/// 32-byte Tier 2 output packet sent at configurable intervals. +/// Contains all vital sign estimates from on-device processing. +pub struct VitalsPacket { + /// Presence state + presence: PresenceState, + /// Motion score (0-255, higher = more motion) + motion_score: u8, + /// Breathing rate estimate (BPM, None if not confident) + breathing_bpm: Option, + /// Heart rate estimate (BPM, None if not confident) + heart_rate_bpm: Option, + /// Fall detected flag + fall_flag: bool, + /// Number of detected persons (0-8) + n_persons: u8, + /// Motion energy scalar + motion_energy: f32, + /// Presence confidence score + presence_score: f32, + /// RSSI at time of measurement + rssi: i8, + /// Timestamp (ms since boot) + timestamp_ms: u32, +} +``` + +### EdgeTier + +```rust +/// Processing tier on the ESP32-S3. Each tier includes all functionality +/// of lower tiers. +#[derive(Clone, Copy, PartialEq, Eq, PartialOrd, Ord)] +pub enum EdgeTier { + /// Tier 0: Raw CSI passthrough (magic 0xC5110001). No on-device processing. + Disabled = 0, + /// Tier 1: Phase unwrap, Welford stats, top-K selection, delta compression. + /// Adds ~30 KB binary overhead. + BasicDsp = 1, + /// Tier 2: All of Tier 1 + biquad bandpass, BPM estimation, presence, + /// fall detection, multi-person clustering. Adds ~3 KB over Tier 1. + FullPipeline = 2, + /// Tier 3: All of Tier 2 + WASM3 runtime for programmable sensing modules. + /// Adds ~100 KB binary (WASM3 interpreter). + WasmProgrammable = 3, +} +``` + +### NvsConfig + +```rust +/// Complete NVS configuration for one ESP32 sensor node. +/// Covers all 20+ firmware-readable keys. +pub struct NvsConfig { + // -- Network -- + pub ssid: String, + pub password: String, + pub target_ip: Ipv4Addr, + pub target_port: u16, // default: 5005 + pub node_id: u8, // default: 0 + + // -- TDM -- + pub tdm_slot: u8, // default: 0 + pub tdm_total: u8, // default: 1 (no TDM) + + // -- Channel Hopping -- + pub hop_count: u8, // default: 1 (no hop) + pub chan_list: Vec, // default: [1, 6, 11] + pub dwell_ms: u32, // default: 100 + + // -- Edge Processing -- + pub edge_tier: EdgeTier, // default: Tier 2 + pub pres_thresh: u16, // default: 0 (auto-calibrate) + pub fall_thresh: u16, // default: 2000 (2.0 rad/s^2) + pub vital_win: u16, // default: 256 + pub vital_int: u16, // default: 1000 ms + pub subk_count: u8, // default: 8 + + // -- Power -- + pub power_duty: u8, // default: 100 (always on) + + // -- WASM -- + pub wasm_max: u8, // default: 4 + pub wasm_verify: bool, // default: true (secure-by-default) + pub wasm_pubkey: Option<[u8; 32]>, // Ed25519 public key + + // -- MAC Filter -- + pub filter_mac: Option, +} +``` + +### RvfContainer + +```rust +/// RVF (RuVector Format) container for signed WASM deployment. +/// Total overhead: 192 bytes (32 header + 96 manifest + 64 signature). +pub struct RvfContainer { + /// Format version (currently 1) + pub format_version: u16, + /// Feature flags (bit 0: has_signature, bit 1: has_test_vectors) + pub flags: u16, + /// Module manifest + pub manifest: RvfManifest, + /// Raw WASM payload (starts with "\0asm" magic) + pub wasm_payload: Vec, + /// Ed25519 signature over header + manifest + payload (64 bytes) + pub signature: Option<[u8; 64]>, + /// Optional test vectors for self-verification + pub test_vectors: Option>, +} + +/// 96-byte packed manifest describing the WASM module. +pub struct RvfManifest { + pub module_name: String, // up to 32 chars + pub required_host_api: u16, // version (1 = current) + pub capabilities: CapabilityBitmask, + pub max_frame_us: u32, // requested per-frame budget + pub max_events_per_sec: u16, // rate limit + pub memory_limit_kb: u16, // max WASM heap + pub event_schema_version: u16, + pub build_hash: [u8; 32], // SHA-256 of WASM payload + pub min_subcarriers: u16, + pub max_subcarriers: u16, + pub author: String, // up to 10 chars +} +``` + +### WasmTelemetry + +```rust +/// Per-module execution telemetry, exposed via /wasm/list endpoint. +pub struct WasmTelemetry { + /// Total on_frame() calls since module start + pub frame_count: u32, + /// Total csi_emit_event() calls + pub event_count: u32, + /// WASM3 runtime errors + pub error_count: u32, + /// Cumulative execution time (microseconds) + pub total_us: u32, + /// Worst-case single-frame execution time (microseconds) + pub max_us: u32, + /// Number of times frame budget was exceeded + pub budget_faults: u32, +} +``` + +### FusedFrame + +```rust +/// Aggregated observation from all nodes for one time window. +/// Product of feature-level fusion (not signal-level). +pub struct FusedFrame { + /// Aggregator-local monotonic timestamp + timestamp: Instant, + /// Per-node features (None if node dropped frames) + node_features: Vec>, + /// Cross-node correlation matrix (N x N) + cross_node_correlation: Array2, + /// Fused motion energy (max across all nodes) + fused_motion_energy: f64, + /// Fused breathing band (coherent sum from highest-SNR node) + fused_breathing_band: f64, +} +``` + +### DeploymentPreset + +```rust +/// Named provisioning presets for common deployment scenarios. +pub enum DeploymentPreset { + /// Single node, Tier 0, no TDM, no hopping + Basic, + /// Single node, Tier 2, vital_int=1000, subk_count=32 + Vitals, + /// 3-node TDM, Tier 1, hop_count=3, channels=[1,6,11] + Mesh3, + /// 6-node TDM, Tier 2, hop_count=3, channels=[1,6,11], vital_int=500 + Mesh6Vitals, +} +``` + +### PresenceState + +```rust +/// Tri-state presence classification from the edge DSP pipeline. +pub enum PresenceState { + /// No motion detected; room appears empty + Empty, + /// Static human presence (breathing motion only) + Present, + /// Active motion detected + Moving, +} +``` + +### ModuleState + +```rust +/// Lifecycle state of a WASM module slot. +pub enum ModuleState { + /// Slot is empty (arena allocated but no module loaded) + Empty, + /// Module loaded into arena but not yet started + Loaded, + /// Module running: on_frame() called per CSI frame, on_timer() at interval + Running, + /// Module explicitly stopped by user + Stopped, + /// Module auto-stopped due to error (10 consecutive budget faults or runtime error) + Error, +} +``` + +--- + +## Domain Events + +### Sensor Node Events + +```rust +/// Emitted when an ESP32 node completes boot and begins CSI collection. +pub struct NodeBooted { + pub node_id: u8, + pub mac: MacAddress, + pub firmware_version: FirmwareVersion, + pub edge_tier: EdgeTier, + pub uptime_ms: u64, + pub timestamp: DateTime, +} + +/// Emitted each time a CSI frame is received by the aggregator. +pub struct CsiFrameReceived { + pub node_id: u8, + pub seq_num: u32, + pub subcarrier_count: u8, + pub rssi: i8, + pub channel: u8, + pub timestamp: DateTime, +} +``` + +### Edge Processing Events + +```rust +/// Emitted when presence detection state transitions. +pub struct PresenceChanged { + pub node_id: u8, + pub previous: PresenceState, + pub current: PresenceState, + pub motion_energy: f32, + pub timestamp: DateTime, +} + +/// Emitted at each vitals interval (default 1 Hz) with latest estimates. +pub struct VitalsUpdated { + pub node_id: u8, + pub breathing_bpm: Option, + pub heart_rate_bpm: Option, + pub n_persons: u8, + pub timestamp: DateTime, +} + +/// Emitted when phase acceleration exceeds the fall detection threshold. +pub struct FallDetected { + pub node_id: u8, + pub motion_energy: f32, + pub phase_acceleration: f32, + pub threshold: f32, + pub timestamp: DateTime, +} +``` + +### WASM Runtime Events + +```rust +/// Emitted when a WASM module is loaded into a slot and passes verification. +pub struct WasmModuleLoaded { + pub slot_id: u8, + pub module_name: String, + pub build_hash: [u8; 32], + pub capabilities: CapabilityBitmask, + pub author: String, + pub timestamp: DateTime, +} + +/// Emitted when a WASM module is auto-stopped or encounters a runtime error. +pub struct WasmModuleFaulted { + pub slot_id: u8, + pub module_name: String, + pub fault_type: WasmFaultType, + pub fault_count: u32, + pub telemetry: WasmTelemetry, + pub timestamp: DateTime, +} + +pub enum WasmFaultType { + /// Exceeded per-frame budget 10 consecutive times + BudgetExhausted, + /// WASM3 runtime trap (stack overflow, OOB memory, etc.) + RuntimeTrap, + /// Module called an unavailable host API function + MissingImport, + /// RVF signature verification failed + SignatureInvalid, +} + +/// Emitted when a WASM module calls csi_emit_event(). +pub struct WasmEventEmitted { + pub slot_id: u8, + pub module_name: String, + pub event_type: u8, + pub value: f32, + pub timestamp: DateTime, +} +``` + +### Provisioning Events + +```rust +/// Emitted when a node's NVS configuration has been successfully written. +pub struct NodeProvisioningComplete { + pub node_id: u8, + pub port: String, + pub config_keys: Vec, + pub preset: Option, + pub verified: bool, + pub timestamp: DateTime, +} + +/// Emitted when mesh provisioning completes for all nodes in a config file. +pub struct MeshProvisioningComplete { + pub session_id: SessionId, + pub node_count: usize, + pub failed_nodes: Vec, + pub timestamp: DateTime, +} +``` + +--- + +## Invariants + +### Firmware Architecture Invariants + +| # | Invariant | Rationale | Enforcement | +|---|-----------|-----------|-------------| +| 1 | Core 0 handles WiFi + CSI callback only; Core 1 handles all DSP | Prevents WiFi stack corruption from compute-heavy DSP. CSI callback runs in ISR context on Core 0. | FreeRTOS task pinning: `xTaskCreatePinnedToCore(..., 1)` for DSP task | +| 2 | SPSC ring buffer between cores prevents memory contention | Lock-free single-producer single-consumer avoids mutexes between ISR and task contexts | `csi_spsc_ring` implementation with atomic read/write indices | +| 3 | CSI callback rate-limited to 50 Hz | Prevents lwIP pbuf exhaustion at high CSI rates (100-500 Hz in promiscuous mode). Issue #127 root cause. | 20 ms minimum interval check in `csi_collector.c` | +| 4 | `sendto()` uses 100 ms ENOMEM backoff | UDP sends can fail when lwIP pbuf pool is temporarily exhausted; immediate retry amplifies the problem | `stream_sender.c` checks `errno == ENOMEM` and delays | +| 5 | Binary must fit within 1 MB OTA partition | ESP32-S3 partition table allocates 1 MB for the factory app. Exceeding this prevents OTA updates. | CI size gate at 950 KB in `firmware-ci.yml` | + +### WASM Runtime Invariants + +| # | Invariant | Rationale | Enforcement | +|---|-----------|-----------|-------------| +| 6 | WASM modules get max 10 ms per frame | Prevents a runaway module from blocking the Tier 2 DSP pipeline and missing CSI frames | `esp_timer_get_time()` measurement + budget fault counter | +| 7 | Auto-stop after 10 consecutive budget faults | Graceful degradation: faulted module is stopped, Tier 2 pipeline continues unaffected | Fault counter in `WasmModuleSlot`, state transition to `Error` | +| 8 | RVF signature verification enabled by default | WASM upload is remote code execution; signatures ensure authenticity | `wasm_verify=1` default in Kconfig and NVS fallback | +| 9 | WASM arenas are pre-allocated at boot (640 KB PSRAM) | Dynamic malloc/free cycles fragment PSRAM over days of continuous operation | Fixed 160 KB arenas per slot, zeroed on unload but never freed | +| 10 | Maximum 4 concurrent WASM module slots | Bounds PSRAM usage and prevents compute exhaustion | `WASM_MAX_SLOTS` constant, validated at load time | + +### Aggregation Invariants + +| # | Invariant | Rationale | Enforcement | +|---|-----------|-----------|-------------| +| 11 | Feature-level fusion only (never raw phase alignment) | ESP32 crystal drift of 20-50 ppm makes cross-node phase coherence impossible | Aggregator extracts per-node features independently, then correlates | +| 12 | Missing nodes produce partial fused frames, not errors | Nodes may drop offline; the system must degrade gracefully | `Option` per node in `FusedFrame.node_features` | +| 13 | Sequence number gaps < 100 ms are interpolated | Brief UDP losses should not create discontinuities in downstream processing | Gap detection + linear interpolation in `NodeState` ring buffer | + +### Provisioning Invariants + +| # | Invariant | Rationale | Enforcement | +|---|-----------|-----------|-------------| +| 14 | WiFi credentials must never appear in tracked files | Prevents credential leakage via git history | `.gitignore` for `sdkconfig`, `provision.py` writes to NVS only | +| 15 | TDM slot indices must be unique within a mesh | Duplicate slots cause transmission collisions | Validation in `MeshProvisionerService`, config file schema check | +| 16 | `hop_count` must equal `chan_list.len()` | Mismatch causes firmware to read uninitialized channel values | CLI validation + NVS write-time assertion | + +--- + +## Domain Services + +### CsiCollectionService + +Manages the ESP-IDF CSI API lifecycle on Core 0. + +```rust +pub trait CsiCollectionService { + /// Register CSI callback, configure promiscuous mode, set channel. + /// Rate-limits callback invocations to 50 Hz. + fn start_collection(&mut self, config: &NvsConfig) -> Result<(), CsiError>; + + /// Stop CSI collection and deregister callback. + fn stop_collection(&mut self) -> Result<(), CsiError>; + + /// Get current collection statistics (frames/sec, drops, errors). + fn stats(&self) -> CollectionStats; +} +``` + +### EdgeProcessingPipeline + +Orchestrates the Tier 1-2 DSP chain on Core 1. + +```rust +pub trait EdgeProcessingPipeline { + /// Process a single CSI frame through the configured tier pipeline. + /// Returns vitals packet (if Tier 2 interval elapsed) and/or compressed frame. + fn process_frame( + &mut self, + raw: &RawCsiFrame, + ) -> Result; + + /// Reconfigure the pipeline tier at runtime (e.g., via NVS update). + fn set_tier(&mut self, tier: EdgeTier) -> Result<(), TierError>; + + /// Get calibration status (0.0-1.0, 1.0 = fully calibrated after 1200 frames). + fn calibration_progress(&self) -> f32; +} + +pub struct ProcessingOutput { + pub vitals: Option, + pub compressed: Option, + pub events: Vec, +} +``` + +### WasmModuleManager + +Manages the lifecycle of WASM modules across 4 slots. + +```rust +pub trait WasmModuleManager { + /// Load an RVF container into the next available slot. + /// Verifies signature (if wasm_verify=1), checks host API compatibility, + /// validates binary fits within arena. + fn load_module(&mut self, rvf: &RvfContainer) -> Result; + + /// Start a loaded module (calls on_init()). + fn start_module(&mut self, slot_id: u8) -> Result<(), WasmExecError>; + + /// Stop a running module. + fn stop_module(&mut self, slot_id: u8) -> Result<(), WasmExecError>; + + /// Unload a module from its slot (zeroes the arena). + fn unload_module(&mut self, slot_id: u8) -> Result<(), WasmLoadError>; + + /// Get telemetry for all slots. + fn list_modules(&self) -> Vec<(u8, Option<&LoadedModule>, &ModuleState, &WasmTelemetry)>; + + /// Execute on_frame() for all running modules within the budget. + fn dispatch_frame(&mut self, n_sc: i32) -> Vec; +} +``` + +### FeatureFusionService + +Server-side fusion of per-node features into a coherent multi-node observation. + +```rust +pub trait FeatureFusionService { + /// Fuse features from N nodes for one time window. + /// - Motion energy: max across nodes + /// - Breathing band: highest-SNR node as primary + /// - Location: cross-node amplitude ratios + fn fuse( + &self, + node_features: &[Option], + ) -> Result; + + /// Compute cross-node correlation matrix. + fn cross_correlate( + &self, + features: &[Option], + ) -> Array2; +} +``` + +### ProvisioningService + +Orchestrates the full provisioning workflow for individual nodes and meshes. + +```rust +pub trait ProvisioningService { + /// Provision a single node with the given configuration. + fn provision_node( + &mut self, + port: &PortIdentity, + config: &NvsConfig, + ) -> Result; + + /// Provision all nodes defined in a mesh config file. + fn provision_mesh( + &mut self, + mesh: &MeshConfig, + ) -> Result, ProvisionError>; + + /// Read back the current NVS configuration from a connected device. + fn read_config( + &self, + port: &PortIdentity, + ) -> Result; + + /// Verify a provisioned node booted successfully. + fn verify_boot( + &self, + port: &PortIdentity, + timeout_secs: u32, + ) -> Result; + + /// Auto-detect connected ESP32-S3 devices. + fn detect_ports(&self) -> Vec; +} +``` + +--- + +## Context Map + +``` ++------------------------------------------------------------------+ +| Hardware Platform Domain | ++------------------------------------------------------------------+ +| | +| +------------------+ | +| | Provisioning | | +| | Context |--(writes NVS)---+ | +| | (provision.py) | | | +| +------------------+ | | +| v | +| +------------------+ SPSC +------------------+ | +| | Sensor Node |---------->| Edge Processing | | +| | Context | ring buf | Context | | +| | (Core 0) | | (Core 1) | | +| +--------+---------+ +--------+----------+ | +| | | | +| | UDP raw (0x01) | feeds CSI data | +| | v | +| | +------------------+ | +| | | WASM Runtime | | +| | | Context | | +| | | (Tier 3, Core 1)| | +| | +--------+---------+ | +| | | | +| | UDP raw UDP vitals (0x02) | UDP events (0x04) | +| | (0x01) UDP compressed | | +| | (0x03) | | +| +----------+------------------+ | +| | | +| v | +| +------------------+ | +| | Aggregation | | +| | Context | | +| | (Server-side) | | +| +--------+---------+ | +| | | +| | mpsc channel | +| v | ++------------------------------------------------------------------+ +| DOWNSTREAM (Customer/Supplier) | +| +-----------------+ +-----------------+ +-----------------+ | +| | wifi-densepose | | wifi-densepose | | wifi-densepose | | +| | -signal | | -nn | | -mat | | +| | (RuvSense) | | (Inference) | | (Disaster) | | +| +-----------------+ +-----------------+ +-----------------+ | ++------------------------------------------------------------------+ +``` + +**Relationship Types:** + +| Upstream | Downstream | Relationship | Description | +|----------|------------|-------------|-------------| +| Provisioning | Sensor Node | **Customer/Supplier** | Provisioning writes NVS config that the node reads at boot | +| Sensor Node | Edge Processing | **Partnership** | Tightly coupled via SPSC ring buffer on the same chip | +| Edge Processing | WASM Runtime | **Customer/Supplier** | Edge pipeline feeds CSI data to WASM modules via Host API | +| Sensor Node | Aggregation | **Published Language** | ADR-018 binary wire format (magic bytes, fixed offsets) | +| Edge Processing | Aggregation | **Published Language** | Vitals (0xC5110002) and compressed (0xC5110003) wire formats | +| WASM Runtime | Aggregation | **Published Language** | WASM events (0xC5110004) wire format | +| Aggregation | Downstream crates | **Customer/Supplier** | Aggregator produces `FusedFrame` consumed by signal/nn/mat | + +--- + +## Anti-Corruption Layers + +### Aggregator-to-Pipeline ACL + +The aggregator translates between the hardware-specific ESP32 binary wire format and the wifi-densepose Rust pipeline types. + +```rust +/// Adapts raw ESP32 UDP packets to the wifi-densepose-signal CsiData type. +pub struct Esp32ToPipelineAdapter { + /// Maps ADR-018 magic bytes to frame type + frame_parser: FrameParser, + /// Converts ESP32 I/Q byte pairs to f32 amplitude/phase + iq_converter: IqConverter, +} + +impl Esp32ToPipelineAdapter { + /// Parse a raw UDP datagram from an ESP32 node into a pipeline-ready frame. + /// Handles magic byte demuxing: + /// 0xC5110001 -> raw CSI frame + /// 0xC5110002 -> vitals packet + /// 0xC5110003 -> compressed frame (decompress first) + /// 0xC5110004 -> WASM event packet + pub fn parse_datagram( + &self, + data: &[u8], + src_addr: SocketAddr, + ) -> Result { + /* ... */ + } +} + +pub enum ParsedFrame { + RawCsi(CsiFrame), + Vitals(VitalsPacket), + CompressedCsi(CsiFrame), // decompressed + WasmEvent(WasmEventEmitted), +} +``` + +### WASM Host API ACL + +The Host API acts as an anti-corruption layer between the WASM module world (no_std, wasm32 ABI) and the firmware's C data structures. + +```rust +/// Translates between WASM3 linear memory and firmware C structs. +/// Each Host API function validates indices, clamps values, and converts types. +pub struct WasmHostApiAdapter { + /// Pointer to current CSI frame data (set before each on_frame dispatch) + current_frame: *const EdgeProcessingState, + /// Event buffer for this dispatch cycle + event_buffer: Vec, +} + +impl WasmHostApiAdapter { + /// csi_get_phase(sc_idx: i32) -> f32 + /// Validates sc_idx is within [0, n_subcarriers), returns 0.0 if out of bounds. + pub fn get_phase(&self, sc_idx: i32) -> f32 { /* ... */ } + + /// csi_emit_event(event_type: i32, value: f32) -> void + /// Validates event_type is within the module's declared event ID range. + /// Applies dead-band filter: suppresses if |value - last_emitted| < threshold. + pub fn emit_event(&mut self, event_type: i32, value: f32) { /* ... */ } +} +``` + +### Provisioning-to-NVS ACL + +The provisioning tool translates between human-readable CLI arguments / JSON config files and the ESP-IDF NVS binary format. + +```rust +/// Adapts CLI/JSON configuration to ESP32 NVS binary partition format. +/// Handles type conversions, validation, and encoding. +pub struct ProvisioningAdapter { + /// Maps CLI flag names to NVS key names and types + key_registry: HashMap, +} + +pub struct NvsKeySpec { + pub nvs_key: String, // e.g., "edge_tier" + pub nvs_type: NvsType, // u8, u16, u32, string, blob + pub default: Option, + pub validator: Box bool>, +} + +impl ProvisioningAdapter { + /// Convert a typed NvsConfig struct to a list of NVS binary writes. + pub fn to_nvs_entries(&self, config: &NvsConfig) -> Vec { /* ... */ } + + /// Parse an NVS binary partition dump into a typed NvsConfig. + pub fn from_nvs_partition(&self, data: &[u8]) -> Result { /* ... */ } +} +``` + +--- + +## Wire Protocol Summary + +All ESP32 UDP packets share a 4-byte magic prefix for demuxing at the aggregator. + +| Magic | Name | Source | Size | Rate | Description | +|-------|------|--------|------|------|-------------| +| `0xC5110001` | Raw CSI | Tier 0+ | ~128-404 B | 20-28.5 Hz | Full I/Q per subcarrier | +| `0xC5110002` | Vitals | Tier 2+ | 32 B | 1 Hz (configurable) | Presence, BPM, fall flag | +| `0xC5110003` | Compressed | Tier 1+ | variable | 20-28.5 Hz | XOR+RLE delta-compressed CSI | +| `0xC5110004` | WASM Events | Tier 3 | variable | event-driven | Module event_type + value tuples | + +--- + +## Hardware Constraints and Measured Performance + +| Metric | Value | Source | +|--------|-------|--------| +| CSI frame rate | 28.5 Hz (measured) | ADR-039 hardware benchmark | +| Boot to ready | 3.9 s | WiFi connect dominates | +| Binary size | 925 KB (10% free in 1 MB) | Includes full WASM3 runtime | +| WASM init time | 106 ms | 4 slots, 160 KB arenas | +| WASM binary size (7 modules) | 13.8 KB | wasm32-unknown-unknown release | +| Internal RAM available | 316 KiB | No PSRAM on test board | +| Crystal drift | 20-50 ppm | 72-180 ms divergence per hour | +| BOM (3-node starter kit) | $54 | ADR-012 bill of materials | + +--- + +## References + +- [ADR-012: ESP32 CSI Sensor Mesh](../adr/ADR-012-esp32-csi-sensor-mesh.md) -- Hardware selection, mesh architecture, BOM +- [ADR-018: Dev Implementation](../adr/ADR-018-dev-implementation.md) -- Binary frame format, ADR-018 wire protocol +- [ADR-039: ESP32-S3 Edge Intelligence](../adr/ADR-039-esp32-edge-intelligence.md) -- Tiered processing, DSP pipeline, hardware benchmarks +- [ADR-040: WASM Programmable Sensing](../adr/ADR-040-wasm-programmable-sensing.md) -- WASM3 runtime, Host API, RVF container, adaptive budget +- [ADR-041: WASM Module Collection](../adr/ADR-041-wasm-module-collection.md) -- 60-module catalog, event ID registry, budget tiers +- [ADR-044: Provisioning Tool Enhancements](../adr/ADR-044-provisioning-tool-enhancements.md) -- NVS coverage, presets, mesh config, read-back +- [RuvSense Domain Model](ruvsense-domain-model.md) -- Upstream signal processing domain +- [WiFi-Mat Domain Model](wifi-mat-domain-model.md) -- Downstream disaster response domain diff --git a/docs/ddd/sensing-server-domain-model.md b/docs/ddd/sensing-server-domain-model.md new file mode 100644 index 00000000..18d02690 --- /dev/null +++ b/docs/ddd/sensing-server-domain-model.md @@ -0,0 +1,842 @@ +# Sensing Server Domain Model + +The Sensing Server is the single-binary deployment surface of WiFi-DensePose. It receives raw CSI frames from ESP32 nodes, processes them into sensing features, streams live data to a web UI, and provides a self-contained workflow for recording data, training models, and running inference -- all without external dependencies. + +This document defines the system using [Domain-Driven Design](https://martinfowler.com/bliki/DomainDrivenDesign.html) (DDD): bounded contexts that own their data and rules, aggregate roots that enforce invariants, value objects that carry meaning, and domain events that connect everything. The server is implemented as a single Axum binary (`wifi-densepose-sensing-server`) with all state managed through `Arc>`. + +**Bounded Contexts:** + +| # | Context | Responsibility | Key ADRs | Code | +|---|---------|----------------|----------|------| +| 1 | [CSI Ingestion](#1-csi-ingestion-context) | Receive, decode, and feature-extract CSI frames from ESP32 UDP | [ADR-019](../adr/ADR-019-sensing-only-ui-mode.md), [ADR-035](../adr/ADR-035-live-sensing-ui-accuracy.md) | `sensing-server/src/main.rs` | +| 2 | [Model Management](#2-model-management-context) | Load, unload, list RVF models; LoRA profile activation | [ADR-043](../adr/ADR-043-sensing-server-ui-api-completion.md) | `sensing-server/src/model_manager.rs` | +| 3 | [CSI Recording](#3-csi-recording-context) | Record CSI frames to .jsonl files, manage recording sessions | [ADR-043](../adr/ADR-043-sensing-server-ui-api-completion.md) | `sensing-server/src/recording.rs` | +| 4 | [Training Pipeline](#4-training-pipeline-context) | Background training runs, progress streaming, contrastive pretraining | [ADR-043](../adr/ADR-043-sensing-server-ui-api-completion.md) | `sensing-server/src/training_api.rs` | +| 5 | [Visualization](#5-visualization-context) | WebSocket streaming to web UI, Gaussian splat rendering, data transparency | [ADR-019](../adr/ADR-019-sensing-only-ui-mode.md), [ADR-035](../adr/ADR-035-live-sensing-ui-accuracy.md) | `ui/` | + +All code paths shown are relative to `rust-port/wifi-densepose-rs/crates/wifi-densepose-` unless otherwise noted. + +--- + +## Domain-Driven Design Specification + +### Ubiquitous Language + +| Term | Definition | +|------|------------| +| **Sensing Update** | A complete JSON message broadcast to WebSocket clients each tick, containing node data, features, classification, signal field, and optional vital signs | +| **Tick** | One processing cycle of the sensing loop (default 100ms = 10 fps, configurable via `--tick-ms`) | +| **Data Source** | Origin of CSI data: `esp32` (UDP port 5005), `wifi` (Windows RSSI), `simulated` (synthetic), or `auto` (try ESP32 then fall back) | +| **RVF Model** | A `.rvf` container file holding trained weights, manifest metadata, optional LoRA adapters, and vital sign configuration | +| **LoRA Profile** | A lightweight adapter applied on top of a base RVF model for environment-specific fine-tuning without retraining the full model | +| **Recording Session** | A period during which CSI frames are appended to a `.csi.jsonl` file, identified by a session ID and optional activity label | +| **Training Run** | A background task that loads recorded CSI data, extracts features, trains a regularised linear model, and exports a `.rvf` container | +| **Frame History** | A circular buffer of the last 100 CSI amplitude vectors used for temporal analysis (sliding-window variance, Goertzel breathing estimation) | +| **Goertzel Filter** | A frequency-domain estimator applied to the frame history to detect breathing rate (0.1--0.5 Hz) via a 9-candidate filter bank | +| **Signal Field** | A 20x1x20 grid of interpolated signal intensity values rendered as Gaussian splats in the UI | +| **Pose Source** | Whether pose keypoints are `signal_derived` (analytical from CSI features) or `model_inference` (from a loaded RVF model) | +| **Progressive Loader** | A two-layer model loading strategy: Layer A loads instantly for basic inference, Layer B loads in background for full accuracy | +| **Sensing-Only Mode** | UI mode when the DensePose backend is unavailable; suppresses DensePose tabs, shows only sensing and signal visualization | +| **AppStateInner** | The single shared state struct holding all server state, accessed via `Arc>` | +| **PCK Score** | Percentage of Correct Keypoints -- the primary accuracy metric for pose estimation models | +| **Contrastive Pretraining** | Self-supervised training on unlabeled CSI data that learns signal representations before supervised fine-tuning (ADR-024) | + +--- + +## Bounded Contexts + +### 1. CSI Ingestion Context + +**Responsibility:** Receive raw CSI frames from ESP32 nodes via UDP (port 5005), decode the binary protocol, extract temporal and frequency-domain features, and produce a `SensingUpdate` each tick. + +``` ++------------------------------------------------------------+ +| CSI Ingestion Context | ++------------------------------------------------------------+ +| | +| +----------------+ +----------------+ | +| | UDP Listener | | Data Source | | +| | (port 5005) | | Selector | | +| | Esp32Frame | | (auto/esp32/ | | +| | parser | | wifi/sim) | | +| +-------+--------+ +-------+--------+ | +| | | | +| +----------+----------+ | +| v | +| +-------------------+ | +| | Frame History | | +| | Buffer | | +| | (VecDeque, | | +| | 100 frames) | | +| +--------+----------+ | +| v | +| +-------------------+ | +| | Feature | | +| | Extractor | | +| | (Welford stats, | | +| | Goertzel FFT, | | +| | L2 motion) | | +| +--------+----------+ | +| v | +| +-------------------+ | +| | Vital Sign | | +| | Detector |---> SensingUpdate | +| | (HR, RR, | | +| | breathing) | | +| +-------------------+ | +| | ++------------------------------------------------------------+ +``` + +**Aggregates:** + +```rust +/// Aggregate Root: The central shared state of the sensing server. +/// All mutations go through RwLock. All handler functions receive +/// State>>. +pub struct AppStateInner { + /// Most recent sensing update broadcast to clients. + latest_update: Option, + /// RSSI history for sparkline display. + rssi_history: VecDeque, + /// Circular buffer of recent CSI amplitude vectors (100 frames). + frame_history: VecDeque>, + /// Monotonic tick counter. + tick: u64, + /// Active data source identifier ("esp32", "wifi", "simulated"). + source: String, + /// Broadcast channel for WebSocket fan-out. + tx: broadcast::Sender, + /// Vital sign detector instance. + vital_detector: VitalSignDetector, + /// Most recent vital signs reading. + latest_vitals: VitalSigns, + /// Smoothed person count (EMA) for hysteresis. + smoothed_person_score: f64, + // ... model, recording, training fields (see other contexts) +} +``` + +**Value Objects:** + +```rust +/// A complete sensing update broadcast to WebSocket clients each tick. +pub struct SensingUpdate { + pub msg_type: String, // always "sensing_update" + pub timestamp: f64, // Unix timestamp with ms precision + pub source: String, // "esp32" | "wifi" | "simulated" + pub tick: u64, // monotonic tick counter + pub nodes: Vec, // per-node CSI data + pub features: FeatureInfo, // extracted signal features + pub classification: ClassificationInfo, + pub signal_field: SignalField, + pub vital_signs: Option, + pub persons: Option>, + pub estimated_persons: Option, +} + +/// Per-node CSI data received from one ESP32. +pub struct NodeInfo { + pub node_id: u8, + pub rssi_dbm: f64, + pub position: [f64; 3], + pub amplitude: Vec, + pub subcarrier_count: usize, +} + +/// Extracted signal features from the frame history buffer. +pub struct FeatureInfo { + pub mean_rssi: f64, + pub variance: f64, + pub motion_band_power: f64, + pub breathing_band_power: f64, + pub dominant_freq_hz: f64, + pub change_points: usize, + pub spectral_power: f64, +} + +/// Motion classification derived from features. +pub struct ClassificationInfo { + pub motion_level: String, // "empty" | "static" | "active" + pub presence: bool, + pub confidence: f64, +} + +/// Interpolated signal field for Gaussian splat visualization. +pub struct SignalField { + pub grid_size: [usize; 3], // [20, 1, 20] + pub values: Vec, +} + +/// ESP32 binary CSI frame (ADR-018 protocol, 20-byte header). +pub struct Esp32Frame { + pub magic: u32, // 0xC5100001 + pub node_id: u8, + pub n_antennas: u8, + pub n_subcarriers: u8, + pub freq_mhz: u16, + pub sequence: u32, + pub rssi: i8, + pub noise_floor: i8, + pub amplitudes: Vec, + pub phases: Vec, +} + +/// Data source selection enum. +pub enum DataSource { + Esp32Udp, // Real ESP32 CSI via UDP port 5005 + WindowsRssi, // Windows WiFi RSSI via netsh + Simulated, // Synthetic sine-wave data + Auto, // Try ESP32, fall back to Windows, then simulated +} +``` + +**Domain Services:** +- `FeatureExtractionService` -- Computes temporal variance (Welford), Goertzel breathing estimation (9-band filter bank), L2 frame-to-frame motion score, SNR-based signal quality +- `VitalSignDetectionService` -- Estimates breathing rate, heart rate, and confidence from CSI phase history +- `DataSourceSelectionService` -- Probes UDP port 5005 for ESP32 frames; falls back through Windows RSSI then simulation + +**Invariants:** +- Frame history buffer never exceeds 100 entries (oldest dropped on push) +- Goertzel breathing estimate requires 3x SNR above noise to be reported +- Source type is determined once at startup and does not change during runtime + +--- + +### 2. Model Management Context + +**Responsibility:** Discover `.rvf` model files from `data/models/`, load weights into memory for inference, manage the active model lifecycle, and support LoRA profile activation. + +``` ++------------------------------------------------------------+ +| Model Management Context | ++------------------------------------------------------------+ +| | +| +----------------+ +----------------+ | +| | Model Scanner | | RVF Reader | | +| | (data/models/ | | (parse .rvf | | +| | *.rvf enum) | | manifest) | | +| +-------+--------+ +-------+--------+ | +| | | | +| +----------+----------+ | +| v | +| +-------------------+ | +| | Model Registry | | +| | (Vec) | | +| +--------+----------+ | +| v | +| +-------------------+ | +| | Model Loader | | +| | (RvfReader -> |---> LoadedModelState | +| | weights, | | +| | LoRA profiles) | | +| +--------+----------+ | +| v | +| +-------------------+ | +| | LoRA Activator | | +| | (profile switch) | | +| +-------------------+ | +| | ++------------------------------------------------------------+ +``` + +**Aggregates:** + +```rust +/// Aggregate Root: Runtime state for a loaded RVF model. +/// At most one LoadedModelState exists at any time. +pub struct LoadedModelState { + /// Model identifier (derived from filename without .rvf extension). + pub model_id: String, + /// Original filename on disk. + pub filename: String, + /// Version string from the RVF manifest. + pub version: String, + /// Description from the RVF manifest. + pub description: String, + /// LoRA profiles available in this model. + pub lora_profiles: Vec, + /// Currently active LoRA profile (if any). + pub active_lora_profile: Option, + /// Model weights (f32 parameters). + pub weights: Vec, + /// Number of frames processed since load. + pub frames_processed: u64, + /// Cumulative inference time for avg calculation. + pub total_inference_ms: f64, + /// When the model was loaded. + pub loaded_at: Instant, +} +``` + +**Value Objects:** + +```rust +/// Summary information for a model discovered on disk. +pub struct ModelInfo { + pub id: String, + pub filename: String, + pub version: String, + pub description: String, + pub size_bytes: u64, + pub created_at: String, + pub pck_score: Option, + pub has_quantization: bool, + pub lora_profiles: Vec, + pub segment_count: usize, +} + +/// Information about the currently loaded model with runtime stats. +pub struct ActiveModelInfo { + pub model_id: String, + pub filename: String, + pub version: String, + pub description: String, + pub avg_inference_ms: f64, + pub frames_processed: u64, + pub pose_source: String, // "model_inference" + pub lora_profiles: Vec, + pub active_lora_profile: Option, +} + +/// Request to load a model by ID. +pub struct LoadModelRequest { + pub model_id: String, +} + +/// Request to activate a LoRA profile. +pub struct ActivateLoraRequest { + pub model_id: String, + pub profile_name: String, +} +``` + +**Domain Services:** +- `ModelScanService` -- Scans `data/models/` at startup for `.rvf` files, parses each with `RvfReader` to extract manifest metadata +- `ModelLoadService` -- Reads model weights from an RVF container into memory, sets `model_loaded = true` +- `LoraActivationService` -- Switches the active LoRA adapter on a loaded model without full reload + +**Invariants:** +- Only one model can be loaded at a time; loading a new model implicitly unloads the previous one +- A model must be loaded before a LoRA profile can be activated +- The `active_lora_profile` must be one of the model's declared `lora_profiles` +- Model deletion is refused if the model is currently loaded (must unload first) +- `data/models/` directory is created at startup if it does not exist + +--- + +### 3. CSI Recording Context + +**Responsibility:** Capture CSI frames to `.csi.jsonl` files during active recording sessions, manage session lifecycle, and provide download/delete operations on stored recordings. + +``` ++------------------------------------------------------------+ +| CSI Recording Context | ++------------------------------------------------------------+ +| | +| +----------------+ +----------------+ | +| | Start/Stop | | Auto-Stop | | +| | Controller | | Timer | | +| | (REST API) | | (duration_ | | +| | | | secs check) | | +| +-------+--------+ +-------+--------+ | +| | | | +| +----------+----------+ | +| v | +| +-------------------+ | +| | Recording State | | +| | (session_id, | | +| | frame_count, | | +| | file_path) | | +| +--------+----------+ | +| v | +| +-------------------+ | +| | Frame Writer | | +| | (maybe_record_ |---> .csi.jsonl file | +| | frame on each | | +| | tick) | | +| +--------+----------+ | +| v | +| +-------------------+ | +| | Metadata Writer | | +| | (.meta.json on | | +| | stop) | | +| +-------------------+ | +| | ++------------------------------------------------------------+ +``` + +**Aggregates:** + +```rust +/// Aggregate Root: Runtime state for the active CSI recording session. +/// At most one RecordingState can be active at any time. +pub struct RecordingState { + /// Whether a recording is currently active. + pub active: bool, + /// Session ID of the active recording. + pub session_id: String, + /// Session display name. + pub session_name: String, + /// Optional label / activity tag (e.g., "walking", "standing"). + pub label: Option, + /// Path to the JSONL file being written. + pub file_path: PathBuf, + /// Number of frames written so far. + pub frame_count: u64, + /// When the recording started (monotonic clock). + pub start_time: Instant, + /// ISO-8601 start timestamp for metadata. + pub started_at: String, + /// Optional auto-stop duration in seconds. + pub duration_secs: Option, +} +``` + +**Value Objects:** + +```rust +/// Metadata for a completed or active recording session. +pub struct RecordingSession { + pub id: String, + pub name: String, + pub label: Option, + pub started_at: String, + pub ended_at: Option, + pub frame_count: u64, + pub file_size_bytes: u64, + pub file_path: String, +} + +/// A single recorded CSI frame line (JSONL format). +pub struct RecordedFrame { + pub timestamp: f64, + pub subcarriers: Vec, + pub rssi: f64, + pub noise_floor: f64, + pub features: serde_json::Value, +} + +/// Request to start a new recording session. +pub struct StartRecordingRequest { + pub session_name: String, + pub label: Option, + pub duration_secs: Option, +} +``` + +**Domain Services:** +- `RecordingLifecycleService` -- Creates a new `.csi.jsonl` file, generates session ID, manages start/stop transitions +- `FrameWriterService` -- Called on each tick via `maybe_record_frame()`, appends a `RecordedFrame` JSON line to the active file +- `AutoStopService` -- Checks elapsed time against `duration_secs` on each tick; triggers stop when exceeded +- `RecordingScanService` -- Enumerates `data/recordings/` for `.csi.jsonl` files and reads companion `.meta.json` for session metadata + +**Invariants:** +- Only one recording session can be active at a time; starting a new recording while one is active returns HTTP 409 Conflict +- Recording with `duration_secs` set auto-stops after the specified elapsed time +- A `.meta.json` companion file is written when a recording stops, capturing final frame count and duration +- `data/recordings/` directory is created at startup if it does not exist +- Frame writer acquires a read lock on `AppStateInner` per tick; stop acquires a write lock + +--- + +### 4. Training Pipeline Context + +**Responsibility:** Run background training against recorded CSI data, stream epoch-level progress via WebSocket, and export trained models as `.rvf` containers. Supports supervised training, contrastive pretraining (ADR-024), and LoRA fine-tuning. + +``` ++------------------------------------------------------------+ +| Training Pipeline Context | ++------------------------------------------------------------+ +| | +| +----------------+ +----------------+ | +| | Training API | | WebSocket | | +| | (start/stop/ | | Progress | | +| | status) | | Streamer | | +| +-------+--------+ +-------+--------+ | +| | ^ | +| v | | +| +-------------------+ | | +| | Training | | | +| | Orchestrator +--------+ | +| | (tokio::spawn) | broadcast::Sender | +| +--------+----------+ | +| v | +| +-------------------+ | +| | Feature | | +| | Extractor | | +| | (subcarrier var, | | +| | Goertzel power, | | +| | temporal grad) | | +| +--------+----------+ | +| v | +| +-------------------+ | +| | Gradient Descent | | +| | Trainer | | +| | (batch SGD, |---> TrainingProgress | +| | early stopping, | | +| | warmup) | | +| +--------+----------+ | +| v | +| +-------------------+ | +| | RVF Exporter | | +| | (RvfBuilder -> |---> data/models/*.rvf | +| | .rvf container) | | +| +-------------------+ | +| | ++------------------------------------------------------------+ +``` + +**Aggregates:** + +```rust +/// Aggregate Root: Runtime training state stored in AppStateInner. +/// At most one training run can be active at any time. +pub struct TrainingState { + /// Current status snapshot. + pub status: TrainingStatus, + /// Handle to the background training task (for cancellation). + pub task_handle: Option>, +} +``` + +**Value Objects:** + +```rust +/// Current training status (returned by GET /api/v1/train/status). +pub struct TrainingStatus { + pub active: bool, + pub epoch: u32, + pub total_epochs: u32, + pub train_loss: f64, + pub val_pck: f64, // Percentage of Correct Keypoints + pub val_oks: f64, // Object Keypoint Similarity + pub lr: f64, // current learning rate + pub best_pck: f64, + pub best_epoch: u32, + pub patience_remaining: u32, + pub eta_secs: Option, + pub phase: String, // "idle" | "training" | "complete" | "failed" +} + +/// Progress update sent over WebSocket to connected UI clients. +pub struct TrainingProgress { + pub epoch: u32, + pub batch: u32, + pub total_batches: u32, + pub train_loss: f64, + pub val_pck: f64, + pub val_oks: f64, + pub lr: f64, + pub phase: String, +} + +/// Training configuration submitted with a start request. +pub struct TrainingConfig { + pub epochs: u32, // default: 100 + pub batch_size: u32, // default: 8 + pub learning_rate: f64, // default: 0.001 + pub weight_decay: f64, // default: 1e-4 + pub early_stopping_patience: u32, // default: 20 + pub warmup_epochs: u32, // default: 5 + pub pretrained_rvf: Option, + pub lora_profile: Option, +} + +/// Request to start supervised training. +pub struct StartTrainingRequest { + pub dataset_ids: Vec, // recording session IDs + pub config: TrainingConfig, +} + +/// Request to start contrastive pretraining (ADR-024). +pub struct PretrainRequest { + pub dataset_ids: Vec, + pub epochs: u32, // default: 50 + pub lr: f64, // default: 0.001 +} + +/// Request to start LoRA fine-tuning. +pub struct LoraTrainRequest { + pub base_model_id: String, + pub dataset_ids: Vec, + pub profile_name: String, + pub rank: u8, // default: 8 + pub epochs: u32, // default: 30 +} +``` + +**Domain Services:** +- `TrainingOrchestrationService` -- Spawns a background `tokio::task`, loads recorded frames, runs feature extraction, executes gradient descent with early stopping and warmup +- `FeatureExtractionService` -- Computes per-subcarrier sliding-window variance, temporal gradients, Goertzel frequency-domain power across 9 bands, and 3 global scalar features (mean amplitude, std, motion score) +- `ProgressBroadcastService` -- Sends `TrainingProgress` messages through a `broadcast::Sender` channel that WebSocket handlers subscribe to +- `RvfExportService` -- Uses `RvfBuilder` to write the best checkpoint as a `.rvf` container to `data/models/` + +**Invariants:** +- Only one training run can be active at a time; starting training while one is running returns HTTP 409 Conflict +- Training requires at least one recording with a minimum frame count before starting +- Early stopping halts training after `patience` epochs with no improvement in `val_pck` +- Learning rate warmup ramps linearly from 0 to `learning_rate` over `warmup_epochs` +- On completion, the best model (by `val_pck`) is automatically exported as `.rvf` +- Training status phase transitions: `idle` -> `training` -> `complete` | `failed` -> `idle` +- Stopping an active training run aborts the background task via `JoinHandle::abort()` and resets phase to `idle` + +--- + +### 5. Visualization Context + +**Responsibility:** Stream sensing data to web UI clients via WebSocket, render Gaussian splat visualizations, display data source transparency indicators, and manage UI mode (full vs. sensing-only). + +``` ++------------------------------------------------------------+ +| Visualization Context | ++------------------------------------------------------------+ +| | +| +----------------+ +----------------+ | +| | WebSocket | | Sensing | | +| | Hub | | Service (JS) | | +| | (/ws/sensing) | | (client-side | | +| | broadcast:: | | reconnect + | | +| | Receiver | | sim fallback)| | +| +-------+--------+ +-------+--------+ | +| | | | +| +----------+----------+ | +| v | +| +----------------------------------------------+ | +| | UI Components | | +| | | | +| | +----------+ +----------+ +----------+ | | +| | | Sensing | | Live | | Models | | | +| | | Tab | | Demo Tab | | Tab | | | +| | | (splats) | | (pose) | | (manage) | | | +| | +----------+ +----------+ +----------+ | | +| | +----------+ +----------+ | | +| | | Recording| | Training | | | +| | | Tab | | Tab | | | +| | | (capture)| | (train) | | | +| | +----------+ +----------+ | | +| +----------------------------------------------+ | +| | ++------------------------------------------------------------+ +``` + +**Value Objects:** + +```rust +/// Data source indicator shown in the UI (ADR-035). +pub enum DataSourceIndicator { + LiveEsp32, // Green banner: "LIVE - ESP32" + Reconnecting, // Yellow banner: "RECONNECTING..." + Simulated, // Red banner: "SIMULATED DATA" +} + +/// Pose estimation mode badge (ADR-035). +pub enum EstimationMode { + SignalDerived, // Green badge: analytical pose from CSI features + ModelInference, // Blue badge: neural network inference from loaded RVF +} + +/// Render mode for pose visualization (ADR-035). +pub enum RenderMode { + Skeleton, // Green lines connecting joints + red keypoint dots + Keypoints, // Large colored dots with glow and labels + Heatmap, // Gaussian radial blobs per keypoint, faint skeleton overlay + Dense, // Body region segmentation with colored filled polygons +} +``` + +**Domain Services:** +- `WebSocketBroadcastService` -- Subscribes to `broadcast::Sender`, forwards each `SensingUpdate` JSON to all connected WebSocket clients +- `SensingServiceJS` -- Client-side JavaScript that manages WebSocket connection, tracks `dataSource` state, falls back to simulation after 5 failed reconnect attempts (~30s delay) +- `GaussianSplatRenderer` -- Custom GLSL `ShaderMaterial` rendering point-cloud splats on a 20x20 floor grid, colored by signal intensity +- `PoseRenderer` -- Renders skeleton, keypoints, heatmap, or dense body segmentation modes +- `BackendDetector` -- Auto-detects whether the full DensePose backend is available; sets `sensingOnlyMode = true` if unreachable + +**Invariants:** +- WebSocket sensing service is started on application init, not lazily on tab visit (ADR-043 fix) +- Simulation fallback is delayed to 5 failed reconnect attempts (~30 seconds) to avoid premature synthetic data +- `pose_source` field is passed through data conversion so the Estimation Mode badge displays correctly +- Dashboard and Live Demo tabs read `sensingService.dataSource` at load time -- the service must already be connected + +--- + +## Domain Events + +| Event | Published By | Consumed By | Payload | +|-------|-------------|-------------|---------| +| `ServerStarted` | CSI Ingestion | Visualization | `{ http_port, udp_port, source_type }` | +| `CsiFrameIngested` | CSI Ingestion | Recording, Visualization | `{ source, node_id, subcarrier_count, tick }` | +| `SensingUpdateBroadcast` | CSI Ingestion | Visualization (WebSocket) | Full `SensingUpdate` JSON | +| `ModelLoaded` | Model Management | CSI Ingestion (inference path) | `{ model_id, weight_count, version }` | +| `ModelUnloaded` | Model Management | CSI Ingestion | `{ model_id }` | +| `LoraProfileActivated` | Model Management | CSI Ingestion | `{ model_id, profile_name }` | +| `RecordingStarted` | Recording | Visualization | `{ session_id, session_name, file_path }` | +| `RecordingStopped` | Recording | Visualization | `{ session_id, frame_count, duration_secs }` | +| `TrainingStarted` | Training Pipeline | Visualization | `{ run_id, config, recording_ids }` | +| `TrainingEpochComplete` | Training Pipeline | Visualization (WebSocket) | `{ epoch, total_epochs, train_loss, val_pck, lr }` | +| `TrainingComplete` | Training Pipeline | Model Management, Visualization | `{ run_id, final_pck, model_path }` | +| `TrainingFailed` | Training Pipeline | Visualization | `{ run_id, error_message }` | +| `WebSocketClientConnected` | Visualization | -- | `{ endpoint, client_addr }` | +| `WebSocketClientDisconnected` | Visualization | -- | `{ endpoint, client_addr }` | + +In the current implementation, events are realized through two mechanisms: +1. **`broadcast::Sender`** for WebSocket fan-out of sensing updates +2. **`broadcast::Sender`** for training progress streaming +3. **State mutations via RwLock** where other contexts read state changes on their next tick + +--- + +## Context Map + +``` ++-------------------+ +---------------------+ +| CSI Ingestion |--------->| Visualization | +| (produces | publish | (WebSocket | +| SensingUpdate) | -------> | consumers) | ++--------+----------+ +----------+----------+ + | | + | maybe_record_frame() | reads dataSource + v | ++-------------------+ | +| CSI Recording | | +| (hooks into | | +| tick loop) | | ++--------+----------+ | + | | + | provides dataset_ids | + v | ++-------------------+ +----------+----------+ +| Training Pipeline |--------->| Model Management | +| (reads .jsonl, | exports | (loads .rvf for | +| trains model) | .rvf --> | inference) | ++-------------------+ +----------+----------+ + | + | model weights + v + +----------+----------+ + | CSI Ingestion | + | (inference path | + | uses loaded model)| + +----------------------+ +``` + +**Relationships:** + +| Upstream | Downstream | Relationship | Mechanism | +|----------|-----------|--------------|-----------| +| CSI Ingestion | Visualization | Published Language | `broadcast::Sender` with `SensingUpdate` JSON schema | +| CSI Ingestion | CSI Recording | Shared Kernel | `maybe_record_frame()` called from the ingestion tick loop | +| CSI Recording | Training Pipeline | Conformist | Training reads `.csi.jsonl` files produced by recording; no negotiation on format | +| Training Pipeline | Model Management | Supplier-Consumer | Training exports `.rvf` to `data/models/`; Model Management scans and loads | +| Model Management | CSI Ingestion | Shared Kernel | Loaded weights stored in `AppStateInner`; ingestion reads them for inference | +| Training Pipeline | Visualization | Published Language | `broadcast::Sender` with progress JSON schema | + +--- + +## Anti-Corruption Layers + +### ESP32 Binary Protocol ACL + +The ESP32 sends CSI frames using a compact binary protocol (ADR-018): 20-byte header with magic `0xC5100001`, followed by amplitude and phase arrays. The `Esp32Frame` parser in the ingestion context decodes this binary format into domain value objects (`NodeInfo`, amplitude/phase vectors) before any downstream processing. No other context handles raw UDP bytes. + +### RVF Container ACL + +The `.rvf` container format encapsulates model weights, manifest metadata, vital sign configuration, and optional LoRA adapters. The `RvfReader` and `RvfBuilder` types in the `rvf_container` module provide the anti-corruption layer between the on-disk binary format and the domain types (`ModelInfo`, `LoadedModelState`). The training pipeline writes through `RvfBuilder`; the model management context reads through `RvfReader`. + +### Sensing-Only Mode ACL (Client-Side) + +When the DensePose backend (port 8000) is unreachable, the client-side `BackendDetector` sets `sensingOnlyMode = true`. The `ApiService.request()` method short-circuits all requests to the DensePose backend, returning empty responses instead of `ERR_CONNECTION_REFUSED`. This prevents DensePose-specific concerns from leaking into the sensing UI. + +### JSONL Recording Format ACL + +CSI frames are recorded as newline-delimited JSON (`.csi.jsonl`). The `RecordedFrame` struct defines the schema: `{timestamp, subcarriers, rssi, noise_floor, features}`. The training pipeline reads through this schema, extracting subcarrier arrays for feature computation. If the internal sensing representation changes, only the `maybe_record_frame()` serializer needs updating -- the training pipeline depends only on the `RecordedFrame` contract. + +--- + +## REST API Surface + +All endpoints share `AppStateInner` via `Arc>`. + +### CSI Ingestion & Sensing + +| Method | Path | Context | Description | +|--------|------|---------|-------------| +| GET | `/api/v1/sensing/latest` | Ingestion | Latest sensing update | +| WS | `/ws/sensing` | Visualization | Streaming sensing updates | + +### Model Management + +| Method | Path | Context | Description | +|--------|------|---------|-------------| +| GET | `/api/v1/models` | Model Mgmt | List all discovered `.rvf` models | +| GET | `/api/v1/models/:id` | Model Mgmt | Detailed info for a specific model | +| GET | `/api/v1/models/active` | Model Mgmt | Active model with runtime stats | +| POST | `/api/v1/models/load` | Model Mgmt | Load model weights into memory | +| POST | `/api/v1/models/unload` | Model Mgmt | Unload the active model | +| DELETE | `/api/v1/models/:id` | Model Mgmt | Delete a model file from disk | +| GET | `/api/v1/models/lora/profiles` | Model Mgmt | List LoRA profiles for active model | +| POST | `/api/v1/models/lora/activate` | Model Mgmt | Activate a LoRA adapter | + +### CSI Recording + +| Method | Path | Context | Description | +|--------|------|---------|-------------| +| POST | `/api/v1/recording/start` | Recording | Start a new recording session | +| POST | `/api/v1/recording/stop` | Recording | Stop the active recording | +| GET | `/api/v1/recording/list` | Recording | List all recording sessions | +| GET | `/api/v1/recording/download/:id` | Recording | Download a `.csi.jsonl` file | +| DELETE | `/api/v1/recording/:id` | Recording | Delete a recording | + +### Training Pipeline + +| Method | Path | Context | Description | +|--------|------|---------|-------------| +| POST | `/api/v1/train/start` | Training | Start supervised training | +| POST | `/api/v1/train/stop` | Training | Stop the active training run | +| GET | `/api/v1/train/status` | Training | Current training phase and metrics | +| POST | `/api/v1/train/pretrain` | Training | Start contrastive pretraining | +| POST | `/api/v1/train/lora` | Training | Start LoRA fine-tuning | +| WS | `/ws/train/progress` | Training | Streaming training progress | + +--- + +## File Layout + +``` +data/ ++-- models/ # RVF model files +| +-- wifi-densepose-v1.rvf # Trained model container +| +-- wifi-densepose-field-v2.rvf # Environment-calibrated model ++-- recordings/ # CSI recording sessions + +-- walking-20260303_140000.csi.jsonl # Raw CSI frames (JSONL) + +-- walking-20260303_140000.csi.meta.json # Session metadata + +-- standing-20260303_141500.csi.jsonl + +-- standing-20260303_141500.csi.meta.json + +crates/wifi-densepose-sensing-server/ ++-- src/ + +-- main.rs # Server entry, CLI args, AppStateInner, sensing loop + +-- model_manager.rs # Model Management bounded context + +-- recording.rs # CSI Recording bounded context + +-- training_api.rs # Training Pipeline bounded context + +-- rvf_container.rs # RVF format ACL (RvfReader, RvfBuilder) + +-- rvf_pipeline.rs # Progressive loader for model inference + +-- vital_signs.rs # Vital sign detection from CSI phase + +-- dataset.rs # Dataset loading for training + +-- trainer.rs # Core training loop implementation + +-- embedding.rs # Contrastive embedding extraction + +-- graph_transformer.rs # Graph transformer architecture + +-- sona.rs # SONA self-optimizing profile + +-- sparse_inference.rs # Sparse inference engine + +-- lib.rs # Public module re-exports +``` + +--- + +## Related + +- [ADR-019: Sensing-Only UI Mode](../adr/ADR-019-sensing-only-ui-mode.md) -- Decoupled sensing UI, Gaussian splats, Python WebSocket bridge +- [ADR-035: Live Sensing UI Accuracy](../adr/ADR-035-live-sensing-ui-accuracy.md) -- Data transparency, Goertzel breathing estimation, signal-responsive pose +- [ADR-043: Sensing Server UI API Completion](../adr/ADR-043-sensing-server-ui-api-completion.md) -- Model, recording, training endpoints; single-binary deployment +- [RuvSense Domain Model](ruvsense-domain-model.md) -- Upstream signal processing domain (multistatic sensing, coherence, tracking) +- [WiFi-Mat Domain Model](wifi-mat-domain-model.md) -- Downstream disaster response domain diff --git a/docs/ddd/signal-processing-domain-model.md b/docs/ddd/signal-processing-domain-model.md new file mode 100644 index 00000000..319d27c4 --- /dev/null +++ b/docs/ddd/signal-processing-domain-model.md @@ -0,0 +1,663 @@ +# Signal Processing Domain Model + +## Domain-Driven Design Specification + +Based on ADR-014 (SOTA Signal Processing) and the `wifi-densepose-signal` crate. + +### Ubiquitous Language + +| Term | Definition | +|------|------------| +| **CsiFrame** | A single CSI measurement: amplitude + phase per antenna per subcarrier at one timestamp | +| **Conjugate Multiplication** | `H_ref[k] * conj(H_target[k])` — cancels CFO/SFO/PDD, isolating environment-induced phase | +| **CSI Ratio** | The complex result of conjugate multiplication between two antenna streams | +| **Hampel Filter** | Running median +/- scaled MAD outlier detector; resists up to 50% contamination | +| **Phase Sanitization** | Pipeline of unwrapping, outlier removal, smoothing, and noise filtering on raw CSI phase | +| **Spectrogram** | 2D time-frequency matrix from STFT, standard CNN input for WiFi activity recognition | +| **Subcarrier Sensitivity** | Variance ratio (motion var / static var) ranking how responsive a subcarrier is to motion | +| **Body Velocity Profile (BVP)** | Doppler-derived velocity x time 2D matrix; domain-independent motion representation | +| **Fresnel Zone** | Ellipsoidal region between TX and RX where signal reflection/diffraction occurs | +| **Breathing Estimate** | BPM + amplitude + confidence derived from Fresnel zone boundary crossings | +| **Motion Score** | Composite (0.0-1.0) from variance, correlation, phase, and optional Doppler components | +| **Presence State** | Binary detection result: human present/absent with smoothed confidence | +| **Calibration** | Recording baseline variance during a known-empty period for adaptive detection | + +--- + +## Bounded Contexts + +### 1. CSI Preprocessing Context + +**Responsibility**: Produce clean, hardware-artifact-free CSI data from raw measurements. + +``` ++-----------------------------------------------------------+ +| CSI Preprocessing Context | ++-----------------------------------------------------------+ +| | +| +--------------+ +--------------+ +------------+ | +| | Conjugate | | Hampel | | Phase | | +| | Multiplication| | Filter | | Sanitizer | | +| +------+-------+ +------+-------+ +-----+------+ | +| | | | | +| v v v | +| +------+-------+ +------+-------+ +-----+------+ | +| | CsiRatio | | HampelResult | | Sanitized | | +| | (clean phase)| |(outlier-free)| | Phase | | +| +--------------+ +--------------+ +------------+ | +| | | | | +| +-------------------+------------------+ | +| | | +| v | +| +-------+--------+ | +| | CsiProcessor |--> CleanedCsiData | +| +----------------+ | +| | ++-----------------------------------------------------------+ +``` + +**Aggregates**: `CsiProcessor` (Aggregate Root) + +**Value Objects**: `CsiData`, `CsiRatio`, `HampelResult`, `HampelConfig`, `PhaseSanitizerConfig` + +**Domain Services**: `CsiPreprocessor`, `PhaseSanitizer` + +--- + +### 2. Feature Extraction Context + +**Responsibility**: Transform clean CSI data into ML-ready feature representations. + +``` ++-----------------------------------------------------------+ +| Feature Extraction Context | ++-----------------------------------------------------------+ +| | +| +--------------+ +--------------+ +------------+ | +| | STFT | | Subcarrier | | Doppler | | +| | Spectrogram | | Selection | | BVP Engine | | +| +------+-------+ +------+-------+ +-----+------+ | +| | | | | +| v v v | +| +------+-------+ +------+-------+ +-----+------+ | +| | Spectrogram | | Subcarrier | | BodyVel | | +| | (2D TF) | | Selection | | Profile | | +| +--------------+ +--------------+ +------------+ | +| | | | | +| +-------------------+------------------+ | +| | | +| v | +| +----------+----------+ | +| | FeatureExtractor |--> CsiFeatures | +| +---------------------+ | +| | ++-----------------------------------------------------------+ +``` + +**Aggregates**: `FeatureExtractor` (Aggregate Root) + +**Value Objects**: `Spectrogram`, `SubcarrierSelection`, `BodyVelocityProfile`, `CsiFeatures` + +**Domain Services**: `SpectrogramConfig`, `SubcarrierSelectionConfig`, `BvpConfig` + +--- + +### 3. Motion Analysis Context + +**Responsibility**: Detect and classify human motion and vital signs from CSI features. + +``` ++-----------------------------------------------------------+ +| Motion Analysis Context | ++-----------------------------------------------------------+ +| | +| +--------------+ +--------------+ | +| | Motion | | Fresnel | | +| | Detector | | Breathing | | +| +------+-------+ +------+-------+ | +| | | | +| v v | +| +------+-------+ +------+-------+ | +| | MotionScore | | Breathing | | +| |+ Detection | | Estimate | | +| +--------------+ +--------------+ | +| | | | +| +-------------------+ | +| | | +| v | +| +--------+--------+ | +| | HumanDetection |--> PresenceState | +| | Result | | +| +-----------------+ | +| | ++-----------------------------------------------------------+ +``` + +**Aggregates**: `MotionDetector` (Aggregate Root) + +**Value Objects**: `MotionScore`, `MotionAnalysis`, `HumanDetectionResult`, `BreathingEstimate`, `FresnelGeometry` + +**Domain Services**: `FresnelBreathingEstimator` + +--- + +## Aggregates + +### CsiProcessor (CSI Preprocessing Root) + +```rust +pub struct CsiProcessor { + config: CsiProcessorConfig, + preprocessor: CsiPreprocessor, + history: VecDeque, + previous_detection_confidence: f64, + statistics: ProcessingStatistics, +} + +impl CsiProcessor { + /// Create with validated configuration + pub fn new(config: CsiProcessorConfig) -> Result; + + /// Full preprocessing pipeline: noise removal -> windowing -> normalization + pub fn preprocess(&self, csi_data: &CsiData) -> Result; + + /// Maintain temporal history for downstream feature extraction + pub fn add_to_history(&mut self, csi_data: CsiData); + + /// Apply exponential moving average to detection confidence + pub fn apply_temporal_smoothing(&mut self, raw_confidence: f64) -> f64; +} +``` + +### FeatureExtractor (Feature Extraction Root) + +```rust +pub struct FeatureExtractor { + config: FeatureExtractorConfig, +} + +impl FeatureExtractor { + /// Extract all feature types from a single CsiData snapshot + pub fn extract(&self, csi_data: &CsiData) -> CsiFeatures; +} +``` + +### MotionDetector (Motion Analysis Root) + +```rust +pub struct MotionDetector { + config: MotionDetectorConfig, + previous_confidence: f64, + motion_history: VecDeque, + baseline_variance: Option, +} + +impl MotionDetector { + /// Analyze motion from extracted features + pub fn analyze_motion(&self, features: &CsiFeatures) -> MotionAnalysis; + + /// Full detection pipeline: analyze -> score -> smooth -> threshold + pub fn detect_human(&mut self, features: &CsiFeatures) -> HumanDetectionResult; + + /// Record baseline variance for adaptive detection + pub fn calibrate(&mut self, features: &CsiFeatures); +} +``` + +--- + +## Value Objects + +### CsiData + +```rust +pub struct CsiData { + pub timestamp: DateTime, + pub amplitude: Array2, // (num_antennas x num_subcarriers) + pub phase: Array2, // (num_antennas x num_subcarriers), radians + pub frequency: f64, // center frequency in Hz + pub bandwidth: f64, // bandwidth in Hz + pub num_subcarriers: usize, + pub num_antennas: usize, + pub snr: f64, // signal-to-noise ratio in dB + pub metadata: CsiMetadata, +} +``` + +### Spectrogram + +```rust +pub struct Spectrogram { + pub data: Array2, // (n_freq x n_time) power/magnitude + pub n_freq: usize, // frequency bins (window_size/2 + 1) + pub n_time: usize, // time frames + pub freq_resolution: f64, // Hz per bin + pub time_resolution: f64, // seconds per frame +} +``` + +### SubcarrierSelection + +```rust +pub struct SubcarrierSelection { + pub selected_indices: Vec, // ranked by sensitivity, descending + pub sensitivity_scores: Vec, // variance ratio for ALL subcarriers + pub selected_data: Option>, // filtered matrix (optional) +} +``` + +### BodyVelocityProfile + +```rust +pub struct BodyVelocityProfile { + pub data: Array2, // (n_velocity_bins x n_time_frames) + pub velocity_bins: Vec, // velocity value for each row (m/s) + pub n_time: usize, + pub time_resolution: f64, // seconds per frame + pub velocity_resolution: f64, // m/s per bin +} +``` + +### BreathingEstimate + +```rust +pub struct BreathingEstimate { + pub rate_bpm: f64, // breaths per minute + pub confidence: f64, // combined confidence (0.0-1.0) + pub period_seconds: f64, // estimated breathing period + pub autocorrelation_peak: f64, // periodicity quality + pub fresnel_confidence: f64, // Fresnel model match + pub amplitude_variation: f64, // observed amplitude variation +} +``` + +### MotionScore + +```rust +pub struct MotionScore { + pub total: f64, // weighted composite (0.0-1.0) + pub variance_component: f64, + pub correlation_component: f64, + pub phase_component: f64, + pub doppler_component: Option, +} +``` + +### HampelResult + +```rust +pub struct HampelResult { + pub filtered: Vec, // outliers replaced with local median + pub outlier_indices: Vec, + pub medians: Vec, // local median at each sample + pub sigma_estimates: Vec, // estimated local sigma at each sample +} +``` + +### FresnelGeometry + +```rust +pub struct FresnelGeometry { + pub d_tx_body: f64, // TX to body distance (meters) + pub d_body_rx: f64, // body to RX distance (meters) + pub frequency: f64, // carrier frequency (Hz) +} + +impl FresnelGeometry { + pub fn wavelength(&self) -> f64; + pub fn fresnel_radius(&self, n: u32) -> f64; + pub fn phase_change(&self, displacement_m: f64) -> f64; + pub fn expected_amplitude_variation(&self, displacement_m: f64) -> f64; +} +``` + +--- + +## Domain Events + +### Preprocessing Events + +```rust +pub enum PreprocessingEvent { + /// Raw CSI frame cleaned through the full pipeline + FrameCleaned { + timestamp: DateTime, + num_antennas: usize, + num_subcarriers: usize, + noise_filtered: bool, + windowed: bool, + normalized: bool, + }, + + /// Outliers detected and replaced by Hampel filter + OutliersDetected { + subcarrier_indices: Vec, + replacement_values: Vec, + contamination_ratio: f64, + }, + + /// Phase sanitization completed + PhaseSanitized { + method: UnwrappingMethod, + outliers_removed: usize, + smoothing_applied: bool, + }, +} +``` + +### Feature Extraction Events + +```rust +pub enum FeatureExtractionEvent { + /// Spectrogram computed from temporal CSI stream + SpectrogramGenerated { + n_time: usize, + n_freq: usize, + window_size: usize, + window_fn: WindowFunction, + }, + + /// Top-K sensitive subcarriers selected + SubcarriersSelected { + top_k_indices: Vec, + sensitivity_scores: Vec, + min_sensitivity_threshold: f64, + }, + + /// Body Velocity Profile extracted + BvpExtracted { + n_velocity_bins: usize, + n_time_frames: usize, + max_velocity: f64, + carrier_frequency: f64, + }, +} +``` + +### Motion Analysis Events + +```rust +pub enum MotionAnalysisEvent { + /// Human motion detected above threshold + MotionDetected { + score: MotionScore, + confidence: f64, + threshold: f64, + timestamp: DateTime, + }, + + /// Breathing detected via Fresnel zone model + BreathingDetected { + rate_bpm: f64, + amplitude_variation: f64, + fresnel_confidence: f64, + autocorrelation_peak: f64, + }, + + /// Presence state changed (entered or left) + PresenceChanged { + previous: bool, + current: bool, + smoothed_confidence: f64, + timestamp: DateTime, + }, + + /// Detector calibrated with baseline variance + BaselineCalibrated { + baseline_variance: f64, + timestamp: DateTime, + }, +} +``` + +--- + +## Invariants + +### CSI Preprocessing Invariants + +1. **Conjugate multiplication requires >= 2 antenna elements.** `compute_ratio_matrix` returns `CsiRatioError::InsufficientAntennas` if `n_ant < 2`. Without two antennas, there is no pair to cancel common-mode offsets. + +2. **Hampel filter window must be >= 1 (half_window > 0).** A zero-width window cannot compute a local median. Enforced by `HampelError::InvalidWindow`. + +3. **Phase data must be within configured range before sanitization.** Default range is `[-pi, pi]`. Enforced by `PhaseSanitizer::validate_phase_data`. + +4. **Antenna stream lengths must match for conjugate multiplication.** `conjugate_multiply` returns `CsiRatioError::LengthMismatch` if `h_ref.len() != h_target.len()`. + +### Feature Extraction Invariants + +5. **Spectrogram window size must be > 0 and signal must be >= window_size samples.** Enforced by `SpectrogramError::SignalTooShort` and `SpectrogramError::InvalidWindowSize`. + +6. **Subcarrier selection must receive matching subcarrier counts.** Motion and static data must have the same number of columns. Enforced by `SelectionError::SubcarrierCountMismatch`. + +7. **BVP requires >= window_size temporal samples.** Insufficient history prevents STFT computation. Enforced by `BvpError::InsufficientSamples`. + +8. **BVP carrier frequency must be > 0 for wavelength calculation.** Zero frequency would produce a division-by-zero in the Doppler-to-velocity mapping. + +### Motion Analysis Invariants + +9. **Fresnel geometry requires positive distances (d_tx_body > 0, d_body_rx > 0).** Zero or negative distances are physically impossible. Enforced by `FresnelError::InvalidDistance`. + +10. **Fresnel frequency must be positive.** Required for wavelength computation. Enforced by `FresnelError::InvalidFrequency`. + +11. **Breathing estimation requires >= 10 amplitude samples.** Fewer samples cannot support autocorrelation analysis. Enforced by `FresnelError::InsufficientData`. + +12. **Motion detector history does not exceed configured max size.** Oldest entries are evicted via `VecDeque::pop_front` when capacity is reached. + +--- + +## Domain Services + +### CsiPreprocessor + +Orchestrates the cleaning pipeline for a single CSI frame. + +```rust +pub struct CsiPreprocessor { + noise_threshold: f64, +} + +impl CsiPreprocessor { + /// Remove subcarriers below noise floor (amplitude in dB < threshold) + pub fn remove_noise(&self, csi_data: &CsiData) -> Result; + + /// Apply Hamming window to reduce spectral leakage + pub fn apply_windowing(&self, csi_data: &CsiData) -> Result; + + /// Normalize amplitude to unit variance + pub fn normalize_amplitude(&self, csi_data: &CsiData) -> Result; +} +``` + +### PhaseSanitizer + +Full phase cleaning pipeline: unwrap -> outlier removal -> smoothing -> noise filtering. + +```rust +pub struct PhaseSanitizer { + config: PhaseSanitizerConfig, + statistics: SanitizationStatistics, +} + +impl PhaseSanitizer { + /// Complete sanitization pipeline (all four stages) + pub fn sanitize_phase( + &mut self, + phase_data: &Array2, + ) -> Result, PhaseSanitizationError>; +} +``` + +### FresnelBreathingEstimator + +Physics-based breathing detection using Fresnel zone geometry. + +```rust +pub struct FresnelBreathingEstimator { + geometry: FresnelGeometry, + min_displacement: f64, // 3mm default + max_displacement: f64, // 15mm default +} + +impl FresnelBreathingEstimator { + /// Check if amplitude variation matches Fresnel breathing model + pub fn breathing_confidence(&self, observed_amplitude_variation: f64) -> f64; + + /// Estimate breathing rate via autocorrelation + Fresnel validation + pub fn estimate_breathing_rate( + &self, + amplitude_signal: &[f64], + sample_rate: f64, + ) -> Result; +} +``` + +--- + +## Context Map + +``` ++--------------------------------------------------------------+ +| Signal Processing System | ++--------------------------------------------------------------+ +| | +| +----------------+ Published +------------------+ | +| | CSI | Language | Feature | | +| | Preprocessing |------------>| Extraction | | +| | Context | CsiData | Context | | +| +-------+--------+ +--------+---------+ | +| | | | +| | Publishes | Publishes | +| | CleanedCsiData | CsiFeatures | +| v v | +| +-------+-------------------------------+---------+ | +| | Event Bus (Domain Events) | | +| +---------------------------+---------------------+ | +| | | +| | Subscribes | +| v | +| +---------+---------+ | +| | Motion | | +| | Analysis | | +| | Context | | +| +-------------------+ | +| | ++---------------------------------------------------------------+ +| DOWNSTREAM (Customer/Supplier) | +| +-----------------+ +------------------+ +--------------+ | +| | wifi-densepose | | wifi-densepose | |wifi-densepose| | +| | -nn | | -mat | | -train | | +| | (consumes | | (consumes | |(consumes | | +| | CsiFeatures, | | BreathingEst, | | CsiFeatures) | | +| | Spectrogram) | | MotionScore) | | | | +| +-----------------+ +------------------+ +--------------+ | ++---------------------------------------------------------------+ +| UPSTREAM (Conformist) | +| +-----------------+ +------------------+ | +| | wifi-densepose | | wifi-densepose | | +| | -core | | -hardware | | +| | (CsiFrame | | (ESP32 raw CSI | | +| | primitives) | | data ingestion) | | +| +-----------------+ +------------------+ | ++---------------------------------------------------------------+ +``` + +**Relationship Types**: +- Preprocessing -> Feature Extraction: **Published Language** (CsiData is the shared contract) +- Preprocessing -> Motion Analysis: **Customer/Supplier** (Preprocessing supplies cleaned data) +- Feature Extraction -> Motion Analysis: **Customer/Supplier** (Features supplies CsiFeatures) +- Signal -> wifi-densepose-nn: **Customer/Supplier** (Signal publishes Spectrogram, BVP) +- Signal -> wifi-densepose-mat: **Customer/Supplier** (Signal publishes BreathingEstimate, MotionScore) +- Signal <- wifi-densepose-core: **Conformist** (Signal adapts to core CsiFrame types) +- Signal <- wifi-densepose-hardware: **Conformist** (Signal adapts to raw ESP32 CSI format) + +--- + +## Anti-Corruption Layers + +### Hardware ACL (Upstream) + +Translates raw ESP32 CSI packets into the signal crate's `CsiData` value object, normalizing hardware-specific quirks (LLTF/HT-LTF format differences, antenna mapping, null subcarrier handling). + +```rust +/// Normalizes vendor-specific CSI frames to canonical CsiData +pub struct HardwareNormalizer { + hardware_type: HardwareType, +} + +impl HardwareNormalizer { + /// Convert raw hardware bytes to canonical CsiData + pub fn normalize( + &self, + raw_csi: &[u8], + hardware_type: HardwareType, + ) -> Result; +} + +pub enum HardwareType { + Esp32S3, + Intel5300, + AtherosAr9580, + Simulation, +} +``` + +### Neural Network ACL (Downstream) + +Adapts signal processing outputs (Spectrogram, BVP, CsiFeatures) into tensor formats expected by the `wifi-densepose-nn` crate. This boundary prevents neural network model details from leaking into the signal processing domain. + +```rust +/// Adapts signal crate types to neural network tensor format +pub struct SignalToTensorAdapter; + +impl SignalToTensorAdapter { + /// Convert Spectrogram to CNN-ready 2D tensor + pub fn spectrogram_to_tensor(spec: &Spectrogram) -> Array2 { + spec.data.mapv(|v| v as f32) + } + + /// Convert BVP to domain-independent velocity tensor + pub fn bvp_to_tensor(bvp: &BodyVelocityProfile) -> Array2 { + bvp.data.mapv(|v| v as f32) + } + + /// Convert selected subcarrier data to reduced-dimension input + pub fn selected_csi_to_tensor( + selection: &SubcarrierSelection, + data: &Array2, + ) -> Result, SelectionError> { + let extracted = extract_selected(data, selection)?; + Ok(extracted.mapv(|v| v as f32)) + } +} +``` + +### MAT ACL (Downstream) + +Adapts motion analysis outputs for the Mass Casualty Assessment Tool, translating domain-generic motion scores and breathing estimates into disaster-context vital signs. + +```rust +/// Adapts signal processing outputs for disaster assessment +pub struct SignalToMatAdapter; + +impl SignalToMatAdapter { + /// Convert BreathingEstimate to MAT-domain BreathingPattern + pub fn to_breathing_pattern(est: &BreathingEstimate) -> BreathingPattern { + BreathingPattern { + rate_bpm: est.rate_bpm as f32, + amplitude: est.amplitude_variation as f32, + regularity: est.autocorrelation_peak as f32, + pattern_type: classify_breathing_type(est.rate_bpm), + } + } + + /// Convert MotionScore to MAT-domain presence indicator + pub fn to_presence_indicator(score: &MotionScore) -> PresenceIndicator { + PresenceIndicator { + detected: score.total > 0.3, + confidence: score.total, + motion_level: classify_motion_level(score), + } + } +} +``` diff --git a/docs/ddd/training-pipeline-domain-model.md b/docs/ddd/training-pipeline-domain-model.md new file mode 100644 index 00000000..57a4aef4 --- /dev/null +++ b/docs/ddd/training-pipeline-domain-model.md @@ -0,0 +1,1121 @@ +# Training Pipeline Domain Model + +The Training & ML Pipeline is the subsystem of WiFi-DensePose that turns raw public CSI datasets into a trained pose estimation model and its downstream derivatives: contrastive embeddings, domain-generalized weights, and deterministic proof bundles. It is the bridge between research data and deployable inference. + +This document defines the system using [Domain-Driven Design](https://martinfowler.com/bliki/DomainDrivenDesign.html) (DDD): bounded contexts that own their data and rules, aggregate roots that enforce invariants, value objects that carry meaning, and domain events that connect everything. The goal is to make the pipeline's structure match the physics and mathematics it implements -- so that anyone reading the code (or an AI agent modifying it) understands *why* each piece exists, not just *what* it does. + +**Bounded Contexts:** + +| # | Context | Responsibility | Key ADRs | Code | +|---|---------|----------------|----------|------| +| 1 | [Dataset Management](#1-dataset-management-context) | Load, validate, normalize, and preprocess training data from MM-Fi and Wi-Pose | [ADR-015](../adr/ADR-015-public-dataset-training-strategy.md) | `train/src/dataset.rs`, `train/src/subcarrier.rs` | +| 2 | [Model Architecture](#2-model-architecture-context) | Define the neural network, forward pass, attention mechanisms, and spatial decoding | [ADR-016](../adr/ADR-016-ruvector-integration.md), [ADR-020](../adr/ADR-020-rust-ruvector-ai-model-migration.md) | `train/src/model.rs`, `train/src/graph_transformer.rs` | +| 3 | [Training Orchestration](#3-training-orchestration-context) | Run the training loop, compute composite loss, checkpoint, and verify deterministic proofs | [ADR-015](../adr/ADR-015-public-dataset-training-strategy.md), [ADR-016](../adr/ADR-016-ruvector-integration.md) | `train/src/trainer.rs`, `train/src/losses.rs`, `train/src/metrics.rs`, `train/src/proof.rs` | +| 4 | [Embedding & Transfer](#4-embedding--transfer-context) | Produce AETHER contrastive embeddings, MERIDIAN domain-generalized features, and LoRA adapters | [ADR-024](../adr/ADR-024-contrastive-csi-embedding-model.md), [ADR-027](../adr/ADR-027-cross-environment-domain-generalization.md) | `train/src/embedding.rs`, `train/src/domain.rs`, `train/src/sona.rs` | + +All code paths shown are relative to `rust-port/wifi-densepose-rs/crates/wifi-densepose-` unless otherwise noted. + +--- + +## Domain-Driven Design Specification + +### Ubiquitous Language + +| Term | Definition | +|------|------------| +| **Training Run** | A complete training session: configuration, epoch loop, checkpoint history, and final model weights | +| **Epoch** | One full pass through the training dataset; produces train loss and validation metrics | +| **Checkpoint** | A snapshot of model weights at a given epoch, identified by SHA-256 hash and validation PCK | +| **CSI Sample** | A single observation: amplitude + phase tensors, ground-truth keypoints, and visibility flags | +| **Subcarrier Interpolation** | Resampling CSI from source subcarrier count to the canonical 56 (114->56 for MM-Fi, 30->56 for Wi-Pose) | +| **Teacher-Student** | Training regime where a camera-based RGB model generates pseudo-labels; at inference the camera is removed | +| **Pseudo-Label** | DensePose UV surface coordinates generated by Detectron2 from paired RGB frames | +| **PCK@0.2** | Percentage of Correct Keypoints within 20% of torso diameter; primary accuracy metric | +| **OKS** | Object Keypoint Similarity; per-keypoint Gaussian-weighted distance used in COCO evaluation | +| **MPJPE** | Mean Per Joint Position Error in millimeters; 3D accuracy metric | +| **Hungarian Assignment** | Bipartite matching of predicted persons to ground-truth using min-cost assignment | +| **Dynamic Min-Cut** | Subpolynomial O(n^1.5 log n) person-to-GT assignment maintained across frames | +| **Compressed CSI Buffer** | Tiered-quantization temporal window: hot frames at 8-bit, warm at 5/7-bit, cold at 3-bit | +| **Proof Verification** | Deterministic check: fixed seed -> N training steps -> loss decreases AND SHA-256 hash matches | +| **AETHER Embedding** | 128-dim L2-normalized contrastive vector from the CsiToPoseTransformer backbone | +| **InfoNCE Loss** | Contrastive loss that pushes same-identity embeddings together and different-identity apart | +| **HNSW Index** | Hierarchical Navigable Small World graph for approximate nearest-neighbor embedding search | +| **Domain Factorizer** | Splits latent features into pose-invariant (h_pose) and environment-specific (h_env) components | +| **Gradient Reversal Layer** | Identity in forward pass; multiplies gradient by -lambda in backward pass to force domain invariance | +| **GRL Lambda** | Adversarial weight annealed from 0.0 to 1.0 over the first 20 epochs | +| **FiLM Conditioning** | Feature-wise Linear Modulation: gamma * features + beta, conditioned on geometry encoding | +| **Hardware Normalizer** | Resamples CSI from any chipset to canonical 56 subcarriers with z-score amplitude normalization | +| **LoRA Adapter** | Low-Rank Adaptation weights (rank r, alpha) for few-shot environment-specific fine-tuning | +| **Rapid Adaptation** | 10-second unlabeled calibration producing a per-room LoRA adapter via contrastive test-time training | + +--- + +## Bounded Contexts + +### 1. Dataset Management Context + +**Responsibility:** Load raw CSI data from public datasets (MM-Fi, Wi-Pose), validate structural invariants, resample subcarriers to the canonical 56, apply phase sanitization, and present typed samples to the training loop. Memory efficiency via tiered temporal compression. + +``` ++----------------------------------------------------------+ +| Dataset Management Context | ++----------------------------------------------------------+ +| | +| +---------------+ +---------------+ | +| | MM-Fi Loader | | Wi-Pose | | +| | (.npy files, | | Loader | | +| | 114 sub, | | (.mat files, | | +| | 40 subjects)| | 30 sub, | | +| +-------+-------+ | 12 subjects)| | +| | +-------+-------+ | +| | | | +| +--------+-----------+ | +| v | +| +----------------+ | +| | Subcarrier | | +| | Interpolator | | +| | (114->56 or | | +| | 30->56) | | +| +--------+-------+ | +| v | +| +----------------+ | +| | Phase | | +| | Sanitizer | | +| | (SOTA algs | | +| | from signal) | | +| +--------+-------+ | +| v | +| +----------------+ | +| | Compressed CSI |--> CsiSample | +| | Buffer | | +| | (tiered quant) | | +| +----------------+ | +| | ++----------------------------------------------------------+ +``` + +**Aggregates:** +- `MmFiDataset` (Aggregate Root) -- Manages the MM-Fi data lifecycle +- `WiPoseDataset` (Aggregate Root) -- Manages the Wi-Pose data lifecycle + +**Value Objects:** +- `CsiSample` -- Single observation with amplitude, phase, keypoints, visibility +- `SubcarrierConfig` -- Source count, target count, interpolation method +- `DatasetSplit` -- Train / Validation / Test subject partitioning +- `CompressedCsiBuffer` -- Tiered temporal window backed by `TemporalTensorCompressor` + +**Domain Services:** +- `SubcarrierInterpolationService` -- Resamples subcarriers via sparse least-squares or linear fallback +- `PhaseSanitizationService` -- Applies SpotFi / MUSIC phase correction from `wifi-densepose-signal` +- `TeacherLabelService` -- Runs Detectron2 on paired RGB frames to produce DensePose UV pseudo-labels +- `HardwareNormalizerService` -- Z-score normalization + chipset-invariant phase sanitization + +**RuVector Integration:** +- `ruvector-solver` -> `NeumannSolver` for sparse O(sqrt(n)) subcarrier interpolation (114->56) +- `ruvector-temporal-tensor` -> `TemporalTensorCompressor` for 50-75% memory reduction in CSI windows + +--- + +### 2. Model Architecture Context + +**Responsibility:** Define the WiFiDensePoseModel: CSI embedding, cross-attention between keypoint queries and CSI features, GNN message passing, attention-gated modality fusion, and spatial decoding heads for keypoints and DensePose UV. + +``` ++----------------------------------------------------------+ +| Model Architecture Context | ++----------------------------------------------------------+ +| | +| +---------------+ +---------------+ | +| | CSI Embed | | Keypoint | | +| | (Linear | | Queries | | +| | 56 -> d) | | (17 learned | | +| +-------+-------+ | embeddings) | | +| | +-------+-------+ | +| | | | +| +--------+-----------+ | +| v | +| +----------------+ | +| | Cross-Attention| | +| | (Q=queries, | | +| | K,V=csi) | | +| +--------+-------+ | +| v | +| +----------------+ | +| | GNN Stack | | +| | (2-layer GCN | | +| | skeleton | | +| | adjacency) | | +| +--------+-------+ | +| v | +| body_part_features [17 x d_model] | +| | | +| +-------+--------+--------+ | +| v v v v | +| +----------+ +------+ +-----+ +-------+ | +| | Modality | | xyz | | UV | |Spatial| | +| | Transl. | | Head | | Head| |Attn | | +| | (attn | | | | | |Decoder| | +| | mincut) | | | | | | | | +| +----------+ +------+ +-----+ +-------+ | +| | ++----------------------------------------------------------+ +``` + +**Aggregates:** +- `WiFiDensePoseModel` (Aggregate Root) -- The complete model graph + +**Entities:** +- `ModalityTranslator` -- Attention-gated CSI fusion using min-cut +- `CsiToPoseTransformer` -- Cross-attention + GNN backbone +- `KeypointHead` -- Regresses 17 x (x, y, z, confidence) from body_part_features +- `DensePoseHead` -- Predicts body part labels and UV surface coordinates + +**Value Objects:** +- `ModelConfig` -- Architecture hyperparameters (d_model, n_heads, n_gnn_layers) +- `AttentionOutput` -- Attended values + gating result from min-cut attention +- `BodyPartFeatures` -- [17 x d_model] intermediate representation + +**Domain Services:** +- `AttentionGatingService` -- Applies `attn_mincut` to prune irrelevant antenna paths +- `SpatialDecodingService` -- Graph-based spatial attention among feature map locations + +**RuVector Integration:** +- `ruvector-attn-mincut` -> `attn_mincut` for antenna-path gating in ModalityTranslator +- `ruvector-attention` -> `ScaledDotProductAttention` for spatial decoder long-range dependencies + +--- + +### 3. Training Orchestration Context + +**Responsibility:** Run the training loop across epochs, compute the composite loss (keypoint MSE + DensePose part CE + UV Smooth L1 + transfer MSE), evaluate validation metrics (PCK@0.2, OKS, MPJPE), manage checkpoints, and verify deterministic proof correctness. + +``` ++----------------------------------------------------------+ +| Training Orchestration Context | ++----------------------------------------------------------+ +| | +| +---------------+ +---------------+ | +| | Training Loop | | Loss Computer | | +| | (epoch iter, | | (composite: | | +| | batch fwd/ | | kp_mse + | | +| | bwd, optim) | | part_ce + | | +| +-------+-------+ | uv_l1 + | | +| | | transfer) | | +| | +-------+-------+ | +| +--------+-----------+ | +| v | +| +----------------+ | +| | Metric | | +| | Evaluator | | +| | (PCK, OKS, | | +| | MPJPE, | | +| | Hungarian) | | +| +--------+-------+ | +| v | +| +-------------+-------------+ | +| v v | +| +----------------+ +----------------+ | +| | Checkpoint | | Proof Verifier | | +| | Manager | | (fixed seed, | | +| | (best-by-PCK, | | 50 steps, | | +| | SHA-256 hash) | | loss + hash) | | +| +----------------+ +----------------+ | +| | ++----------------------------------------------------------+ +``` + +**Aggregates:** +- `TrainingRun` (Aggregate Root) -- The complete training session + +**Entities:** +- `CheckpointManager` -- Persists and selects model snapshots +- `ProofVerifier` -- Deterministic verification against stored hashes + +**Value Objects:** +- `TrainingConfig` -- Epochs, batch_size, learning_rate, loss_weights, optimizer params +- `Checkpoint` -- Epoch number, model weights SHA-256, validation PCK at that epoch +- `LossWeights` -- Relative weights for each loss component +- `CompositeTrainingLoss` -- Combined scalar loss with per-component breakdown +- `OksScore` -- Per-keypoint Object Keypoint Similarity with sigma values +- `PckScore` -- Percentage of Correct Keypoints at threshold 0.2 +- `MpjpeScore` -- Mean Per Joint Position Error in millimeters +- `ProofResult` -- Seed, steps, loss_decreased flag, hash_matches flag + +**Domain Services:** +- `LossComputationService` -- Computes composite loss from model outputs and ground truth +- `MetricEvaluationService` -- Computes PCK, OKS, MPJPE over validation set +- `HungarianAssignmentService` -- Bipartite matching for multi-person evaluation +- `DynamicPersonMatcherService` -- Frame-persistent assignment via `ruvector-mincut` +- `ProofVerificationService` -- Fixed-seed training + SHA-256 verification + +**RuVector Integration:** +- `ruvector-mincut` -> `DynamicMinCut` for O(n^1.5 log n) multi-person assignment in metrics +- Original `hungarian_assignment` kept for single-frame static matching in proof verification + +--- + +### 4. Embedding & Transfer Context + +**Responsibility:** Produce AETHER contrastive embeddings from the model backbone, train domain-adversarial features via MERIDIAN, manage the HNSW embedding index for re-ID and fingerprinting, and generate LoRA adapters for few-shot environment adaptation. + +``` ++----------------------------------------------------------+ +| Embedding & Transfer Context | ++----------------------------------------------------------+ +| | +| body_part_features [17 x d_model] | +| | | +| +--------+-----------+ | +| v v | +| +---------------+ +---------------+ | +| | AETHER | | MERIDIAN | | +| | Projection | | Domain | | +| | Head | | Factorizer | | +| | (MeanPool -> | | (PoseEncoder | | +| | fc -> 128d) | | + EnvEncoder)| | +| +-------+-------+ +-------+-------+ | +| | | | +| v v | +| +---------------+ +---------------+ | +| | InfoNCE Loss | | Gradient | | +| | + Hard Neg | | Reversal | | +| | Mining (HNSW) | | Layer (GRL) | | +| +-------+-------+ +-------+-------+ | +| | | | +| v v | +| +---------------+ +---------------+ | +| | Embedding | | Geometry | | +| | Index (HNSW) | | Encoder + | | +| | (fingerprint | | FiLM Cond. | | +| | store) | | (zero-shot) | | +| +---------------+ +-------+-------+ | +| | | +| v | +| +---------------+ | +| | Rapid Adapt. | | +| | (LoRA + TTT, | | +| | 10-sec cal.) | | +| +---------------+ | +| | ++----------------------------------------------------------+ +``` + +**Aggregates:** +- `EmbeddingIndex` (Aggregate Root) -- HNSW-indexed store of AETHER fingerprints +- `DomainAdaptationState` (Aggregate Root) -- Tracks GRL lambda, domain classifier accuracy, factorization quality + +**Entities:** +- `ProjectionHead` -- MLP mapping body_part_features to 128-dim embedding space +- `DomainFactorizer` -- Splits features into h_pose and h_env +- `DomainClassifier` -- Classifies domain from h_pose (trained adversarially via GRL) +- `GeometryEncoder` -- Fourier positional encoding + DeepSets for AP positions +- `LoraAdapter` -- Low-rank adaptation weights for environment-specific fine-tuning + +**Value Objects:** +- `AetherEmbedding` -- 128-dim L2-normalized contrastive vector +- `FingerprintType` -- ReIdentification / RoomFingerprint / PersonFingerprint +- `DomainLabel` -- Environment identifier for adversarial training +- `GrlSchedule` -- Lambda annealing parameters (max_lambda, warmup_epochs) +- `GeometryInput` -- AP positions in meters relative to room origin +- `FilmParameters` -- Gamma (scale) and beta (shift) vectors from geometry conditioning +- `LoraConfig` -- Rank, alpha, target layers +- `AdaptationLoss` -- ContrastiveTTT / EntropyMin / Combined + +**Domain Services:** +- `ContrastiveLossService` -- Computes InfoNCE loss with temperature scaling +- `HardNegativeMiningService` -- HNSW k-NN search for difficult negative pairs +- `DomainAdversarialService` -- Manages GRL annealing and domain classification +- `GeometryConditioningService` -- Encodes AP layout and produces FiLM parameters +- `VirtualDomainAugmentationService` -- Generates synthetic environment shifts for training diversity +- `RapidAdaptationService` -- Produces LoRA adapter from 10-second unlabeled calibration + +--- + +## Core Domain Entities + +### TrainingRun (Aggregate Root) + +```rust +pub struct TrainingRun { + /// Unique run identifier + pub id: TrainingRunId, + /// Full training configuration + pub config: TrainingConfig, + /// Datasets loaded for this run + pub datasets: Vec, + /// Ordered history of per-epoch metrics + pub epoch_history: Vec, + /// Best checkpoint by validation PCK + pub best_checkpoint: Option, + /// Current epoch (0-indexed) + pub current_epoch: usize, + /// Run status + pub status: RunStatus, + /// Proof verification result (if run) + pub proof_result: Option, +} + +pub enum RunStatus { + Initializing, + Training, + Completed, + Failed { reason: String }, + ProofVerified, +} +``` + +**Invariants:** +- Must have at least 1 dataset loaded before transitioning to `Training` +- `best_checkpoint` is updated only when a new epoch's validation PCK exceeds all prior epochs +- `proof_result` can only be set once and is immutable after verification + +### MmFiDataset (Aggregate Root) + +```rust +pub struct MmFiDataset { + /// Root directory containing .npy files + pub data_root: PathBuf, + /// Subject IDs in this split + pub subject_ids: Vec, + /// Number of action classes + pub n_actions: usize, // 27 + /// Source subcarrier count + pub source_subcarriers: usize, // 114 + /// Target subcarrier count after interpolation + pub target_subcarriers: usize, // 56 + /// Antenna configuration: 1 TX x 3 RX + pub antenna_pairs: usize, // 3 + /// Sampling rate in Hz + pub sample_rate_hz: f32, // 100.0 + /// Temporal window size (frames per sample) + pub window_frames: usize, // 10 + /// Compressed buffer for memory-efficient storage + pub buffer: CompressedCsiBuffer, + /// Total loaded samples + pub n_samples: usize, +} +``` + +### WiPoseDataset (Aggregate Root) + +```rust +pub struct WiPoseDataset { + /// Root directory containing .mat files + pub data_root: PathBuf, + /// Subject IDs in this split + pub subject_ids: Vec, + /// Source subcarrier count + pub source_subcarriers: usize, // 30 + /// Target subcarrier count after zero-padding + pub target_subcarriers: usize, // 56 + /// Antenna configuration: 3 TX x 3 RX + pub antenna_pairs: usize, // 9 + /// Keypoint count (18 AlphaPose, mapped to 17 COCO) + pub source_keypoints: usize, // 18 + /// Compressed buffer + pub buffer: CompressedCsiBuffer, + /// Total loaded samples + pub n_samples: usize, +} +``` + +### WiFiDensePoseModel (Aggregate Root) + +```rust +pub struct WiFiDensePoseModel { + /// CSI embedding layer: Linear(56, d_model) + pub csi_embed: Linear, + /// Learned keypoint query embeddings [17 x d_model] + pub keypoint_queries: Tensor, + /// Cross-attention: Q=queries, K,V=csi_embed + pub cross_attention: MultiHeadAttention, + /// GNN message passing on skeleton graph + pub gnn_stack: GnnStack, + /// Modality translator with attention-gated fusion + pub modality_translator: ModalityTranslator, + /// Keypoint regression head + pub keypoint_head: KeypointHead, + /// DensePose UV prediction head + pub densepose_head: DensePoseHead, + /// Spatial attention decoder + pub spatial_decoder: SpatialAttentionDecoder, + /// Model dimensionality + pub d_model: usize, // 64 +} +``` + +### EmbeddingIndex (Aggregate Root) + +```rust +pub struct EmbeddingIndex { + /// HNSW graph for approximate nearest-neighbor search + pub hnsw: HnswIndex, + /// Stored embeddings with metadata + pub entries: Vec, + /// Embedding dimensionality + pub dim: usize, // 128 + /// Number of indexed embeddings + pub count: usize, + /// HNSW construction parameters + pub ef_construction: usize, // 200 + pub m_connections: usize, // 16 +} + +pub struct EmbeddingEntry { + pub id: EmbeddingId, + pub embedding: Vec, // [128], L2-normalized + pub fingerprint_type: FingerprintType, + pub source_domain: Option, + pub created_at: u64, +} + +pub enum FingerprintType { + ReIdentification, + RoomFingerprint, + PersonFingerprint, +} +``` + +--- + +## Value Objects + +### CsiSample + +```rust +pub struct CsiSample { + /// Amplitude tensor [n_antenna_pairs x n_subcarriers x n_time_frames] + pub amplitude: Vec, + /// Phase tensor [n_antenna_pairs x n_subcarriers x n_time_frames] + pub phase: Vec, + /// Ground-truth 3D keypoints [17 x 3] (x, y, z in meters) + pub keypoints: [[f32; 3]; 17], + /// Per-keypoint visibility flags + pub visibility: [f32; 17], + /// DensePose UV pseudo-labels (optional, from teacher model) + pub densepose_uv: Option, + /// Domain label for adversarial training + pub domain_label: Option, + /// Hardware source type + pub hardware_type: HardwareType, +} +``` + +### TrainingConfig + +```rust +pub struct TrainingConfig { + /// Number of training epochs + pub epochs: usize, + /// Mini-batch size + pub batch_size: usize, + /// Initial learning rate + pub learning_rate: f64, // 1e-3 + /// Learning rate schedule: step decay at these epochs + pub lr_decay_epochs: Vec, // [40, 80] + /// Learning rate decay factor + pub lr_decay_factor: f64, // 0.1 + /// Loss component weights + pub loss_weights: LossWeights, + /// Optimizer (Adam) + pub optimizer: OptimizerConfig, + /// Validation subject IDs (MM-Fi: 33-40) + pub val_subjects: Vec, + /// Random seed for reproducibility + pub seed: u64, + /// Enable MERIDIAN domain-adversarial training + pub meridian_enabled: bool, + /// Enable AETHER contrastive learning + pub aether_enabled: bool, +} + +pub struct LossWeights { + /// Keypoint heatmap MSE weight + pub keypoint_mse: f32, // 1.0 + /// DensePose body part cross-entropy weight + pub densepose_part_ce: f32, // 0.5 + /// DensePose UV Smooth L1 weight + pub uv_smooth_l1: f32, // 0.5 + /// Teacher-student transfer MSE weight + pub transfer_mse: f32, // 0.2 + /// AETHER contrastive loss weight (ADR-024) + pub contrastive: f32, // 0.1 + /// MERIDIAN domain adversarial weight (ADR-027) + pub domain_adversarial: f32, // annealed 0.0 -> 1.0 +} +``` + +### Checkpoint + +```rust +pub struct Checkpoint { + /// Epoch at which this checkpoint was saved + pub epoch: usize, + /// SHA-256 hash of serialized model weights + pub weights_hash: String, + /// Validation PCK@0.2 at this epoch + pub validation_pck: f64, + /// Validation OKS at this epoch + pub validation_oks: f64, + /// File path to saved weights + pub path: PathBuf, + /// Timestamp + pub created_at: u64, +} +``` + +### ProofResult + +```rust +pub struct ProofResult { + /// Seed used for model initialization + pub model_seed: u64, // MODEL_SEED = 0 + /// Seed used for proof data generation + pub proof_seed: u64, // PROOF_SEED = 42 + /// Number of training steps in proof + pub steps: usize, // 50 + /// Whether loss decreased monotonically + pub loss_decreased: bool, + /// Whether final weights hash matches stored expected hash + pub hash_matches: bool, + /// The computed SHA-256 hash + pub computed_hash: String, + /// The expected SHA-256 hash (from file) + pub expected_hash: String, +} +``` + +### LoraAdapter + +```rust +pub struct LoraAdapter { + /// Low-rank decomposition rank + pub rank: usize, // 4 + /// LoRA alpha scaling factor + pub alpha: f32, // 1.0 + /// Per-layer weight matrices (A and B for each adapted layer) + pub weights: Vec, + /// Source domain this adapter was calibrated for + pub source_domain: DomainLabel, + /// Calibration duration in seconds + pub calibration_duration_secs: f32, + /// Number of calibration frames used + pub calibration_frames: usize, +} + +pub struct LoraLayerWeights { + /// Layer name in the model + pub layer_name: String, + /// Down-projection: [d_model x rank] + pub a: Vec, + /// Up-projection: [rank x d_model] + pub b: Vec, +} +``` + +--- + +## Domain Events + +### Dataset Events + +```rust +pub enum DatasetEvent { + /// Dataset loaded and validated + DatasetLoaded { + dataset_type: DatasetType, + n_samples: usize, + n_subjects: u32, + source_subcarriers: usize, + timestamp: u64, + }, + + /// Subcarrier interpolation completed for a dataset + SubcarrierInterpolationComplete { + dataset_type: DatasetType, + source_count: usize, + target_count: usize, + method: InterpolationMethod, + timestamp: u64, + }, + + /// Teacher pseudo-labels generated for a batch + PseudoLabelsGenerated { + n_samples: usize, + n_with_uv: usize, + timestamp: u64, + }, +} + +pub enum DatasetType { + MmFi, + WiPose, + Synthetic, +} + +pub enum InterpolationMethod { + /// ruvector-solver NeumannSolver sparse least-squares + SparseNeumannSolver, + /// Fallback linear interpolation + LinearInterpolation, + /// Wi-Pose zero-padding + ZeroPad, +} +``` + +### Training Events + +```rust +pub enum TrainingEvent { + /// One epoch of training completed + EpochCompleted { + epoch: usize, + train_loss: f64, + val_pck: f64, + val_oks: f64, + val_mpjpe_mm: f64, + learning_rate: f64, + grl_lambda: f32, + timestamp: u64, + }, + + /// New best checkpoint saved + CheckpointSaved { + epoch: usize, + weights_hash: String, + validation_pck: f64, + path: String, + timestamp: u64, + }, + + /// Deterministic proof verification completed + ProofVerified { + model_seed: u64, + proof_seed: u64, + steps: usize, + loss_decreased: bool, + hash_matches: bool, + timestamp: u64, + }, + + /// Training run completed or failed + TrainingRunFinished { + run_id: String, + status: RunStatus, + total_epochs: usize, + best_pck: f64, + best_oks: f64, + timestamp: u64, + }, +} +``` + +### Embedding Events + +```rust +pub enum EmbeddingEvent { + /// New AETHER embedding indexed + EmbeddingIndexed { + embedding_id: String, + fingerprint_type: FingerprintType, + nearest_neighbor_distance: f32, + index_size: usize, + timestamp: u64, + }, + + /// Hard negative pair discovered during mining + HardNegativeFound { + anchor_id: String, + negative_id: String, + similarity: f32, + timestamp: u64, + }, + + /// Domain adaptation completed for a target environment + DomainAdaptationComplete { + source_domain: String, + target_domain: String, + pck_before: f64, + pck_after: f64, + adaptation_method: String, + timestamp: u64, + }, + + /// LoRA adapter generated via rapid calibration + LoraAdapterGenerated { + domain: String, + rank: usize, + calibration_frames: usize, + calibration_seconds: f32, + timestamp: u64, + }, +} +``` + +--- + +## Invariants + +### Dataset Management +- MM-Fi samples must be interpolated from 114 to 56 subcarriers before use in training +- Wi-Pose samples must be zero-padded from 30 to 56 subcarriers before use in training +- Wi-Pose keypoints must be mapped from 18 (AlphaPose) to 17 (COCO) by dropping neck index 1 +- All CSI amplitudes must be finite and non-negative after loading +- Phase values must be in [-pi, pi] after sanitization +- Validation subjects (MM-Fi: 33-40) must never appear in the training split +- `CompressedCsiBuffer` must preserve signal fidelity within quantization error bounds (hot: <1% error) + +### Model Architecture +- `csi_embed` input dimension must equal the canonical 56 subcarriers +- `keypoint_queries` must have exactly 17 entries (one per COCO keypoint) +- `attn_mincut` seq_len must equal n_antenna_pairs * n_time_frames +- GNN adjacency matrix must encode the human skeleton topology (17 nodes, 16 edges) +- Spatial attention decoder must preserve spatial resolution (no information loss in reshape) + +### Training Orchestration +- TrainingRun must have at least 1 dataset loaded before `start()` is called +- Proof verification requires fixed seeds: MODEL_SEED=0, PROOF_SEED=42 +- Proof verification uses exactly 50 training steps on deterministic SyntheticDataset +- Loss must decrease over proof steps (otherwise proof fails) +- SHA-256 hash of final weights must match stored expected hash (otherwise proof fails) +- `best_checkpoint` is updated if and only if current val_pck > all previous val_pck values +- Learning rate decays by factor 0.1 at epochs 40 and 80 (step schedule) +- Hungarian assignment for static single-frame matching must use the deterministic implementation (not DynamicMinCut) during proof verification + +### Embedding & Transfer +- AETHER embeddings must be L2-normalized (unit norm) before indexing in HNSW +- InfoNCE temperature must be > 0 (typically 0.07) +- HNSW index ef_search must be >= k for k-NN queries +- MERIDIAN GRL lambda must anneal from 0.0 to 1.0 over the first 20 epochs using the schedule: lambda(p) = 2 / (1 + exp(-10 * p)) - 1, where p = epoch / 20 +- GRL lambda must not exceed 1.0 at any epoch +- `DomainFactorizer` output dimensions: h_pose = [17 x 64], h_env = [32] +- `GeometryEncoder` must be permutation-invariant with respect to AP ordering (DeepSets guarantee) +- LoRA adapter rank must be <= d_model / 4 (default rank=4 for d_model=64) +- Rapid adaptation requires at least 200 CSI frames (10 seconds at 20 Hz) + +--- + +## Domain Services + +### SubcarrierInterpolationService + +Resamples CSI subcarriers from source to target count using physically-motivated sparse interpolation. + +```rust +pub trait SubcarrierInterpolationService { + /// Sparse interpolation via NeumannSolver (O(sqrt(n)), preferred) + fn interpolate_sparse( + &self, + source: &[f32], + source_count: usize, + target_count: usize, + tolerance: f64, + ) -> Result, InterpolationError>; + + /// Linear interpolation fallback (O(n)) + fn interpolate_linear( + &self, + source: &[f32], + source_count: usize, + target_count: usize, + ) -> Vec; + + /// Zero-pad for Wi-Pose (30 -> 56) + fn zero_pad( + &self, + source: &[f32], + target_count: usize, + ) -> Vec; +} +``` + +### LossComputationService + +Computes the composite training loss from model outputs and ground truth. + +```rust +pub trait LossComputationService { + /// Compute composite loss with per-component breakdown + fn compute( + &self, + predictions: &ModelOutput, + targets: &GroundTruth, + weights: &LossWeights, + ) -> CompositeTrainingLoss; +} + +pub struct CompositeTrainingLoss { + /// Total weighted loss (scalar for backprop) + pub total: f64, + /// Keypoint heatmap MSE component + pub keypoint_mse: f64, + /// DensePose body part cross-entropy component + pub densepose_part_ce: f64, + /// DensePose UV Smooth L1 component + pub uv_smooth_l1: f64, + /// Teacher-student transfer MSE component + pub transfer_mse: f64, + /// AETHER contrastive loss (if enabled) + pub contrastive: Option, + /// MERIDIAN domain adversarial loss (if enabled) + pub domain_adversarial: Option, +} +``` + +### MetricEvaluationService + +Evaluates model accuracy on the validation set using standard pose estimation metrics. + +```rust +pub trait MetricEvaluationService { + /// PCK@0.2: fraction of keypoints within 20% of torso diameter + fn compute_pck(&self, predictions: &[PosePrediction], targets: &[PoseTarget], threshold: f64) -> PckScore; + + /// OKS: Object Keypoint Similarity with per-keypoint sigmas + fn compute_oks(&self, predictions: &[PosePrediction], targets: &[PoseTarget]) -> OksScore; + + /// MPJPE: Mean Per Joint Position Error in millimeters + fn compute_mpjpe(&self, predictions: &[PosePrediction], targets: &[PoseTarget]) -> MpjpeScore; + + /// Multi-person assignment via Hungarian (static, deterministic) + fn assign_hungarian(&self, pred: &[PosePrediction], gt: &[PoseTarget]) -> Vec<(usize, usize)>; + + /// Multi-person assignment via DynamicMinCut (persistent, O(n^1.5 log n)) + fn assign_dynamic(&mut self, pred: &[PosePrediction], gt: &[PoseTarget]) -> Vec<(usize, usize)>; +} +``` + +### DomainAdversarialService + +Manages the MERIDIAN gradient reversal training regime. + +```rust +pub trait DomainAdversarialService { + /// Compute GRL lambda for the current epoch + fn grl_lambda(&self, epoch: usize, max_warmup_epochs: usize) -> f32; + + /// Forward pass through domain classifier with gradient reversal + fn classify_domain( + &self, + h_pose: &Tensor, + lambda: f32, + ) -> Tensor; + + /// Compute domain adversarial loss (cross-entropy on domain logits) + fn domain_loss( + &self, + domain_logits: &Tensor, + domain_labels: &Tensor, + ) -> f64; +} +``` + +--- + +## Context Map + +``` ++------------------------------------------------------------------+ +| Training Pipeline System | ++------------------------------------------------------------------+ +| | +| +------------------+ CsiSample +------------------+ | +| | Dataset |-------------->| Training | | +| | Management | | Orchestration | | +| | Context | | Context | | +| +--------+---------+ +--------+-----------+ | +| | | | +| | Publishes | Publishes | +| | DatasetEvent | TrainingEvent | +| v v | +| +------------------------------------------------------+ | +| | Event Bus (Domain Events) | | +| +------------------------------------------------------+ | +| | | | +| v v | +| +------------------+ +------------------+ | +| | Model |<-------------| Embedding & | | +| | Architecture | body_part_ | Transfer | | +| | Context | features | Context | | +| +------------------+ +------------------+ | +| | ++------------------------------------------------------------------+ +| UPSTREAM (Conformist) | +| +--------------+ +--------------+ +--------------+ | +| |wifi-densepose| |wifi-densepose| |wifi-densepose| | +| | -signal | | -nn | | -core | | +| | (phase algs,| | (ONNX, | | (CsiFrame, | | +| | SpotFi) | | Candle) | | error) | | +| +--------------+ +--------------+ +--------------+ | +| | ++------------------------------------------------------------------+ +| SIBLING (Partnership) | +| +--------------+ +--------------+ +--------------+ | +| | RuvSense | | MAT | | Sensing | | +| | (pose | | (triage, | | Server | | +| | tracker, | | survivor) | | (inference | | +| | field | | | | deployment) | | +| | model) | | | | | | +| +--------------+ +--------------+ +--------------+ | +| | ++------------------------------------------------------------------+ +| EXTERNAL (Published Language) | +| +--------------+ +--------------+ +--------------+ | +| | MM-Fi | | Wi-Pose | | Detectron2 | | +| | (NeurIPS | | (NjtechCV | | (teacher | | +| | dataset) | | dataset) | | labels) | | +| +--------------+ +--------------+ +--------------+ | ++------------------------------------------------------------------+ +``` + +**Relationship Types:** +- Dataset Management -> Training Orchestration: **Customer/Supplier** (Dataset produces CsiSamples; Orchestration consumes) +- Model Architecture -> Training Orchestration: **Partnership** (tight bidirectional coupling: Orchestration drives forward/backward; Architecture defines the computation graph) +- Model Architecture -> Embedding & Transfer: **Customer/Supplier** (Architecture produces body_part_features; Embedding consumes for contrastive/adversarial heads) +- Embedding & Transfer -> Training Orchestration: **Partnership** (contrastive and adversarial losses feed into composite loss) +- Training Pipeline -> Upstream crates: **Conformist** (adapts to wifi-densepose-signal, -nn, -core types) +- Training Pipeline -> RuvSense/MAT/Server: **Partnership** (trained model weights flow downstream) +- Training Pipeline -> External datasets: **Anti-Corruption Layer** (dataset loaders translate external formats to domain types) + +--- + +## Anti-Corruption Layer + +### MM-Fi Adapter (Dataset Management -> External MM-Fi format) + +```rust +/// Translates raw MM-Fi numpy files into domain CsiSample values. +/// Handles the 114->56 subcarrier interpolation and 1TX/3RX antenna layout. +pub struct MmFiAdapter { + /// Subcarrier interpolation service + interpolator: Box, + /// Phase sanitizer from wifi-densepose-signal + phase_sanitizer: PhaseSanitizer, + /// Hardware normalizer for z-score normalization + normalizer: HardwareNormalizer, +} + +impl MmFiAdapter { + /// Load a single MM-Fi sample from .npy tensors and produce a CsiSample. + /// Steps: + /// 1. Read amplitude [3, 114, 10] and phase [3, 114, 10] + /// 2. Interpolate 114 -> 56 subcarriers per antenna pair + /// 3. Sanitize phase (remove linear offset, unwrap) + /// 4. Z-score normalize amplitude per frame + /// 5. Read 17-keypoint COCO annotations + pub fn adapt(&self, raw: &MmFiRawSample) -> Result; +} +``` + +### Wi-Pose Adapter (Dataset Management -> External Wi-Pose format) + +```rust +/// Translates Wi-Pose .mat files into domain CsiSample values. +/// Handles 30->56 zero-padding and 18->17 keypoint mapping. +pub struct WiPoseAdapter { + /// Zero-padding service + interpolator: Box, + /// Phase sanitizer + phase_sanitizer: PhaseSanitizer, +} + +impl WiPoseAdapter { + /// Load a Wi-Pose sample from .mat format and produce a CsiSample. + /// Steps: + /// 1. Read CSI [9, 30] (3x3 antenna pairs, 30 subcarriers) + /// 2. Zero-pad 30 -> 56 subcarriers (high-frequency padding) + /// 3. Sanitize phase + /// 4. Map 18 AlphaPose keypoints -> 17 COCO (drop neck, index 1) + pub fn adapt(&self, raw: &WiPoseRawSample) -> Result; +} +``` + +### Teacher Model Adapter (Dataset Management -> Detectron2) + +```rust +/// Adapts Detectron2 DensePose outputs into domain DensePoseLabels. +/// Used during teacher-student pseudo-label generation. +pub struct TeacherModelAdapter; + +impl TeacherModelAdapter { + /// Run Detectron2 DensePose on an RGB frame and produce pseudo-labels. + /// Output: (part_labels [H x W], u_coords [H x W], v_coords [H x W]) + pub fn generate_pseudo_labels( + &self, + rgb_frame: &RgbFrame, + ) -> Result; +} +``` + +### RuVector Adapter (Model Architecture -> ruvector crates) + +```rust +/// Adapts ruvector-attn-mincut API to the model's tensor format. +/// Handles the Tensor <-> Vec conversion overhead per batch element. +pub struct AttnMinCutAdapter; + +impl AttnMinCutAdapter { + /// Apply min-cut gated attention to antenna-path features. + /// Converts [B, n_ant, n_sc] tensor to flat Vec per batch element, + /// calls attn_mincut, and reshapes output back to tensor. + pub fn apply( + &self, + features: &Tensor, + n_antenna_paths: usize, + n_subcarriers: usize, + lambda: f32, + ) -> Result; +} +``` + +--- + +## Repository Interfaces + +```rust +/// Persists and retrieves training run state +pub trait TrainingRunRepository { + fn save(&self, run: &TrainingRun) -> Result<(), RepositoryError>; + fn find_by_id(&self, id: &TrainingRunId) -> Result, RepositoryError>; + fn find_latest(&self) -> Result, RepositoryError>; + fn list_completed(&self) -> Result, RepositoryError>; +} + +/// Persists model checkpoints +pub trait CheckpointRepository { + fn save(&self, checkpoint: &Checkpoint) -> Result<(), RepositoryError>; + fn find_best(&self, run_id: &TrainingRunId) -> Result, RepositoryError>; + fn find_by_epoch(&self, run_id: &TrainingRunId, epoch: usize) -> Result, RepositoryError>; + fn list_all(&self, run_id: &TrainingRunId) -> Result, RepositoryError>; +} + +/// Persists AETHER embedding index +pub trait EmbeddingRepository { + fn save_index(&self, index: &EmbeddingIndex) -> Result<(), RepositoryError>; + fn load_index(&self) -> Result, RepositoryError>; + fn add_entry(&self, entry: &EmbeddingEntry) -> Result<(), RepositoryError>; + fn search_knn(&self, query: &[f32], k: usize) -> Result, RepositoryError>; +} + +/// Persists LoRA adapters for environment-specific fine-tuning +pub trait LoraRepository { + fn save(&self, adapter: &LoraAdapter) -> Result<(), RepositoryError>; + fn find_by_domain(&self, domain: &DomainLabel) -> Result, RepositoryError>; + fn list_all(&self) -> Result, RepositoryError>; +} +``` + +--- + +## References + +- ADR-015: Public Dataset Strategy (MM-Fi, Wi-Pose, teacher-student training) +- ADR-016: RuVector Integration (5 crate integration points in training pipeline) +- ADR-020: Rust Migration (training pipeline in wifi-densepose-train crate) +- ADR-024: AETHER Contrastive CSI Embeddings (128-dim fingerprints, InfoNCE, HNSW) +- ADR-027: MERIDIAN Cross-Environment Domain Generalization (GRL, FiLM, LoRA) +- Yang et al., "MM-Fi: Multi-Modal Non-Intrusive 4D Human Dataset" (NeurIPS 2023) +- NjtechCVLab, "Wi-Pose Dataset" (CSI-Former, MDPI Entropy 2023) +- Geng et al., "DensePose From WiFi" (CMU, arXiv:2301.00250, 2023) +- Ganin et al., "Domain-Adversarial Training of Neural Networks" (JMLR 2016) +- Perez et al., "FiLM: Visual Reasoning with a General Conditioning Layer" (AAAI 2018)