wifi-densepose/docs/adr/ADR-057-desktop-training-ru...

7.7 KiB

ADR-057: Desktop App Training & RuVector Integration

Field Value
Status Proposed
Date 2026-03-10
Authors RuView Team
Reviewers -
Related ADR-016, ADR-017, ADR-024, ADR-027

Context

The RuView desktop application currently provides device discovery, firmware flashing, OTA updates, and real-time sensing visualization. However, users cannot train models or configure RuVector signal processing modules directly from the desktop app.

The following crates exist in the workspace but are not exposed in the desktop UI:

Training Crate (wifi-densepose-train)

  • Dataset management (MM-Fi, Wi-Pose formats)
  • Model architectures (CSI encoder, pose decoder)
  • Training loops with metrics tracking
  • Checkpoint save/load
  • ruview_metrics integration

RuVector Crates (5 modules)

  1. ruvector-mincut - Graph-based person segmentation, DynamicPersonMatcher
  2. ruvector-attn-mincut - Attention-weighted antenna selection
  3. ruvector-temporal-tensor - Temporal CSI compression, breathing detection
  4. ruvector-solver - Sparse interpolation, triangulation
  5. ruvector-attention - Spatial attention, BVP extraction

Decision

Add a new "Training" page to the desktop application with tabbed navigation:

Tab Structure

┌─────────────────────────────────────────────────────────────┐
│  Training & Models                                          │
├──────────┬──────────┬──────────┬──────────┬────────────────┤
│ Datasets │  Models  │ Training │ RuVector │    Metrics     │
└──────────┴──────────┴──────────┴──────────┴────────────────┘

Tab 1: Datasets

  • Download standard datasets (MM-Fi, Wi-Pose)
  • Import custom CSI recordings
  • Preview dataset samples (CSI heatmaps, labels)
  • Split into train/val/test sets
  • Statistics - sample counts, class distribution

Tab 2: Models

  • Browse available architectures:
    • CSI Encoder (CNN, Transformer)
    • Pose Decoder (LSTM, GRU)
    • AETHER embedding network (ADR-024)
    • MERIDIAN domain adaptor (ADR-027)
  • Load checkpoints from disk
  • View model summary (params, layers, memory)
  • Export to ONNX/TorchScript

Tab 3: Training

  • Configure training:
    • Learning rate, batch size, epochs
    • Optimizer (Adam, SGD, AdamW)
    • Loss function selection
    • Data augmentation toggles
  • GPU Detection - CUDA/Metal availability
  • Start/Stop training jobs
  • Progress - live loss curves, ETA
  • Checkpointing - auto-save best model

Tab 4: RuVector

  • Module Configuration:
    • MinCut graph parameters
    • Attention weights
    • Temporal compression ratio
    • Solver interpolation settings
  • Live Testing - apply to real-time CSI stream
  • Comparison - A/B test configurations
  • Export - save optimal config

Tab 5: Metrics

  • Loss Curves - training/validation over epochs
  • Evaluation - PCK, mAP, IoU scores
  • Confusion Matrix - per-joint accuracy
  • Export - CSV, JSON, TensorBoard format

Architecture

Backend (Rust/Tauri)

wifi-densepose-desktop/
├── src/
│   ├── commands/
│   │   ├── training.rs      # NEW: Training job management
│   │   ├── datasets.rs      # NEW: Dataset download/import
│   │   ├── models.rs        # NEW: Model loading/export
│   │   ├── ruvector.rs      # NEW: RuVector config
│   │   └── metrics.rs       # NEW: Metrics retrieval
│   └── domain/
│       ├── training.rs      # Training state machine
│       └── ruvector.rs      # RuVector config types

Frontend (React/TypeScript)

ui/src/pages/
├── Training/
│   ├── index.tsx            # Tab container
│   ├── DatasetsTab.tsx      # Dataset management
│   ├── ModelsTab.tsx        # Model browser
│   ├── TrainingTab.tsx      # Training control
│   ├── RuVectorTab.tsx      # Signal processing config
│   └── MetricsTab.tsx       # Visualization

Tauri Commands

Command Description
list_datasets Get available datasets
download_dataset Download standard dataset
import_dataset Import custom recordings
list_models Get model architectures
load_checkpoint Load model weights
export_model Export to ONNX
detect_gpu Check CUDA/Metal
start_training Begin training job
stop_training Cancel training
training_progress Get current status
get_ruvector_config Load RuVector settings
set_ruvector_config Update settings
test_ruvector_live Apply to live CSI
get_metrics Retrieve training metrics

Event System

Training progress updates via Tauri events:

#[derive(Serialize, Clone)]
pub struct TrainingProgress {
    pub epoch: u32,
    pub total_epochs: u32,
    pub batch: u32,
    pub total_batches: u32,
    pub train_loss: f32,
    pub val_loss: Option<f32>,
    pub learning_rate: f32,
    pub eta_secs: u64,
    pub gpu_memory_mb: Option<u64>,
}

// Emit every batch
app.emit("training:progress", progress)?;

// Emit on completion
app.emit("training:complete", result)?;

Implementation Plan

Phase 1: Foundation (Week 1-2)

  1. Create Training page skeleton with tabs
  2. Implement detect_gpu command
  3. Add dataset listing/download commands
  4. Design TypeScript types for all entities

Phase 2: Dataset Management (Week 3)

  1. MM-Fi dataset downloader
  2. Wi-Pose dataset downloader
  3. Custom dataset import (CSV/NPZ)
  4. Dataset preview component

Phase 3: Model Management (Week 4)

  1. Model architecture browser
  2. Checkpoint loading
  3. Model summary display
  4. ONNX export

Phase 4: Training Loop (Week 5-6)

  1. Training configuration UI
  2. Background training thread
  3. Progress event emission
  4. Checkpoint auto-save
  5. Training history persistence

Phase 5: RuVector Integration (Week 7)

  1. RuVector config UI
  2. Live CSI testing
  3. A/B comparison mode
  4. Config export/import

Phase 6: Metrics & Polish (Week 8)

  1. Loss curve visualization (Chart.js/Recharts)
  2. Evaluation metrics display
  3. Export functionality
  4. Error handling & edge cases

Risks & Mitigations

Risk Probability Impact Mitigation
No GPU available Medium High CPU fallback with warning
Large dataset downloads High Medium Resume support, progress UI
Training crashes Medium High Checkpoint recovery, error reporting
Memory exhaustion Low High Batch size auto-tuning
UI blocking Medium High All training in background thread

Success Criteria

  1. User can download MM-Fi dataset from UI
  2. User can start training with GPU detection
  3. Live progress updates without UI freeze
  4. Training can be paused/resumed
  5. RuVector config changes apply to live CSI
  6. Metrics display updates in real-time
  7. Models can be exported to ONNX

Alternatives Considered

1. Separate Training App

  • Rejected: Fragments user experience, duplicates code

2. Web-based Training Dashboard

  • Rejected: Requires server, no offline support

3. CLI-only Training

  • Rejected: Poor UX for non-technical users

References