wifi-densepose/vendor/sublinear-time-solver/crates/temporal-compare/RESULTS.md

2.2 KiB

Temporal-Compare Benchmark Results

Test Configuration

  • Dataset: Synthetic temporal data with Gaussian noise
  • Window size: 32 time steps
  • Task: Time-R1 style temporal prediction

Results Summary

Regression Task (MSE - Lower is Better)

Backend Train Size Epochs MSE (Val) MSE (Test)
Baseline N/A N/A N/A 0.1120
MLP 2000 15 0.1375 0.1281
MLP 5000 20 0.1722 0.1424

Classification Task (Accuracy - Higher is Better)

Backend Train Size Epochs Accuracy
Baseline N/A N/A 0.6467
MLP 2000 15 0.3700
MLP 1000 10 0.1667

Key Observations

  1. Baseline Performance: The naive baseline (predicting last value in window) performs surprisingly well:

    • MSE: ~0.11
    • Accuracy: ~65-70%
  2. MLP Challenges: The simplified MLP without full backpropagation shows:

    • Regression: Competitive with baseline (MSE: 0.128 vs 0.112)
    • Classification: Underperforms baseline significantly (37% vs 65%)
  3. Training Dynamics:

    • Lower learning rates (0.001) improve stability
    • More epochs don't always improve performance
    • The simplified SGD approach limits learning capacity

Architecture Details

MLP Implementation

  • Architecture: Input(32) → Hidden(64) → Output(1 or 3)
  • Activation: ReLU
  • Training: Simplified SGD with numerical gradient approximation
  • Weight Init: Xavier/He initialization

Baseline

  • Strategy: Returns last value in temporal window
  • Classification: Maps continuous values to 3 classes via thresholds

Compilation Features

Successfully builds with all backends:

  • baseline: Always available
  • mlp: Native Rust implementation
  • ruv-fann: Feature-gated, compiles successfully

Future Improvements

  1. Implement full backpropagation for better gradient flow
  2. Add momentum and adaptive learning rates
  3. Implement proper cross-entropy loss for classification
  4. Add validation-based early stopping
  5. Integrate actual ruv-fann backend implementation