wifi-densepose/vendor/sublinear-time-solver/crates/temporal-compare/RESULTS.md

62 lines
2.2 KiB
Markdown

# Temporal-Compare Benchmark Results
## Test Configuration
- Dataset: Synthetic temporal data with Gaussian noise
- Window size: 32 time steps
- Task: Time-R1 style temporal prediction
## Results Summary
### Regression Task (MSE - Lower is Better)
| Backend | Train Size | Epochs | MSE (Val) | MSE (Test) |
|----------|------------|--------|-----------|------------|
| Baseline | N/A | N/A | N/A | 0.1120 |
| MLP | 2000 | 15 | 0.1375 | 0.1281 |
| MLP | 5000 | 20 | 0.1722 | 0.1424 |
### Classification Task (Accuracy - Higher is Better)
| Backend | Train Size | Epochs | Accuracy |
|----------|------------|--------|-----------|
| Baseline | N/A | N/A | 0.6467 |
| MLP | 2000 | 15 | 0.3700 |
| MLP | 1000 | 10 | 0.1667 |
## Key Observations
1. **Baseline Performance**: The naive baseline (predicting last value in window) performs surprisingly well:
- MSE: ~0.11
- Accuracy: ~65-70%
2. **MLP Challenges**: The simplified MLP without full backpropagation shows:
- Regression: Competitive with baseline (MSE: 0.128 vs 0.112)
- Classification: Underperforms baseline significantly (37% vs 65%)
3. **Training Dynamics**:
- Lower learning rates (0.001) improve stability
- More epochs don't always improve performance
- The simplified SGD approach limits learning capacity
## Architecture Details
### MLP Implementation
- Architecture: Input(32) → Hidden(64) → Output(1 or 3)
- Activation: ReLU
- Training: Simplified SGD with numerical gradient approximation
- Weight Init: Xavier/He initialization
### Baseline
- Strategy: Returns last value in temporal window
- Classification: Maps continuous values to 3 classes via thresholds
## Compilation Features
✅ Successfully builds with all backends:
- `baseline`: Always available
- `mlp`: Native Rust implementation
- `ruv-fann`: Feature-gated, compiles successfully
## Future Improvements
1. Implement full backpropagation for better gradient flow
2. Add momentum and adaptive learning rates
3. Implement proper cross-entropy loss for classification
4. Add validation-based early stopping
5. Integrate actual ruv-fann backend implementation