62 lines
2.2 KiB
Markdown
62 lines
2.2 KiB
Markdown
# Temporal-Compare Benchmark Results
|
|
|
|
## Test Configuration
|
|
- Dataset: Synthetic temporal data with Gaussian noise
|
|
- Window size: 32 time steps
|
|
- Task: Time-R1 style temporal prediction
|
|
|
|
## Results Summary
|
|
|
|
### Regression Task (MSE - Lower is Better)
|
|
| Backend | Train Size | Epochs | MSE (Val) | MSE (Test) |
|
|
|----------|------------|--------|-----------|------------|
|
|
| Baseline | N/A | N/A | N/A | 0.1120 |
|
|
| MLP | 2000 | 15 | 0.1375 | 0.1281 |
|
|
| MLP | 5000 | 20 | 0.1722 | 0.1424 |
|
|
|
|
### Classification Task (Accuracy - Higher is Better)
|
|
| Backend | Train Size | Epochs | Accuracy |
|
|
|----------|------------|--------|-----------|
|
|
| Baseline | N/A | N/A | 0.6467 |
|
|
| MLP | 2000 | 15 | 0.3700 |
|
|
| MLP | 1000 | 10 | 0.1667 |
|
|
|
|
## Key Observations
|
|
|
|
1. **Baseline Performance**: The naive baseline (predicting last value in window) performs surprisingly well:
|
|
- MSE: ~0.11
|
|
- Accuracy: ~65-70%
|
|
|
|
2. **MLP Challenges**: The simplified MLP without full backpropagation shows:
|
|
- Regression: Competitive with baseline (MSE: 0.128 vs 0.112)
|
|
- Classification: Underperforms baseline significantly (37% vs 65%)
|
|
|
|
3. **Training Dynamics**:
|
|
- Lower learning rates (0.001) improve stability
|
|
- More epochs don't always improve performance
|
|
- The simplified SGD approach limits learning capacity
|
|
|
|
## Architecture Details
|
|
|
|
### MLP Implementation
|
|
- Architecture: Input(32) → Hidden(64) → Output(1 or 3)
|
|
- Activation: ReLU
|
|
- Training: Simplified SGD with numerical gradient approximation
|
|
- Weight Init: Xavier/He initialization
|
|
|
|
### Baseline
|
|
- Strategy: Returns last value in temporal window
|
|
- Classification: Maps continuous values to 3 classes via thresholds
|
|
|
|
## Compilation Features
|
|
✅ Successfully builds with all backends:
|
|
- `baseline`: Always available
|
|
- `mlp`: Native Rust implementation
|
|
- `ruv-fann`: Feature-gated, compiles successfully
|
|
|
|
## Future Improvements
|
|
1. Implement full backpropagation for better gradient flow
|
|
2. Add momentum and adaptive learning rates
|
|
3. Implement proper cross-entropy loss for classification
|
|
4. Add validation-based early stopping
|
|
5. Integrate actual ruv-fann backend implementation |