wifi-densepose/vendor/midstream/docs/BENCHMARK_GUIDE.md

359 lines
8.2 KiB
Markdown

# Comprehensive Benchmark Guide
This guide covers all benchmarks for the Midstream workspace's 6 production crates.
## Overview
All benchmarks use Criterion.rs for statistical analysis and HTML report generation. Each crate has comprehensive benchmarks targeting specific performance goals.
## Running Benchmarks
### Run All Benchmarks
```bash
cargo bench
```
### Run Specific Crate Benchmarks
```bash
# Temporal Compare (DTW, LCS, Edit Distance)
cargo bench --bench temporal_bench
# Nanosecond Scheduler
cargo bench --bench scheduler_bench
# Temporal Attractor Studio
cargo bench --bench attractor_bench
# Temporal Neural Solver
cargo bench --bench solver_bench
# Strange Loop (Meta-Learning)
cargo bench --bench meta_bench
# QUIC Multistream
cargo bench --bench quic_bench
```
### Run Specific Benchmark Groups
```bash
# DTW performance tests
cargo bench --bench temporal_bench dtw
# Scheduler overhead tests
cargo bench --bench scheduler_bench overhead
# Phase space embedding
cargo bench --bench attractor_bench embedding
```
## Performance Targets
### 1. Temporal Compare (`temporal_bench.rs`)
**Targets:**
- DTW n=100: <10ms
- LCS n=100: <5ms
- Edit distance n=100: <3ms
- Cache hit: <1μs
**Benchmark Groups:**
```rust
dtw_benches // DTW performance across sizes
lcs_benches // LCS algorithm performance
edit_benches // Edit distance operations
cache_benches // Cache hit/miss scenarios
memory_benches // Memory allocation patterns
```
**Key Metrics:**
- Throughput (elements/second)
- Mean execution time
- Standard deviation
- Memory allocations
### 2. Nanosecond Scheduler (`scheduler_bench.rs`)
**Targets:**
- Schedule overhead: <100ns
- Task execution: <1μs
- Stats calculation: <10μs
- Multi-threaded scaling
**Benchmark Groups:**
```rust
overhead_benches // Schedule operation overhead
latency_benches // Task execution latency
queue_benches // Priority queue operations
stats_benches // Statistics calculation
threading_benches // Multi-threaded scenarios
```
**Key Scenarios:**
- High/low contention
- Priority variations
- Batch operations
- Concurrent scheduling
### 3. Temporal Attractor Studio (`attractor_bench.rs`)
**Targets:**
- Phase space embedding: <20ms (n=1000)
- Lyapunov calculation: <500ms
- Attractor detection: <100ms
- Dimension estimation
**Benchmark Groups:**
```rust
embedding_benches // Phase space reconstruction
lyapunov_benches // Lyapunov exponent calculation
detection_benches // Attractor type detection
trajectory_benches // Trajectory analysis
dimension_benches // Dimension estimation
chaos_benches // Chaos detection
pipeline_benches // Complete analysis pipeline
```
**Test Attractors:**
- Lorenz attractor
- Rössler attractor
- Hénon map
- Periodic signals
- Random data
### 4. Temporal Neural Solver (`solver_bench.rs`)
**Targets:**
- Formula encoding: <10ms
- Verification: <100ms
- Parsing: <5ms
- State checking: <1μs
**Benchmark Groups:**
```rust
encoding_benches // LTL formula encoding
parsing_benches // Formula parsing
verification_benches // Trace verification
state_benches // State operations
neural_benches // Neural verification
operator_benches // Temporal operators
pipeline_benches // Complete pipeline
```
**LTL Operations:**
- Next (X)
- Globally (G)
- Finally (F)
- Until (U)
- Boolean combinations
### 5. Strange Loop (`meta_bench.rs`)
**Targets:**
- Meta-learning iteration: <50ms
- Pattern extraction: <20ms
- Integration overhead: <100ms
- Recursive optimization
**Benchmark Groups:**
```rust
learning_benches // Meta-learning iteration
pattern_benches // Pattern extraction/matching
hierarchy_benches // Multi-level learning
integration_benches // Cross-crate integration
recursive_benches // Self-referential operations
pipeline_benches // Complete meta-learning cycle
```
**Integration Tests:**
- With temporal-compare (DTW)
- With nanosecond-scheduler
- With attractor-studio
- Cross-crate overhead
### 6. QUIC Multistream (`quic_bench.rs`)
**Targets:**
- Stream establishment: <1ms
- Multiplexing overhead: <100μs
- Throughput: >1GB/s
- Connection setup: <10ms
## Benchmark Configuration
### Criterion Settings
Each benchmark group uses optimized Criterion configuration:
```rust
criterion_group! {
name = benches;
config = Criterion::default()
.sample_size(100) // Statistical samples
.measurement_time(Duration::from_secs(10)) // Per benchmark
.warm_up_time(Duration::from_secs(3)); // Warmup period
targets = ...
}
```
### Custom Configurations
**Fast benchmarks** (overhead, parsing):
- sample_size: 500-1000
- measurement_time: 5s
**Slow benchmarks** (neural, integration):
- sample_size: 30-50
- measurement_time: 15s
## Understanding Results
### HTML Reports
After running benchmarks, view results at:
```
target/criterion/[benchmark_name]/report/index.html
```
### Key Metrics
1. **Mean**: Average execution time
2. **Std Dev**: Consistency indicator
3. **Median**: Central tendency
4. **MAD**: Median Absolute Deviation
5. **Throughput**: Operations per second
### Regression Detection
Criterion automatically detects performance regressions:
- Green: Performance improved
- Yellow: Within noise threshold
- Red: Performance regressed
## Profiling Integration
### With perf
```bash
cargo bench --bench temporal_bench -- --profile-time=10
perf record -g cargo bench --bench temporal_bench
perf report
```
### With flamegraph
```bash
cargo install flamegraph
cargo flamegraph --bench temporal_bench
```
### With valgrind (memory)
```bash
cargo bench --bench temporal_bench -- --profile-time=10
valgrind --tool=cachegrind target/release/temporal_bench
```
## Best Practices
### 1. Consistent Environment
- Close other applications
- Disable CPU frequency scaling
- Use consistent power settings
- Run multiple times
### 2. Baseline Establishment
```bash
# Create baseline
cargo bench -- --save-baseline main
# Compare against baseline
git checkout feature-branch
cargo bench -- --baseline main
```
### 3. Statistical Validity
- Minimum 30 samples for statistical significance
- Watch for outliers (high std dev)
- Multiple runs for consistency
### 4. Realistic Data
- Use production-like data sizes
- Include edge cases
- Test boundary conditions
- Vary input patterns
## CI/CD Integration
### GitHub Actions
```yaml
- name: Run benchmarks
run: cargo bench --no-fail-fast
- name: Upload benchmark results
uses: actions/upload-artifact@v3
with:
name: benchmark-results
path: target/criterion/
```
### Performance Tracking
Store baseline results in repo:
```bash
git add target/criterion/*/base/
git commit -m "Update benchmark baselines"
```
## Optimization Workflow
1. **Identify bottlenecks**: Run benchmarks, check reports
2. **Profile**: Use flamegraph/perf for hotspots
3. **Optimize**: Make targeted improvements
4. **Verify**: Re-run benchmarks
5. **Compare**: Check against baseline
6. **Document**: Update if targets change
## Common Issues
### High Variance
- System load too high
- Thermal throttling
- Background processes
- Insufficient samples
**Solution**: Increase sample size, close applications, check CPU frequency.
### Unexpected Regressions
- Compiler version changes
- Dependency updates
- System configuration
- Measurement noise
**Solution**: Compare multiple runs, check git diff, validate hardware.
### Memory Benchmarks Inconsistent
- GC timing (if applicable)
- Allocator behavior
- Page faults
- Cache effects
**Solution**: Increase warmup time, use fixed heap size, minimize allocations.
## Future Enhancements
- [ ] Continuous benchmark tracking
- [ ] Performance regression alerts
- [ ] Cross-platform comparison
- [ ] Memory profiling integration
- [ ] Automated optimization suggestions
- [ ] Benchmark result visualization
- [ ] Historical trend analysis
## Resources
- [Criterion.rs Documentation](https://bheisler.github.io/criterion.rs/book/)
- [Rust Performance Book](https://nnethercote.github.io/perf-book/)
- [Linux perf Tutorial](https://perf.wiki.kernel.org/index.php/Tutorial)
---
**Summary**: All 6 crates now have comprehensive benchmarks covering core functionality, edge cases, and integration scenarios. Total ~2,800 lines of benchmark code targeting specific performance goals for each crate.