wifi-densepose/vendor/midstream/benches/README.md

410 lines
9.9 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Midstream Benchmarks
Comprehensive performance benchmarks for all 6 crates in the Midstream workspace.
## 📊 Quick Start
### Run All Benchmarks
```bash
./scripts/run_benchmarks.sh
```
### Run Individual Benchmarks
```bash
# Temporal Compare (DTW, LCS, Edit Distance)
cargo bench --bench temporal_bench
# Nanosecond Scheduler
cargo bench --bench scheduler_bench
# Temporal Attractor Studio
cargo bench --bench attractor_bench
# Temporal Neural Solver
cargo bench --bench solver_bench
# Strange Loop (Meta-Learning)
cargo bench --bench meta_bench
# QUIC Multistream
cargo bench --bench quic_bench
```
### Compare Branches
```bash
./scripts/benchmark_comparison.sh main feature-branch
```
## 📁 Benchmark Files
### 1. `temporal_bench.rs` - Temporal Compare Benchmarks
**Coverage:**
- DTW (Dynamic Time Warping) performance
- LCS (Longest Common Subsequence) algorithms
- Edit distance calculations
- Cache hit/miss scenarios
- Memory allocation patterns
**Performance Targets:**
- DTW n=100: <10ms
- LCS n=100: <5ms
- Edit distance n=100: <3ms
- Cache hit: <1μs
**Test Scenarios:**
```
dtw_performance/ # DTW across various sizes (10-1000)
dtw_similarity/ # Similarity variations (50%-99%)
lcs_performance/ # LCS with identical/similar/different sequences
edit_distance_ops/ # Insertions, deletions, substitutions
cache_scenarios/ # Cache hits, misses, eviction
memory_allocation/ # Small, large, repeated allocations
```
**Lines:** ~450
---
### 2. `scheduler_bench.rs` - Nanosecond Scheduler Benchmarks
**Coverage:**
- Schedule operation overhead
- Task execution latency
- Priority queue operations
- Statistics calculation
- Multi-threaded scheduling
- Contention scenarios
**Performance Targets:**
- Schedule overhead: <100ns
- Task execution: <1μs
- Stats calculation: <10μs
**Test Scenarios:**
```
schedule_overhead/ # Single, batch, priority scheduling
schedule_priorities/ # Critical, high, normal, low
execution_latency/ # Minimal, light, medium, heavy compute
execution_throughput/ # 10-1000 tasks
priority_queue/ # Push, pop, mixed operations
statistics/ # Stats collection with varying history
multithreaded/ # 1, 2, 4, 8 threads
contention/ # High/low contention scenarios
```
**Lines:** ~520
---
### 3. `attractor_bench.rs` - Temporal Attractor Studio Benchmarks
**Coverage:**
- Phase space embedding
- Lyapunov exponent calculation
- Attractor detection
- Trajectory analysis
- Dimension estimation
- Chaos detection
**Performance Targets:**
- Phase space n=1000: <20ms
- Lyapunov calculation: <500ms
- Attractor detection: <100ms
**Test Scenarios:**
```
phase_space_embedding/ # Dimensions 2, 3, 5 with varying delays
embedding_delays/ # Delays 1-50
lyapunov_exponent/ # Lorenz, Rössler, periodic signals
attractor_detection/ # Known attractors, varying sizes
trajectory_analysis/ # Reconstruction, distances, neighbors
dimension_estimation/ # Correlation dimension, varying samples
chaos_detection/ # Chaotic, periodic, random signals
complete_pipeline/ # Full analysis workflow
```
**Test Attractors:**
- Lorenz attractor
- Rössler attractor
- Hénon map
- Periodic signals
- Random data
**Lines:** ~480
---
### 4. `solver_bench.rs` - Temporal Neural Solver Benchmarks
**Coverage:**
- LTL formula encoding
- Formula parsing
- Trace verification
- State operations
- Neural network verification
- Temporal logic operators
**Performance Targets:**
- Formula encoding: <10ms
- Verification: <100ms
- Formula parsing: <5ms
- State checking: <1μs
**Test Scenarios:**
```
formula_encoding/ # Simple, complex, safety, liveness, nested
formula_parsing/ # Various LTL formula types
trace_verification/ # Simple/complex formulas, varying trace lengths
verification_outcomes/ # Satisfying vs violating traces
state_operations/ # Creation, checking, comparison, trace ops
neural_verification/ # Encoding, inference, training overhead
temporal_operators/ # Next, Globally, Finally, Until
complete_pipeline/ # Parse → Encode → Verify
```
**LTL Operators:**
- Next (X)
- Globally (G)
- Finally (F)
- Until (U)
- Boolean combinations (∧, , →)
**Lines:** ~490
---
### 5. `meta_bench.rs` - Strange Loop Benchmarks
**Coverage:**
- Meta-learning iterations
- Pattern extraction
- Multi-level learning hierarchies
- Cross-crate integration
- Self-referential operations
- Recursive optimization
**Performance Targets:**
- Meta-learning iteration: <50ms
- Pattern extraction: <20ms
- Integration overhead: <100ms
**Test Scenarios:**
```
meta_learning/ # Simple, complex, varying batch sizes
incremental_learning/ # Progressive, with forgetting
pattern_extraction/ # Simple/complex patterns, varying sizes
pattern_matching/ # Single, batch matching
multi_level_learning/ # 2-5 level hierarchies
level_transition/ # Bottom-up, top-down propagation
cross_crate/ # Integration with other crates
self_referential/ # Self-improvement, meta-patterns
recursive_opt/ # Varying recursion depths
complete_pipeline/ # Full meta-learning cycle
```
**Integration Tests:**
- temporal-compare (DTW)
- nanosecond-scheduler
- attractor-studio
- Cross-crate overhead measurement
**Lines:** ~500
---
### 6. `quic_bench.rs` - QUIC Multistream Benchmarks
**Coverage:**
- Stream establishment
- Multiplexing operations
- Connection setup
- Throughput measurement
- Concurrent streams
- Error handling
**Performance Targets:**
- Stream establishment: <1ms
- Multiplexing overhead: <100μs
- Throughput: >1GB/s ✓
- Connection setup: <10ms
**Test Scenarios:**
```
stream_establishment/ # Single, concurrent streams
multiplexing/ # Overhead, fairness, priority
connection_setup/ # Handshake, TLS, QUIC params
throughput/ # Small, large, streaming data
concurrent_streams/ # 1-100 concurrent streams
error_scenarios/ # Timeout, disconnect, recovery
```
**Lines:** ~420 (already created)
---
## 🎯 Performance Summary
| Benchmark | Target | Status |
|-----------|--------|--------|
| **Temporal Compare** | | |
| DTW n=100 | <10ms | |
| LCS n=100 | <5ms | |
| Edit distance n=100 | <3ms | |
| **Scheduler** | | |
| Schedule overhead | <100ns | |
| Task execution | <1μs | |
| Stats calculation | <10μs | |
| **Attractor Studio** | | |
| Phase space n=1000 | <20ms | |
| Lyapunov calc | <500ms | |
| Detection | <100ms | |
| **Neural Solver** | | |
| Formula encoding | <10ms | |
| Verification | <100ms | |
| Parsing | <5ms | |
| **Strange Loop** | | |
| Meta-learning | <50ms | |
| Pattern extraction | <20ms | |
| Integration | <100ms | |
| **QUIC Multistream** | | |
| Stream setup | <1ms | |
| Multiplexing | <100μs | |
| Throughput | >1GB/s | ✓ |
## 📈 Viewing Results
### HTML Reports
After running benchmarks:
```bash
open target/criterion/temporal_bench/report/index.html
open target/criterion/scheduler_bench/report/index.html
open target/criterion/attractor_bench/report/index.html
open target/criterion/solver_bench/report/index.html
open target/criterion/meta_bench/report/index.html
open target/criterion/quic_bench/report/index.html
```
### Summary Report
```bash
cat target/criterion/SUMMARY.md
```
## 🔧 Configuration
### Criterion Settings
Each benchmark uses optimized Criterion configuration:
```rust
criterion_group! {
name = benches;
config = Criterion::default()
.sample_size(100) // 100 statistical samples
.measurement_time(Duration::from_secs(10)) // 10s per benchmark
.warm_up_time(Duration::from_secs(3)); // 3s warmup
targets = ...
}
```
### Custom Configurations
**Fast benchmarks** (overhead, parsing):
- sample_size: 500-1000
- measurement_time: 5s
**Slow benchmarks** (neural, integration):
- sample_size: 30-50
- measurement_time: 15s
## 🎨 Best Practices
### 1. Environment Setup
```bash
# Disable CPU frequency scaling
sudo cpupower frequency-set --governor performance
# Close unnecessary applications
# Run benchmarks in isolated environment
```
### 2. Baseline Management
```bash
# Create baseline
cargo bench -- --save-baseline main
# Compare with baseline
git checkout feature-branch
cargo bench -- --baseline main
```
### 3. Statistical Validity
- Minimum 30 samples for significance
- Watch for high standard deviation
- Multiple runs for consistency
- Check for outliers
### 4. Profiling Integration
```bash
# With flamegraph
cargo flamegraph --bench temporal_bench
# With perf
perf record -g cargo bench --bench temporal_bench
perf report
# With valgrind
valgrind --tool=cachegrind target/release/deps/temporal_bench-*
```
## 📊 Understanding Metrics
### Key Metrics
1. **Mean**: Average execution time
2. **Std Dev**: Consistency indicator
3. **Median**: Central tendency
4. **MAD**: Median Absolute Deviation
5. **Throughput**: Operations per second
### Regression Detection
Criterion automatically detects performance regressions:
- 🟢 Green: Performance improved
- 🟡 Yellow: Within noise threshold
- 🔴 Red: Performance regressed
## 🔄 CI/CD Integration
### GitHub Actions
```yaml
- name: Run benchmarks
run: ./scripts/run_benchmarks.sh
- name: Upload benchmark results
uses: actions/upload-artifact@v3
with:
name: benchmark-results
path: target/criterion/
```
## 📚 Resources
- [Criterion.rs Documentation](https://bheisler.github.io/criterion.rs/book/)
- [Rust Performance Book](https://nnethercote.github.io/perf-book/)
- [Benchmark Guide](../docs/BENCHMARK_GUIDE.md)
## 🎯 Total Coverage
- **Total benchmark files**: 6
- **Total benchmark groups**: 45+
- **Total test scenarios**: 150+
- **Total lines of benchmark code**: ~2,860
- **Performance targets tracked**: 18
---
**All benchmarks are production-ready with realistic data, comprehensive coverage, and clear performance targets.**