wifi-densepose/vendor/sublinear-time-solver/plans/02-algorithms-implementatio...

25 KiB

Algorithm Implementation Plan - Sublinear Time Solver

Overview

This document outlines the implementation strategy for three complementary algorithmic techniques in the sublinear-time solver: Neumann Series Expansion, Forward & Backward Push Methods, and Hybrid Random-Walk Estimation.

1. Neumann Series Implementation

1.1 Core Algorithm Structure

function neumannSeries(M: Matrix, b: Vector, tolerance: f64) -> Vector:
    // Pre-process: Ensure ||M|| < 1 for convergence
    scaling_factor = 1.0 / spectralRadius(M)
    M_scaled = M * scaling_factor
    b_scaled = b * scaling_factor

    result = b_scaled.clone()
    power_term = b_scaled.clone()
    residual_norm = inf
    iteration = 0

    while residual_norm > tolerance and iteration < MAX_ITERATIONS:
        power_term = M_scaled * power_term
        result += power_term

        // Compute residual: ||b - (I - M)x||
        residual = b_scaled - (power_term - M_scaled * result)
        residual_norm = residual.l2_norm()

        iteration += 1

        // Early termination check
        if power_term.l2_norm() < tolerance * 1e-3:
            break

    return result

1.2 Matrix Scaling Strategies

Spectral Radius Estimation:

  • Power iteration method for dominant eigenvalue
  • Gershgorin circle theorem for bounds
  • Adaptive scaling based on matrix structure

Implementation Details:

pub struct ScalingStrategy {
    method: ScalingMethod,
    max_iterations: usize,
    tolerance: f64,
}

enum ScalingMethod {
    SpectralRadius,
    GershgorinBounds,
    FrobeniusNorm,
    Adaptive,
}

impl ScalingStrategy {
    pub fn compute_scaling_factor(&self, matrix: &SparseMatrix) -> f64 {
        match self.method {
            ScalingMethod::SpectralRadius => self.power_iteration(matrix),
            ScalingMethod::GershgorinBounds => self.gershgorin_estimate(matrix),
            ScalingMethod::FrobeniusNorm => 1.0 / matrix.frobenius_norm(),
            ScalingMethod::Adaptive => self.adaptive_scaling(matrix),
        }
    }
}

1.3 Series Truncation Logic

Convergence Criteria:

  1. Residual-based: ||r_k|| < tolerance
  2. Term magnitude: ||M^k b|| < epsilon * ||b||
  3. Relative improvement: ||x_{k+1} - x_k|| / ||x_k|| < delta

Adaptive Truncation:

function adaptiveTruncation(term_sequence: Iterator<Vector>) -> usize:
    terms = []
    for (i, term) in term_sequence.enumerate():
        terms.push(term)

        if i >= 3:  // Need minimum terms for analysis
            // Check for geometric decay
            ratio = terms[i].norm() / terms[i-1].norm()
            if ratio > 0.95:  // Poor convergence
                return i

            // Richardson extrapolation for acceleration
            if i % 3 == 0:
                extrapolated = richardsonExtrapolation(terms.last_n(3))
                if extrapolated.converged():
                    return i

    return MAX_TERMS

1.4 Vectorized Operations Design

SIMD Optimization:

use std::simd::{f64x4, Simd};

pub fn vectorized_axpy(alpha: f64, x: &[f64], y: &mut [f64]) {
    let alpha_vec = Simd::splat(alpha);

    for (x_chunk, y_chunk) in x.chunks_exact(4).zip(y.chunks_exact_mut(4)) {
        let x_vec = f64x4::from_slice(x_chunk);
        let y_vec = f64x4::from_slice(y_chunk);
        let result = alpha_vec * x_vec + y_vec;
        y_chunk.copy_from_slice(result.as_array());
    }
}

1.5 Error Bound Calculations

A Posteriori Error Estimates:

function errorBounds(x_approx: Vector, M: Matrix, b: Vector) -> ErrorBounds:
    residual = b - (I - M) * x_approx
    residual_norm = residual.l2_norm()

    // Condition number estimate
    condition_estimate = estimateConditionNumber(I - M)

    error_bound = ErrorBounds {
        absolute: residual_norm / (1 - spectralRadius(M)),
        relative: residual_norm / (condition_estimate * b.l2_norm()),
        backward: residual_norm / b.l2_norm(),
    }

    return error_bound

2. Push Methods (Forward/Backward)

2.1 Graph Representation for Push

Sparse Matrix Structure:

pub struct PushGraph {
    adjacency: CompressedSparseRow<f64>,
    reverse_adjacency: CompressedSparseRow<f64>,  // For backward push
    degrees: Vec<f64>,
    reverse_degrees: Vec<f64>,
}

impl PushGraph {
    pub fn from_matrix(matrix: &SparseMatrix) -> Self {
        let adjacency = matrix.to_csr();
        let reverse_adjacency = adjacency.transpose();

        Self {
            degrees: adjacency.row_sums(),
            reverse_degrees: reverse_adjacency.row_sums(),
            adjacency,
            reverse_adjacency,
        }
    }
}

2.2 Residual Vector Management

Forward Push Algorithm:

function forwardPush(graph: PushGraph, source: NodeId, alpha: f64, epsilon: f64) -> (Vector, Vector):
    n = graph.num_nodes()
    estimate = Vector::zeros(n)
    residual = Vector::zeros(n)
    residual[source] = 1.0

    work_queue = PriorityQueue::new()
    work_queue.push(source, residual[source])

    while let Some((node, _)) = work_queue.pop():
        if residual[node] < epsilon * graph.degrees[node]:
            continue

        push_amount = alpha * residual[node]
        estimate[node] += push_amount
        residual[node] -= push_amount

        remaining = (1.0 - alpha) * residual[node]
        residual[node] = 0.0

        // Distribute to neighbors
        for (neighbor, weight) in graph.adjacency.row(node):
            delta = remaining * weight / graph.degrees[node]
            residual[neighbor] += delta

            if residual[neighbor] >= epsilon * graph.degrees[neighbor]:
                work_queue.push(neighbor, residual[neighbor])

    return (estimate, residual)

2.3 Work Queue Optimization

Priority-Based Processing:

pub struct WorkQueue {
    heap: BinaryHeap<WorkItem>,
    in_queue: BitSet,
    threshold: f64,
}

#[derive(PartialEq, PartialOrd)]
struct WorkItem {
    priority: OrderedFloat<f64>,
    node_id: usize,
}

impl WorkQueue {
    pub fn push_if_threshold(&mut self, node: usize, residual: f64, degree: f64) {
        let priority = residual / degree;
        if priority >= self.threshold && !self.in_queue.contains(node) {
            self.heap.push(WorkItem {
                priority: OrderedFloat(priority),
                node_id: node,
            });
            self.in_queue.insert(node);
        }
    }

    pub fn adaptive_threshold(&mut self, queue_size: usize) {
        // Increase threshold if queue too large
        if queue_size > MAX_QUEUE_SIZE {
            self.threshold *= 1.1;
        } else if queue_size < MIN_QUEUE_SIZE {
            self.threshold *= 0.9;
        }
    }
}

2.4 Single-Index vs Full-Solution Modes

Mode Selection Strategy:

pub enum PushMode {
    SingleSource { source: usize, target: Option<usize> },
    MultiSource { sources: Vec<usize> },
    FullSolution,
}

impl PushSolver {
    pub fn solve(&self, mode: PushMode) -> SolutionResult {
        match mode {
            PushMode::SingleSource { source, target } => {
                let (estimate, residual) = self.forward_push(source);
                if let Some(t) = target {
                    SolutionResult::SingleValue(estimate[t])
                } else {
                    SolutionResult::SparseVector(estimate)
                }
            },
            PushMode::FullSolution => {
                self.solve_all_sources()
            },
            PushMode::MultiSource { sources } => {
                self.solve_multiple_sources(&sources)
            }
        }
    }
}

2.5 Visited Node Tracking

Efficient Set Operations:

pub struct VisitedTracker {
    visited: BitSet,
    visit_order: Vec<usize>,
    timestamps: Vec<u32>,
    current_time: u32,
}

impl VisitedTracker {
    pub fn mark_visited(&mut self, node: usize) -> bool {
        if !self.visited.contains(node) {
            self.visited.insert(node);
            self.visit_order.push(node);
            self.timestamps[node] = self.current_time;
            true
        } else {
            false
        }
    }

    pub fn reset_for_new_query(&mut self) {
        self.current_time += 1;
        if self.current_time == u32::MAX {
            self.full_reset();
        }
    }
}

3. Hybrid Random-Walk

3.1 Random Walk Simulation Engine

Core Random Walk Implementation:

function randomWalk(graph: Graph, start: NodeId, max_steps: usize, restart_prob: f64) -> WalkResult:
    current = start
    steps = 0
    path = [start]

    rng = RandomGenerator::new()

    while steps < max_steps:
        if rng.uniform() < restart_prob:
            return WalkResult::Restart(steps, path)

        neighbors = graph.neighbors(current)
        if neighbors.is_empty():
            return WalkResult::Sink(steps, path)

        // Weighted random selection
        weights = graph.edge_weights(current)
        next_node = weightedRandomChoice(neighbors, weights, rng)

        current = next_node
        path.push(current)
        steps += 1

    return WalkResult::MaxSteps(steps, path)

3.2 Sampling Strategies

Adaptive Sampling:

pub struct AdaptiveSampler {
    base_samples: usize,
    variance_threshold: f64,
    max_samples: usize,
    confidence_level: f64,
}

impl AdaptiveSampler {
    pub fn sample_until_converged<F>(&self, mut sampler: F) -> SamplingResult
    where
        F: FnMut() -> f64,
    {
        let mut samples = Vec::with_capacity(self.base_samples);
        let mut sum = 0.0;
        let mut sum_squares = 0.0;

        // Initial batch
        for _ in 0..self.base_samples {
            let sample = sampler();
            samples.push(sample);
            sum += sample;
            sum_squares += sample * sample;
        }

        loop {
            let n = samples.len() as f64;
            let mean = sum / n;
            let variance = (sum_squares - sum * sum / n) / (n - 1.0);
            let std_error = (variance / n).sqrt();

            // Check convergence using confidence interval
            let margin = self.confidence_level * std_error;
            if margin / mean.abs() < self.variance_threshold || samples.len() >= self.max_samples {
                break;
            }

            // Add more samples
            let batch_size = (samples.len() / 4).max(10);
            for _ in 0..batch_size {
                let sample = sampler();
                samples.push(sample);
                sum += sample;
                sum_squares += sample * sample;
            }
        }

        SamplingResult {
            estimate: sum / samples.len() as f64,
            variance: sum_squares / samples.len() as f64 - (sum / samples.len() as f64).powi(2),
            num_samples: samples.len(),
        }
    }
}

3.3 Push-Walk Coordination

Hybrid Strategy:

function hybridSolver(graph: Graph, source: NodeId, target: NodeId, epsilon: f64) -> f64:
    // Phase 1: Forward push to reduce problem size
    (push_estimate, residual) = forwardPush(graph, source, alpha=0.2, epsilon)

    // Phase 2: Random walks from high-residual nodes
    high_residual_nodes = residual.nonzero_indices_above(epsilon)

    walk_contribution = 0.0
    for node in high_residual_nodes:
        // Estimate transition probability from node to target
        prob_estimate = estimateTransitionProbability(graph, node, target, residual[node])
        walk_contribution += prob_estimate

    // Phase 3: Backward push from target (if beneficial)
    if shouldUseBackwardPush(graph, target, high_residual_nodes):
        (backward_estimate, _) = backwardPush(graph, target, alpha=0.2, epsilon)

        // Combine estimates using residual weights
        combined = combineBidirectionalEstimates(push_estimate, walk_contribution, backward_estimate, residual)
        return combined

    return push_estimate[target] + walk_contribution

3.4 Bidirectional Exploration

Meet-in-the-Middle Strategy:

pub struct BidirectionalWalker {
    forward_frontier: HashMap<usize, f64>,
    backward_frontier: HashMap<usize, f64>,
    meeting_probability: f64,
}

impl BidirectionalWalker {
    pub fn explore(&mut self, graph: &Graph, source: usize, target: usize) -> f64 {
        let forward_steps = self.forward_walk(graph, source);
        let backward_steps = self.backward_walk(graph, target);

        // Find intersection points
        let mut total_probability = 0.0;
        for (&node, &forward_prob) in &self.forward_frontier {
            if let Some(&backward_prob) = self.backward_frontier.get(&node) {
                total_probability += forward_prob * backward_prob;
            }
        }

        total_probability
    }

    fn should_meet(&self, forward_depth: usize, backward_depth: usize) -> bool {
        // Use graph diameter estimate to decide when frontiers should meet
        forward_depth + backward_depth >= self.estimated_diameter()
    }
}

3.5 Stochastic Error Estimates

Confidence Intervals:

function computeConfidenceInterval(samples: Vec<f64>, confidence: f64) -> (f64, f64):
    n = samples.len()
    mean = samples.mean()
    variance = samples.variance()
    std_error = sqrt(variance / n)

    // Use t-distribution for small samples, normal for large
    if n < 30:
        t_value = tDistributionQuantile(confidence, n - 1)
        margin = t_value * std_error
    else:
        z_value = normalQuantile(confidence)
        margin = z_value * std_error

    return (mean - margin, mean + margin)

4. Algorithm Selection Logic

4.1 Condition Number Estimation

Fast Condition Number Bounds:

pub fn estimate_condition_number(matrix: &SparseMatrix) -> f64 {
    // Use power iteration for largest eigenvalue
    let lambda_max = power_iteration(matrix, 50);

    // Use inverse power iteration for smallest eigenvalue
    let lambda_min = inverse_power_iteration(matrix, 50);

    lambda_max / lambda_min.abs()
}

pub fn power_iteration(matrix: &SparseMatrix, max_iter: usize) -> f64 {
    let n = matrix.nrows();
    let mut v = vec![1.0 / (n as f64).sqrt(); n];
    let mut lambda = 0.0;

    for _ in 0..max_iter {
        let w = matrix * &v;
        lambda = v.dot(&w);
        let norm = w.l2_norm();
        v = w / norm;
    }

    lambda
}

4.2 Method Auto-Selection Heuristics

Decision Tree:

pub enum SolverMethod {
    NeumannSeries,
    ForwardPush,
    BackwardPush,
    HybridRandomWalk,
    DirectSolver,
}

pub struct MethodSelector {
    matrix_analyzer: MatrixAnalyzer,
    performance_history: PerformanceTracker,
}

impl MethodSelector {
    pub fn select_method(&self, problem: &LinearSystemProblem) -> SolverMethod {
        let analysis = self.matrix_analyzer.analyze(&problem.matrix);

        // Decision criteria
        if analysis.condition_number < 10.0 {
            return SolverMethod::DirectSolver;
        }

        if analysis.sparsity > 0.99 && problem.query_type == QueryType::SingleEntry {
            return SolverMethod::ForwardPush;
        }

        if analysis.spectral_radius < 0.5 {
            return SolverMethod::NeumannSeries;
        }

        if problem.precision_requirement < 1e-6 {
            return SolverMethod::HybridRandomWalk;
        }

        // Default to adaptive hybrid
        SolverMethod::HybridRandomWalk
    }
}

4.3 Adaptive Switching During Solve

Dynamic Method Switching:

function adaptiveSolve(problem: LinearSystemProblem) -> Solution:
    current_method = selectInitialMethod(problem)
    solution_state = SolutionState::new()

    while not solution_state.converged():
        progress = executeMethod(current_method, problem, solution_state)

        if progress.stagnated():
            // Switch to different method
            candidates = alternativeMethods(current_method, problem)
            current_method = selectBestCandidate(candidates, solution_state)

        if progress.error_increased():
            // Fallback to more stable method
            current_method = conservativeFallback(problem)

        updateSolutionState(solution_state, progress)

    return solution_state.extract_solution()

4.4 Performance Profiling Hooks

Profiling Framework:

pub struct PerformanceProfiler {
    timers: HashMap<String, Instant>,
    counters: HashMap<String, u64>,
    memory_tracker: MemoryTracker,
}

impl PerformanceProfiler {
    pub fn profile<F, R>(&mut self, name: &str, f: F) -> R
    where
        F: FnOnce() -> R,
    {
        let start = Instant::now();
        let start_memory = self.memory_tracker.current_usage();

        let result = f();

        let duration = start.elapsed();
        let memory_delta = self.memory_tracker.current_usage() - start_memory;

        self.record_timing(name, duration);
        self.record_memory(name, memory_delta);

        result
    }
}

5. Numerical Stability

5.1 Precision Management (f32/f64)

Mixed Precision Strategy:

pub enum PrecisionMode {
    Single,    // f32
    Double,    // f64
    Mixed,     // f32 for bulk operations, f64 for critical computations
    Adaptive,  // Switch based on condition number
}

pub struct PrecisionManager {
    mode: PrecisionMode,
    promotion_threshold: f64,
    demotion_threshold: f64,
}

impl PrecisionManager {
    pub fn should_promote(&self, condition_number: f64) -> bool {
        condition_number > self.promotion_threshold
    }

    pub fn execute_with_precision<F, R>(&self, cond_num: f64, f: F) -> R
    where
        F: FnOnce(PrecisionLevel) -> R,
    {
        let precision = if self.should_promote(cond_num) {
            PrecisionLevel::Double
        } else {
            PrecisionLevel::Single
        };

        f(precision)
    }
}

5.2 Overflow/Underflow Handling

Safe Arithmetic Operations:

pub trait SafeArithmetic {
    fn safe_add(&self, other: &Self) -> Result<Self, ArithmeticError>
    where
        Self: Sized;

    fn safe_multiply(&self, other: &Self) -> Result<Self, ArithmeticError>
    where
        Self: Sized;
}

impl SafeArithmetic for f64 {
    fn safe_add(&self, other: &f64) -> Result<f64, ArithmeticError> {
        let result = self + other;
        if result.is_infinite() {
            Err(ArithmeticError::Overflow)
        } else if result == 0.0 && (*self != 0.0 || *other != 0.0) {
            Err(ArithmeticError::Underflow)
        } else {
            Ok(result)
        }
    }
}

5.3 Ill-Conditioned System Detection

Early Warning System:

function detectIllConditioning(matrix: Matrix) -> ConditioningReport:
    // Quick checks
    diagonal_dominance = checkDiagonalDominance(matrix)
    if diagonal_dominance < 0.1:
        return ConditioningReport::PoorlyConditioned

    // Eigenvalue spread estimation
    eigenvalue_ratio = estimateEigenvalueRatio(matrix)
    if eigenvalue_ratio > 1e12:
        return ConditioningReport::IllConditioned

    // Numerical rank estimation
    rank_deficiency = estimateRankDeficiency(matrix)
    if rank_deficiency > 0:
        return ConditioningReport::RankDeficient

    return ConditioningReport::WellConditioned

5.4 Residual Computation Strategies

High-Precision Residual:

pub fn compute_residual_extended_precision(
    matrix: &SparseMatrix<f64>,
    solution: &Vector<f64>,
    rhs: &Vector<f64>
) -> Vector<f64> {
    // Use extended precision for critical computation
    let matrix_ext: SparseMatrix<f128> = matrix.cast();
    let solution_ext: Vector<f128> = solution.cast();
    let rhs_ext: Vector<f128> = rhs.cast();

    let residual_ext = &rhs_ext - &matrix_ext * &solution_ext;

    // Cast back to working precision
    residual_ext.cast()
}

6. Incremental Updates

6.1 Delta Cost Propagation

Incremental Matrix Updates:

function incrementalUpdate(solver_state: SolverState, delta_matrix: SparseMatrix) -> SolverState:
    // Sherman-Morrison-Woodbury formula for low-rank updates
    if delta_matrix.rank() <= MAX_RANK_UPDATE:
        return shermanMorrisonUpdate(solver_state, delta_matrix)

    // Incremental push updates for localized changes
    affected_nodes = delta_matrix.nonzero_pattern()
    if affected_nodes.len() < solver_state.num_nodes() * 0.1:
        return localizedPushUpdate(solver_state, delta_matrix, affected_nodes)

    // Full recomputation for major changes
    return fullRecompute(solver_state.matrix + delta_matrix)

6.2 Partial Recomputation Logic

Smart Invalidation:

pub struct IncrementalSolver {
    cached_solutions: HashMap<QueryKey, CachedSolution>,
    dependency_graph: DependencyGraph,
    invalidation_frontier: BitSet,
}

impl IncrementalSolver {
    pub fn update_matrix(&mut self, changes: &MatrixDelta) {
        let affected_queries = self.dependency_graph.find_dependent_queries(changes);

        for query in affected_queries {
            self.invalidate_cached_solution(query);
            self.invalidation_frontier.insert(query.node_id);
        }

        // Propagate invalidation using push-based approach
        self.propagate_invalidation();
    }

    fn propagate_invalidation(&mut self) {
        while let Some(node) = self.invalidation_frontier.pop_first() {
            let dependents = self.dependency_graph.dependents(node);
            for dependent in dependents {
                if self.should_invalidate(dependent, node) {
                    self.invalidation_frontier.insert(dependent);
                }
            }
        }
    }
}

6.3 State Caching Mechanisms

Multi-Level Cache:

pub struct SolutionCache {
    l1_cache: LruCache<QueryKey, Vector<f64>>,  // Recent exact solutions
    l2_cache: LruCache<QueryKey, ApproximateSolution>,  // Approximate solutions
    l3_cache: PersistentCache<QueryKey, CompressedSolution>,  // Compressed historical data
}

impl SolutionCache {
    pub fn get_cached_solution(&self, query: &QueryKey) -> Option<CachedSolution> {
        // Check L1 first
        if let Some(exact) = self.l1_cache.get(query) {
            return Some(CachedSolution::Exact(exact.clone()));
        }

        // Check L2 for approximation
        if let Some(approx) = self.l2_cache.get(query) {
            if approx.meets_tolerance(query.tolerance) {
                return Some(CachedSolution::Approximate(approx.clone()));
            }
        }

        // Check L3 for warm start
        if let Some(compressed) = self.l3_cache.get(query) {
            let warm_start = compressed.decompress();
            return Some(CachedSolution::WarmStart(warm_start));
        }

        None
    }
}

6.4 Update Verification

Correctness Checking:

function verifyIncrementalUpdate(
    original_solution: Vector,
    updated_solution: Vector,
    matrix_delta: SparseMatrix,
    tolerance: f64
) -> VerificationResult:
    // Check solution validity
    original_residual = computeResidual(original_matrix, original_solution)
    updated_residual = computeResidual(updated_matrix, updated_solution)

    if updated_residual.norm() > original_residual.norm() * 1.1:
        return VerificationResult::ResidualIncreased

    // Check incremental consistency
    expected_change = estimateExpectedChange(matrix_delta, original_solution)
    actual_change = updated_solution - original_solution

    relative_error = (actual_change - expected_change).norm() / expected_change.norm()

    if relative_error < tolerance:
        return VerificationResult::Verified
    else:
        return VerificationResult::SuspiciousChange(relative_error)

Complexity Analysis

Time Complexity Summary

Algorithm Single Query Full Solution Space
Neumann Series O(k·nnz) O(k·n²) O(n)
Forward Push O(1/ε) O(n/ε) O(n)
Backward Push O(1/ε) O(n/ε) O(n)
Hybrid Random-Walk O(√n/ε) O(n√n/ε) O(√n)

Where:

  • k = number of series terms
  • nnz = number of non-zeros
  • ε = accuracy parameter
  • n = matrix dimension

Space-Time Tradeoffs

Memory-Efficient Mode:

  • Streaming computation for large matrices
  • On-demand residual computation
  • Compressed intermediate storage

Speed-Optimized Mode:

  • Full matrix precomputation
  • Aggressive caching
  • Parallel execution of multiple queries

Implementation Priority

  1. Phase 1: Core Neumann Series (Week 1-2)
  2. Phase 2: Forward Push Method (Week 3-4)
  3. Phase 3: Random Walk Engine (Week 5-6)
  4. Phase 4: Algorithm Selection & Hybrid Methods (Week 7-8)
  5. Phase 5: Numerical Stability & Incremental Updates (Week 9-10)
  6. Phase 6: Performance Optimization & Benchmarking (Week 11-12)

Testing Strategy

  • Unit tests for each algorithm component
  • Integration tests for method selection
  • Property-based testing for numerical stability
  • Benchmarking against reference implementations
  • Stress testing with ill-conditioned matrices
  • Performance regression testing

This implementation plan provides a comprehensive roadmap for building a robust, efficient, and numerically stable sublinear-time linear system solver.