Sublinear-time sparse solvers -- O(log n) PageRank, spectral methods, and linear systems in Rust and WASM.
Most numerical libraries use dense solvers that slow down dramatically as data grows. ruvector-solver provides seven specialized sparse algorithms that run in O(log n) to O(sqrt(n)) time, automatically picks the best one for your problem, and works in native Rust or in the browser via WebAssembly. It powers the graph analytics and AI coherence layers inside RuVector.
| Dense Solvers (e.g. nalgebra) | ruvector-solver | |
|---|---|---|
| Speed at scale | O(n^3) -- slows fast | O(nnz * log n) to O(log n) -- stays fast |
| Memory | Stores full n*n matrix | Only stores non-zero entries (CSR) |
| SIMD acceleration | Partial | AVX2 8-wide + fused kernels |
| Algorithm selection | Manual | Automatic router picks the best of 7 |
| PageRank | Not available | 3 sublinear algorithms built in |
| Browser / WASM | No | Full wasm-bindgen bindings |
[dependencies]
ruvector-solver = "0.1"use ruvector_solver::types::CsrMatrix;
use ruvector_solver::neumann::NeumannSolver;
// Build a sparse 3x3 system and solve it
let a = CsrMatrix::<f32>::from_coo(3, 3, vec![
(0, 0, 2.0_f32), (0, 1, -0.5),
(1, 0, -0.5), (1, 1, 2.0), (1, 2, -0.5),
(2, 1, -0.5), (2, 2, 2.0),
]);
let b = vec![1.0_f32, 0.0, 1.0];
let solver = NeumannSolver::new(1e-6, 500);
let result = solver.solve(&a, &b).unwrap();
println!("solution: {:?}", result.solution);
println!("iterations: {}", result.iterations);
println!("residual: {:.2e}", result.residual_norm);Or let the router pick the best algorithm automatically:
use ruvector_solver::router::{SolverRouter, QueryType};
use ruvector_solver::types::{CsrMatrix, ComputeBudget};
let router = SolverRouter::new();
let (algo, result) = router.solve(&matrix, &rhs, &ComputeBudget::default(), QueryType::LinearSystem).unwrap();
println!("Router selected: {:?}", algo);| Feature | What It Does | Why It Matters |
|---|---|---|
| 7 specialized algorithms | Neumann, CG, Forward Push, Backward Push, Random Walk, TRUE, BMSSP | Each tuned for a specific problem class |
| Automatic routing | SolverRouter analyzes matrix structure and picks the optimal algorithm | No need to be a numerical methods expert |
| Fallback chain | If the selected algorithm fails, tries CG, then dense | Robust convergence in production |
| AVX2 SIMD SpMV | 8-wide vectorized sparse matrix-vector multiply | Maximizes throughput on x86_64 |
| Fused residual + norm | Computes residual and norm in one pass instead of three | 3x less memory traffic per iteration |
| Arena allocator | Bump allocation for scratch buffers, O(1) reset | Zero heap allocation inside solve loops |
| WASM support | Full wasm-bindgen bindings | Run solvers in the browser |
| ComputeBudget | Set max time, iterations, and tolerance | Predictable resource usage |
| Algorithm | Module | Complexity | Applicable to |
|---|---|---|---|
| Jacobi-preconditioned Neumann series | neumann |
O(nnz * log(1/eps)) | Diagonally dominant Ax = b |
| Conjugate Gradient (Hestenes-Stiefel) | cg |
O(nnz * sqrt(kappa)) | Symmetric positive-definite Ax = b |
| Forward Push (Andersen-Chung-Lang) | forward_push |
O(1/epsilon) | Single-source Personalized PageRank |
| Backward Push | backward_push |
O(1/epsilon) | Reverse relevance / target-centric PPR |
| Hybrid Random Walk | random_walk |
O(sqrt(n)/epsilon) | Large-graph PPR with push initialisation |
| TRUE (JL + sparsification + Neumann) | true_solver |
O(nnz * log n) | Batch linear systems with shared A |
| BMSSP Multigrid (V-cycle + Jacobi) | bmssp |
O(n log n) | Ill-conditioned / graph Laplacian systems |
| Feature | Default | Description |
|---|---|---|
neumann |
Yes | Jacobi-preconditioned Neumann series solver |
cg |
Yes | Conjugate Gradient (Hestenes-Stiefel) solver |
forward-push |
Yes | Forward push for single-source PPR |
backward-push |
No | Backward push for reverse relevance computation |
hybrid-random-walk |
No | Hybrid random walk with push initialisation (enables getrandom) |
true-solver |
No | TRUE batch solver (implies neumann) |
bmssp |
No | BMSSP multigrid solver (V-cycle with Jacobi smoothing) |
all-algorithms |
No | Enables every algorithm above |
simd |
No | AVX2 SIMD-accelerated SpMV (x86_64 only) |
wasm |
No | WebAssembly target support |
parallel |
No | Multi-threaded SpMV and solver loops (enables rayon, crossbeam) |
full |
No | All algorithms + parallel + nalgebra-backend |
Enable all algorithms:
[dependencies]
ruvector-solver = { version = "0.1", features = ["all-algorithms"] }The inner SpMV loop is the single hottest path in every iterative solver. The
spmv_unchecked method on CsrMatrix<f32> and CsrMatrix<f64> uses raw
pointers to eliminate per-element bounds checks, relying on a one-time CSR
structural validation (validation::validate_csr_matrix) performed before
entering the solve loop.
Standard implementations compute the residual r = b - Ax and its squared norm
||r||^2 in separate passes (SpMV, vector subtraction, dot product -- three
full memory traversals). fused_residual_norm_sq computes both in a single
pass, reducing memory traffic by roughly 3x per iteration.
When the simd feature is enabled on x86_64, spmv_simd dispatches to an
AVX2 kernel that processes 8 f32 values per instruction using _mm256
intrinsics with a horizontal sum reduction at the end of each row. Falls back
to a portable scalar loop on other architectures.
The Neumann iteration's update step x[j] += d_inv[j] * r[j] is manually
unrolled 4-wide for instruction-level parallelism, with a scalar remainder loop
for dimensions not divisible by 4.
SolverArena provides bump allocation for per-solve scratch buffers. All
temporary vectors are allocated from a single contiguous backing buffer and
reclaimed in O(1) via arena.reset(), eliminating heap allocation overhead
inside the iteration loop.
+-------------------+
| SolverRouter |
| (algorithm select)|
+--------+----------+
|
+----------+-----------+-----------+----------+
| | | | |
+----v---+ +----v---+ +----v------+ +--v----+ +---v----+
|Neumann | | CG | |ForwardPush| | TRUE | | BMSSP |
|Solver | | Solver | | Solver | |Solver | |Solver |
+----+---+ +----+---+ +-----+-----+ +--+----+ +---+----+
| | | | |
+-----+----+-----+-----+-----+----+-----+---+
| | | |
+----v---+ +---v----+ +----v----+ +----v-----+
|types.rs| |simd.rs | |arena.rs | |budget.rs |
|CsrMatrix| |AVX2 | |Bump | |ComputeBudget|
+--------+ |SpMV | |Alloc | |enforcement|
+-------+ +--------+ +----------+
| | |
+----v---+ +---v------+ +--v---------+
|traits.rs| |validate.rs| |error.rs |
|SolverEngine| |CSR check| |SolverError |
+--------+ +---------+ +-----------+
The SolverRouter analyses the matrix SparsityProfile and QueryType
to select the optimal algorithm. When the selected algorithm fails,
SolverOrchestrator::solve_with_fallback tries a deterministic fallback
chain: selected -> CG -> Dense.
| Type | Description |
|---|---|
CsrMatrix<T> |
Compressed Sparse Row matrix with spmv, spmv_unchecked, from_coo, transpose |
SolverResult |
Solution vector, iteration count, residual norm, wall time, convergence history |
ComputeBudget |
Maximum time, max iterations, target tolerance |
Algorithm |
Enum of all solver algorithms (Neumann, CG, ForwardPush, ...) |
SparsityProfile |
Matrix structural analysis (density, diagonal dominance, spectral radius estimate) |
QueryType |
What the caller wants to solve (LinearSystem, PageRankSingle, Batch, ...) |
ComplexityEstimate |
Predicted flops, iterations, memory, and complexity class |
| Trait | Description |
|---|---|
SolverEngine |
Core trait: solve(matrix, rhs, budget) -> SolverResult |
SparseLaplacianSolver |
Extension for graph Laplacian systems and effective resistance |
SublinearPageRank |
Extension for sublinear PPR: ppr(matrix, source, alpha, epsilon) |
| Error | Cause |
|---|---|
SolverError::NonConvergence |
Iteration budget exhausted without reaching tolerance |
SolverError::NumericalInstability |
NaN/Inf or residual growth > 2x detected |
SolverError::SpectralRadiusExceeded |
Spectral radius >= 1.0 (Neumann series would diverge) |
SolverError::BudgetExhausted |
Wall-clock time limit exceeded |
SolverError::InvalidInput |
Dimension mismatch, non-finite values, index out of bounds |
SolverError::BackendError |
Backend-specific failure (nalgebra, BLAS) |
| Module | Description |
|---|---|
router.rs |
SolverRouter for automatic algorithm selection; SolverOrchestrator with fallback |
simd.rs |
AVX2-accelerated SpMV with runtime feature detection |
validation.rs |
CSR structural validation (index bounds, monotonic row_ptr, NaN/Inf) |
arena.rs |
SolverArena bump allocator for zero per-iteration heap allocation |
budget.rs |
ComputeBudget enforcement during solve |
audit.rs |
Audit logging for solver invocations |
events.rs |
Event system for solver lifecycle hooks |
The crate includes 177 tests (138 unit tests + 39 integration/doctests):
# Run all tests
cargo test -p ruvector-solver
# Run tests with all algorithms enabled
cargo test -p ruvector-solver --features all-algorithms
# Run a specific test module
cargo test -p ruvector-solver -- neumann::testsFive Criterion benchmark groups are provided:
# Run all benchmarks
cargo bench -p ruvector-solver
# Run a specific benchmark
cargo bench -p ruvector-solver --bench solver_neumann| Benchmark | Description |
|---|---|
solver_baseline |
Baseline SpMV and vector operations |
solver_neumann |
Neumann solver convergence on tridiagonal systems |
solver_cg |
Conjugate Gradient on SPD matrices |
solver_push |
Forward/backward push on graph adjacency matrices |
solver_e2e |
End-to-end solve through the router with algorithm selection |
Tutorial: Solving a Sparse Linear System
use ruvector_solver::types::CsrMatrix;
// 4x4 tridiagonal system (diagonally dominant)
let a = CsrMatrix::<f32>::from_coo(4, 4, vec![
(0, 0, 3.0), (0, 1, -1.0),
(1, 0, -1.0), (1, 1, 3.0), (1, 2, -1.0),
(2, 1, -1.0), (2, 2, 3.0), (2, 3, -1.0),
(3, 2, -1.0), (3, 3, 3.0),
]);
let b = vec![2.0f32, 1.0, 1.0, 2.0];use ruvector_solver::neumann::NeumannSolver;
let solver = NeumannSolver::new(1e-6, 500);
let result = solver.solve(&a, &b).unwrap();
println!("Solution: {:?}", result.solution);
println!("Iterations: {}", result.iterations);
println!("Residual: {:.2e}", result.residual_norm);use ruvector_solver::router::{SolverRouter, QueryType};
use ruvector_solver::types::{CsrMatrix, ComputeBudget};
let a64 = CsrMatrix::<f64>::from_coo(4, 4, vec![/* ... */]);
let b64 = vec![2.0, 1.0, 1.0, 2.0];
let budget = ComputeBudget::default();
let router = SolverRouter::new();
let (algo, result) = router.solve(&a64, &b64, &budget, QueryType::LinearSystem).unwrap();
println!("Router selected: {:?}", algo);use ruvector_solver::validation::validate_csr_matrix;
let errors = validate_csr_matrix(&a);
assert!(errors.is_empty(), "CSR validation failed: {:?}", errors);cargo bench -p ruvector-solver --bench solver_neumann
cargo bench -p ruvector-solver --bench solver_e2eTutorial: PageRank with Forward Push
use ruvector_solver::forward_push::ForwardPushSolver;
use ruvector_solver::types::CsrMatrix;
// Build adjacency matrix for a small graph
let adj = CsrMatrix::<f32>::from_coo(4, 4, vec![
(0, 1, 1.0), (1, 0, 1.0),
(1, 2, 1.0), (2, 1, 1.0),
(2, 3, 1.0), (3, 2, 1.0),
(0, 3, 1.0), (3, 0, 1.0),
]);
let solver = ForwardPushSolver::new(0.85, 1e-6); // alpha=0.85
let ppr = solver.ppr(&adj, 0); // PPR from node 0
println!("PPR scores: {:?}", ppr);| Crate | Role |
|---|---|
ruvector-attn-mincut |
Min-cut gating using graph solvers |
ruvector-coherence |
Coherence metrics for attention comparison |
ruvector-profiler |
Benchmarking memory, power, latency |
Rust 1.77 or later.
Licensed under the MIT License.
Part of RuVector -- the self-learning vector database.