History

ruv cd5943df23 Merge commit 'd803bfe2b1fe7f5e219e50ac20d6801a0a58ac75' as 'vendor/ruvector'		2026-02-28 14:39:40 -05:00
..
src	Merge commit 'd803bfe2b1fe7f5e219e50ac20d6801a0a58ac75' as 'vendor/ruvector'	2026-02-28 14:39:40 -05:00
Cargo.toml	Merge commit 'd803bfe2b1fe7f5e219e50ac20d6801a0a58ac75' as 'vendor/ruvector'	2026-02-28 14:39:40 -05:00
README.md	Merge commit 'd803bfe2b1fe7f5e219e50ac20d6801a0a58ac75' as 'vendor/ruvector'	2026-02-28 14:39:40 -05:00

README.md

rvf-solver-wasm

Self-learning temporal reasoning engine compiled to WebAssembly -- Thompson Sampling, three-loop adaptive solver, and cryptographic witness chains in ~160 KB.

Overview

rvf-solver-wasm compiles the complete AGI temporal puzzle solver to wasm32-unknown-unknown for use in browsers, Node.js, and edge runtimes. It is a no_std + alloc crate (same architecture as rvf-wasm) with a pure C ABI export surface -- no wasm-bindgen required.

The solver learns which solving strategy works best for each problem context using Thompson Sampling, compiles successful patterns into a signature cache, and proves its learning through a three-mode ablation test with SHAKE-256 witness chains.

Key Design Choices

Choice	Rationale
no_std + alloc	Matches `rvf-wasm` pattern; runs in any WASM runtime
Pure-integer `Date` type	Howard Hinnant algorithm replaces `chrono`; no std required
`BTreeMap` over `HashMap`	Available in `alloc`; deterministic iteration order
`libm` for float math	`sqrt`, `log`, `cos`, `pow` -- pure Rust, no_std compatible
xorshift64 RNG	Deterministic, zero dependencies, identical to benchmarks RNG
C ABI exports	Maximum compatibility -- works with any WASM host
Handle-based API	Up to 8 concurrent solver instances

Build

# Build the WASM module
cargo build --target wasm32-unknown-unknown --release -p rvf-solver-wasm

# Optimize with wasm-opt (optional, ~80-100 KB output)
wasm-opt -Oz target/wasm32-unknown-unknown/release/rvf_solver_wasm.wasm \
  -o rvf_solver_wasm.opt.wasm

Binary Size

Build	Size
Release (`wasm32-unknown-unknown`)	~160 KB
After `wasm-opt -Oz`	~80-100 KB

Architecture

Three-Loop Adaptive Solver

The engine uses a three-loop architecture where each loop operates on a different timescale:

 Fast loop (per puzzle)          Medium loop (per batch)       Slow loop (per cycle)
 ┌──────────────────────┐       ┌──────────────────────┐     ┌──────────────────────┐
 │ Constraint propagation│──────▶│ PolicyKernel selects  │────▶│ ReasoningBank tracks │
 │ Range narrowing       │       │ skip mode via Thompson│     │ trajectories         │
 │ Date enumeration      │       │ Sampling (two-signal) │     │ KnowledgeCompiler    │
 │ Solution validation   │       │ Speculative dual-path │     │ compiles patterns    │
 └──────────────────────┘       └──────────────────────┘     │ Checkpoint/rollback  │
                                                              └──────────────────────┘

Loop	Frequency	What it does
Fast	Every puzzle	Constraint propagation, range narrowing, date enumeration, solution check
Medium	Every puzzle	Thompson Sampling selects `None`/`Weekday`/`Hybrid` skip mode per context bucket
Slow	Per training cycle	ReasoningBank promotes successful trajectories; KnowledgeCompiler caches signatures

Five AGI Capabilities

#	Capability	Description
1	Thompson Sampling	Two-signal model: Beta posterior for safety (correct + no early-commit) + EMA for cost
2	18 Context Buckets	3 range (small/medium/large) x 3 distractor (clean/some/heavy) x 2 noise = 18 independent bandits
3	Speculative Dual-Path	When top-2 arms within delta 0.15 and variance > 0.02, speculatively execute secondary arm
4	KnowledgeCompiler	Constraint signature cache (`v1:{difficulty}:{sorted_types}`); compiled skip-mode, step budget, confidence
5	Acceptance Test	Multi-cycle training/holdout with A/B/C ablation and checkpoint/rollback on regression

Ablation Modes

Mode	Compiler	Router	Purpose
A (Baseline)	Off	Off	Fixed heuristic policy; establishes cost/accuracy baseline
B (Compiler)	On	Off	KnowledgeCompiler active; must show >= 15% cost decrease vs A
C (Full)	On	On	Thompson Sampling + speculation; must show robustness gain vs B

WASM Export Surface

Memory Management (2 exports)

Export	Signature	Description
`rvf_solver_alloc`	`(size: i32) -> i32`	Allocate WASM memory; returns pointer or 0
`rvf_solver_free`	`(ptr: i32, size: i32)`	Free previously allocated memory

Lifecycle (2 exports)

Export	Signature	Description
`rvf_solver_create`	`() -> i32`	Create solver instance; returns handle (>0) or -1
`rvf_solver_destroy`	`(handle: i32) -> i32`	Destroy solver; returns 0 on success

Training (1 export)

Export	Signature	Description
`rvf_solver_train`	`(handle, count, min_diff, max_diff, seed_lo, seed_hi) -> i32`	Train on `count` generated puzzles using three-loop learning; returns correct count

Parameters:

Parameter	Type	Description
`handle`	`i32`	Solver instance handle
`count`	`i32`	Number of puzzles to generate and solve
`min_diff`	`i32`	Minimum puzzle difficulty (1-10)
`max_diff`	`i32`	Maximum puzzle difficulty (1-10)
`seed_lo`	`i32`	Lower 32 bits of RNG seed
`seed_hi`	`i32`	Upper 32 bits of RNG seed

Acceptance Test (1 export)

Export	Signature	Description
`rvf_solver_acceptance`	`(handle, holdout, training, cycles, budget, seed_lo, seed_hi) -> i32`	Run full A/B/C ablation test; returns 1 = passed, 0 = failed, -1 = error

Parameters:

Parameter	Type	Description
`handle`	`i32`	Solver instance handle
`holdout`	`i32`	Number of holdout puzzles per evaluation
`training`	`i32`	Training puzzles per cycle
`cycles`	`i32`	Number of training/evaluation cycles
`budget`	`i32`	Maximum steps per puzzle solve
`seed_lo`	`i32`	Lower 32 bits of RNG seed
`seed_hi`	`i32`	Upper 32 bits of RNG seed

Result / Policy / Witness Reads (6 exports)

Export	Signature	Description
`rvf_solver_result_len`	`(handle: i32) -> i32`	Byte length of last result JSON
`rvf_solver_result_read`	`(handle: i32, out_ptr: i32) -> i32`	Copy result JSON to `out_ptr`; returns bytes written
`rvf_solver_policy_len`	`(handle: i32) -> i32`	Byte length of policy state JSON
`rvf_solver_policy_read`	`(handle: i32, out_ptr: i32) -> i32`	Copy policy JSON to `out_ptr`; returns bytes written
`rvf_solver_witness_len`	`(handle: i32) -> i32`	Byte length of witness chain (73 bytes/entry)
`rvf_solver_witness_read`	`(handle: i32, out_ptr: i32) -> i32`	Copy raw witness chain to `out_ptr`; returns bytes written

Usage from JavaScript

Node.js / Browser

import { readFile } from 'fs/promises';

// Load WASM module
const wasmBytes = await readFile('rvf_solver_wasm.wasm');
const { instance } = await WebAssembly.instantiate(wasmBytes);
const wasm = instance.exports;

// Create a solver instance
const handle = wasm.rvf_solver_create();
console.log('Solver handle:', handle); // 1

// Train on 500 puzzles (difficulty 1-8, seed 42)
const correct = wasm.rvf_solver_train(handle, 500, 1, 8, 42, 0);
console.log(`Training: ${correct}/500 correct`);

// Run full acceptance test (A/B/C ablation)
const passed = wasm.rvf_solver_acceptance(
  handle,
  100,  // holdout puzzles
  100,  // training per cycle
  5,    // cycles
  400,  // step budget
  42, 0 // seed
);
console.log('Acceptance test:', passed === 1 ? 'PASSED' : 'FAILED');

// Read the result manifest (JSON)
const resultLen = wasm.rvf_solver_result_len(handle);
const resultPtr = wasm.rvf_solver_alloc(resultLen);
wasm.rvf_solver_result_read(handle, resultPtr);
const resultJson = new TextDecoder().decode(
  new Uint8Array(wasm.memory.buffer, resultPtr, resultLen)
);
const manifest = JSON.parse(resultJson);
console.log('Mode A accuracy:', manifest.mode_a.cycles.at(-1).accuracy);
console.log('Mode B accuracy:', manifest.mode_b.cycles.at(-1).accuracy);
console.log('Mode C accuracy:', manifest.mode_c.cycles.at(-1).accuracy);
wasm.rvf_solver_free(resultPtr, resultLen);

// Read policy state (Thompson Sampling internals)
const policyLen = wasm.rvf_solver_policy_len(handle);
const policyPtr = wasm.rvf_solver_alloc(policyLen);
wasm.rvf_solver_policy_read(handle, policyPtr);
const policyJson = new TextDecoder().decode(
  new Uint8Array(wasm.memory.buffer, policyPtr, policyLen)
);
const policy = JSON.parse(policyJson);
console.log('Context buckets:', Object.keys(policy.context_stats).length);
console.log('Early commit rate:', (policy.early_commits_wrong / policy.early_commits_total * 100).toFixed(1) + '%');
wasm.rvf_solver_free(policyPtr, policyLen);

// Read witness chain (verifiable by rvf-wasm)
const witnessLen = wasm.rvf_solver_witness_len(handle);
const witnessPtr = wasm.rvf_solver_alloc(witnessLen);
wasm.rvf_solver_witness_read(handle, witnessPtr);
const witnessChain = new Uint8Array(
  wasm.memory.buffer, witnessPtr, witnessLen
).slice(); // copy out of WASM memory
console.log('Witness entries:', witnessLen / 73);
wasm.rvf_solver_free(witnessPtr, witnessLen);

// Clean up
wasm.rvf_solver_destroy(handle);

Verify Witness Chain with rvf-wasm

// Load both WASM modules
const solver = await WebAssembly.instantiate(solverWasmBytes);
const verifier = await WebAssembly.instantiate(rvfWasmBytes);

// Run acceptance test in solver
const handle = solver.instance.exports.rvf_solver_create();
solver.instance.exports.rvf_solver_acceptance(handle, 100, 100, 5, 400, 42, 0);

// Extract witness chain
const wLen = solver.instance.exports.rvf_solver_witness_len(handle);
const wPtr = solver.instance.exports.rvf_solver_alloc(wLen);
solver.instance.exports.rvf_solver_witness_read(handle, wPtr);
const chain = new Uint8Array(solver.instance.exports.memory.buffer, wPtr, wLen).slice();

// Copy into verifier memory and verify
const vPtr = verifier.instance.exports.rvf_alloc(wLen);
new Uint8Array(verifier.instance.exports.memory.buffer, vPtr, wLen).set(chain);
const entryCount = verifier.instance.exports.rvf_witness_verify(vPtr, wLen);

if (entryCount > 0) {
  console.log(`Witness chain verified: ${entryCount} entries`);
} else {
  console.error('Witness chain verification failed:', entryCount);
  // -2 = truncated, -3 = hash mismatch
}

verifier.instance.exports.rvf_free(vPtr, wLen);
solver.instance.exports.rvf_solver_destroy(handle);

Module Structure

crates/rvf/rvf-solver-wasm/
├── Cargo.toml           # no_std + alloc, dlmalloc, libm, serde_json
├── README.md            # This file
└── src/
    ├── lib.rs           # 12 WASM exports, instance registry, panic handler
    ├── alloc_setup.rs   # dlmalloc global allocator, rvf_solver_alloc/free
    ├── types.rs         # Date arithmetic, Constraint, Puzzle, Rng64
    ├── policy.rs        # PolicyKernel, Thompson Sampling, KnowledgeCompiler
    └── engine.rs        # AdaptiveSolver, ReasoningBank, PuzzleGenerator, acceptance test

File	Lines	Purpose
`types.rs`	239	Pure-integer date math (Howard Hinnant algorithm), 10 constraint types, puzzle checking, xorshift64 RNG
`policy.rs`	505	Thompson Sampling two-signal model, Marsaglia gamma sampling, 18 context buckets, KnowledgeCompiler signature cache
`engine.rs`	690	Three-loop solver, constraint propagation, ReasoningBank trajectory tracking, PuzzleGenerator, acceptance test runner
`lib.rs`	396	12 C ABI WASM exports, handle-based registry (8 slots), SHAKE-256 witness chain, panic handler
`alloc_setup.rs`	45	dlmalloc global allocator, `rvf_solver_alloc`/`rvf_solver_free` interop

Temporal Constraint Types

The solver handles 10 constraint types for temporal puzzle solving:

Constraint	Example	Description
`Exact(date)`	`2025-03-15`	Must be this exact date
`After(date)`	`> 2025-01-01`	Must be strictly after date
`Before(date)`	`< 2025-12-31`	Must be strictly before date
`Between(a, b)`	`2025-01-01..2025-06-30`	Must fall within range (inclusive)
`DayOfWeek(w)`	`Monday`	Must fall on this weekday
`DaysAfter(ref, n)`	`5 days after "meeting"`	Relative to named reference date
`DaysBefore(ref, n)`	`3 days before "deadline"`	Relative to named reference date
`InMonth(m)`	`March`	Must be in this month
`InYear(y)`	`2025`	Must be in this year
`DayOfMonth(d)`	`15th`	Must be this day of month

Thompson Sampling Details

Two-Signal Model

Each skip-mode arm (None, Weekday, Hybrid) maintains two signals per context bucket:

Signal	Distribution	Update Rule
Safety	Beta(alpha, beta)	alpha += 1 on correct & no early-commit; beta += 1 on failure, beta += 1.5 on early-commit wrong
Cost	EMA (alpha = 0.1)	Normalized step count (steps / 200), exponentially weighted

Composite score: sample_beta(alpha, beta) - 0.3 * cost_ema

Context Bucketing

Dimension	Levels	Thresholds
Range	small, medium, large	0-60, 61-180, 181+ days
Distractors	clean, some, heavy	0, 1, 2+ duplicate constraint types
Noise	clean, noisy	Whether puzzle has injected noise

Total: 3 x 3 x 2 = 18 independent bandit contexts

Speculative Dual-Path

When the top-2 arms are within delta 0.15 of each other and the leading arm's variance exceeds 0.02, the solver speculatively executes the secondary arm. This accelerates convergence in uncertain contexts.

Integration with RVF Ecosystem

┌──────────────────────┐         ┌──────────────────────┐
│   rvf-solver-wasm    │         │     rvf-wasm         │
│   (self-learning     │ ──────▶ │   (verification)     │
│    AGI engine)       │ witness │                      │
│                      │ chain   │ rvf_witness_verify   │
│ rvf_solver_train     │         │ rvf_witness_count    │
│ rvf_solver_acceptance│         │                      │
│ rvf_solver_witness_* │         │ rvf_store_*          │
└──────────┬───────────┘         └──────────────────────┘
           │ uses
    ┌──────▼──────┐
    │  rvf-crypto  │
    │  SHAKE-256   │
    │  witness     │
    │  chain       │
    └─────────────┘

rvf-solver-wasm produces witness chains via rvf-crypto::create_witness_chain
rvf-wasm verifies those chains via rvf_witness_verify (73 bytes per entry)
Both modules run in the browser -- no backend required

Dependencies

Crate	Version	Purpose
`rvf-types`	0.1.0	Shared RVF type definitions
`rvf-crypto`	0.1.0	SHAKE-256 hashing and witness chain creation
`dlmalloc`	0.2	Global allocator for WASM heap
`libm`	0.2	`no_std` float math (`sqrt`, `log`, `cos`, `pow`)
`serde`	1.0	Serialization (no_std, alloc features)
`serde_json`	1.0	JSON output for result/policy manifests (no_std, alloc)

Determinism

Given identical seeds, the WASM module produces identical results:

Same seed produces same puzzles (xorshift64 RNG)
Same puzzles produce same learning trajectory
Same trajectory produces same witness chain hashes

Minor float precision differences between native and WASM (due to libm vs std f64 methods) may cause Thompson Sampling to diverge over many iterations, but acceptance test outcomes should converge.

Benchmarks

Run the native reference benchmark:

cargo run --bin wasm-solver-bench -- --holdout 50 --training 50 --cycles 3

Reference results (native):

Mode	Accuracy	Cost/Solve	Noise Accuracy	Pass
A (baseline)	100%	~43	~100%	PASS
B (compiler)	100%	~10	~100%	PASS
C (learned)	100%	~10	~100%	PASS

B vs A cost decrease: ~76% (threshold: >= 15%)
Thompson Sampling converges across 13+ context buckets with 3 unique skip modes

ADR-032 -- RVF WASM integration
ADR-037 -- Publishable RVF acceptance test
ADR-038 -- npx/rvlite witness verification
ADR-039 -- RVF solver WASM AGI integration (this crate)

License

MIT OR Apache-2.0