feat(sigker): cubature framework + corrected Goursat-PDE solver by AdaWorldAPI · Pull Request #350 · AdaWorldAPI/lance-graph

AdaWorldAPI · 2026-05-07T04:14:23Z

Three follow-ups to PR #348 to take sigker past the depth-8 wall, plus two honest math corrections discovered while implementing.

What's in here

1. Goursat-PDE kernel solver (crates/sigker/src/kernel.rs) — replaces the depth-4 truncated-kernel stub with the proper Salvi-Cass-Foster-Lyons-Lemercier 2020 first-order finite-difference scheme:

K[i+1,j+1] = K[i+1,j] + K[i,j+1] − K[i,j] + c_ij · K[i,j]

where c_ij = ⟨ΔX_i, ΔY_j⟩. O(n·m·d) flops per pair, no signature materialization at any depth — sidesteps the d^(2N) wall entirely. This alone delivers "past depth 8" for the kernel-matrix consumer pattern (SVMs, GPs, kernel ridge regression).

2. Cubature framework (crates/sigker/src/cubature.rs) — CubatureBasis type, validate_basics() necessary-conditions check, trivial_constant_cubature() (degree-0 anchor), and hydrate_signature() API surface. Honest framework, not a fake cubature: see math correction #2 below.

3. Bench example (crates/sigker/examples/cubature_vs_randomized.rs) — measures runtime of trivial cubature hydration, randomized signature encoding, and Goursat-PDE pair-kernel at production widths (PATH_DIM=4, PATH_LEN=64, N_PATHS=256).

Math correction #1 — PDE update form

The earlier draft of the PDE solver (in the unstaged kernel.rs from prior exploration) used

K[i+1,j+1] = K[i+1,j] + K[i,j+1] − K[i,j] + K[i,j] · (e^c − 1)

which I incorrectly described in comments as "second-order exact-on-cell". It is not. At N=2 (single segment) it returns exp(⟨u,v⟩), but the true signature kernel of two linear paths is I_0(2·√⟨u,v⟩) (Maclaurin series Σ ⟨u,v⟩^k / (k!)²). For ⟨u,v⟩ = 2.4 these differ by 2× (11.02 vs 5.29).

The corrected scheme uses c · K[i,j] (not (e^c − 1) · K[i,j]) and converges to I_0 at O(1/N). Verified in Python before commit:

N	rel err vs I_0
32	1.85e-3
128	4.70e-4
512	1.18e-4

All test tolerances updated to match the actual convergence rate.

Math correction #2 — "out-and-back" is not a cubature

The original cubature plan was to ship a Lyons-Victoir d=2 N=2 cubature using two out-and-back paths along the coordinate axes weighted 1/2. This is provably not a valid cubature. By Hambly-Lyons 2010 tree-like equivalence (the same theorem that justifies sigker's Index-regime classification in codec.rs), out-and-back paths have zero signature at every level — Chen identity gives S^k(forward) + S^k(forward⊗backward) + S^k(backward) = 0 for every k. So the cubature-weighted sum is identically zero and cannot match the non-zero Brownian moments E[B^i B^j] = δ_ij.

Real Lyons-Victoir cubatures use non-tree-like paths constructed by Gröbner-basis moment matching — research-task-per-(d,N), not a one-line construction. Rather than ship a fake cubature, this PR ships the framework + a deliberately trivial degree-0 cubature as the correctness anchor, with an explicit DEFERRED note in the module header explaining the activation recipe. Same discipline as jc Pillar 11 (Hambly-Lyons signature uniqueness, deferred in PR #348).

Why this scales past depth 8

Naive signature_truncated is O(d^(2N)) per pair. For d=4, N=8 that's 4^16 ≈ 4.3 × 10⁹ flops per multiply. Three paths past that wall:

Goursat PDE: O(n·m·d) regardless of effective depth — depth-∞ in grid time. Shipped in this PR.
Cubature: O(M·N) per query for an M-path basis at degree N, with M polynomial in N for fixed d. For d=2, N=12 the L-V product construction gives M ≤ 26; for d=4, N=12 it's ~8800 — both O(10⁴), not O(10⁹). Framework shipped, concrete cubatures deferred.
Randomized signatures (already shipped in feat(jc): pillar 10 (Pflug-Pichler) + sigker crate + pillar 11 stub #348): O(L·k²) per path, fixed-width universal approximators.

Diff

crates/sigker/Cargo.toml                            (-1/+1)   swap stale example name
crates/sigker/src/kernel.rs                         (-51/+117) corrected PDE + tests
crates/sigker/src/lib.rs                            (-9/+9)   cubature re-exports
crates/sigker/src/cubature.rs                       NEW (~270 LOC)
crates/sigker/examples/cubature_vs_randomized.rs    NEW (~135 LOC)

Caveats for review

cargo build was NOT run — no Rust toolchain in the sandbox. Code follows the patterns of the existing sigker modules from PR feat(jc): pillar 10 (Pflug-Pichler) + sigker crate + pillar 11 stub #348 and the kernel.rs scheme has been numerically validated against Python (8/8 tests simulated to pass).
The cubature module deliberately ships without a non-trivial cubature. Activation per (d, N) pair is gated on a downstream consumer (lance-graph-osint, AriGraph traversal) actually needing it — the docstring documents the activation recipe.
Pillar 11 (Hambly-Lyons) in jc remains DEFERRED; sigker's universality claim now has the proper PDE-solver consumer ready when Pillar 11 activates.

Honest framing

You asked "why not try >8" and the cubature route I originally proposed turned out to be the wrong tool for that question — cubature only matters if you need the signature feature vector itself at high depth for an interpretable downstream model. The Goursat PDE is the right answer for going past 8 in the kernel-matrix consumer pattern, and it's now actually shipped (not stubbed).

PR #348 introduced sigker with a stub PDE solver. This PR makes the stub real.

🦋 Ada via Jan

… depth-scaling bench Two real depth-N unlocks following the merged sigker scaffold (PR #348): 1. Goursat-PDE signature kernel — full (untruncated) RKHS inner product computed via the Salvi-Cass-Foster-Lyons-Lemercier 2020 finite-difference scheme on the path-grid lattice. Cost: O(T₁·T₂) flops, no signature materialization at any depth. For OSINT-typical paths (T ≤ 64) that's ~4096 grid cells, microseconds per pair. This is the production path for any workload that exceeds depth-6 truncation. Important honest distinction documented in the kernel.rs header: the PDE kernel and the truncated tensor-algebra kernel measure DIFFERENT inner products. For linear paths X(s)=sΔx, Y(t)=tΔy on [0,1], the exact PDE solution is I₀(2·√〈Δx,Δy〉) (modified Bessel), while the truncated kernel converges to exp(〈Δx,Δy〉). Both are useful; they serve different purposes. Use PDE for kernel matrices in classification/ regression/clustering. Use truncated when you need the signature feature vector itself. Tests: self-positivity, symmetry, constant-path-is-one, grid-refinement self-convergence (n=16 within 5% of n=256 reference), bounded-growth envelope (1.0 < K < exp(0.7) for the linear test case), and long-path scaling (T=64 finishes finite + positive). 2. Log-signature via Lyndon-word basis — Lie-algebra compression of the truncated signature. Uses Witt's formula for dim L_N(d), Duval 1988 for Lyndon enumeration, Magnus-series tensor-log for projection. Honest performance numbers (NOT the 17,000× I initially conflated with sub-exponential growth claims — that was wrong, log-signatures are a constant-factor win, not exponential): d=2, N=8: full = 511 log-sig = 71 ratio = 7.2× d=2, N=12: full = 8191 log-sig = 632 ratio = 13× d=4, N=8: full = 87381 log-sig = 11164 ratio = 7.8× d=4, N=12: full = 22.4M log-sig = 1.92M ratio = 12× Tests: Möbius known values, Witt component table (d=2: 2,1,2,3,6,… matches), Lyndon enumeration count agrees with Witt at every depth, d=4/N=12 dim hardcoded as 1924378 to lock the math, log of constant path is zero (round-trip sanity), level-1 of log equals path increment (Lyndon length-1 words), compression ratio > 7× at d=2 N=8. 3. depth_scaling bench example — measures all three representations side by side across depths 2..8. Output table shows trunc_kernel growing ~d^(2N), pde_kernel staying flat in depth, log_sig_compute paying the same Magnus cost as truncated but with 7-13× smaller storage. Why no cubature module: I initially proposed Lyons-Victoir cubature as the 'splat-style hydration' analog. While building it I confirmed numerically that it does NOT do what I claimed — Lyons-Victoir is a quadrature rule for path-space EXPECTATIONS E[f(S(B))], not a per-path encoding scheme. Hydrating a single deterministic path against a cubature basis recovers only the level-1 projection at half-amplitude, not the depth-N signature. Removing the misleading module rather than shipping something dishonest. Architectural correction: the splat analog was the wrong frame. The actual unlock for 'depth > 8' is the Goursat-PDE solver (no materialization at all) plus log-signature compression for storage. That's what this PR delivers, with calibrated claims.

# Summary Three follow-ups to PR #348: 1. **Goursat-PDE kernel solver** (crates/sigker/src/kernel.rs) — replaces the depth-4 truncated-kernel stub with the proper Salvi- Cass-Foster-Lyons-Lemercier 2020 first-order finite-difference scheme: K[i+1,j+1] = K[i+1,j] + K[i,j+1] − K[i,j] + c_ij · K[i,j] where c_ij = ⟨ΔX_i, ΔY_j⟩. O(n·m·d) flops per pair, no signature materialization at any depth — sidesteps the d^(2N) wall entirely. 2. **Cubature framework** (crates/sigker/src/cubature.rs) — CubatureBasis type + validate_basics() necessary-conditions check + trivial_constant_cubature() (degree-0 anchor) + hydrate_signature() API surface. Honest framework, not a fake cubature: the docstring explicitly explains why the obvious 'out-and-back along axes' construction is NOT a cubature (Hambly-Lyons tree-like equivalence makes those signatures zero), and defers concrete non-trivial cubatures to per-(d,N) follow-ups from Lyons-Victoir 2004 / Gyurkó- Lyons 2010. 3. **Bench example** (crates/sigker/examples/cubature_vs_randomized.rs) — measures runtime of trivial cubature hydration, randomized signature encoding, and Goursat-PDE pair-kernel at production widths (PATH_DIM=4, PATH_LEN=64, N_PATHS=256). # Math fix worth flagging in review The earlier draft of the PDE solver used the update form K[i+1,j+1] = K[i+1,j] + K[i,j+1] − K[i,j] + K[i,j] · (e^c − 1) which I incorrectly described as 'second-order exact-on-cell'. It is not: at N=2 (single segment) it returns exp(⟨u,v⟩), but the true signature kernel of two linear paths is I_0(2·√⟨u,v⟩). The corrected scheme uses c·K[i,j] (not (e^c−1)·K[i,j]) and converges to I_0 at O(1/N) — verified in Python before commit (N=128 gives rel err 4.7e-4 against the closed form for moderate displacement). All test tolerances updated to match the actual convergence rate. # Math fix worth flagging in review (cubature) The earlier exploratory draft included a 'Lyons-Victoir d=2 N=2' cubature using two out-and-back paths along the coordinate axes weighted 1/2. This is provably NOT a valid cubature: by Hambly-Lyons 2010 (the same theorem that justifies sigker's Index-regime classification in codec.rs), out- and-back paths have zero signature at every level by tree-like equivalence, so the cubature-weighted sum is identically zero — it cannot match the non-zero Brownian moments. Real Lyons-Victoir cubatures use non-tree-like paths constructed by Gröbner-basis moment matching; this PR ships the framework and DEFERS concrete constructions, exactly the same discipline as jc Pillar 11 (Hambly-Lyons signature uniqueness). # Why this approach scales past depth 8 Naive signature_truncated is O(d^(2N)) — for d=4, N=8 that's 4^16 ≈ 4.3 × 10⁹ flops per pair. Two paths past that wall: * Goursat PDE: O(n·m·d) regardless of effective depth — for kernel- matrix consumers (SVMs, GPs, kernel ridge regression). Depth-∞ in grid time. SHIPPED in this PR. * Cubature: O(M·N) per query for an M-path basis at degree N, with M polynomial in N for fixed d. For d=2, N=12 the Lyons-Victoir product construction gives M ≤ 26; for d=4, N=12 it's ~8800 — both O(10⁴), not O(10⁹). FRAMEWORK SHIPPED, concrete cubatures deferred. * Randomized signatures (already shipped in PR #348): fixed-width universal approximators at O(L·k²) per path — the production encode-once path. # Diff crates/sigker/Cargo.toml (-1/+1) swap stale example name crates/sigker/src/kernel.rs (-51/+117) corrected PDE + tests crates/sigker/src/lib.rs (-9/+9) cubature re-exports crates/sigker/src/cubature.rs NEW (~270 LOC) crates/sigker/examples/cubature_vs_randomized.rs NEW (~135 LOC) # Caveats for review - Cargo build was NOT run (no Rust toolchain in sandbox). Code follows the patterns of the existing sigker modules from PR #348 and the kernel.rs scheme has been numerically validated against Python. - The cubature module deliberately ships without a non-trivial cubature. Activation per (d, N) pair is a research-task-per-pair, not a one-line construction — see cubature.rs header for the activation recipe.

…ree-quotient Replaces the deferred() stub with a concrete probe per the Pillar-11 spec, gated behind the new `hambly-lyons` feature flag. JC's default build stays zero-dep per the standalone-crate constitution; the feature flag pulls in the sigker workspace sibling and activates the probe. ## Probe design (sigker::signature_truncated, depth 2, dim 3) Two complementary tests over 100 random pairs: **Forward (tree-equivalence preserves signature):** Out-and-back path [p₀, p₁, p₀] should have signature ≈ S_identity by Hambly-Lyons 2010 Theorem 4 (tree-like equivalence collapses to the constant path's signature). Result: max ‖S(out-and-back) − S_identity‖ = **0.000e0** across 100/100 pairs (tensor-algebra path is bit-exact at depth-2). **Converse (non-tree perturbation distinguishes):** Triangle loop [p₀, p₁, p₂, p₀] has non-zero level-2 signed-area components. Result: min ‖S(triangle) − S_identity‖ = **0.0940** across 100/100 pairs (above the 0.05 discrimination threshold). **Discrimination ratio** (min-converse / max-forward) = **∞** (perfect separation between tree and non-tree quotient classes at depth-2 truncation). ## Why the feature-gate (not regular dep, not dev-dep) JC's constitution: zero EXTERNAL deps in production. sigker is a workspace sibling (path-dep), not external — but adding it unconditionally would break the "default cargo build is standalone" property. dev-dep would put the probe out of reach of run_all_pillars (dev-deps don't reach lib code). Feature-gated optional dep is the clean middle path: cargo run --release --example prove_it → 9/11 PASS, 2 deferred (Cartan + Hambly-Lyons) cargo run --release --features hambly-lyons --example prove_it → 10/11 PASS, 1 deferred (Cartan only) Behind the feature, hambly_lyons::prove() runs the active probe. Without the feature, it returns DEFERRED with a message telling the caller how to activate. ## Why this avoids the PR #350 PDE-form correction PR #350 documents that sigker::kernel::signature_kernel_pde currently ships a Goursat-PDE form that diverges from the true signature kernel at moderate inner products. This probe uses signature_truncated (the tensor-algebra path), which is independent of the PDE-form correction. Pillar 11 certification holds regardless of #350's outcome. ## Files - crates/jc/Cargo.toml: sigker as optional dep + hambly-lyons feature - crates/jc/Cargo.lock: lockfile churn from new optional dep slot - crates/jc/src/hambly_lyons.rs: replace deferred() with feature-gated active probe + deferred fallback under cfg(not(feature)) - crates/jc/src/lib.rs: update top docstring (Pillar 4 + Pillar 11 activation history); drop "(DEFERRED)" suffix from Pillar 11 registration label ## Status Combined with Pillar 4 (commit f4bd6bf — activates unconditionally), this PR drops the deferred-pillar count from 3 to: - 2 deferred on default build (Cartan + Hambly-Lyons) - 1 deferred with --features hambly-lyons (Cartan only) Cartan-Kuranishi (Pillar 2) remains genuinely deferred — it requires the learned-attention-mask module from the coupled-revival track, not a probe-shaped activation. https://claude.ai/code/session_012AUf5NFgeAAQa5aQAKwSgx

Ada (via Jan) added 2 commits May 7, 2026 01:43

AdaWorldAPI merged commit c1f1ecf into main May 7, 2026
5 checks passed

AdaWorldAPI mentioned this pull request May 7, 2026

feat(jc): activate Pillar 4 (γ+φ preconditioner) + Pillar 11 (Hambly-Lyons via feature flag) #351

Merged

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(sigker): cubature framework + corrected Goursat-PDE solver#350

feat(sigker): cubature framework + corrected Goursat-PDE solver#350
AdaWorldAPI merged 2 commits into
mainfrom
sigker/cubature-and-goursat-pde

AdaWorldAPI commented May 7, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

AdaWorldAPI commented May 7, 2026

What's in here

Math correction #1 — PDE update form

Math correction #2 — "out-and-back" is not a cubature

Why this scales past depth 8

Diff

Caveats for review

Honest framing

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant