knowledge: neurosymbolic + RLVR + causal learning-layer curriculum (v1) — 8 papers, 5-PR roadmap, governance only by AdaWorldAPI · Pull Request #373 · AdaWorldAPI/lance-graph

AdaWorldAPI · 2026-05-15T19:13:19Z

Summary

8-paper curriculum + 5-PR roadmap for the stack's missing self-improvement loop. No code changes — plan + board hygiene per CLAUDE.md Mandatory Board-Hygiene Rule (new integration plan → INTEGRATION_PLANS.md PREPEND + .claude/knowledge/<name>-v<N>.md + EPIPHANIES.md PREPEND).

What this composes

The substrate landed in PR #372 (causaledge64-mailbox-rename-soa-v1) — AriGraph SPO-G + CausalEdge64 v2 + Σ-tier router + MailboxSoA. What this curriculum adds is the learning loop on top of that substrate — the verbs that turn the existing Think struct into a system that trains itself.

8 papers in 4 reading tiers (~6 hours total):

Tier	Papers	Stack verb
0 — Doctrinal	Causal de Finetti · Executable CFG	ICM grouping, Pearl 2³ trainable
1 — Method	LINC · GRPO	Σ9-Σ10 prover dispatch, RLVR algorithm
2 — Generation	LPN · TextGrad · Opt-Sym	StyleVector test-time grad, hybrid prompt opt, symbolic data gen
3 — Safety	Conformal CFG	Calibrated bounds for L4 outputs

Stack mapping (5 live, 4 missing)

Live (PR #372): AriGraph SPO-G quads · StyleVectors · Σ9-Σ10 → L4 shell · MUL gate · Pearl 2³ in NarsEngine

Missing (this curriculum's PR roadmap fills): NARS Intervention/Counterfactual verbs · ICM-invariance column · TextGrad-style style_synthesize · GRPO trainer · LINC bridge + conformal CFG

5-PR sequencing

#	Scope	LOC	Risk
PR-LL-1	NARS Intervention/Counterfactual verbs + AriGraph::intervene_on	~200	Low
PR-LL-2	ICM-invariance column + `lance-graph-planner::data_gen` (Opt-Sym generator)	~800	Med
PR-LL-3	Hybrid TextGrad/LPN style_synthesize	~400	Med
PR-LL-4	`crates/lance-graph-trainer/` (GRPO loop)	~800	High
PR-LL-5	`crates/linc-bridge/` (Z3 prover + conformal CFG)	~600	Med

Sequential. PR-LL-4 requires ~2 weeks separate prep for the Qwen3-head-via-candle wiring.

Doctrinal claims worth flagging

Stack's NARS verifier is strictly stronger than Opt-Sym's LLM verifier — graded confidence ∈ [0,1] is better than binary commit/reject as a GRPO reward signal. The stack gets a free-quality improvement on the published method.
Stack's StyleVectors is already an LPN-style continuous latent space — the gradient-at-inference operator is what's missing, not the representation.
The MUL gate is already the LINC dispatch shape — LINC fills the "what does L4 actually compute" slot in the existing Σ9-Σ10 escalation path.
Conformal CFG with Jirak bounds, not Berry-Esseen — counterfactual rollouts share latent abduction; correlation makes classical bounds underestimate variance. The stack's I-NOISE-FLOOR-JIRAK iron rule already encodes this.

6 Open Questions (ratify before sprint fan-out)

OQ-LL-1 reward shape (graded NARS confidence vs binary) — recommendation: graded
OQ-LL-2 TextGrad optimizer location (local Qwen3 vs frontier API) — recommendation: local with frontier fallback
OQ-LL-3 prover choice (Z3 vs Prover9 vs HOL Light) — recommendation: Z3 default, HOL Light for verified-code consumers
OQ-LL-4 style-pool location (contract vs separate learned_styles) — recommendation: separate pool, contract exposes StylePoolProvider trait
OQ-LL-5 ICM-invariance update protocol — recommendation: clear bit on counterfactual contradiction
OQ-LL-6 Σ-tier-as-difficulty probe (hot-path latency)

Iron rule audit (all 6 satisfied)

Rule	Status
I-SUBSTRATE-MARKOV	All synthesized trajectories pass Chapman-Kolmogorov test
I-NOISE-FLOOR-JIRAK	Conformal calibration uses Jirak bounds, not Berry-Esseen
I-VSA-IDENTITIES	`style_synthesize` produces identity fingerprints, not content
I1 BindSpace read-only	`IcmInvarianceColumn` writes via `CollapseGate::bundle`
Method-on-carrier	All 4 new capabilities are methods on existing carriers
AGI-as-glove SoA	New styles land in `StyleColumn` extension, no new layer

Blast radius

2 new crates: lance-graph-trainer + linc-bridge (~1400 LOC)
3 crates modified: lance-graph-planner, causal-edge, lance-graph-contract
Zone 3 surface UNCHANGED
ndarray side UNCHANGED (curriculum stays thinking-side of the doctrinal split)
External deps gated: z3-rs (PR-LL-5), candle/burn (PR-LL-4)

Files changed

NEW: .claude/knowledge/neurosymbolic-rlvr-causal-curriculum-v1.md (~620 lines, 12 sections)
PREPEND: .claude/board/EPIPHANIES.md (E-LL-CURRICULUM-1)
PREPEND: .claude/board/INTEGRATION_PLANS.md (plan-index entry)

3 files changed, +617 lines. Zero code.

Test plan

Knowledge doc renders cleanly (markdown)
EPIPHANIES.md PREPEND format matches existing entries (append-only governance preserved)
INTEGRATION_PLANS.md PREPEND format matches existing entries
No code in this PR — nothing to compile or test
Reviewer ratifies §7 OQ-LL-1 through OQ-LL-6 before sprint-11 worker fan-out
Reviewer confirms §5 stack-substrate alignment table accurately reflects current state post-PR specs(sprint-10): 12-worker CCA2A fleet + meta-review (governance) #372

What this PR does NOT touch

No code. Zero .rs / Cargo.toml changes.
No new crates. All 5 PR-LL-* sub-PRs describe future crates; this PR does not create them.
No ndarray changes. Curriculum is doctrinally thinking-side; ndarray stays as hardware substrate.
PR-LL-1 through PR-LL-5 implementation workers — owned by the next sprint kick-off after this curriculum merges and OQs are ratified.
Pre-training of foundation models — outside scope. Stack uses Jina v5 / Qwen3 / ModernBERT / etc. as frozen encoders; PR-LL-4 fine-tunes a head, not the foundation.

Predecessor

PR #372 (causaledge64-mailbox-rename-soa-v1) — the substrate this curriculum builds the learning loop on top of. Without PR #372's SPO-G quads, MailboxSoA, and Σ-tier router, the curriculum has nothing to wire into.

Generated by Claude Code

8-paper synthesis composing Schölkopf SCMs + MIT BPL + LINC + GRPO into a 5-PR roadmap for the stack's missing self-improvement loop. Governance only — no code changes. Doc + 2 board PREPENDs per CLAUDE.md Mandatory Board-Hygiene Rule. ## What this composes Eight papers, four tiers, ~6 hours total reading load: Tier 0 (doctrinal frame): - Causal de Finetti (Guo+Schölkopf 2022, arXiv:2203.15756) — ICM principle, AriGraph SPO-G grouping doctrine - Executable Counterfactuals (Vashishtha 2025, arXiv:2510.01539) — Pearl 2³ trainable verbs, RL>SFT for OOD Tier 1 (method substrate): - LINC (Olausson+Solar-Lezama+Tenenbaum 2023, arXiv:2310.15164) — Σ9-Σ10 classical-prover dispatch - GRPO/DeepSeekMath (Shao 2024, arXiv:2402.03300) — RLVR algorithm spec Tier 2 (closed-loop generation): - LPN (Bonnet 2024, arXiv:2411.08706) — StyleVectors test-time gradient - TextGrad (Yuksekgonul 2024, ~arXiv:2406.07496) — textual-gradient prompt optimizer - Opt-Sym (Yeo+Solar-Lezama 2026, ID pending) — symbolic-space adaptive data generation Tier 3 (safety/calibration): - Conformal CFG (Farzaneh 2026, arXiv:2601.20090) — calibrated bounds for L4 planner outputs ## Stack mapping (5 live components, 4 missing modules) Live: AriGraph SPO-G quads (PR #372) · StyleVectors (cache::triple_model) · Σ9-Σ10 → L4 dispatch shell · MUL gate · Pearl 2³ in nars_engine Missing: NARS Intervention/Counterfactual verbs · ICM-invariance column · TextGrad-style style_synthesize · GRPO trainer · LINC bridge ## 5-PR roadmap (each ~200-800 LOC, sequential, PR-LL-1..LL-5) PR-LL-1: NARS Intervention/Counterfactual InferenceType variants + AriGraph ::intervene_on — closes Pearl 2³ dispatch gap PR-LL-2: ICM-invariance BindSpace column (1 bit/row, gated through CollapseGate per I1) + lance-graph-planner::data_gen module (Opt-Sym generator targeting Σ-tier as difficulty axis, with NARS+Chapman-Kolmogorov deterministic verification stronger than Opt-Sym's LLM verifier) PR-LL-3: Hybrid TextGrad/LPN style_synthesize — numerical autograd on StyleVector + textual gradient on rendering prompt — closes THINKING_ORCHESTRATION_WIRING.md Gap 1 (12 vs 36 ThinkingStyle) PR-LL-4: crates/lance-graph-trainer/ — GRPO loop with NARS confidence ∈ [0,1] as graded reward (strictly stronger than Opt-Sym binary). Backed by candle or burn for the Qwen3 head fine-tune. ~2 weeks separate prep work for the head-via-candle wiring. PR-LL-5: crates/linc-bridge/ — Z3 SMT prover + Farzaneh-style conformal counterfactual wrap. SMT theories (arithmetic, bitvectors, arrays) match stack queries better than pure FOL. Required for MedCare-rs / q2 high-stakes safety. ## 6 Open Questions (ratify before sprint fan-out) OQ-LL-1 reward shape (graded NARS confidence vs binary) OQ-LL-2 TextGrad optimizer location (local Qwen3 vs frontier API) OQ-LL-3 prover choice (Z3 vs Prover9 vs HOL Light) OQ-LL-4 style-pool location (contract vs separate learned_styles pool) OQ-LL-5 ICM-invariance update protocol (when does invariance bit clear?) OQ-LL-6 Σ-tier-as-difficulty probe (hot-path latency check) ## Iron rule audit (all 6 satisfied) I-SUBSTRATE-MARKOV: synthesized trajectories pass Chapman-Kolmogorov in PR-LL-2 verify step I-NOISE-FLOOR-JIRAK: PR-LL-5 conformal calibration uses Jirak bounds (not classical Berry-Esseen) — counterfactual rollouts share latent abduction, classical bounds underestimate variance I-VSA-IDENTITIES: style_synthesize produces identity fingerprints, not content I1: IcmInvarianceColumn writes through CollapseGate::bundle, never raw assignment Method-on-carrier: all 4 new capabilities are methods on existing carriers (AriGraph, StyleVector, Student, Query) AGI-as-glove SoA: synthesized styles land in StyleColumn extension; no new layer ## Blast radius - 2 new crates: lance-graph-trainer + linc-bridge (~1400 LOC total) - 3 crates modified: lance-graph-planner, causal-edge, lance-graph-contract - Zone 3 surface UNCHANGED - ndarray side UNCHANGED (curriculum stays thinking-side of doctrinal split) - External deps gated behind features: z3-rs (PR-LL-5), candle/burn (PR-LL-4) ## Files changed - NEW: .claude/knowledge/neurosymbolic-rlvr-causal-curriculum-v1.md (~620 lines, 12 sections, 5-PR roadmap + 6 OQs + iron-rule audit) - PREPEND: .claude/board/EPIPHANIES.md (E-LL-CURRICULUM-1) - PREPEND: .claude/board/INTEGRATION_PLANS.md (plan-index entry) 3 files changed. Pure plan + board hygiene. No code. ## Predecessor / successor Predecessor: PR #371/#372 (causaledge64-mailbox-rename-soa-v1) substrate — this curriculum is the learning loop on top of that substrate Successor: PR-LL-1 through PR-LL-5 (the curriculum is the spec for the sequential implementation wave; each PR fan-outs via the established CCA2A 12-worker pattern)

AdaWorldAPI merged commit 7a66513 into main May 15, 2026

This was referenced May 15, 2026

gov: retire 4 superseded orphan branches + audit-trail entry #379

Merged

plan: cognitive-substrate-convergence-v1 (i4 mantissa + gapless baton + active inference) #380

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

knowledge: neurosymbolic + RLVR + causal learning-layer curriculum (v1) — 8 papers, 5-PR roadmap, governance only#373

knowledge: neurosymbolic + RLVR + causal learning-layer curriculum (v1) — 8 papers, 5-PR roadmap, governance only#373
AdaWorldAPI merged 1 commit into
mainfrom
claude/learning-layer-curriculum

AdaWorldAPI commented May 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

AdaWorldAPI commented May 15, 2026

Summary

What this composes

Stack mapping (5 live, 4 missing)

5-PR sequencing

Doctrinal claims worth flagging

6 Open Questions (ratify before sprint fan-out)

Iron rule audit (all 6 satisfied)

Blast radius

Files changed

Test plan

What this PR does NOT touch

Predecessor

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants