D5 Trajectory + MarkovBundler + board hygiene for categorical-algebraic inference#243
Conversation
Three final epiphanies before cutting code:
1. The 8-step wiring sequence (encoder migration → MarkovBundler →
Trajectory → pipeline → AriGraph commit → global context →
awareness revision → KL feedback). Dependency chain: 1→2→3→4→
{5,6,7}→8. Three PRs close the loop. "Wire first, optimize later."
2. COCA 4096 + 20K scientific + spider NER = no vocabulary blocker.
The three loop-closing PRs are the ONLY critical path.
3. The AGI test is one measurement: does coreference accuracy rise
from chapter 1 to chapter 10 of Animal Farm with no parameter
change? Rising curve = AGI. Flat curve = broken wire.
https://claude.ai/code/session_01SbYsmmbPf9YQuYbHZN52Zh
…wiring) Three new modules in deepnsm under the `grammar-10k` feature gate (pulls in lance-graph-contract for RoleKey, Vsa10k, FreeEnergy): 1. `content_fp.rs` — 10K-dim content fingerprints from COCA vocab ranks via SplitMix64 spread. Deterministic, balanced (~50% bits), slack bits above dim 10000 zeroed. 5 tests. 2. `markov_bundle.rs` — MarkovBundler ring buffer of 11 Vsa10k. Each sentence: bind tokens per role key (SUBJECT/PREDICATE/OBJECT/ MODIFIER/TEMPORAL), XOR-bundle into one Vsa10k. Per window: braid via vsa_permute(sentence, distance * 64) per ±5 position, XOR- superpose. WeightingKernel selects MexicanHat/Uniform/Gaussian. 8 tests including permute invertibility and braided-vs-unbraided. 3. `trajectory.rs` — Trajectory carrier (the Think struct). Holds `bundle: Vsa10k` + `global_context: Vsa10k`. Methods: role_bundle (unbind one role), mean_recovery_margin (likelihood term), ambient_similarity (global context alignment), free_energy (compose likelihood + KL), resolve (rank hypotheses by F, return Resolution). 4 tests: subject recovery, F-ordering, resolve commits best, bundled-trajectory-through-bundler. Verification: cargo test --manifest-path crates/deepnsm/Cargo.toml --features grammar-10k # 63 passed (0 failed) — all existing + 17 new This is steps 1-3 of the 8-step wiring sequence. Steps 4-8 (pipeline, AriGraph commit, global context, awareness revision, KL feedback) close the loop in the next PRs. https://claude.ai/code/session_01SbYsmmbPf9YQuYbHZN52Zh
Updates all governance files in a single commit per mandatory rule: - LATEST_STATE.md — updated to 2026-04-21, PR #243 row added, grammar/ inventory expanded with all new types (RoleKey::bind, Vsa10k, GrammarStyleAwareness, FreeEnergy, Hypothesis, Resolution). - PR_ARC_INVENTORY.md — PR #243 entry prepended with Added/Locked/ Deferred/Docs sections. 8 architectural invariants locked. - STATUS_BOARD.md — D5 and D7 flipped to Shipped with LOC/test counts. D2/D8/D10 descriptions updated with wiring-step cross-references. - session-2026-04-21-categorical-click.md — full handover knowledge doc with shipped code table, shipped docs table, 8-step wiring sequence, 12 critical insights, 7 anti-patterns. https://claude.ai/code/session_01SbYsmmbPf9YQuYbHZN52Zh
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 0dacb4790b
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| let distance = if i <= MARKOV_RADIUS { | ||
| MARKOV_RADIUS - i | ||
| } else { | ||
| i - MARKOV_RADIUS | ||
| }; |
There was a problem hiding this comment.
Encode signed temporal offsets in braiding
The bundle currently computes distance as an absolute offset from the focal index, so positions -d and +d are permuted identically before XOR superposition. Because XOR is commutative, swapping a sentence from before the focal with one after it can produce the same trajectory, which drops before/after information and breaks chronology-sensitive disambiguation.
Useful? React with 👍 / 👎.
| let expected = crate::content_fp::content_fp( | ||
| filler_rank_from_label(_filler_label), | ||
| ); |
There was a problem hiding this comment.
Use real filler ranks when scoring hypotheses
mean_recovery_margin compares unbound role content to content_fp(filler_rank_from_label(...)), but that helper hashes the filler text modulo 4096 instead of using the rank that was actually bound into the sentence vector. In realistic hypotheses (labels or token strings), this makes the likelihood term unrelated to the underlying content and can cause resolve to choose or reject hypotheses incorrectly.
Useful? React with 👍 / 👎.
| let weight = self.kernel.weight(distance); | ||
| if weight <= 0.0 { | ||
| continue; |
There was a problem hiding this comment.
Apply kernel magnitude during bundle composition
The kernel weight is calculated but only used as a sign check; positive weights are all treated the same. That means Gaussian (all positive) degenerates to Uniform output, so MarkovPolicy.kernel silently loses its intended effect on bundle construction and downstream free-energy scoring.
Useful? React with 👍 / 👎.
Summary
content_fp.rs— 10K-dim content fingerprints from COCA ranks via SplitMix64. 98 LOC, 5 tests. Feature-gated:grammar-10k.markov_bundle.rs—MarkovBundler±5 ring buffer with role-key bind + braiding viavsa_permute(Shaw's ρ operator) + XOR-superpose +WeightingKernel. 250 LOC, 8 tests.trajectory.rs—Trajectory(Think carrier):role_bundle,mean_recovery_margin,ambient_similarity,free_energy,resolve. 298 LOC, 4 tests.session-2026-04-21-categorical-click.md— session handover with 12 critical insights + 7 anti-patterns table.What's Next (steps 4-8 close the loop)
The AGI test: Animal Farm end-to-end. Chapter-10 coreference accuracy > chapter-1 accuracy with no parameter change = loop works.
Verification
Test plan
https://claude.ai/code/session_01SbYsmmbPf9YQuYbHZN52Zh