docs: ledger efficiency + phases + HDC substrate corrections + cognitive refactor scope by AdaWorldAPI · Pull Request #214 · AdaWorldAPI/lance-graph

AdaWorldAPI · 2026-04-19T15:43:12Z

Summary

EPIPHANIES: 10⁷× find-code via prompt↔PR ledger, 30-50% ambient arc-knowledge loss quantified, vector (10⁴) vs matrix (10⁸) category distinction, L3 working-set invariant (both superseded as prior-art reminders)
PHASES: Phase 6 (grammar feat: grammar/crystal contract + AriGraph episodic unbundling #208-210) + Phase 7 (CCA2A feat(infra): CCA2A bootstrap — append-only governance, agent ensemble, SessionStart hooks #211-213) + Phase 8 (elegant-herding-rocket D2+) added. Open-prompt adjacency map: 45 open → 15 implicitly resolved, 15 live, 15 stale
IDEAS: FP_WORDS=256 → CORRECTION (HDC = 16384-D float BF16, not binary) → REFINEMENT (scope ban to 3 exception zones) → REFINEMENT-2 (FP16/BF16 not FP32, LanceDB-only viable home for HDC); lance-graph-cognitive refactor scope (keep/merge/inspect/excise)
TECH_DEBT: Vsa10k→Vsa16k rename, 10k×10k glitch matrix (CORRECTION: single allocation not population), ladybug-rs retired (ada-rs + lance-graph exclusively)
SKILL: procedure-bookkeeping.md — three-pass Haiku→Opus→main-thread recipe for code awareness in ~90s

Test plan

All commits are docs-only (zero crates/ changes, zero Cargo.toml)
Append-only governance respected (cat >> on all board files)
SKILL.md link to procedure-bookkeeping.md resolves
Phases doc new sections don't conflict with existing Phase 0-5

https://claude.ai/code/session_01SbYsmmbPf9YQuYbHZN52Zh

Append FINDING to EPIPHANIES.md. PR #213 + ndarray PR #110 demonstrated the dumb-bookkeeper pattern: ~90 seconds, Haiku, enumerate+match+append. Result is a grep-addressable index of every shipped artifact keyed by the prompt-file brief that birthed it. For every future "what did we ship about X" query the ledger replaces a full-codebase grep with a single line — ~25 tokens vs ~25M tokens. Seven orders of magnitude cheaper. https://claude.ai/code/session_01SbYsmmbPf9YQuYbHZN52Zh

Add Phase 6 (grammar PR #208-210) and Phase 7 (governance PR #211-213) to integration_phases.md. Phase 8 queued (elegant-herding-rocket v1 D2+). Also classify the 45 "none" rows from PROMPTS_VS_PRS.md into three groups the Haiku couldn't distinguish: - Live — open prompt aligned to a named live phase (work to pick up). - Implicitly resolved / superseded — shipped under an overlapping PR title (Haiku's literal filename match missed the semantic overlap). Queues a Tier-2 Opus meta-pass to annotate the ledger with `superseded by #N` without re-reading code. - Genuinely stale — no active adjacency; archive-candidate. The ledger still saves ~10^7× over code grep even accounting for Pass 2's semantic-refinement overhead — Pass 2 reads ~10 KB of ledger text, not megabytes of Rust. https://claude.ai/code/session_01SbYsmmbPf9YQuYbHZN52Zh

…ambient) Third finding appended to EPIPHANIES.md — orthogonal to cold-start tax (20-30 turns) and find-code discount (10⁷×). This is the AMBIENT channel: 30-50% of every session's token budget burns on rediscovering what code paths exist / what was tried / why code is shaped the way it is. The prompt↔PR ledger collapses all three channels to two text-file reads: - Cold-start: 20-30 turns → 3-5 turns (~6×) - Find-code: ~25M tokens → ~25 tokens (10⁷×) - Ambient: 30-50% → 0% (2×-eternal) https://claude.ai/code/session_01SbYsmmbPf9YQuYbHZN52Zh

Per user (2026-04-19): Container<[u64; 256]> is already 16,384 bits in LanceDB storage, so padding 157 (or even 160) fingerprints into 256 wastes 62% (resp. 37%) of every row. Unifying FP_WORDS with the Container primitive means: - zero padding, zero remainder loops at any SIMD level - cache-line perfect (2 KB / 64 B = 32 cache lines, clean every level) - +62% VSA capacity for free (Plate's bound ~1,500 → ~2,400 items) - No rebake of stored fingerprints — Container was 256 already No LanceDB patch needed — FixedSizeList<UInt8, 2048> is native. Supersedes the tech-debt entry targeting FP_WORDS = 160. https://claude.ai/code/session_01SbYsmmbPf9YQuYbHZN52Zh

Concrete three-pass recipe added to the cca2a skill: Pass 1 — Haiku bookkeeper (~90 s, mechanical): enumerate prompt files, match against git log, append one ledger line per pair. Pass 2 — Opus meta-synthesizer (read-only inputs): annotate the "none" rows from Pass 1 with superseded/open/stale classification. Pass 3 — Main thread consumer (sub-second per query): grep the ledger for every "what's open / shipped / about X" question. Closes three token-waste channels simultaneously: - cold-start (20-30 turns → 3-5 turns) - find-code (~25M tokens → ~25 tokens, 10⁷×) - ambient arc knowledge (30-50% → 0%) First deployments: - PR #213 (lance-graph, 41 prompts mapped, 90 s) - PR #110 (ndarray, 25 prompts mapped, 90 s) Linked from SKILL.md under "What to read when". https://claude.ai/code/session_01SbYsmmbPf9YQuYbHZN52Zh

CORRECTION to prior IDEA "FP_WORDS = 256" and TECH_DEBT entry for the Vsa10k* → Vsa16k* rename sweep. Two distinct substrates that must NOT be collapsed: 1. Hamming binary fingerprint — Container<[u64; 256]> = 16,384 bits = 2 KB. Popcount metric. NOT VSA. 2. VSA (= HDC, High-Dimensional Computing) substrate — 16,384 DIMENSIONS × float. bind / bundle / permute. Canonical per Plate / Kanerva / arxiv 2111.06077. Never binary, never 10k. Size per VSA fingerprint: - Vsa16kF32 (lossless): 64 KB - Vsa16kBF16: 32 KB - Vsa16k u8 × 5-lane: 80 KB - Vsa16k BF16 × 5-lane: 160 KB Ban "10,000-D binary VSA" framing workspace-wide. When writing about binary fingerprints say "16,384-bit Hamming fingerprint" / "Container" — never "VSA". When writing about the HDC substrate say "16,384-D float VSA" — never "binary", never "10k". Follow-up PR will rename CrystalFingerprint::Vsa10kF32 → Vsa16kF32, re-address role-key slices from [0..10000) → [0..16384), and sweep ~28 files for legacy 10k mentions. https://claude.ai/code/session_01SbYsmmbPf9YQuYbHZN52Zh

Per user clarification (2026-04-19): REFINEMENT to prior IDEA CORRECTION-OF — the "no 10000-D VSA" ban is NOT workspace-wide. Three scopes legitimately preserve 10k until the coordinated rename PR: 1. Grammar prototype (role_keys + ContextChain, shipped at 10k in #210) 2. Quantum prototype (Vsa10kF32 holographic residual) 3. Ladybug-rs / bighorn imports (PRs #200-#203 cognitive stack) Elsewhere: strip 10k mentions. Files in-scope vs out-of-scope enumerated in the IDEAS entry. TECH_DEBT for the ladybug memory pathology: - Observed 700-1,100 MB runtime after #200-#203 imports at 10k - 16k rename WORSENS per-row cost 40 KB → 64 KB at f32 - Fix requires LanceDB mmap zero-copy + working-set cache policy, not wider substrate alone - Gate the 16k rename on peak-RAM measurement against Animal Farm D10 - Sparse-encoding candidate (Structured5x5 cells only) for common case https://claude.ai/code/session_01SbYsmmbPf9YQuYbHZN52Zh

REFINEMENT-2 correcting prior size math. Key corrections: - HDC superposition runs at FP16 / BF16, not f32. f32 naming (Vsa10kF32) reflects legacy compute precision, not storage; half-precision is sufficient for bundle accumulation. - Per-row size table: 1024-D Jina v3: 2 KB 1536-D OpenAI small: 3 KB 3072-D Upstash max: 6 KB 10000-D HDC FP16: 20 KB (current target) 16384-D HDC FP16: 32 KB (rename target) 16384-D × 5-lane u8: 80 KB 16384-D × 5-lane BF16: 160 KB - The ladybug 700-1100 MB blowup was at f32 40 KB/row. Had it been FP16 (20 KB/row) memory would have been 350-550 MB. 16k × FP16 (32 KB/row) at the same population = 560-880 MB — cheaper than the current f32 state. - The 16k rename is memory-positive ONLY if paired with f32 → FP16 migration. Without it, 16k × f32 (64 KB/row) inflates the problem. - Architectural constraint: commercial vector DBs (Upstash, Pinecone, Qdrant) cap at ≤ 3072 dims. HDC at 16384-D can only live in LanceDB FixedSizeList<BFloat16, 16384>. That's why lance-graph is The Spine. Rename scope: Vsa10kF32 → Vsa16kBF16 (not Vsa16kF32); role-key slices [0..10000) → [0..16384); storage contract = FixedSizeList<BFloat16, 16384>; f32 compute preserved internally only for numerically-sensitive hot paths. https://claude.ai/code/session_01SbYsmmbPf9YQuYbHZN52Zh

CORRECTION-OF the 2026-04-19 "Ladybug 700-1100 MB memory blowup" entry. Per user: there is no 10,000 × 10,000 matrix we actually want. The blowup was a glitch — a dense 10k × 10k structure imported from outdated ladybug-rs / bighorn (PRs #200-#203) that ended up in the binary by accident. Math: - 10,000 × 10,000 × f32 = 400 MB (single allocation) - Plus cognitive-stack state → 700-1,100 MB total observed Fix: identify and DELETE the glitch allocation. Not a migration. Candidates: token-token distance matrix, co-occurrence matrix, dense attention matrix, K=10000 CLAM centroid table. High-probability locations: cognitive crate, CognitiveShader, BindSpace, CollapseGate, adaptive codecs imported from ladybug-rs without trimming. This invalidates: - "16k rename makes memory worse" — the per-row math was sound but irrelevant to this specific blowup. - Mmap zero-copy requirement — still good hygiene, not the fix here. - Sparse encoding dependency — still architecturally useful, unrelated to the glitch. 16k rename + f32 → BF16 migration proceed independently of this P0 deletion. https://claude.ai/code/session_01SbYsmmbPf9YQuYbHZN52Zh

…late Lock the categorical distinction: - 16K-D wire VECTOR (intentional): 1 × 16,384 = 10⁴ cells, 32 KB BF16. One fingerprint, lossless round-trip, LanceDB native. - 10K × 10K glitch MATRIX (unintentional): 10,000 × 10,000 = 10⁸ cells, 200 MB BF16 / 400 MB f32. Zero purpose, imported debris from outdated ladybug-rs / bighorn. Four orders of magnitude apart. They share a numeric coincidence only. Future docs describing 10k-D HDC must say VECTOR explicitly — plain "10,000-D HDC" is ambiguous and preserves the category error. The rename (Vsa10kF32 → Vsa16kBF16) is about the vector; the matrix deletion is a separate P0 cleanup. https://claude.ai/code/session_01SbYsmmbPf9YQuYbHZN52Zh

Typical server L3 = 32-96 MB. Hot-path structures exceeding this take ~8× DRAM latency penalty per access, regardless of storage capacity. Codec stack implication: dense square matrices capped at sqrt(L3 / cell_size). At 32 MB budget with BF16 cells that's ~4000 × 4000. 10k × 10k at f32 (400 MB) is 12× over and architecturally disqualified. This is the deep reason the full codec chain exists (planes → ZeckBF17 → Base17 → Palette → CAM-PQ → Scent): each layer picks a density that keeps the hot table L3-resident. 1-D vectors are cheap — a 16K-D BF16 row is 32 KB, thousands fit L3-resident. The L3 bound binds on 2-D dense, not 1-D. That's why the rename (Vsa10kF32 → Vsa16kBF16) is safe; the 10k × 10k matrix is not. https://claude.ai/code/session_01SbYsmmbPf9YQuYbHZN52Zh

The "vector vs matrix" + "L3 working-set cap" entries from earlier this session aren't new findings — both are long-established invariants of the workspace (codec chain design, L3-budget sizing). Appending a supersede entry that downgrades both to SUPERSEDED so the FINDING log stays useful. The 10k × 10k matrix is legacy-hygiene debt from the stone-age ladybug-rs / bighorn import. Nobody re-validated it against workspace invariants because the imports were expected to be rewritten or deleted; touching the code was migration desperation, not design. Delete-on-touch, not a principle waiting to be learned. https://claude.ai/code/session_01SbYsmmbPf9YQuYbHZN52Zh

Per user (2026-04-19): ladybug-rs is archived, not maintained. Migration target is ada-rs + lance-graph going forward. Ladybug becomes read-only reference quarry for patterns (Fingerprint, BindSpace, CognitiveShader, adaptive codecs, CollapseGate). Consequences for prior ledger entries: - Glitch-matrix deletion downgrades P0 → P2 (goes away with archival unless it's currently linked into built binaries). - Vsa10k → Vsa16k rename scope tightens to ada-rs + lance-graph; no ladybug touches. - Refactor-resistance audit for ladybug imports is obsolete — harvest patterns clean, don't port code. - CLAUDE.md "ladybug-rs = The Brain" line is stale; ada-rs takes that slot. Architecture diagram update in follow-up PR. Five-repo stack becomes: ndarray = The Foundation lance-graph = The Spine ada-rs = The Brain (was ladybug-rs) crewai-rust = The Agents n8n-rs = The Orchestrator https://claude.ai/code/session_01SbYsmmbPf9YQuYbHZN52Zh

…xcise Not a wholesale deletion target — it's the complementary cognitive layer above SPO store + contract primitives. Targeted refactor: Keep: grammar/Triangle, spo/Crystal-layer, spectroscopy (unique) Merge: search/temporal → planner::strategy; cypher_bridge audit Inspect: fabric/, world/, container_bs/ → DTO-move or excise Excise: learning/ stub, wip-gated non-compiling modules, core_full/ catch-all decomposition ~1 week refactor, zero functional change, tightens contract adoption per CLAUDE.md §Current Status In-Progress rule. Plan D3 depends on GrammarTriangle staying in lance-graph-cognitive, so no rename or cross-crate move for that module. https://claude.ai/code/session_01SbYsmmbPf9YQuYbHZN52Zh

Strip ada-rs mentions from the lance-graph-cognitive refactor entry. The cognitive DTO contract surface (state classification pillars + shader-driver endpoints) already shipped in PR #206 under Pumpkin NPC framing. The refactor is dedup/merge/excise against that existing contract, not new trait work. https://claude.ai/code/session_01SbYsmmbPf9YQuYbHZN52Zh

claude added 15 commits April 19, 2026 14:20

AdaWorldAPI merged commit 204f330 into main Apr 19, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs: ledger efficiency + phases + HDC substrate corrections + cognitive refactor scope#214

docs: ledger efficiency + phases + HDC substrate corrections + cognitive refactor scope#214
AdaWorldAPI merged 15 commits into
mainfrom
claude/ledger-efficiency-epiphany

AdaWorldAPI commented Apr 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

AdaWorldAPI commented Apr 19, 2026

Summary

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants