Skip to content

docs: ledger efficiency + phases + HDC substrate corrections + cognitive refactor scope#214

Merged
AdaWorldAPI merged 15 commits into
mainfrom
claude/ledger-efficiency-epiphany
Apr 19, 2026
Merged

docs: ledger efficiency + phases + HDC substrate corrections + cognitive refactor scope#214
AdaWorldAPI merged 15 commits into
mainfrom
claude/ledger-efficiency-epiphany

Conversation

@AdaWorldAPI
Copy link
Copy Markdown
Owner

Summary

  • EPIPHANIES: 10⁷× find-code via prompt↔PR ledger, 30-50% ambient arc-knowledge loss quantified, vector (10⁴) vs matrix (10⁸) category distinction, L3 working-set invariant (both superseded as prior-art reminders)
  • PHASES: Phase 6 (grammar feat: grammar/crystal contract + AriGraph episodic unbundling #208-210) + Phase 7 (CCA2A feat(infra): CCA2A bootstrap — append-only governance, agent ensemble, SessionStart hooks #211-213) + Phase 8 (elegant-herding-rocket D2+) added. Open-prompt adjacency map: 45 open → 15 implicitly resolved, 15 live, 15 stale
  • IDEAS: FP_WORDS=256 → CORRECTION (HDC = 16384-D float BF16, not binary) → REFINEMENT (scope ban to 3 exception zones) → REFINEMENT-2 (FP16/BF16 not FP32, LanceDB-only viable home for HDC); lance-graph-cognitive refactor scope (keep/merge/inspect/excise)
  • TECH_DEBT: Vsa10k→Vsa16k rename, 10k×10k glitch matrix (CORRECTION: single allocation not population), ladybug-rs retired (ada-rs + lance-graph exclusively)
  • SKILL: procedure-bookkeeping.md — three-pass Haiku→Opus→main-thread recipe for code awareness in ~90s

Test plan

  • All commits are docs-only (zero crates/ changes, zero Cargo.toml)
  • Append-only governance respected (cat >> on all board files)
  • SKILL.md link to procedure-bookkeeping.md resolves
  • Phases doc new sections don't conflict with existing Phase 0-5

https://claude.ai/code/session_01SbYsmmbPf9YQuYbHZN52Zh

claude added 15 commits April 19, 2026 14:20
Append FINDING to EPIPHANIES.md. PR #213 + ndarray PR #110 demonstrated
the dumb-bookkeeper pattern: ~90 seconds, Haiku, enumerate+match+append.
Result is a grep-addressable index of every shipped artifact keyed by
the prompt-file brief that birthed it.

For every future "what did we ship about X" query the ledger replaces
a full-codebase grep with a single line — ~25 tokens vs ~25M tokens.
Seven orders of magnitude cheaper.

https://claude.ai/code/session_01SbYsmmbPf9YQuYbHZN52Zh
Add Phase 6 (grammar PR #208-210) and Phase 7 (governance PR #211-213)
to integration_phases.md. Phase 8 queued (elegant-herding-rocket v1 D2+).

Also classify the 45 "none" rows from PROMPTS_VS_PRS.md into three
groups the Haiku couldn't distinguish:

- Live — open prompt aligned to a named live phase (work to pick up).
- Implicitly resolved / superseded — shipped under an overlapping PR
  title (Haiku's literal filename match missed the semantic overlap).
  Queues a Tier-2 Opus meta-pass to annotate the ledger with
  `superseded by #N` without re-reading code.
- Genuinely stale — no active adjacency; archive-candidate.

The ledger still saves ~10^7× over code grep even accounting for Pass 2's
semantic-refinement overhead — Pass 2 reads ~10 KB of ledger text, not
megabytes of Rust.

https://claude.ai/code/session_01SbYsmmbPf9YQuYbHZN52Zh
…ambient)

Third finding appended to EPIPHANIES.md — orthogonal to cold-start tax
(20-30 turns) and find-code discount (10⁷×). This is the AMBIENT channel:
30-50% of every session's token budget burns on rediscovering what code
paths exist / what was tried / why code is shaped the way it is.

The prompt↔PR ledger collapses all three channels to two text-file reads:
- Cold-start: 20-30 turns → 3-5 turns   (~6×)
- Find-code:  ~25M tokens → ~25 tokens  (10⁷×)
- Ambient:    30-50% → 0%               (2×-eternal)

https://claude.ai/code/session_01SbYsmmbPf9YQuYbHZN52Zh
Per user (2026-04-19): Container<[u64; 256]> is already 16,384 bits in
LanceDB storage, so padding 157 (or even 160) fingerprints into 256
wastes 62% (resp. 37%) of every row. Unifying FP_WORDS with the
Container primitive means:

- zero padding, zero remainder loops at any SIMD level
- cache-line perfect (2 KB / 64 B = 32 cache lines, clean every level)
- +62% VSA capacity for free (Plate's bound ~1,500 → ~2,400 items)
- No rebake of stored fingerprints — Container was 256 already

No LanceDB patch needed — FixedSizeList<UInt8, 2048> is native.

Supersedes the tech-debt entry targeting FP_WORDS = 160.

https://claude.ai/code/session_01SbYsmmbPf9YQuYbHZN52Zh
Concrete three-pass recipe added to the cca2a skill:

  Pass 1 — Haiku bookkeeper (~90 s, mechanical): enumerate prompt files,
           match against git log, append one ledger line per pair.
  Pass 2 — Opus meta-synthesizer (read-only inputs): annotate the "none"
           rows from Pass 1 with superseded/open/stale classification.
  Pass 3 — Main thread consumer (sub-second per query): grep the ledger
           for every "what's open / shipped / about X" question.

Closes three token-waste channels simultaneously:
  - cold-start (20-30 turns → 3-5 turns)
  - find-code (~25M tokens → ~25 tokens, 10⁷×)
  - ambient arc knowledge (30-50% → 0%)

First deployments:
  - PR #213 (lance-graph, 41 prompts mapped, 90 s)
  - PR #110 (ndarray, 25 prompts mapped, 90 s)

Linked from SKILL.md under "What to read when".

https://claude.ai/code/session_01SbYsmmbPf9YQuYbHZN52Zh
CORRECTION to prior IDEA "FP_WORDS = 256" and TECH_DEBT entry for the
Vsa10k* → Vsa16k* rename sweep.

Two distinct substrates that must NOT be collapsed:

1. Hamming binary fingerprint — Container<[u64; 256]> = 16,384 bits
   = 2 KB. Popcount metric. NOT VSA.
2. VSA (= HDC, High-Dimensional Computing) substrate — 16,384
   DIMENSIONS × float. bind / bundle / permute. Canonical per
   Plate / Kanerva / arxiv 2111.06077. Never binary, never 10k.

Size per VSA fingerprint:
- Vsa16kF32 (lossless): 64 KB
- Vsa16kBF16:            32 KB
- Vsa16k u8 × 5-lane:    80 KB
- Vsa16k BF16 × 5-lane:  160 KB

Ban "10,000-D binary VSA" framing workspace-wide. When writing about
binary fingerprints say "16,384-bit Hamming fingerprint" / "Container"
— never "VSA". When writing about the HDC substrate say
"16,384-D float VSA" — never "binary", never "10k".

Follow-up PR will rename CrystalFingerprint::Vsa10kF32 → Vsa16kF32,
re-address role-key slices from [0..10000) → [0..16384), and sweep
~28 files for legacy 10k mentions.

https://claude.ai/code/session_01SbYsmmbPf9YQuYbHZN52Zh
Per user clarification (2026-04-19):

REFINEMENT to prior IDEA CORRECTION-OF — the "no 10000-D VSA" ban is
NOT workspace-wide. Three scopes legitimately preserve 10k until the
coordinated rename PR:

1. Grammar prototype (role_keys + ContextChain, shipped at 10k in #210)
2. Quantum prototype (Vsa10kF32 holographic residual)
3. Ladybug-rs / bighorn imports (PRs #200-#203 cognitive stack)

Elsewhere: strip 10k mentions. Files in-scope vs out-of-scope
enumerated in the IDEAS entry.

TECH_DEBT for the ladybug memory pathology:
- Observed 700-1,100 MB runtime after #200-#203 imports at 10k
- 16k rename WORSENS per-row cost 40 KB → 64 KB at f32
- Fix requires LanceDB mmap zero-copy + working-set cache policy, not
  wider substrate alone
- Gate the 16k rename on peak-RAM measurement against Animal Farm D10
- Sparse-encoding candidate (Structured5x5 cells only) for common case

https://claude.ai/code/session_01SbYsmmbPf9YQuYbHZN52Zh
REFINEMENT-2 correcting prior size math. Key corrections:

- HDC superposition runs at FP16 / BF16, not f32. f32 naming (Vsa10kF32)
  reflects legacy compute precision, not storage; half-precision is
  sufficient for bundle accumulation.
- Per-row size table:
    1024-D Jina v3:        2 KB
    1536-D OpenAI small:   3 KB
    3072-D Upstash max:    6 KB
    10000-D HDC FP16:     20 KB  (current target)
    16384-D HDC FP16:     32 KB  (rename target)
    16384-D × 5-lane u8:  80 KB
    16384-D × 5-lane BF16: 160 KB

- The ladybug 700-1100 MB blowup was at f32 40 KB/row. Had it been
  FP16 (20 KB/row) memory would have been 350-550 MB. 16k × FP16
  (32 KB/row) at the same population = 560-880 MB — cheaper than the
  current f32 state.

- The 16k rename is memory-positive ONLY if paired with f32 → FP16
  migration. Without it, 16k × f32 (64 KB/row) inflates the problem.

- Architectural constraint: commercial vector DBs (Upstash, Pinecone,
  Qdrant) cap at ≤ 3072 dims. HDC at 16384-D can only live in LanceDB
  FixedSizeList<BFloat16, 16384>. That's why lance-graph is The Spine.

Rename scope: Vsa10kF32 → Vsa16kBF16 (not Vsa16kF32); role-key slices
[0..10000) → [0..16384); storage contract = FixedSizeList<BFloat16,
16384>; f32 compute preserved internally only for numerically-sensitive
hot paths.

https://claude.ai/code/session_01SbYsmmbPf9YQuYbHZN52Zh
CORRECTION-OF the 2026-04-19 "Ladybug 700-1100 MB memory blowup" entry.
Per user: there is no 10,000 × 10,000 matrix we actually want. The
blowup was a glitch — a dense 10k × 10k structure imported from
outdated ladybug-rs / bighorn (PRs #200-#203) that ended up in the
binary by accident.

Math:
- 10,000 × 10,000 × f32 = 400 MB (single allocation)
- Plus cognitive-stack state → 700-1,100 MB total observed

Fix: identify and DELETE the glitch allocation. Not a migration.
Candidates: token-token distance matrix, co-occurrence matrix, dense
attention matrix, K=10000 CLAM centroid table. High-probability
locations: cognitive crate, CognitiveShader, BindSpace, CollapseGate,
adaptive codecs imported from ladybug-rs without trimming.

This invalidates:
- "16k rename makes memory worse" — the per-row math was sound but
  irrelevant to this specific blowup.
- Mmap zero-copy requirement — still good hygiene, not the fix here.
- Sparse encoding dependency — still architecturally useful, unrelated
  to the glitch.

16k rename + f32 → BF16 migration proceed independently of this P0
deletion.

https://claude.ai/code/session_01SbYsmmbPf9YQuYbHZN52Zh
…late

Lock the categorical distinction:

- 16K-D wire VECTOR (intentional):   1 × 16,384 = 10⁴ cells, 32 KB BF16.
  One fingerprint, lossless round-trip, LanceDB native.
- 10K × 10K glitch MATRIX (unintentional): 10,000 × 10,000 = 10⁸ cells,
  200 MB BF16 / 400 MB f32. Zero purpose, imported debris from outdated
  ladybug-rs / bighorn.

Four orders of magnitude apart. They share a numeric coincidence only.

Future docs describing 10k-D HDC must say VECTOR explicitly — plain
"10,000-D HDC" is ambiguous and preserves the category error.

The rename (Vsa10kF32 → Vsa16kBF16) is about the vector; the matrix
deletion is a separate P0 cleanup.

https://claude.ai/code/session_01SbYsmmbPf9YQuYbHZN52Zh
Typical server L3 = 32-96 MB. Hot-path structures exceeding this take
~8× DRAM latency penalty per access, regardless of storage capacity.

Codec stack implication: dense square matrices capped at sqrt(L3 /
cell_size). At 32 MB budget with BF16 cells that's ~4000 × 4000.
10k × 10k at f32 (400 MB) is 12× over and architecturally disqualified.

This is the deep reason the full codec chain exists (planes → ZeckBF17
→ Base17 → Palette → CAM-PQ → Scent): each layer picks a density that
keeps the hot table L3-resident.

1-D vectors are cheap — a 16K-D BF16 row is 32 KB, thousands fit
L3-resident. The L3 bound binds on 2-D dense, not 1-D. That's why
the rename (Vsa10kF32 → Vsa16kBF16) is safe; the 10k × 10k matrix
is not.

https://claude.ai/code/session_01SbYsmmbPf9YQuYbHZN52Zh
The "vector vs matrix" + "L3 working-set cap" entries from earlier this
session aren't new findings — both are long-established invariants of
the workspace (codec chain design, L3-budget sizing). Appending a
supersede entry that downgrades both to SUPERSEDED so the FINDING log
stays useful.

The 10k × 10k matrix is legacy-hygiene debt from the stone-age
ladybug-rs / bighorn import. Nobody re-validated it against workspace
invariants because the imports were expected to be rewritten or
deleted; touching the code was migration desperation, not design.
Delete-on-touch, not a principle waiting to be learned.

https://claude.ai/code/session_01SbYsmmbPf9YQuYbHZN52Zh
Per user (2026-04-19): ladybug-rs is archived, not maintained. Migration
target is ada-rs + lance-graph going forward. Ladybug becomes read-only
reference quarry for patterns (Fingerprint, BindSpace, CognitiveShader,
adaptive codecs, CollapseGate).

Consequences for prior ledger entries:
- Glitch-matrix deletion downgrades P0 → P2 (goes away with archival
  unless it's currently linked into built binaries).
- Vsa10k → Vsa16k rename scope tightens to ada-rs + lance-graph; no
  ladybug touches.
- Refactor-resistance audit for ladybug imports is obsolete — harvest
  patterns clean, don't port code.
- CLAUDE.md "ladybug-rs = The Brain" line is stale; ada-rs takes that
  slot. Architecture diagram update in follow-up PR.

Five-repo stack becomes:
  ndarray     = The Foundation
  lance-graph = The Spine
  ada-rs      = The Brain  (was ladybug-rs)
  crewai-rust = The Agents
  n8n-rs      = The Orchestrator

https://claude.ai/code/session_01SbYsmmbPf9YQuYbHZN52Zh
…xcise

Not a wholesale deletion target — it's the complementary cognitive
layer above SPO store + contract primitives. Targeted refactor:

Keep:    grammar/Triangle, spo/Crystal-layer, spectroscopy (unique)
Merge:   search/temporal → planner::strategy; cypher_bridge audit
Inspect: fabric/, world/, container_bs/ → DTO-move or excise
Excise:  learning/ stub, wip-gated non-compiling modules, core_full/
         catch-all decomposition

~1 week refactor, zero functional change, tightens contract adoption
per CLAUDE.md §Current Status In-Progress rule.

Plan D3 depends on GrammarTriangle staying in lance-graph-cognitive,
so no rename or cross-crate move for that module.

https://claude.ai/code/session_01SbYsmmbPf9YQuYbHZN52Zh
Strip ada-rs mentions from the lance-graph-cognitive refactor entry.
The cognitive DTO contract surface (state classification pillars +
shader-driver endpoints) already shipped in PR #206 under Pumpkin NPC
framing. The refactor is dedup/merge/excise against that existing
contract, not new trait work.

https://claude.ai/code/session_01SbYsmmbPf9YQuYbHZN52Zh
@AdaWorldAPI AdaWorldAPI merged commit 204f330 into main Apr 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants