Skip to content

feat: CLAM/XOR/holographic adaptive codecs + holograph crate import#200

Merged
AdaWorldAPI merged 6 commits into
mainfrom
claude/teleport-session-setup-wMZfb
Apr 18, 2026
Merged

feat: CLAM/XOR/holographic adaptive codecs + holograph crate import#200
AdaWorldAPI merged 6 commits into
mainfrom
claude/teleport-session-setup-wMZfb

Conversation

@AdaWorldAPI
Copy link
Copy Markdown
Owner

Summary

  • CLAM-adaptive codec — CHAODA LFD-driven per-row precision allocation. Top 10% LFD → passthrough, next 20% → i8, bottom 70% → i4+i2. KV projections promoted to i8 for GQA sensitivity. k_proj: 25% → 97% argmax.
  • XOR-adaptive codec — sign-flip-driven per-dimension precision. XOR between row and archetype fingerprint identifies flipped dimensions → i8 precision there, i4 on matching. speaker_encoder.fc: 25% → 81% argmax.
  • Holographic residual memory — VSA superposition stores ALL cluster residuals in one fingerprint-sized memory per archetype. 3.1:1 compression. Sign-only retrieval works (cos 0.6-0.75) but needs magnitude via slot encoding.
  • Holograph crate imported — AdaWorldAPI/RedisGraph/holograph as local crate (38K LOC). BitpackedVector, slot encoding (phase+magnitude in recoverable slots), resonance cleanup, HDR cascade, 3D XYZ superposition.
  • Slot binding test confirms exact phase/magnitude recovery via XOR.
Codec speaker_encoder.fc k_proj q_proj gate_proj
Uniform k=64 25% 25% 100% 69%
CLAM-adaptive 28% 97% 100% 88%
XOR-adaptive 81% 97% 91% 88%
Holographic 6% 9% 9% 0%

Next: wire slot-encoded magnitude into holographic memory (phase+magnitude slots → full correction vector via XOR retrieval).

Test plan

  • cargo test -- adaptive_codec — 3 tests pass
  • cargo test -- xor_adaptive — 2 tests pass
  • cargo test -- holographic_residual — 4 tests pass (incl. holograph slot binding)
  • Holograph crate compiles as local dependency
  • E2E argmax comparison on Qwen3-TTS real weights

https://claude.ai/code/session_01SbYsmmbPf9YQuYbHZN52Zh

claude added 6 commits April 18, 2026 10:45
New module: bgz_tensor::adaptive_codec

Uses ndarray's ClamTree LFD (Local Fractal Dimension) to classify
weight rows by compressibility:
  - Top 10% LFD → passthrough BF16 (hardest to compress)
  - Next 20% → i8 Hadamard (moderate)
  - Bottom 70% → i4+i2 cascade (well-clustered)
  - KV projections: all non-passthrough promoted to i8 (GQA sensitivity)

Measured on Qwen3-TTS:
  k_proj (was 25% at uniform k=64): now 97% argmax at adaptive i8
  q_proj: 100% argmax
  gate_proj: 88% argmax (up from 69%)
  speaker_encoder.fc: 28% — genuinely pathological, needs GPTQ

Remaining blocker: GPTQ Hessian-guided rounding for tensors where
CLAM finds no cluster structure (uniform high LFD = everything
equally anomalous). Marked TODO in code.

3 tests pass. Burn integration research completed (report in agent).

https://claude.ai/code/session_01SbYsmmbPf9YQuYbHZN52Zh
XOR between row fingerprint and archetype fingerprint identifies which
dimensions flipped sign. Sign-flipped dims get i8, matching dims get i4.

speaker_encoder.fc: 25% → 81% argmax (the hardest tensor)
MLP down_proj: 69% → 81%

The XOR delta IS the anomaly detector — no Hessian, no calibration.
~39% of dimensions flip sign → get 8-bit precision
~61% matching → get 4-bit precision
Combined: ~5.5 effective bits/dim, targeted where error is largest.

https://claude.ai/code/session_01SbYsmmbPf9YQuYbHZN52Zh
XOR-bind + majority-vote bundle stores ALL cluster residuals in one
fingerprint-sized holographic memory per archetype. 3.1:1 compression.

Results: holographic retrieval works (cos 0.6-0.75) but argmax-poor
(0-9%) — sign-bit-only residuals lose magnitude. Need gain-shape
decomposition before holographic binding (TurboQuant on residuals).

Comparison at k=64:
  speaker_encoder.fc: CLAM 28% → XOR 81% → Holo 6% (sign-only)
  k_proj:             CLAM 97% → XOR 97% → Holo 9%

Next: accumulate indexed phase + magnitude of residual distortion
into the holographic memory via TurboQuant gain-shape split.

https://claude.ai/code/session_01SbYsmmbPf9YQuYbHZN52Zh
Adds holograph (AdaWorldAPI/RedisGraph) as dependency for VSA
slot binding — phase AND magnitude stored in separate recoverable
slots within the same holographic memory:

  Memory = Base ⊕ (SlotPhase ⊕ sign) ⊕ (SlotMag ⊕ magnitude)
  Retrieve: XOR out the slot key → get the specific component

Test confirms exact slot recovery on holograph BitpackedVector.
This is the foundation for using VSA bundle/bind/XOR as holographic
weight memory instead of traditional codebooks.

Three codec approaches compared:
  CLAM-adaptive: 28-100% argmax (LFD-driven precision)
  XOR-adaptive:  81-97% argmax (sign-flip per-dim precision)
  Holographic:   0-9% argmax (sign-only, needs magnitude — fixed
                 by slot encoding in next iteration)

https://claude.ai/code/session_01SbYsmmbPf9YQuYbHZN52Zh
Imported AdaWorldAPI/RedisGraph/holograph as crates/holograph/ —
the VSA holographic memory library with:
  - BitpackedVector (10K/16K/32K bit widths)
  - XOR bind/unbind/bundle (slot encoding for recoverable attributes)
  - Resonance cleanup memory (noisy → clean retrieval)
  - HDR cascade (stacked popcount search)
  - DN-Tree (256-way hierarchical addressing)
  - 3D XYZ superposition (content/context/relation)

bgz-tensor now uses local path dep instead of git dep.
All 4 holographic_residual tests pass.

https://claude.ai/code/session_01SbYsmmbPf9YQuYbHZN52Zh
@AdaWorldAPI AdaWorldAPI merged commit faa1fa7 into main Apr 18, 2026
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 1d7f64049a

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

let ms = bf16_to_f32(row.matched_scale_bf16);
let mut mi = 0;
for d in 0..self.n_cols {
if !row.flipped_indices.contains(&(d as u16)) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Replace linear flipped-index lookups during reconstruction

In XorAdaptiveTensor::reconstruct_row, each dimension does row.flipped_indices.contains(...), which makes reconstruction O(n_cols * n_flipped) per row. For the target 16K/32K-width vectors, this becomes a major runtime bottleneck (e.g., large flip sets turn one row decode into hundreds of millions of comparisons), so reconstruct_all can become impractically slow in production-sized runs.

Useful? React with 👍 / 👎.

Comment on lines +65 to +67
let k = k.min(n).min(256);

let centroids = kmeans(data, k, n_cols, 10);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Guard encode against zero centroid count

This clamps k with k.min(n).min(256) but never enforces k >= 1, so callers can pass k=0 and reach centroid-dependent code with no valid centroid, which later panics when indexing centroid-derived arrays for non-empty input. Adding an explicit validation/early return for k == 0 avoids a runtime crash on invalid-but-possible API input.

Useful? React with 👍 / 👎.

Comment on lines +234 to +237
// Step 4: GPTQ compensation (if calibration data provided)
// For each column left-to-right, adjust remaining weights to minimize
// output error on calibration inputs.
// TODO: implement Hessian-guided rounding in a follow-up
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Implement or remove promised GPTQ compensation path

The encode path advertises GPTQ compensation and accepts calibration_inputs, but the compensation stage is a TODO and no calibration-driven adjustment is performed. This means callers can supply calibration data and get identical output to None, which silently breaks the API contract implied by the function signature and comments and can mislead experiment results.

Useful? React with 👍 / 👎.

AdaWorldAPI pushed a commit that referenced this pull request Apr 19, 2026
Per user clarification (2026-04-19):

REFINEMENT to prior IDEA CORRECTION-OF — the "no 10000-D VSA" ban is
NOT workspace-wide. Three scopes legitimately preserve 10k until the
coordinated rename PR:

1. Grammar prototype (role_keys + ContextChain, shipped at 10k in #210)
2. Quantum prototype (Vsa10kF32 holographic residual)
3. Ladybug-rs / bighorn imports (PRs #200-#203 cognitive stack)

Elsewhere: strip 10k mentions. Files in-scope vs out-of-scope
enumerated in the IDEAS entry.

TECH_DEBT for the ladybug memory pathology:
- Observed 700-1,100 MB runtime after #200-#203 imports at 10k
- 16k rename WORSENS per-row cost 40 KB → 64 KB at f32
- Fix requires LanceDB mmap zero-copy + working-set cache policy, not
  wider substrate alone
- Gate the 16k rename on peak-RAM measurement against Animal Farm D10
- Sparse-encoding candidate (Structured5x5 cells only) for common case

https://claude.ai/code/session_01SbYsmmbPf9YQuYbHZN52Zh
AdaWorldAPI pushed a commit that referenced this pull request Apr 19, 2026
CORRECTION-OF the 2026-04-19 "Ladybug 700-1100 MB memory blowup" entry.
Per user: there is no 10,000 × 10,000 matrix we actually want. The
blowup was a glitch — a dense 10k × 10k structure imported from
outdated ladybug-rs / bighorn (PRs #200-#203) that ended up in the
binary by accident.

Math:
- 10,000 × 10,000 × f32 = 400 MB (single allocation)
- Plus cognitive-stack state → 700-1,100 MB total observed

Fix: identify and DELETE the glitch allocation. Not a migration.
Candidates: token-token distance matrix, co-occurrence matrix, dense
attention matrix, K=10000 CLAM centroid table. High-probability
locations: cognitive crate, CognitiveShader, BindSpace, CollapseGate,
adaptive codecs imported from ladybug-rs without trimming.

This invalidates:
- "16k rename makes memory worse" — the per-row math was sound but
  irrelevant to this specific blowup.
- Mmap zero-copy requirement — still good hygiene, not the fix here.
- Sparse encoding dependency — still architecturally useful, unrelated
  to the glitch.

16k rename + f32 → BF16 migration proceed independently of this P0
deletion.

https://claude.ai/code/session_01SbYsmmbPf9YQuYbHZN52Zh
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants