Skip to content

fix(bgz-tensor): reject HhtlF32Tensor palette sizes >= 256 (codex #184 P1)#185

Merged
AdaWorldAPI merged 1 commit into
mainfrom
claude/hhtl-f32-palette-bounds
Apr 17, 2026
Merged

fix(bgz-tensor): reject HhtlF32Tensor palette sizes >= 256 (codex #184 P1)#185
AdaWorldAPI merged 1 commit into
mainfrom
claude/hhtl-f32-palette-bounds

Conversation

@AdaWorldAPI
Copy link
Copy Markdown
Owner

Summary

Follow-up to merged #184. Addresses codex P1 comment on that PR: "Reject palette sizes that exceed 255 centroids."

Single file: crates/bgz-tensor/src/hhtl_f32.rs (+59 LOC).

The bug codex caught

HhtlF32Entry.twig is u8 so valid centroid IDs are 0..=255. Before this fix:

  • encode(role, rows, k) accepted any k.
  • clam_furthest_point_f32 capped k at rows.len() but not at 256.
  • assign_nearest_f32 returned ci as u8, silently wrapping (e.g. 300 → 44).

Result: passing k = 1024 would silently assign centroid-300's rows to centroid-44's slot, corrupting every downstream ρ / argmax measurement — but producing a "working" run.

This was actively dangerous because #184's follow-up plan explicitly proposed k = 1024 or 2048 as the quality-improvement fallback (per the reviewer's E5 suggestion). If someone ran the Path A encoder with k = 1024 they'd get a clean exit and bogus numbers.

Fix

pub const MAX_PALETTE_K: usize = 256;

impl HhtlF32Tensor {
    pub fn encode(role: &str, rows_f32: &[Vec<f32>], k: usize) -> Self {
        assert!(k > 0, "HhtlF32Tensor::encode requires k > 0, got {}", k);
        assert!(k <= MAX_PALETTE_K,
            "HhtlF32Tensor::encode requires k <= {} (u8 twig limit), got {}",
            MAX_PALETTE_K, k);
        // ...
    }

    pub fn encode_with_leaf(...) -> Self {
        // same bounds checks
    }
}

Future-work note baked into the doc: larger palettes need a codec with a wider twig-index (u16 would lift the cap to 65536, but it's a wire format change). That lands in a separate PR if/when the quality probe demonstrates k ≥ 512 earns its keep over k=256 + deeper SlotL basis.

Tests

encode_rejects_zero_k ............................... ok  (#[should_panic = "k > 0"])
encode_rejects_k_above_256 .......................... ok  (#[should_panic = "u8 twig limit"])
encode_with_leaf_rejects_k_above_256 ................ ok  (same)
encode_accepts_k_at_max_palette ..................... ok  (k=256 still succeeds)
5 existing tests ...................................... ok
───────────────────────────────────────────────────
9 passed; 0 failed

Stack

  • #180/181/182/183/184 merged
  • #185 (this PR) — P1 codex fix for HhtlF32Tensor palette bounds

https://claude.ai/code/session_01NYGrxVopyszZYgLBxe4hgj

Per codex P1 comment on #184: HhtlF32Entry.twig is u8, so valid
centroid IDs are 0..=255. Before this fix, encode() accepted any k
and assign_nearest_f32 silently wrapped ci as u8 — passing k=300
(say) would assign centroid-300 as twig-44 and reconstruct the wrong
row. This was actively dangerous because the next-session plan (PR
#184 thread) explicitly proposed k=1024 or 2048 centroids as the
quality fallback.

Fix:
  - New `pub const MAX_PALETTE_K: usize = 256` with clear docstring
  - Both `encode` and `encode_with_leaf` now assert:
      k > 0
      k <= MAX_PALETTE_K
    with explicit panic messages naming the u8 twig limit

Larger palettes need a codec with a wider twig-index (u16 would lift
the cap to 65536, but changes the wire format). That's a separate PR
if/when the quality probe shows k=512+ earns its keep.

Tests (4 new, all pass + 5 existing):
  encode_rejects_zero_k            (#[should_panic = "k > 0"])
  encode_rejects_k_above_256       (#[should_panic = "u8 twig limit"])
  encode_with_leaf_rejects_k_above_256  (same)
  encode_accepts_k_at_max_palette  (k=256 must still succeed)

Refs:
  - PR #184 codex P1 comment ("Reject palette sizes that exceed 255 centroids")
  - Follow-up to merged PRs #180/#181/#182/#183/#184

https://claude.ai/code/session_01NYGrxVopyszZYgLBxe4hgj
AdaWorldAPI pushed a commit that referenced this pull request Apr 15, 2026
Session-end artefact for future déjà-vu. Catalogues every compression
approach tried in PRs #176-#185 and the lesson each one produced. No
approach is thrown away — each failed experiment carries information
about where the real boundary is.

## Structure

### Core invariants (6)
  I1. Two regimes, opposite needs (argmax vs index)
  I2. Near-orthogonality of weight rows in high dim
  I3. Direction vs amplitude cannot be merged into one scalar
  I4. Wire-format type widths are hard caps — assert at encode time
  I5. 'u8 can span u16/u64 effective' requires the right decoder
  I6. The ticket-for-curve model (SpiralAddress + shared curve)

### Approaches tried (7)
  A1. HhtlDTensor — Base17 + Slot D + Slot V (correct for cascade, wrong for f32 GEMM)
  A2. Progressive residual RVQ with k-ladder (works argmax, fails index)
  A3. Hierarchical CLAM 256x256 (REFUTED — cos 0.0046 on vocab)
  A4. Passthrough BF16 n_rows > 8192 (SHIPS for correctness, net loss for ratio)
  A5. SlotL 8 x i8 on SVD basis (correct algorithm, misapplied to Base17 centroid)
  A6. HhtlF32Tensor f32 palette + SlotL (right direction, 10x better, still short)
  A7. cascade_attention_probe Base17 palette (3.71% argmax agreement — palette doesn't preserve inner products)

### Abstractions that ARE the right primitive (3)
  R1. highheelbgz::rehydrate::SpiralEncoding (exists, untested on real Qwen3)
  R2. Per-role stride in NeuronPrint (q/k=3, v=5, gate=8, up=2, down=4)
  R3. HHTL cascade inference (hhtl_cache RouteAction)

### Open probes (4)
  P1. SpiralEncoding on real Qwen3 weights — claim rho >= 0.95 unproven
  P2. Shared anchors + i8 position per row — depends on P1
  P3. Palette preserves inner-product neighbourhoods — A7 refuted for Base17
  P4. Log-radial CLAM with magnitude split — hypothesised > linear CLAM

### Déjà-vu table

Lists 7 'if you're tempted to...' instincts with the PR that already
refuted them. Exists so future sessions hit the lesson before writing
the code.

### Structural checklist (5 questions)

Before shipping any new codec:
  1. What regime does this tensor belong to? (I1)
  2. Does the codec encode direction AND amplitude separately? (I3)
  3. Is the palette substrate inner-product-preserving? (I2, A7)
  4. Does the decoder evaluate the curve, or tile anchors? (I5)
  5. Are wire-format widths asserted at encode time? (I4)

## Why this doc matters

Every failed approach in this session taught something the next session
would otherwise re-learn the hard way. HCLAM (#177->#178) already has
its lesson buried in a passthrough commit. The Base17 reconstruction
failure (#183) is buried in a PR comment. The #184 Path A/B duality
(they aren't independent) is only visible if you read the probe results.

This doc surfaces all of it as a single index, structured for mutation:
each approach has 'mutation hooks' naming how it could evolve into
something that works, rather than being discarded.

## Next step blocked by token budget

The SpiralEncoding-on-real-Qwen3 probe (P1) is the obvious next
experiment and would have landed in this PR. Deferred to a fresh
session with budget. The doc leaves the probe fully specified so
re-entering cold loses no context.

https://claude.ai/code/session_01NYGrxVopyszZYgLBxe4hgj
@AdaWorldAPI AdaWorldAPI merged commit f0aa58b into main Apr 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants