Skip to content

feat(examples): universal_hhtld_encode — model-generic encoder with SlotL dispatch#183

Merged
AdaWorldAPI merged 3 commits into
mainfrom
claude/universal-hhtld-encode
Apr 15, 2026
Merged

feat(examples): universal_hhtld_encode — model-generic encoder with SlotL dispatch#183
AdaWorldAPI merged 3 commits into
mainfrom
claude/universal-hhtld-encode

Conversation

@AdaWorldAPI
Copy link
Copy Markdown
Owner

Summary

Step 4 of the universal_hhtld_encode plan (#179 mindset, #180 foundation, #181 HhtlDTensor, #182 SharedPaletteGroup).

Single example: crates/thinking-engine/examples/universal_hhtld_encode.rs (365 LOC). Model-generic — consumes any BPE-vocab safetensors model.

What it does

Load safetensors → bucket by (component, role, shape) → for each bucket:
    build_group_with_leaf(key, names, rows, k) which internally dispatches:
        argmax regime (qko/v/gate/up/down/projection) → 4 B/row Slot D only
        index  regime (embed/lm_head)                 → 12 B/row Slot D + Slot L
        passthrough   (norms/bias/<is_encodable)      → BF16 unchanged

Gates reported

# Gate Target Measurement
1 Per-row ρ, argmax regime median ≥ 0.95, p5 ≥ 0.90 sampled first 64 rows per tensor
1 Per-row ρ, index regime median ≥ 0.98, p5 ≥ 0.95 same
3 Storage ratio vs BF16 ≥ 2 : 1 total output bytes / orig BF16

Gates 2 (argmax-parity on held-out prompt) and 4 (WAV envelope vs raw) need integration with tts_full_inference.rs and land in a follow-up PR.

Usage

cargo run --release --example universal_hhtld_encode \
    --manifest-path crates/thinking-engine/Cargo.toml \
    -- /path/to/model.safetensors

Design notes

  • reconstruct_row_from_group rebuilds a transient HhtlDTensor from the SharedPaletteGroup's (cache, entries, slot_l, svd_basis) so it can call HhtlDTensor::reconstruct_row. Cleaner: add this as a method on SharedPaletteGroup itself — deferred to keep this PR focused on the validation example.
  • Sample cap of 64 rows per tensor bounds wall time on 151K-row vocab tensors. Gate 2 (argmax parity) requires the full row set and is handled in the follow-up integration PR.
  • bucket_tensors sorts results for stable reporting output across runs.

Stack

PR Status What
#180 merged SlotL foundation module
#181 merged HhtlDTensor × SlotL (encode_with_leaf / reconstruct_row)
#182 open — required SharedPaletteGroup × SlotL (build_group_with_leaf)
#183 this PR — depends on #182 universal encoder example + gates 1 + 3

Test plan

  • cargo build --release --example universal_hhtld_encode — clean first try
  • cargo run on Qwen3-TTS-0.6B — currently running in background, full result will be posted as a comment when complete
  • Integrate gates 2 + 4 via tts_full_inference in follow-up PR

Follow-ups if this PR passes

  1. Gates 2 + 4 PR — fork of tts_full_inference that loads weights via the encoded .hhtld → run raw vs encoded codec-token comparison + WAV envelope check
  2. Disk serialisation.hhtld container with magic byte header (only needed once gates pass end-to-end)
  3. Model-family probes — run on Jina v5, Qwen3-SLM, Qwen3-VL (if open) to confirm model-generic claim

Follow-ups if GATES FAIL

Per the other session's ranked fallbacks:

Fail mode First response
index ρ median < 0.98 Repack SlotL as `sign + i4
global SVD tops out < 0.99 Per-leaf local SVD basis (256 × 16 × n_cols overhead) per E5
argmax p5 < 0.90 Expand palette (k=256→512 or 1024) or add Slot V use beyond magnitude
ratio < 2:1 Audit passthrough count; tighten is_encodable threshold

https://claude.ai/code/session_01NYGrxVopyszZYgLBxe4hgj

claude added 2 commits April 15, 2026 06:31
Step 3 of the universal_hhtld_encode plan (#179 mindset doc, #180 SlotL
foundation, #181 HhtlDTensor integration). The SVD basis for Slot L now
lives at GROUP granularity — one basis amortised across all same-role
same-shape tensors in the group.

## Changes (additive, backwards-compat)

SharedPaletteGroup gains:
  - pub tensor_slot_l: Vec<(String, Vec<SlotL>, f32)>   // per-tensor leaves
  - pub svd_basis: Option<SvdBasis>                      // shared basis

Existing `build_group_with_fisher_z` constructor defaults both to empty /
None. No caller needs to change.

## New dispatch primitive

  pub fn should_use_leaf(role: &str) -> bool
    true  -> "embed" | "lm_head"   (index-regime, per-row identity needed)
    false -> everything else       (argmax-regime, 4 B/row is enough)

Maps directly to the two-regime split named in
docs/COMPRESSION_MINDSET_SHIFTS.md § "The insight that reframes the rest".

## New entry point

  pub fn build_group_with_leaf(key, names, rows_f32, k) -> SharedPaletteGroup

Dispatches on key.role:
  - argmax-regime  -> delegates to build_group_with_fisher_z (unchanged)
  - index-regime   -> builds ONE SvdBasis from first tensor's rows
                      (capped at 4096 sample rows for speed on 151K-vocab),
                      then encodes each tensor via encode_with_leaf so the
                      basis is shared across the whole group

Wire cost per row for index-regime groups: 4 B (Slot D + Slot V) + 8 B
(Slot L) = 12 B/row. Basis cost is amortised: one SvdBasis per group,
regardless of tensor count.

## Convenience methods on SharedPaletteGroup

  slot_l_byte_size()        -> bytes across all per-tensor Slot L entries
  svd_basis_byte_size()     -> bytes for the shared SVD basis (0 if None)
  slot_l_for(tensor_name)   -> Option<(&[SlotL], f32)> for lookup in
                                reconstruction paths

## Tests (all new pass)

  should_use_leaf_classification ................................. ok
  build_group_with_leaf_falls_back_for_argmax_regime ............. ok
    role="qko" -> no SVD basis, no Slot L (4 B/row preserved)
  build_group_with_leaf_populates_slot_l_for_index_regime ........ ok
    role="embed", 2 tensors × 64 rows × 128 cols -> Slot L populated
    at 8 B/row, basis shared (single SvdBasis)
  svd_basis_shared_across_group_not_per_tensor ................... ok
    Confirms amortisation: basis_size is constant; entries_size scales
    linearly with tensor count.

Full bgz-tensor suite: 150 passing, 4 new = 154. Pre-existing failures
on main (gamma_calibration, hhtl_d_entry_roundtrip, matryoshka,
hhtl_cache) are unchanged — not introduced by this work.

## Relation to prior PRs in session

  #180 (merged) - SlotL module (8 × i8 on shared SVD basis)
  #181 (merged) - HhtlDTensor × SlotL per-tensor integration
  #182 (this)   - SharedPaletteGroup × SlotL group-level amortisation

## Follow-ups

- `universal_hhtld_encode.rs` example: iterate over tensors, bucket by
  (classify_component, classify_role, effective_shape), feed each bucket
  to build_group_with_leaf (which internally dispatches on role)
- .hhtld container format: single-file pack with magic byte header,
  palette + basis + per-tensor entries + Slot L
- Inference wiring: swap tts_full_inference's RVQ codebook sum for
  HhtlDTensor::reconstruct_row on the index-regime tensors

https://claude.ai/code/session_01NYGrxVopyszZYgLBxe4hgj
… with SlotL dispatch

Implements the `universal_hhtld_encode` proposal from
docs/COMPRESSION_MINDSET_SHIFTS.md (#179), built on the #180, #181, #182
foundation stack.

## What it does

Consumes any BPE-vocab safetensors model, buckets tensors by
(component, role, shape), routes each bucket through
bgz_tensor::shared_palette::build_group_with_leaf which auto-dispatches
on role:

  argmax regime  (qko/v/gate/up/down/projection)  → 4 B/row Slot D only
  index  regime  (embed/lm_head)                   → 12 B/row Slot D + Slot L
  passthrough    (norms, biases, < is_encodable)   → BF16 unchanged

## Validation gates

This ships gates 1 + 3 of the 4-gate plan:

  GATE 1: per-row ρ histogram, split by regime
          - argmax: target median ≥ 0.95, p5 ≥ 0.90
          - index:  target median ≥ 0.98, p5 ≥ 0.95
  GATE 3: storage ratio vs BF16 original
          - target ≥ 2:1

Sample-based (first 64 rows per tensor) to keep wall time bounded on
151K-row vocab tensors.

Gates 2 (argmax-parity on held-out prompt) and 4 (WAV envelope match
vs raw) require integration with tts_full_inference.rs and land in a
follow-up PR.

## Usage

  cargo run --release --example universal_hhtld_encode \
      --manifest-path crates/thinking-engine/Cargo.toml \
      -- /path/to/model.safetensors

## Design notes

- reconstruct_row_from_group rebuilds a transient HhtlDTensor from the
  SharedPaletteGroup's (cache, entries, slot_l, svd_basis) so it can
  call HhtlDTensor::reconstruct_row. Cleaner would be a method on
  SharedPaletteGroup; deferred to keep the PR focused on the example.
- Sample cap at 64 rows per tensor: full validation pass on Qwen3-TTS-0.6B
  is O(bucket_count × tensor_count × 64 × n_cols) which bounds wall time.
  For gate 2 (argmax parity) the full row set matters — handled in the
  follow-up integration PR.

## Session PR stack

  #180 (merged) - SlotL foundation
  #181 (merged) - HhtlDTensor × SlotL per-tensor integration
  #182 (this PR's dep) - SharedPaletteGroup × SlotL group-level integration
  #183 (this PR) - universal_hhtld_encode example + gates 1 + 3

https://claude.ai/code/session_01NYGrxVopyszZYgLBxe4hgj
Copy link
Copy Markdown
Owner Author

Full run results: ratio passed, reconstruction gates failed

Qwen3-TTS-0.6B end-to-end. Exit 0. ~5 min total wall.

ARGMAX regime: mean ρ = 0.040,  p50 = 0.033,  p5 = 0.000    FAIL
INDEX  regime: mean ρ = 0.157,  p50 = 0.090,  p5 = 0.019    FAIL
RATIO:         243.4 : 1   (1829.3 MB → 7.52 MB)             PASS

Better ratio than projected 126:1, but reconstruction is nearly zero correlation on both regimes.

Root cause

The reconstruction path uses Base17::to_f32(n_cols) to expand palette centroids. That function places 17 i16 values at GOLDEN_POS indices per 17-dim octave — other dimensions = 0. For a 1024-dim row, only ~60 of 1024 dims have non-zero centroid content before SlotL runs. SlotL then tries to correct a residual where the centroid offset is 94% zeros.

projection.rs and BGZ_HHTL_D.md both explicitly state this: "encoding is designed for the HHTL cascade, not for decompression back to f32."

I built SlotL-style reconstruction on a primitive that was never meant to decode. Same class of error as HCLAM 256×256 (PR #177 → refuted in #178): accepting a documented-but-unverified quality claim and building on top of it.

What this DOES confirm

  • HhtlDTensor is a lookup address, not a reconstruction centroid. Using it for f32 GEMM was the wrong application. Using it for HHTL cascade table-walks (the thing it was designed for) still works at 343:1.
  • encode_with_leaf algorithm is correct — the SlotL module test at feat(bgz-tensor): add SlotL — 8 × i8 leaf residual on shared SVD basis #180 passed with ρ ≥ 0.98 on synthetic data with zero-centroid baseline, isolating the SlotL arithmetic. The Base17 centroid expansion is the broken step, not SlotL itself.
  • The ratio is real — 243:1 stands regardless of reconstruction quality. The storage story works; the f32-reconstruction story does not.

Path A — f32 centroid palette (reconstruction-grade)

Parallel codec, new module crates/bgz-tensor/src/hhtl_f32.rs. Replace Base17 palette entries with CLAM centroids stored as f32 or BF16 vectors (256 × n_cols × 2–4 B = 1–2 MB per palette × 26 groups = 26–52 MB). Keep SlotL on top.

  • Projected ratio: 10–30:1
  • Projected ρ: median ≥ 0.95 argmax, ≥ 0.98 index
  • Scope: ~400 LOC new module, reuses SlotL + SvdBasis. ~1 session.

Path B — inference in codec space (what HhtlDTensor is for)

Swap tts_full_inference's forward pass from hidden @ reconstructed_W.T → argmax to HHTL cascade lookups via bgz_tensor::hhtl_cache (Skip/Attend/Compose/Escalate) + FisherZTable.

  • Ratio stays at 243:1 / published 343:1
  • Inference is a different computation, not f32 GEMM
  • Scope: multi-session. E5 of the mindset doc.
  • This is what HhtlDTensor was designed for; the current session kept trying to make it do something else.

Recommendation

Path A for the next PR. It's the one-session move that would land a shippable reconstruction-grade codec. Path B is the correct long-term architecture but a bigger rewrite.

Do NOT merge #183. It proves the failure mode but the example as it stands is misleading (243:1 ratio claim without the quality context). Either:

  1. Add a prominent REFUTED banner + keep as historical artifact
  2. Close feat(examples): universal_hhtld_encode — model-generic encoder with SlotL dispatch #183, rework as universal_hhtld_f32_encode.rs in a new PR using the Path A codec

Leaning toward (2) — keep history clean. #180, #181, #182 remain useful primitives that the new codec will consume.

Stack after this finding:

  • #178 open — passthrough fix + Lance roadmap + WAV test
  • #180 merged — SlotL foundation (still correct)
  • #181 merged — HhtlDTensor × SlotL (correct as an algorithmic building block, wrong as a reconstruction path)
  • #182 open — SharedPaletteGroup × SlotL (same caveat)
  • #183 this PR — proposed close in favor of Path A follow-up

https://claude.ai/code/session_01NYGrxVopyszZYgLBxe4hgj


Generated by Claude Code

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d1d6d293a4

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +311 to +312
let total_output = total_entries_bytes + total_slot_l_bytes + total_svd_basis_bytes
+ total_palette_bytes + passthrough_bytes;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Include Fisher-Z bytes in total output accounting

Gate 3 currently computes total_output from entries, Slot L, SVD basis, palettes, and passthrough tensors, but omits the Fisher-Z tables that are built for every group (group.fisher_z). At k=256, each omitted table is about 64 KB, so across many groups this can undercount output by multiple MB and incorrectly report a PASS on the >= 2:1 storage gate. Please add Fisher-Z byte size to the per-group accumulation and final ratio/reporting.

Useful? React with 👍 / 👎.

Comment on lines +261 to +262
if tensor_rows_f32.is_empty() {
return build_group_with_fisher_z(key, tensor_names, tensor_rows_f32, k);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Avoid panic when leaf builder receives no tensors

This empty-input guard still forwards to build_group_with_fisher_z, which immediately indexes tensor_rows_f32[0]. If build_group_with_leaf is called with an empty slice (for example after upstream filtering), this path will panic instead of handling the empty case. The guard should return a safe empty result (or explicit error) rather than dispatching to a function that assumes at least one tensor.

Useful? React with 👍 / 👎.

…#183)

Per codex P1 comment: the per-group Fisher-Z table (k*k i8 + 8 bytes
family gamma = ~64 KB at k=256) was omitted from the total output
byte count. Across 26 groups on Qwen3-TTS-0.6B that's ~1.7 MB
understatement — enough to paint a falsely generous ratio on marginal
models and to undercount on all models.

Changes:
  - total_fisher_z_bytes accumulator
  - group.fisher_z.as_ref().map(|f| f.byte_size()).unwrap_or(0) per group
  - included in total_output
  - new line in Gate 3 summary: "Fisher-Z tables: {mb} MB"

Impact on the current run: total_output goes from 7.52 MB to
~9.2 MB (adds 1.66 MB), so the reported 243:1 ratio is closer to
~199:1. Ratio still passes the ≥ 2:1 gate by two orders of
magnitude — the broken gate here is reconstruction ρ (documented
in the PR #183 comment thread), not storage.

This fix is useful even if the recommendation to close #183 stands:
the accounting should be correct in any future Path A / Path B
successor codec example.

Refs: PR #183 codex comment (P1 badge, "Include Fisher-Z bytes in
total output accounting")

https://claude.ai/code/session_01NYGrxVopyszZYgLBxe4hgj
@AdaWorldAPI AdaWorldAPI merged commit 4e0a8a6 into main Apr 15, 2026
AdaWorldAPI pushed a commit that referenced this pull request Apr 15, 2026
Post-#183 finding: Base17 palette substrate can't reconstruct rows for
f32 GEMM (per-row ρ ≈ 0.04 on real Qwen3). This lands both paths the
other session ranked as viable forward directions.

## Path A — HhtlF32Tensor (reconstruction-grade)

New module crates/bgz-tensor/src/hhtl_f32.rs (5 tests passing):

  pub struct HhtlF32Entry { pub twig: u8 }             // 1 byte/row
  pub struct HhtlF32Tensor {
      palette_f32: Vec<Vec<f32>>,    // CLAM centroids in f32
      entries:     Vec<HhtlF32Entry>,
      slot_l:      Option<Vec<SlotL>>,
      slot_l_scale: Option<f32>,
      svd_basis:   Option<SvdBasis>,
      ...
  }

  impl HhtlF32Tensor {
      fn encode(role, rows, k) -> Self;           // 1 B/row, argmax regime
      fn encode_with_leaf(role, rows, k, basis);  // 9 B/row, index regime
      fn reconstruct_row(idx, n_cols) -> Vec<f32>;
      fn reconstruct_rows(n_cols) -> Vec<Vec<f32>>;
  }

Pipeline:
  row     →   CLAM furthest-point  →  twig idx (1 byte)
  residual →  SvdBasis::project    →  SlotL (8 × i8)
  decode:   palette_f32[twig] + SvdBasis::reconstruct(slot_l * scale)

Per-tensor footprint for [n_rows, n_cols]:
  palette BF16: 256 × n_cols × 2
  SVD basis:    8 × n_cols × 2
  entries:      n_rows × 1
  slot_l:       n_rows × 8 (if index regime)

Tests (5 new, all passing):
  encode_without_leaf_picks_real_rows_as_centroids
  reconstruct_without_leaf_returns_nearest_centroid
  encode_with_leaf_beats_without_leaf_on_real_rows  ← ρ ≥ 0.95 on low-rank
  entry_byte_size_is_one
  storage_accounting_is_additive

Example: universal_hhtl_f32_encode.rs — same gates as #183 universal
encoder, but uses HhtlF32Tensor. Running on Qwen3-TTS-0.6B in background.

## Path B — cascade_attention_probe (codec-space inference)

New example: cascade_attention_probe.rs. Measures argmax agreement
between:
  Raw:    argmax_i  q · K[i]^T                          (f32 dot)
  Codec:  argmax_i  FisherZTable[pal_idx(q), pal_idx(K[i])]

on 512 perturbed queries against a real attention K matrix (talker
layer 0 self_attn.k_proj, shape [1024, 2048]).

Pass criteria (subjective):
  ≥ 90% top-1 agreement → Path B viable for pipeline-wide swap
  ≥ 70% partial         → Path B needs Q-side escalation layer
  <  70% fail           → not competitive with f32 GEMM

Both runs launched; results will be posted as PR comments when they
complete.

## Session PR stack

  #180 merged   SlotL foundation
  #181 merged   HhtlDTensor × SlotL
  #182 merged   SharedPaletteGroup × SlotL
  #183 merged   universal_hhtld_encode (Base17 — reconstruction failure documented)
  #184 this PR  HhtlF32Tensor codec + Path A/B examples

https://claude.ai/code/session_01NYGrxVopyszZYgLBxe4hgj
AdaWorldAPI pushed a commit that referenced this pull request Apr 15, 2026
Per codex P1 comment on #184: HhtlF32Entry.twig is u8, so valid
centroid IDs are 0..=255. Before this fix, encode() accepted any k
and assign_nearest_f32 silently wrapped ci as u8 — passing k=300
(say) would assign centroid-300 as twig-44 and reconstruct the wrong
row. This was actively dangerous because the next-session plan (PR
#184 thread) explicitly proposed k=1024 or 2048 centroids as the
quality fallback.

Fix:
  - New `pub const MAX_PALETTE_K: usize = 256` with clear docstring
  - Both `encode` and `encode_with_leaf` now assert:
      k > 0
      k <= MAX_PALETTE_K
    with explicit panic messages naming the u8 twig limit

Larger palettes need a codec with a wider twig-index (u16 would lift
the cap to 65536, but changes the wire format). That's a separate PR
if/when the quality probe shows k=512+ earns its keep.

Tests (4 new, all pass + 5 existing):
  encode_rejects_zero_k            (#[should_panic = "k > 0"])
  encode_rejects_k_above_256       (#[should_panic = "u8 twig limit"])
  encode_with_leaf_rejects_k_above_256  (same)
  encode_accepts_k_at_max_palette  (k=256 must still succeed)

Refs:
  - PR #184 codex P1 comment ("Reject palette sizes that exceed 255 centroids")
  - Follow-up to merged PRs #180/#181/#182/#183/#184

https://claude.ai/code/session_01NYGrxVopyszZYgLBxe4hgj
AdaWorldAPI pushed a commit that referenced this pull request Apr 15, 2026
Session-end artefact for future déjà-vu. Catalogues every compression
approach tried in PRs #176-#185 and the lesson each one produced. No
approach is thrown away — each failed experiment carries information
about where the real boundary is.

## Structure

### Core invariants (6)
  I1. Two regimes, opposite needs (argmax vs index)
  I2. Near-orthogonality of weight rows in high dim
  I3. Direction vs amplitude cannot be merged into one scalar
  I4. Wire-format type widths are hard caps — assert at encode time
  I5. 'u8 can span u16/u64 effective' requires the right decoder
  I6. The ticket-for-curve model (SpiralAddress + shared curve)

### Approaches tried (7)
  A1. HhtlDTensor — Base17 + Slot D + Slot V (correct for cascade, wrong for f32 GEMM)
  A2. Progressive residual RVQ with k-ladder (works argmax, fails index)
  A3. Hierarchical CLAM 256x256 (REFUTED — cos 0.0046 on vocab)
  A4. Passthrough BF16 n_rows > 8192 (SHIPS for correctness, net loss for ratio)
  A5. SlotL 8 x i8 on SVD basis (correct algorithm, misapplied to Base17 centroid)
  A6. HhtlF32Tensor f32 palette + SlotL (right direction, 10x better, still short)
  A7. cascade_attention_probe Base17 palette (3.71% argmax agreement — palette doesn't preserve inner products)

### Abstractions that ARE the right primitive (3)
  R1. highheelbgz::rehydrate::SpiralEncoding (exists, untested on real Qwen3)
  R2. Per-role stride in NeuronPrint (q/k=3, v=5, gate=8, up=2, down=4)
  R3. HHTL cascade inference (hhtl_cache RouteAction)

### Open probes (4)
  P1. SpiralEncoding on real Qwen3 weights — claim rho >= 0.95 unproven
  P2. Shared anchors + i8 position per row — depends on P1
  P3. Palette preserves inner-product neighbourhoods — A7 refuted for Base17
  P4. Log-radial CLAM with magnitude split — hypothesised > linear CLAM

### Déjà-vu table

Lists 7 'if you're tempted to...' instincts with the PR that already
refuted them. Exists so future sessions hit the lesson before writing
the code.

### Structural checklist (5 questions)

Before shipping any new codec:
  1. What regime does this tensor belong to? (I1)
  2. Does the codec encode direction AND amplitude separately? (I3)
  3. Is the palette substrate inner-product-preserving? (I2, A7)
  4. Does the decoder evaluate the curve, or tile anchors? (I5)
  5. Are wire-format widths asserted at encode time? (I4)

## Why this doc matters

Every failed approach in this session taught something the next session
would otherwise re-learn the hard way. HCLAM (#177->#178) already has
its lesson buried in a passthrough commit. The Base17 reconstruction
failure (#183) is buried in a PR comment. The #184 Path A/B duality
(they aren't independent) is only visible if you read the probe results.

This doc surfaces all of it as a single index, structured for mutation:
each approach has 'mutation hooks' naming how it could evolve into
something that works, rather than being discarded.

## Next step blocked by token budget

The SpiralEncoding-on-real-Qwen3 probe (P1) is the obvious next
experiment and would have landed in this PR. Deferred to a fresh
session with budget. The doc leaves the probe fully specified so
re-entering cold loses no context.

https://claude.ai/code/session_01NYGrxVopyszZYgLBxe4hgj
AdaWorldAPI pushed a commit that referenced this pull request Apr 17, 2026
Codifies 7 anti-patterns (AP1-AP7) learned from PRs #176-#188 into
an agent card that fires flags when the session repeats them:

  AP1: "225/225 feels like success" without gate 2 (#178)
  AP2: Projecting quality from docs instead of measuring (#177)
  AP3: Building new codec before benching existing ones (#184)
  AP4: Centroid-residual framing on near-orthogonal data (#177/#183)
  AP5: Python in the inference hot path
  AP6: Chained score multiplication without chain-collapse check (P5)
  AP7: Modifying ndarray without explicit permission (#176)

Invoked by adk-coordinator when pattern repetition is suspected, or
by human directly. Output: list of fired flags, max 7 lines.

Also audited all 29 agent cards across both repos:
  - All pin model: opus or model: sonnet (no hardcoded versions)
  - opus → Opus 4.7 automatically, sonnet → Sonnet 4.6
  - 3 ndarray agents on sonnet (l3-strategist, migration-tracker,
    product-engineer) — intentional for speed-over-depth roles
  - adk-coordinator missing Bash tool (by design — delegates)
  - sentinel-qa missing Edit/Write (by design — audit-only)

No agent changes needed for Opus 4.7 compatibility — model: opus
resolves correctly.

https://claude.ai/code/session_01NYGrxVopyszZYgLBxe4hgj
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants