Skip to content

fix: technical debt + knowledge update + BF16 distance table plan#123

Merged
AdaWorldAPI merged 4 commits into
mainfrom
claude/risc-thought-engine-TCZw7
Apr 6, 2026
Merged

fix: technical debt + knowledge update + BF16 distance table plan#123
AdaWorldAPI merged 4 commits into
mainfrom
claude/risc-thought-engine-TCZw7

Conversation

@AdaWorldAPI
Copy link
Copy Markdown
Owner

Summary

Technical debt fixes + knowledge update + BF16 plan for next session.

Fixes

  • builder.rs: RoleTemperatures → Temperature(f32). Dead params removed.
  • signed_engine.rs: WARNING on from_unsigned() — CDF rank relabeling.
  • cronbach.rs: quorum_scores() → variance_agreement_scores() (not Cronbach α).
  • l4_bridge.rs: LIMITATION documented — table rows ≠ centroids.

Knowledge Update

  • .claude/knowledge/signed-session-findings.md — for 13 agents
  • Quality checks: 6 questions every agent must answer before encoding changes

Technical Debt Review

  • .claude/TECHNICAL_DEBT_SIGNED_SESSION.md — 56% useful, 44% bypasses StackedN/BF16
  • The critical misunderstanding: built i8/u8 when architecture says BF16

BF16 Plan (5 phases)

  • .claude/PLAN_BF16_DISTANCE_TABLES.md
  • Phase 1: BF16ThinkingEngine
  • Phase 2: Gate-modulated table (only Up)
  • Phase 3: Wire existing modules
  • Phase 4: Bake BF16 lenses (needs HF streaming)
  • Phase 5: ONNX ICC calibration (needs candle)

After this merge: redo encoding

  1. Jina v5 StackedN stacking → BF16 distance table
  2. Reranker StackedN stacking → BF16 with real negatives
  3. Cronbach α over new BF16 tables
  4. Metadata per table (encoding_metadata.json)
  5. ONNX ICC on top of CORRECT encoding

244 tests pass.

https://claude.ai/code/session_019RzHP8tpJu55ESTxhfUy1A

claude added 4 commits April 6, 2026 05:18
Brutal audit identified these issues:

builder.rs:
  RoleTemperatures { t_gate, t_attn, t_ffn, t_down } → Temperature(f32)
  Only t_gate was ever used. t_attn, t_ffn, t_down were dead parameters.
  Per-role temperature deferred until per-role forward pass exists.

signed_engine.rs:
  from_unsigned() WARNING: relabels CDF ranks, NOT real cosine signs.
  u8 CDF tables have Mean=127.5, Std=73.6 — subtracting 128 produces
  symmetric i8 that LOOK signed but carry zero gate information.
  Use from_f32_cosines() or build_signed_table() for real signed tables.

cronbach.rs:
  quorum_scores() → variance_agreement_scores()
  The per-pair formula (1 - sqrt(var/max_var)) is NOT Cronbach α.
  It's an ad-hoc variance-based agreement score.
  cronbach_alpha() (the global formula) remains correctly named.

l4_bridge.rs:
  LIMITATION documented: distance table rows ≠ centroid vectors.
  A row is a list of distances, not a representation.
  Real centroids needed for accurate L4 learning (only Qwopus has them).

244 tests pass.

https://claude.ai/code/session_019RzHP8tpJu55ESTxhfUy1A
…dN/BF16

BRUTALLY HONEST REVIEW (.claude/TECHNICAL_DEBT_SIGNED_SESSION.md):

56% useful (pooling, builder, cronbach, auto_detect, tensor_bridge, tokenizer_registry)
44% bypasses existing architecture (signed_engine, dual_engine, l4_bridge, signed_domino)

THE CRITICAL MISUNDERSTANDING:
  I built i8/u8 alternatives when the architecture says BF16.
  StackedN::from_bf16() → ClamCodebook::build_cosine() → BF16 distance table
  is the existing pipeline. I duplicated it with raw f32 → CDF → i8.
  Result: cos=1.000 for all pairs. Confirmed broken.

FIXES IN THIS COMMIT:
  builder.rs: RoleTemperatures { t_gate, t_attn, t_ffn, t_down } → Temperature(f32)
    Dead parameters removed. Only t_gate was ever used.
  signed_engine.rs: from_unsigned() WARNING — CDF rank relabeling, not real signs
  cronbach.rs: quorum_scores() → variance_agreement_scores() (not Cronbach α)
  l4_bridge.rs: LIMITATION documented — table rows ≠ centroids

CORRECT PATH FORWARD:
  BF16 distance table from StackedN cosine (128 KB for 256×256)
  Same precision as source weights. No 8-bit bottleneck.
  Sign IS the gate decision. BF16 exponent IS the natural gamma.
  highheelbgz addresses the table. bgz-tensor encodes the centroids.

244 tests pass.

https://claude.ai/code/session_019RzHP8tpJu55ESTxhfUy1A
… quality checks

.claude/knowledge/signed-session-findings.md:
  READ BY: 13 agents across lance-graph + ndarray

Key findings every agent needs:
  1. Distance tables should be BF16 (u16), not i8/u8
     - Same precision as StackedN source, sign preserved, exponent = natural gamma
  2. Only Up gets silu(gate) modulation (K/V are raw — different subspace)
  3. Cronbach α: 71.5% disagreement across 3 lenses (superposition not redundant)
  4. Temperature maps to CollapseGate (FLOW/HOLD/BLOCK)
  5. L4 stays i8 (accumulators, different use case)

Quality checks for any encoding proposal:
  - Uses StackedN/ClamCodebook? (not bypass)
  - BF16 precision? (not 8-bit bottleneck)
  - Sign preserved? (bit 15 = gate decision)
  - Gate modulation before CLAM? (on centroids, not table values)
  - Tested against Spearman ρ? (not synthetic data)

https://claude.ai/code/session_019RzHP8tpJu55ESTxhfUy1A
.claude/knowledge/signed-session-findings.md — for 13 agents
.claude/TECHNICAL_DEBT_SIGNED_SESSION.md — 56% useful, 44% bypass
.claude/PLAN_BF16_DISTANCE_TABLES.md — Phase 1-5 with quality checks

https://claude.ai/code/session_019RzHP8tpJu55ESTxhfUy1A
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants