Skip to content

docs: calibration matrix — ONNX vs GGUF vs highheelbgz × ICC profiles 5 encoding paths × ONNX ground truth × 6 models × 6 roles: ONNX (rten) = f32 ground truth GGUF raw u8 CDF, GGUF γ+φ, GGUF i8 signed, GGUF highheelbgz spiral ICC profile per path. Spearman ρ identifies best encoding per model×role. All tools ready: rten, streaming, gamma_phi, LensProfile, calibrate harness. Estimated: ~2.5 hours for complete matrix. https://claude.ai/code/session_01ChLvBfpJS8dQhHxRD4pYNp#113

Merged
AdaWorldAPI merged 5 commits into
mainfrom
claude/setup-embedding-pipeline-Fa65C
Apr 5, 2026

Conversation

@AdaWorldAPI
Copy link
Copy Markdown
Owner

No description provided.

claude added 5 commits April 5, 2026 17:14
reranker_lens.rs: complete lens module for Jina Reranker v3 BF16
  - 256×256 HDR table baked via include_bytes! (64 KB, L2 resident)
  - 151,936 codebook index (Qwen2 tokenizer, 296 KB)
  - cos[-0.886, +0.826] — widest signed range, most brain-like E/I ratio
  - reranker_lookup/distance/engine/think — full pipeline
  - reranker_relevance(): cross-encoder style query→document scoring
  - cross_model_eval(): Jina embedding × reranker = agreement score

Cross-model evaluation:
  Jina v3 similarity (symmetric) × Reranker relevance (asymmetric)
  Agreement = geometric mean (weakest-link property)

9 tests: table shape, codebook size, diagonal=255, symmetry,
  variance, engine creation, self-relevance, cross-model.

CRITICAL for i8 signed experiment:
  The reranker's symmetric cosine range makes it the BEST model
  for testing excitation/inhibition dynamics. Start the dual-path
  (u8 vs i8) comparison HERE, not on Jina v3 (which is positive-skewed).

https://claude.ai/code/session_01ChLvBfpJS8dQhHxRD4pYNp
LensProfile (camera+lens ICC for distance tables):
  - EncodingPath: RawF32, HdrCdfU8, SignedI8, GammaPhiU8, GammaPhiI8
  - Transfer curve: 256 sample points (ground truth cos → encoded value)
  - Inverse curve: encoded → estimated cos
  - Per-centroid bias, noise floor, effective bits, signed ratio
  - build() compares encoded table vs ground truth f32 cosines
  - rten computes ground truth, profile captures the residual

LensConfig 6-lane registry:
  jina-v3:       250K vocab, XlmRoberta, cos[-0.067,0.234], γ=0.37, truth anchor
  bge-m3:        250K vocab, XlmRoberta, cos[-0.07,0.23],   γ=0.40
  reranker-v3:   151K vocab, Qwen2,      cos[-0.886,0.826], γ=1.50, best for i8
  reader-lm:     151K vocab, Qwen2,      cos[-0.095,0.336], γ=0.12
  qwopus-27b:    248K vocab, Qwen2,      cos[-0.23,0.18],   γ=1.50, 4096 centroids
  maverick-128e: 202K vocab, Llama,      TBD,               TBD

60 contract tests passing.

Session complete: reranker wired, ICC DTO ready, i8 architecture documented,
all handover docs committed. Next session can start i8 dual-path on reranker.

https://claude.ai/code/session_01ChLvBfpJS8dQhHxRD4pYNp
calibrate_lenses.rs: measures baked lens quality against ground truth
  - 16 sentence pairs (diverse: similar/dissimilar/related/unrelated)
  - Jina v3 + Reranker v3 baked distances computed
  - Spearman rank correlation against API ground truth
  - ICC profile (linear fit) when ρ < 0.998
  - Thresholds: >0.998 = truth anchor, >0.95 = usable+ICC, <0.95 = broken

Current result with hash tokenization: ρ = 0.13 (BROKEN — expected)
Fix: use real tokenizers per model (Jina=XLM-RoBERTa, Reranker=Qwen2)
Then: set JINA_API_KEY or use rten+ONNX for ground truth

Cross-model calibration: Jina ←ICC→ Reranker ←ICC→ all others
Star topology: N-1 profiles align N models. 512 bytes per pair.

https://claude.ai/code/session_01ChLvBfpJS8dQhHxRD4pYNp
jinaai/jina-embeddings-v5-text-small-text-matching:
  F16 GGUF: 1,198 MB (streamable, v3 format, 310 tensors)
  ONNX: 2,384 MB (rten ground truth, full f32)
  tokenizer.json: 11.4 MB (real BPE)

Calibration: rten(ONNX) = ground truth, stream(GGUF) = baked lens.
Same model, two paths, ICC profile bridges them. No API key.

https://claude.ai/code/session_01ChLvBfpJS8dQhHxRD4pYNp
5 encoding paths × ONNX ground truth × 6 models × 6 roles:
  ONNX (rten) = f32 ground truth
  GGUF raw u8 CDF, GGUF γ+φ, GGUF i8 signed, GGUF highheelbgz spiral

ICC profile per path. Spearman ρ identifies best encoding per model×role.
All tools ready: rten, streaming, gamma_phi, LensProfile, calibrate harness.
Estimated: ~2.5 hours for complete matrix.

https://claude.ai/code/session_01ChLvBfpJS8dQhHxRD4pYNp
@AdaWorldAPI AdaWorldAPI merged commit 4e0298e into main Apr 5, 2026
AdaWorldAPI pushed a commit that referenced this pull request May 13, 2026
…** permissions

PR_ARC PREPEND for #366 (sprint-7 7-worker implementation wave +
AuditSink trait unification). LATEST_STATE header updated +
prepended #366 row. ISSUES.md new entry for the ndarray:master
hpc-extras gap surfaced by MedCare-rs#118 (P2, upstream-blocked).

Adjacent landings recorded inline: MedCare-rs sprint-1 10-PR sweep
(#113-#122) including E1-1 OQ-3 direct migration consuming our
0d725d4 decision; MedCare-rs sprint-2 5 PRs queued (item 5
consumes this PR's new UnifiedBridge::with_jsonl_audit constructor).

settings.json: consolidated per-sprint-log-N entries into single
.claude/board/** glob for Write/Edit/tee. Drops 18 specific entries
in favor of 3 globs. Future sprint-log-N dirs won't need a
permissions patch before spawning workers.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants