docs: calibration matrix — ONNX vs GGUF vs highheelbgz × ICC profiles 5 encoding paths × ONNX ground truth × 6 models × 6 roles: ONNX (rten) = f32 ground truth GGUF raw u8 CDF, GGUF γ+φ, GGUF i8 signed, GGUF highheelbgz spiral ICC profile per path. Spearman ρ identifies best encoding per model×role. All tools ready: rten, streaming, gamma_phi, LensProfile, calibrate harness. Estimated: ~2.5 hours for complete matrix. https://claude.ai/code/session_01ChLvBfpJS8dQhHxRD4pYNp#113
Merged
Conversation
reranker_lens.rs: complete lens module for Jina Reranker v3 BF16 - 256×256 HDR table baked via include_bytes! (64 KB, L2 resident) - 151,936 codebook index (Qwen2 tokenizer, 296 KB) - cos[-0.886, +0.826] — widest signed range, most brain-like E/I ratio - reranker_lookup/distance/engine/think — full pipeline - reranker_relevance(): cross-encoder style query→document scoring - cross_model_eval(): Jina embedding × reranker = agreement score Cross-model evaluation: Jina v3 similarity (symmetric) × Reranker relevance (asymmetric) Agreement = geometric mean (weakest-link property) 9 tests: table shape, codebook size, diagonal=255, symmetry, variance, engine creation, self-relevance, cross-model. CRITICAL for i8 signed experiment: The reranker's symmetric cosine range makes it the BEST model for testing excitation/inhibition dynamics. Start the dual-path (u8 vs i8) comparison HERE, not on Jina v3 (which is positive-skewed). https://claude.ai/code/session_01ChLvBfpJS8dQhHxRD4pYNp
LensProfile (camera+lens ICC for distance tables): - EncodingPath: RawF32, HdrCdfU8, SignedI8, GammaPhiU8, GammaPhiI8 - Transfer curve: 256 sample points (ground truth cos → encoded value) - Inverse curve: encoded → estimated cos - Per-centroid bias, noise floor, effective bits, signed ratio - build() compares encoded table vs ground truth f32 cosines - rten computes ground truth, profile captures the residual LensConfig 6-lane registry: jina-v3: 250K vocab, XlmRoberta, cos[-0.067,0.234], γ=0.37, truth anchor bge-m3: 250K vocab, XlmRoberta, cos[-0.07,0.23], γ=0.40 reranker-v3: 151K vocab, Qwen2, cos[-0.886,0.826], γ=1.50, best for i8 reader-lm: 151K vocab, Qwen2, cos[-0.095,0.336], γ=0.12 qwopus-27b: 248K vocab, Qwen2, cos[-0.23,0.18], γ=1.50, 4096 centroids maverick-128e: 202K vocab, Llama, TBD, TBD 60 contract tests passing. Session complete: reranker wired, ICC DTO ready, i8 architecture documented, all handover docs committed. Next session can start i8 dual-path on reranker. https://claude.ai/code/session_01ChLvBfpJS8dQhHxRD4pYNp
calibrate_lenses.rs: measures baked lens quality against ground truth - 16 sentence pairs (diverse: similar/dissimilar/related/unrelated) - Jina v3 + Reranker v3 baked distances computed - Spearman rank correlation against API ground truth - ICC profile (linear fit) when ρ < 0.998 - Thresholds: >0.998 = truth anchor, >0.95 = usable+ICC, <0.95 = broken Current result with hash tokenization: ρ = 0.13 (BROKEN — expected) Fix: use real tokenizers per model (Jina=XLM-RoBERTa, Reranker=Qwen2) Then: set JINA_API_KEY or use rten+ONNX for ground truth Cross-model calibration: Jina ←ICC→ Reranker ←ICC→ all others Star topology: N-1 profiles align N models. 512 bytes per pair. https://claude.ai/code/session_01ChLvBfpJS8dQhHxRD4pYNp
jinaai/jina-embeddings-v5-text-small-text-matching: F16 GGUF: 1,198 MB (streamable, v3 format, 310 tensors) ONNX: 2,384 MB (rten ground truth, full f32) tokenizer.json: 11.4 MB (real BPE) Calibration: rten(ONNX) = ground truth, stream(GGUF) = baked lens. Same model, two paths, ICC profile bridges them. No API key. https://claude.ai/code/session_01ChLvBfpJS8dQhHxRD4pYNp
5 encoding paths × ONNX ground truth × 6 models × 6 roles: ONNX (rten) = f32 ground truth GGUF raw u8 CDF, GGUF γ+φ, GGUF i8 signed, GGUF highheelbgz spiral ICC profile per path. Spearman ρ identifies best encoding per model×role. All tools ready: rten, streaming, gamma_phi, LensProfile, calibrate harness. Estimated: ~2.5 hours for complete matrix. https://claude.ai/code/session_01ChLvBfpJS8dQhHxRD4pYNp
AdaWorldAPI
pushed a commit
that referenced
this pull request
May 13, 2026
…** permissions PR_ARC PREPEND for #366 (sprint-7 7-worker implementation wave + AuditSink trait unification). LATEST_STATE header updated + prepended #366 row. ISSUES.md new entry for the ndarray:master hpc-extras gap surfaced by MedCare-rs#118 (P2, upstream-blocked). Adjacent landings recorded inline: MedCare-rs sprint-1 10-PR sweep (#113-#122) including E1-1 OQ-3 direct migration consuming our 0d725d4 decision; MedCare-rs sprint-2 5 PRs queued (item 5 consumes this PR's new UnifiedBridge::with_jsonl_audit constructor). settings.json: consolidated per-sprint-log-N entries into single .claude/board/** glob for Write/Edit/tee. Drops 18 specific entries in favor of 3 globs. Future sprint-log-N dirs won't need a permissions patch before spawning workers.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.