From c8ae8ab3fb4cde29762bc958a4e477f2adbbf670 Mon Sep 17 00:00:00 2001 From: Claude Date: Mon, 6 Apr 2026 09:58:26 +0000 Subject: [PATCH] =?UTF-8?q?docs:=20=CF=86-spiral=20reconstruction=20?= =?UTF-8?q?=E2=80=94=20stride/offset/gamma/Zeckendorf=20theory?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit .claude/knowledge/phi-spiral-reconstruction.md: For all agents working on encoding, distance tables, or compression. Key concepts: 1. Interpolation = spiral evaluation, NOT averaging φ-spiral r(θ) = a×e^(bθ) gives exact reconstruction 2. Stride selects OCTAVES (not bins). All 17 bins always sampled. Offset should be φ-fractional: floor(n_octaves × 0.618...) 3. Two corrections: γ (distribution shape) vs stride (sampling sparsity) γ = per-role metadata (28 bytes). Stride = spiral fit. 4. Metadata chain: every table carries its full provenance 5. Zeckendorf connection: Fibonacci positions = φ-spiral coordinates Reconstruction between Fibonacci positions = optimal coverage Research direction: extended Zeckendorf proof would give LOGICAL ground truth ρ ≥ 1-ε where ε is computable from spiral curvature Replaces empirical ICC with mathematical guarantee ONNX/Spearman become validation of the proof, not the proof itself Updated quality checks (10 items, stricter than before). https://claude.ai/code/session_019RzHP8tpJu55ESTxhfUy1A --- .../knowledge/phi-spiral-reconstruction.md | 249 ++++++++++++++++++ 1 file changed, 249 insertions(+) create mode 100644 .claude/knowledge/phi-spiral-reconstruction.md diff --git a/.claude/knowledge/phi-spiral-reconstruction.md b/.claude/knowledge/phi-spiral-reconstruction.md new file mode 100644 index 00000000..fa9d56ce --- /dev/null +++ b/.claude/knowledge/phi-spiral-reconstruction.md @@ -0,0 +1,249 @@ +# KNOWLEDGE UPDATE: φ-Spiral Reconstruction — The Core Superpower + +## READ BY: family-codec-smith, palette-engineer, savant-research, +## truth-architect, cascade-architect, integration-lead + +--- + +## 1. INTERPOLATION = SPIRAL EVALUATION, NOT AVERAGING + +``` +WRONG (halftone, deleted): + bin[3] missing → bin[3] = (bin[2] + bin[4]) / 2 + Linear average. Destroys spiral structure. + Like replacing a curve with a straight line on a map. + +RIGHT (φ-spiral reconstruction): + bin[3] missing → position 3 on the φ-spiral is KNOWN + θ = 3 × golden_angle = 3 × 2π/φ² ≈ 3 × 137.507° + r = f(3) from the spiral equation r(θ) = a × e^(bθ) + bin[3] = spiral(θ, r) → EXACT reconstruction + Because the spiral IS the constraint. Not a guess. +``` + +The golden ratio is the WORST case for rational approximation +(Hurwitz's theorem). This means: +- No two φ-spiral positions alias to the same quantization bucket +- The spiral fills space MAXIMALLY uniformly (Weyl equidistribution) +- Fewer points suffice to identify the spiral than any other curve + +## 2. STRIDE AND OFFSET SELECTION + +### Octave stride + +``` +BF16 weight vector: 5120 dimensions +Base17: 17 bins +Octaves: ceil(5120 / 17) = 302 octaves + +Stride=1: sample ALL 302 octaves → 302 × 17 = 5134 BF16→f64 conversions +Stride=4: every 4th octave → 76 × 17 = 1292 conversions (4× faster) +Stride=16: every 16th octave → 19 × 17 = 323 conversions (16× faster) +Stride=20: every 20th octave → 16 × 17 = 272 conversions (19× faster) + +The stride selects WHICH octaves to sample. +NOT which bins. ALL 17 bins are always sampled. +``` + +### Why stride works (the spiral argument) + +``` +Each octave is one 17-position spiral turn. +Consecutive octaves are CORRELATED (smooth weight variations). +Stride=16 means: sample every 16th turn of the spiral. + +IF the spiral is smooth (weights don't jump wildly between octaves): + stride=16 captures the spiral shape with 19 points + the 283 skipped octaves lie ON the same spiral + reconstruction from 19 points = evaluate spiral at 302 positions + +IF the spiral has high-frequency variation: + stride=16 misses sharp features (aliasing) + stride=1 needed for these tensors + → detect via: compare stride=1 vs stride=16 Pearson ρ + → if ρ > 0.99: stride=16 is safe (spiral is smooth) + → if ρ < 0.95: use stride=1 (high-frequency content) +``` + +### Offset selection + +``` +Default: offset=0 (start at first octave) +Better: offset = golden ratio fractional part + +offset = floor(n_octaves × (φ - 1)) = floor(302 × 0.618...) = 186 + +Starting at octave 186 instead of 0: + stride=16 samples octaves: 186, 202, 218, 234, 250, 266, 282, 296, 10, 26, ... + (wrapping around at 302) + + This MAXIMIZES the coverage because φ-offset + stride + guarantees the sample positions don't cluster. + + vs offset=0: samples 0, 16, 32, ... (regular, but misses middle if stride too big) + vs offset=186: φ-scattered across the full range + +The offset IS the golden-angle sampling that highheelbgz already does. +It's the same φ-distribution applied to octave selection. +``` + +## 3. CORRECTION: BENDING vs COMPRESSION + +Two sources of error, two different corrections: + +### Bending (γ correction) — distribution SHAPE is wrong + +``` +Problem: raw cosine distribution is NOT uniform. + Gate weights: 68.9% near zero, thin tails. + Attention: broad, nearly symmetric. + Down: narrow, one-sided. + + Uniform quantization wastes bits on empty regions. + Gate values near zero get the SAME resolution as gate values at 0.3 + But the SiLU decision boundary IS at zero → needs MORE resolution there. + +Fix: gamma_phi_encode(value, role_gamma, phi_scale) + Stage 1: γ-normalize by role (compress highlights, expand shadows) + gamma_encode(v, γ) = sign(v) × ln(1 + |v|/γ) × γ + Gate γ=1.50 → MOST expansion near zero + Q γ=0.37 → less expansion (already broad) + + Stage 2: φ-distribute (golden ratio spacing) + phi_encode(v, φ_scale) = sign(v) × log_φ(1 + |v|/φ_scale) × φ_scale + Ensures quantization boundaries sit at irrational positions + No BF16 bucket aliasing + +Stored as metadata: GammaProfile { role_gamma: [f32; 6], phi_scale: f32 } + 28 bytes per model. Exact decode: phi_decode(gamma_decode(stored, γ), φ_scale) +``` + +### Compression (stride/offset) — samples are SPARSE + +``` +Problem: stride=16 samples 19 of 302 octaves. + The 283 skipped octaves contribute to the true centroid average + but are not measured. + +Fix: spiral-aware interpolation. + The 19 sampled points define a φ-spiral in 17D. + The 283 missing points lie ON this spiral (smooth assumption). + Reconstruction: evaluate spiral at missing positions. + + NOT: linear interpolation (wrong, ignores curvature) + NOT: halftone dropping (wrong, misses entire dimensions) + IS: φ-spiral fit from sampled points → evaluate at all positions + +Implementation: + For each of the 17 bins: + sampled_values[19]: the values we measured at stride=16 + sampled_positions[19]: which octaves we sampled (0, 16, 32, ...) + + Fit: r(θ) = a × e^(b×θ) through the 19 points + Evaluate: r(θ) at all 302 positions + Average: mean of all 302 reconstructed values = the true bin value + + This is MORE accurate than averaging only the 19 sampled values + because the spiral fit exploits the smoothness constraint. +``` + +### When to use which correction + +``` +Per centroid pair in the distance table: + +1. ALWAYS: γ correction (role-specific distribution shaping) + → stored as GammaProfile metadata (28 bytes) + → applied during encoding AND decoding + +2. IF stride > 1: spiral reconstruction + → stored as stride + offset metadata (2 bytes) + → applied during centroid averaging (StackedN build) + → NOT applied to the distance table values (those come from cosine) + +3. IF Spearman ρ < 0.998 after γ: ICC profile correction + → stored as transfer curve (per pair or per region) + → applied during distance table lookup + → absorbs whatever γ + spiral didn't fix + +4. IF ICC insufficient: CoSENT candle training + → directly optimizes rank order + → last resort before LoRA (the nuclear option) +``` + +## 4. THE METADATA CHAIN + +Every baked table carries: + +```json +{ + "source_gguf": "jinaai/jina-reranker-v3-GGUF", + "source_dtype": "BF16", + "n_centroids": 256, + "centroid_spd": 32, + "octave_stride": 16, + "octave_offset": 186, + "role": "ffn_up", + "gate_modulated": true, + "gamma_profile": { + "role_gamma": 0.12, + "phi_scale": 0.08, + "n_calibration": 5120 + }, + "encoding": "BF16", + "cosine_range": [-0.886, 0.826], + "sign_distribution": { "positive": 32512, "negative": 32768, "zero": 256 }, + "spearman_rho_vs_f32": null, + "icc_profile": null, + "variance_agreement_score": null, + "cronbach_alpha_context": null +} +``` + +Each field enables exact reconstruction: +- stride + offset → which octaves were sampled +- gamma_profile → undo γ+φ for exact decode +- cosine_range → per-role scale factor +- icc_profile → correction curve (when available) +- variance_agreement → quorum confidence per pair + +## 5. QUALITY CHECKS (updated) + +Before encoding any distance table: + +``` +[ ] Uses StackedN/ClamCodebook? (not raw f32 bypass) +[ ] BF16 precision? (not 8-bit bottleneck) +[ ] All 17 bins sampled? (not halftone 9/17) +[ ] Stride documented? (octave_stride in metadata) +[ ] Offset is φ-fractional? (not 0, not arbitrary) +[ ] γ correction applied? (per-role from GammaProfile) +[ ] φ distribution applied? (irrational bucket boundaries) +[ ] Gate modulation on Up only? (silu(gate)×up before cosine) +[ ] Metadata JSON saved alongside table? +[ ] Reconstruction path documented? (decode = φ_decode(γ_decode(stored))) +``` + +## 6. THE ZECKENDORF CONNECTION + +``` +Zeckendorf's theorem: every positive integer has a unique +representation as sum of non-consecutive Fibonacci numbers. + +ZeckF64 in bgz-tensor: encodes positions as Fibonacci sums. + Position 42 = F(9) + F(6) + F(3) = 34 + 8 + 2 = 44... + (actually: the nearest Zeckendorf representation) + +Fibonacci numbers ARE the φ-spiral positions: + F(n) / F(n-1) → φ as n → ∞ + Each Fibonacci number is the next position on the spiral + +Zeckendorf decomposition = expressing a point as + a sum of spiral positions = its coordinates ON the spiral. + +Spiral reconstruction from Zeckendorf: + Given: bin values at Zeckendorf positions + The Fibonacci structure tells you WHERE on the spiral each value sits + Reconstruction = evaluate the spiral BETWEEN Fibonacci positions + This is optimal because Fibonacci spacing = φ-optimal coverage +```