feat: Phase 3 BF16 wiring + φ-spiral reconstruction theory#128
Conversation
.claude/knowledge/phi-spiral-reconstruction.md:
For all agents working on encoding, distance tables, or compression.
Key concepts:
1. Interpolation = spiral evaluation, NOT averaging
φ-spiral r(θ) = a×e^(bθ) gives exact reconstruction
2. Stride selects OCTAVES (not bins). All 17 bins always sampled.
Offset should be φ-fractional: floor(n_octaves × 0.618...)
3. Two corrections: γ (distribution shape) vs stride (sampling sparsity)
γ = per-role metadata (28 bytes). Stride = spiral fit.
4. Metadata chain: every table carries its full provenance
5. Zeckendorf connection: Fibonacci positions = φ-spiral coordinates
Reconstruction between Fibonacci positions = optimal coverage
Research direction: extended Zeckendorf proof would give LOGICAL ground truth
ρ ≥ 1-ε where ε is computable from spiral curvature
Replaces empirical ICC with mathematical guarantee
ONNX/Spearman become validation of the proof, not the proof itself
Updated quality checks (10 items, stricter than before).
https://claude.ai/code/session_019RzHP8tpJu55ESTxhfUy1A
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: c8ae8ab3fb
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| No BF16 bucket aliasing | ||
|
|
||
| Stored as metadata: GammaProfile { role_gamma: [f32; 6], phi_scale: f32 } | ||
| 28 bytes per model. Exact decode: phi_decode(gamma_decode(stored, γ), φ_scale) |
There was a problem hiding this comment.
Apply γ+φ decode in inverse transform order
This decode formula is reversed: gamma_phi_encode applies gamma first and phi second, so rehydration must decode phi first and then gamma. Documenting phi_decode(gamma_decode(...)) will produce incorrect round-trips if someone implements from this guide, and it conflicts with the actual inverse implementation in crates/bgz-tensor/src/gamma_phi.rs::gamma_phi_decode.
Useful? React with 👍 / 👎.
…eam #150 ## Stale artifact removal (182 files, 3 MB) `AdaWorldAPI-lance-graph-d9df43b/` was a committed snapshot of an older upstream version (48 .rs files vs our 98). Full audit confirmed: - ZERO files exist only in the artifact (every file has a counterpart) - Every differing file: ours >= artifact in LOC (ours is strictly ahead) - All upstream features (#125 parameter_substitution, #140 lance_vector_search) are already in our src tree The directory created GitHub path confusion — duplicate navigation paths for datafusion_planner, spo, blasgraph, neighborhood, arigraph. Removing it eliminates that confusion with zero content loss. ## Cherry-pick: spark_dialect.rs from upstream PR #150 The ONE file upstream has that we didn't: - `crates/lance-graph/src/spark_dialect.rs` (107 LOC) Spark SQL dialect for DataFusion unparser: backtick quoting, STRING type casting, EXTRACT for dates, BIGINT/INT types, LENGTH(), derived table aliases. - `crates/lance-graph/tests/test_to_spark_sql.rs` (293 LOC) Full test suite for Spark SQL output. - `pub mod spark_dialect;` added to lib.rs Adapted from upstream's DF 50.3 to our DF 51 — same API surface, no changes needed. ## Upstream audit result (for the record) Upstream (lance-format/lance-graph) is at v0.5.4. Our fork is at v0.5.3 with newer deps (arrow 57 vs 56.2, datafusion 51 vs 50.3). Other than spark_dialect, every upstream feature and fix is already present in our source tree — parameter_substitution (#125), lance_vector_search (#140), complex RETURN clauses (#142), duplicate columns fix (#128) are all in `crates/lance-graph/src/`. Their deleted `simple_executor` was a prototype cold-path executor we never had. Our `ExecutionStrategy::DataFusion` path (6K LOC planner) + `ExecutionStrategy::BlasGraph` (semiring algebra) subsume it. The user has flagged adding a deliberate `ExecutionStrategy::Simple` cold path as a 4th strategy for trivial queries — that's a separate PR per the documented matrix of execution strategies. https://claude.ai/code/session_01NYGrxVopyszZYgLBxe4hgj
Summary
Phase 3 wiring + φ-spiral reconstruction theory + ndarray dead code cleanup.
Phase 3: Wire existing modules to BF16
dual_engine.rs: rewritten for BuiltEngine (u8/i8/BF16 any pair)DualEngine::u8_vs_bf16(table)— the real comparisoncomposite_engine.rs: rewritten for BuiltEngineadd_bf16_lens(),add_u8_lens(),add_engine()— mixed types in one compositionφ-Spiral Reconstruction Theory
.claude/knowledge/phi-spiral-reconstruction.md:floor(n_octaves × 0.618...)Research direction: extended Zeckendorf proof would give LOGICAL ground truth
ρ ≥ 1-ε where ε computable from spiral curvature.
Replaces empirical ICC with mathematical guarantee.
Quality Checks (10 items)
Every encoding proposal must pass:
All 17 bins sampled? Stride documented? Offset φ-fractional?
γ applied? φ distributed? Gate on Up only? Metadata JSON saved?
257 tests pass (lance-graph).
https://claude.ai/code/session_019RzHP8tpJu55ESTxhfUy1A