tech-debt: Frankenstein blast radius breadcrumbs (Vsa10k / L3 / 157)#245
Conversation
- vsa_udfs.rs: 5 DataFusion UDFs over FixedSizeBinary(2048) fingerprint columns (vsa_unbind, vsa_bundle, vsa_hamming_dist, vsa_braid_at, vsa_top_k) + 9 tests Operating at L4/L5 fingerprint precision; L3 Vsa10k unbind deferred (§ 18) - filter_expr.rs: CommitFilter → DataFusion Expr translator (DM-3) Predicates: gate_f ≤ max_free_energy, thinking = style_ordinal, gate_commit = bool actor_id deferred (UNKNOWN-4). 5 tests. - lib.rs: pub mod vsa_udfs + filter_expr under [query] feature gate - STATUS_BOARD: DM-2 Phase A → In progress https://claude.ai/code/session_01CgQyZ7rMWkCEohrPzEiwkD
…e row The Internal/VSA dataset row in §17 carried two errors: 1. "L3 cold tier" misread the user's L3-CPU-cache-budget constraint as a cognitive-stack layer. 2. "Vsa10k BF16 (20 KB, lossless)" is not a real type — Vsa10k is [u64; 157] = ~1.2 KB bit-packed binary (grammar/role_keys.rs:51), not a 20 KB bf16 vector. The bf16 variant doesn't exist. Restore the row to just describe Fingerprint<256> cycle fingerprints plus NARS truth vectors and braid offsets, which is what the schema actually carries. §18 and unified-integration-v1.md's precision note carry the same cluster of errors and need separate cleanup review.
Append-only entry in TECH_DEBT.md capturing three-cluster root error on this branch so the next session has breadcrumbs rather than having to rediscover: - L3 misread (CPU cache budget → cognitive layer hallucination) - Vsa10k BF16 fabrication (type does not exist; real forms are CrystalFingerprint::Vsa10kI8 / Vsa10kF32) - Inherited Vsa10k = [u64; 157] confusion (bit-packed never uses 10,000; workspace already filed rename sweep in IDEAS.md 2026-04-19 which this session ignored) Entry enumerates: - Poisoned plan sections with commit anchors (468357d / 7a60c42 / 585f8b0 / 489911b / a05979e / 2a4a245) - vsa_udfs.rs three wrong operations (unbind_op / bundle_op / braid_at_op) - Salvageable architectural ideas (two callcenter modes, VSA lazy-buffer, kitchen analogy, BBB iron rule, Chronos→ONNX, Archetype ECS bridge) - Prior workspace corrections this session violated - P0-ordered correction plan for next session https://claude.ai/code/session_01CgQyZ7rMWkCEohrPzEiwkD
Append-only addition to the 2026-04-24 Frankenstein blast radius entry. Captures what's in this session's conversation context but NOT in the plan docs: - Archetype name collision: external VangelisTech ECS crate vs internal thinking-engine::persona archetype. DU-2 only covers the former; user flagged the internal sense mid-session; not yet disambiguated in plans. - Chronos: replacement rationale captured in § 17 + DU-1; no additional session-attributable content held with confidence. - Archetype × Chronos interplay: flagged as open design question (candidate tick-driven classifier composition noted as conjecture, not written into any plan). Honesty discipline: recording only what attributes to this session's conversation; not reconstructing brainstorm content I don't hold. https://claude.ai/code/session_01CgQyZ7rMWkCEohrPzEiwkD
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 8da758749e
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| let len = match &args[0] { | ||
| ColumnarValue::Array(a) => a.len(), | ||
| _ => 1, | ||
| }; |
There was a problem hiding this comment.
Derive batch length from both UDF arguments
BundleUdf::invoke_with_args computes len from args[0] only, so calls with a scalar first argument and array second argument (for example vsa_bundle(lit_fp, fingerprint_col)) only process one row and return a 1-row array. The same pattern is repeated in HammingDistUdf and TopKUdf, which means valid scalar/array queries can return truncated or invalidly-sized results instead of broadcasting across the batch.
Useful? React with 👍 / 👎.
| if roles.is_null(i) || fps.is_null(i) { | ||
| f32::NAN | ||
| } else { |
There was a problem hiding this comment.
Return NULL for null inputs in vsa_unbind
When either input is null, vsa_unbind emits NaN rather than SQL null. That changes null-propagation semantics and can corrupt downstream analytics (e.g., AVG/SUM or ordering/filtering on the UDF output) because missing values become numeric payloads instead of nullable entries.
Useful? React with 👍 / 👎.
| let values = Arc::new(UInt16Array::from(all_values)) as ArrayRef; | ||
| let item_field = Arc::new(Field::new("item", DataType::UInt16, true)); | ||
| let offsets = OffsetBuffer::new(Buffer::from_vec(offsets).into()); | ||
| let list = ListArray::new(item_field, offsets, values, None); |
There was a problem hiding this comment.
Preserve NULL rows in vsa_top_k list output
vsa_top_k builds offsets for null inputs but constructs the ListArray with None validity, so rows with null fingerprint/k become empty lists instead of nulls. This loses information and breaks expected null semantics for consumers that need to distinguish “missing input” from “computed empty result.”
Useful? React with 👍 / 👎.
…ates Two additional breadcrumb subsections appended to the 2026-04-24 Frankenstein blast radius entry: ONNX > Chronos — documents what § 17's 6-criterion table captures (Output / Task type / Training / Precision / Infra / Fit) and flags the gap: the table enumerates where Chronos loses but not where Chronos would legitimately win. My first-principles candidate-XYZ list (temporal forecasting: F-value N cycles ahead, style-drift onset, gate-commit-rate rolling window) is explicitly marked NOT session-attributable so the next session either fills from their own recall or rejects Chronos across all cases. Archetype / persona / thinking-style modeling — lists three epiphany candidates that sit in § 16 / § 17 plan text but were never prepended to EPIPHANIES.md (board-hygiene violation from commit 468357d): four-way multiply as architecture search, persona as atom-space coordinate, MM-CoT stage split as faculty asymmetry. Framings provided for the next session to prepend as dated entries. https://claude.ai/code/session_01CgQyZ7rMWkCEohrPzEiwkD
Purpose
Session on
claude/read-claude-md-jh51Oran out of context before cleanup. This PR appends a single entry toTECH_DEBT.mdso the next session picks up the blast radius as breadcrumbs rather than having to rediscover.No other files changed in this PR. Scope is deliberately tight — cleanup itself (plan-doc deletions, vsa_udfs.rs canonical delegation, board hygiene) is the next session's P0 work, described inside the new entry.
What the entry captures
Three-cluster root error:
Vsa10k BF16fabrication — type does not exist; real forms areCrystalFingerprint::Vsa10kI8(10 KB) /Vsa10kF32(40 KB legacy).Vsa10k = [u64; 157]confusion — bit-packed never uses 10,000; workspace already filed rename sweep inIDEAS.md2026-04-19 entries; this session ignored it.Breadcrumbs included:
468357d/7a60c42/585f8b0/489911b/a05979e/2a4a245).vsa_udfs.rsthree wrong operations (unbind_op/bundle_op/braid_at_op) with canonical-delegation target (PR D7 grammar thinking styles + categorical-algebraic inference architecture #242 / D5 Trajectory + MarkovBundler + board hygiene for categorical-algebraic inference #243 / feat(crystal): sandwich layout + bipolar cells + harvest docs #209).IDEAS.mdVsa10k→Vsa16k rename + governance ban on "10,000-D binary VSA" framing).Why merge this separately (not bundled with cleanup)
vsa_udfs.rs, append EPIPHANIES, and verify the rename sweep scope.TECH_DEBT.md.FP_WORDS = 157entry it sits next to.Test plan
.claude/board/TECH_DEBT.md+ 73 lines (append).git logon this branch.https://claude.ai/code/session_01CgQyZ7rMWkCEohrPzEiwkD
Generated by Claude Code