expr: direct-unwrap LazyBinaryFunc for sqlfunc-generated structs#36705
Draft
antiguru wants to merge 4 commits into
Draft
expr: direct-unwrap LazyBinaryFunc for sqlfunc-generated structs#36705antiguru wants to merge 4 commits into
antiguru wants to merge 4 commits into
Conversation
Extend `#[sqlfunc(...)]` to accept stateful structs and arena access for unary and binary functions. Bumps `EagerUnaryFunc::call` to take `&'a self` and `&'a RowArena`, mirroring the variadic shape. Adds `skip_display = true` modifier for structs with non-trivial `Display`. Migrates 19 hand-written `LazyUnaryFunc` impls in `scalar/func/impls/*` to use the macro form. Remaining hand-written impls are the 6 short- circuit variadics (And, Or, Coalesce, ErrorIfNull, Greatest, Least) plus CaseLiteral, which require true lazy semantics. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add `RowArena::make_datum_array` mirroring `make_datum_list`. Switch `cast_string_to_array` and `cast_array_to_array` to return `Array<'a>` instead of `Datum<'a>`, removing `try_make_datum + try_push_array` boilerplate. Drop the unused `_temp_storage` parameter from `record_get`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`T: Borrow<Datum<'a>>` only constrains the iter Item type; it does not verify that the borrowed `Datum`'s variant matches what T's reader expects. Callers passing T = i32 with `Datum::String` inputs would satisfy the bound and produce typed handles whose `typed_iter()` panics. Add doc warning to both helpers. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…structs Sqlfunc-generated binary structs now emit an explicit `impl LazyBinaryFunc` with a body that calls `try_from_result` per argument instead of going through the generic `<(T0, T1) as InputDatumType>::try_from_iter`. The tuple `try_from_iter` was the largest single contributor to mz-expr's LLVM IR (93k lines across 218 monomorphizations) and produced a costly iterator state machine in each binary variant's dispatch path. The blanket `impl<T: EagerBinaryFunc> LazyBinaryFunc for T` is removed; sqlfunc-generated structs supply their own impl via the proc macro, and the two remaining hand-written `EagerBinaryFunc` impls (`ListLengthMax`, `RegexpReplace`) use the new `lazy_via_eager_binary!` declarative macro, which still pays the tuple `try_from_iter` cost but only for those two shapes. Unary and variadic paths are unchanged: the blanket `impl<T: EagerUnaryFunc> LazyUnaryFunc for T` stays, and `LazyVariadicFunc` retains its existing shape so the seven short-circuit variadics keep their hand-written eval. Measurements (`cargo llvm-lines -p mz-expr`): * total LLVM lines: 1,289,072 -> 1,232,567 (-56,505 = -4.4%) * `<(T0,T1) as InputDatumType>::try_from_iter`: 93,259 (218 copies) -> 0 * per-variant binary dispatch (`<T as LazyBinaryFunc>::eval`): ~608 LLVM lines effective -> ~472 lines (-22%) Measurements (`cargo asm -p mz-expr --lib`): * per-variant binary dispatch asm: 949 -> 726 insns average (-23.5%) * total binary dispatch asm: 201,214 -> 153,888 insns No behavior change: existing `cargo test -p mz-expr` (49 tests), `cargo test -p mz-expr-derive-impl` (14 snapshot tests), and the `test_runner` datadriven suite all pass. The macro snapshot files are regenerated via `cargo insta accept` to reflect the new emitted shape. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Stacked on #36697.
Sqlfunc-generated binary structs now emit an explicit
impl LazyBinaryFuncwhose body callstry_from_resultper argument instead of going through the generic<(T0, T1) as InputDatumType>::try_from_iter. The tupletry_from_iterwas the single largest contributor to mz-expr's LLVM IR (93k lines / 218 copies) and produced a costly iterator state machine in every binary variant's dispatch path.The blanket
impl<T: EagerBinaryFunc> LazyBinaryFunc for Tis removed; the two remaining hand-writtenEagerBinaryFuncimpls (ListLengthMax,RegexpReplace) use a newlazy_via_eager_binary!declarative macro that pays the tuple cost only for those two shapes. Unary and variadic paths are unchanged.Measurements:
cargo llvm-lines -p mz-exprtotal<(T0,T1)>::try_from_iterNo behavior change:
cargo test -p mz-expr(49),cargo test -p mz-expr-derive-impl(14 snapshot tests),test_runnerdatadriven suite, andcargo check --workspaceall pass. Snapshots regenerated viacargo insta accept.Prompt at
doc/developer/prompts/sqlfunc-dyn-dispatch.md.