Skip to content

Upgrade to DataFusion 44#368

Merged
Dandandan merged 42 commits intobranch-44from
bg_44
Nov 17, 2025
Merged

Upgrade to DataFusion 44#368
Dandandan merged 42 commits intobranch-44from
bg_44

Conversation

@avantgardnerio
Copy link

Which issue does this PR close?

Closes #.

Rationale for this change

What changes are included in this PR?

Are there any user-facing changes?

@avantgardnerio avantgardnerio changed the title Get working build Upgrade to DataFusion 44 Oct 14, 2025
joroKr21 and others added 22 commits October 14, 2025 17:35
* Add pool_size method to MemoryPool

* Fix

* Fmt

Co-authored-by: Daniël Heres <danielheres@gmail.com>
* ignore writer shutdown error

* cargo check
* Try and fix swap_hash_join

* Only swap projections when join does not have projections

* just backport upstream fix

* remove println
* Support Duration in min/max agg functions

* Attempt to fix build

* Attempt to fix build - Fix chrono version

* Revert "Attempt to fix build - Fix chrono version"

This reverts commit fd76fe6.

* Revert "Attempt to fix build"

This reverts commit 9114b86.

---------

Co-authored-by: svranesevic <svranesevic@users.noreply.github.com>
* Drop rust-toolchain

* Fix panics in array_union

* Fix the chrono
…4496) v46

* fix: rewrite fetch, skip of the Limit node in correct order

* style: fix clippy
* Support aliases in ConstEvaluator (apache#14734)

Not sure why they are not supported. It seems that if we're not careful,
some transformations can introduce aliases nested inside other expressions.

* Format Cargo.toml
…che#14888) v46

Whenever we use `recompute_schema` or `with_exprs_and_inputs`,
this ensures that we obtain the same schema.
Co-authored-by: svranesevic <svranesevic@users.noreply.github.com>
* fix case_column_or_null with nullable when conditions

* improve sqllogictests for case_column_or_null

---------

Co-authored-by: zhangli20 <zhangli20@kuaishou.com>
* fix: FULL OUTER JOIN and LIMIT produces wrong results

* Fix minor slt testing

* fix test

(cherry picked from commit ecc5694)
* fix: Limits are not applied correctly

* Add easy fix

* Add fix

* Add slt testing

* Address comments
thinkharderdev and others added 12 commits October 14, 2025 18:27
* test to demonstrate segfault in ByteGroupValueBuilder

* check for offset overflow

* clippy

(cherry picked from commit 5bdaeaf)
* add fetch info to CoalescePartitionsExec

* use Statistics with_fetch API on CoalescePartitionsExec

* check limit_reached only if fetch is assigned

Co-authored-by: mertak-synnada <mertak67+synaada@gmail.com>
… v48

* add fetch to CoalescePartitionsExecNode

* gen proto code

* Add test

* fix

* fix build

* Fix test build

* remove comments

Co-authored-by: 张林伟 <lewiszlw520@gmail.com>
* Add JoinContext with JoinLeftData to TaskContext in HashJoinExec

* Expose random state as const

* re-export ahash::RandomState

* JoinContext default impl

* Add debug log when setting join left data
…in GroupedHashAggregateStream (apache#13995) (#302) v45

* Refactor spill handling in GroupedHashAggregateStream to use partial aggregate schema

* Implement aggregate functions with spill handling in tests

* Add tests for aggregate functions with and without spill handling

* Move test related imports into mod test

* Rename spill pool test functions for clarity and consistency

* Refactor aggregate function imports to use fully qualified paths

* Remove outdated comments regarding input batch schema for spilling in GroupedHashAggregateStream

* Update aggregate test to use AVG instead of MAX

* assert spill count

* Refactor partial aggregate schema creation to use create_schema function

* Refactor partial aggregation schema creation and remove redundant function

* Remove unused import of Schema from arrow::datatypes in row_hash.rs

* move spill pool testing for aggregate functions to physical-plan/src/aggregates

* Use Arc::clone for schema references in aggregate functions

(cherry picked from commit 81b50c4)

Co-authored-by: kosiew <kosiew@gmail.com>
@avantgardnerio avantgardnerio force-pushed the bg_44 branch 2 times, most recently from 9f59429 to 2cb84e6 Compare November 7, 2025 22:07
apache#15055

* handle columns in with_new_exprs with Join

* test doesn't return result

* take join from result

* clippy

* make test fallible

* accept any pair of expression for new_on in with_new_exprs for Join

* use with_capacity

Co-authored-by: delamarch3 <68732277+delamarch3@users.noreply.github.com>
@Dandandan Dandandan self-requested a review November 17, 2025 10:54
@Dandandan Dandandan marked this pull request as ready for review November 17, 2025 10:54
@Dandandan Dandandan merged commit 9726699 into branch-44 Nov 17, 2025
19 of 21 checks passed
@Dandandan Dandandan deleted the bg_44 branch November 17, 2025 11:09
@avantgardnerio avantgardnerio restored the bg_44 branch November 17, 2025 15:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.