Skip to content

[SPARK-47625][SQL] Addition of Indeterminate Collation Support#46004

Closed
mihailomilosevic2001 wants to merge 19 commits into
apache:masterfrom
mihailomilosevic2001:SPARK-47625
Closed

[SPARK-47625][SQL] Addition of Indeterminate Collation Support#46004
mihailomilosevic2001 wants to merge 19 commits into
apache:masterfrom
mihailomilosevic2001:SPARK-47625

Conversation

@mihailomilosevic2001

Copy link
Copy Markdown
Contributor

What changes were proposed in this pull request?

INDETERMINATE_COLLATION should only be thrown on comparison operations and memory storing of data, and we should be able to combine different implicit collations for certain operations like concat and possible others in the future.
This is why we have to add another predefined collation id named INDETERMINATE_COLLATION_ID which means that the result is a combination of conflicting non-default implicit collations. Right now it would an id of -1 so it fail if it ever goes to the CollatorFactory.

Why are the changes needed?

Support for concatenation between columns of different collation is what PGSQL follows and this behaviour should be followed.

Does this PR introduce any user-facing change?

Yes. It adds new error of indeterminate collation.

How was this patch tested?

Tests in CollationSuite.

Was this patch authored or co-authored using generative AI tooling?

No

# Conflicts:
#	sql/core/src/test/scala/org/apache/spark/sql/CollationStringExpressionsSuite.scala
# Conflicts:
#	sql/api/src/main/scala/org/apache/spark/sql/types/StringType.scala
#	sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CollationTypeCasts.scala
#	sql/core/src/test/scala/org/apache/spark/sql/CollationStringExpressionsSuite.scala
#	sql/core/src/test/scala/org/apache/spark/sql/CollationSuite.scala
# Conflicts:
#	sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CollationTypeCasts.scala
# Conflicts:
#	sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/rules.scala
#	sql/core/src/main/scala/org/apache/spark/sql/internal/BaseSessionStateBuilder.scala
#	sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveSessionStateBuilder.scala
@github-actions

Copy link
Copy Markdown

We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.
If you'd like to revive this PR, please reopen it and ask a committer to remove the Stale tag!

@github-actions github-actions Bot added the Stale label Aug 23, 2024
@github-actions github-actions Bot closed this Aug 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant