Skip to content

[SPARK-46539][SQL] SELECT * EXCEPT(all fields from a struct) results in an assertion failure#44527

Closed
stefankandic wants to merge 4 commits into
apache:masterfrom
stefankandic:select-except-err
Closed

[SPARK-46539][SQL] SELECT * EXCEPT(all fields from a struct) results in an assertion failure#44527
stefankandic wants to merge 4 commits into
apache:masterfrom
stefankandic:select-except-err

Conversation

@stefankandic

@stefankandic stefankandic commented Dec 28, 2023

Copy link
Copy Markdown
Contributor

What changes were proposed in this pull request?

Fixing the assertion error which occurs when we do SELECT .. EXCEPT(every field from a struct) by adding a check for an empty struct

Why are the changes needed?

Because this is a valid query that should just return an empty struct rather than fail during serialization.

Does this PR introduce any user-facing change?

Yes, users should no longer see this error and instead get an empty struct '{ }'

How was this patch tested?

By adding new UT to existing selectExcept tests

Was this patch authored or co-authored using generative AI tooling?

No

@github-actions github-actions Bot added the SQL label Dec 28, 2023
@stefankandic stefankandic changed the title [WIP] Select except assert error [WIP][SPARK-46539][SQL] SELECT * EXCEPT(all fields from a struct) results in an assertion failure Dec 28, 2023
@stefankandic stefankandic changed the title [WIP][SPARK-46539][SQL] SELECT * EXCEPT(all fields from a struct) results in an assertion failure [SPARK-46539][SQL] SELECT * EXCEPT(all fields from a struct) results in an assertion failure Dec 28, 2023
@stefankandic stefankandic marked this pull request as ready for review December 28, 2023 17:10
@MaxGekk

MaxGekk commented Dec 29, 2023

Copy link
Copy Markdown
Member

@milastdbx Are you ok with the changes?

@cloud-fan

Copy link
Copy Markdown
Contributor

Does it mean we will hit this bug if an empty struct is returned? EXCEPT may not be the only way to create empty struct.

@stefankandic

Copy link
Copy Markdown
Contributor Author

Does it mean we will hit this bug if an empty struct is returned? EXCEPT may not be the only way to create empty struct.

yes, i think we will @cloud-fan

"each serializer expression should contain at least one `BoundReference`")
assert(boundRefs.nonEmpty || isEmptyStruct(ser),
"each serializer expression should contain at least one `BoundReference` " +
"or be an empty struct")

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error msg is not clear.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated! let me know if you think it's clearer now

@MaxGekk

MaxGekk commented Dec 30, 2023

Copy link
Copy Markdown
Member

@stefankandic Just to double check, up to which version should the changes be ported? The ticket SPARK-46539 points out 3.0.0, is it correct?

@MaxGekk

MaxGekk commented Jan 3, 2024

Copy link
Copy Markdown
Member

+1, LGTM. Merging to master.
Thank you, @stefankandic and @cloud-fan @beliefer @milastdbx for review.

@MaxGekk MaxGekk closed this in 9c46d9d Jan 3, 2024
@MaxGekk

MaxGekk commented Jan 3, 2024

Copy link
Copy Markdown
Member

@stefankandic The PR conflicts with branch-3.5. Please, open a separate PR for the branch and branch-3.4 (if it is needed).

@dongjoon-hyun dongjoon-hyun left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, @stefankandic and @MaxGekk and all. Unfortunately, this seems to break master branch. Could you make a follow-up first before backporting?

[info] - selectExcept.sql_analyzer_test *** FAILED *** (136 milliseconds)
[info]   selectExcept.sql_analyzer_test
[info]   Expected "... (`tbl_view`, [id#x,[name#x,]data#x])
[info]         +- Pr...", but got "... (`tbl_view`, [id#x,[ name#x, ]data#x])
[info]         +- Pr..." Result did not match for query #9
[info]   SELECT * EXCEPT (data.f1, data.s2) FROM tbl_view (SQLQueryTestSuite.scala:902)

@dongjoon-hyun

Copy link
Copy Markdown
Member

As of now, this failure blocks all PRs like the following.

@cloud-fan

Copy link
Copy Markdown
Contributor

I'm fixing it at #44585

yaooqinn pushed a commit that referenced this pull request Jan 4, 2024
### What changes were proposed in this pull request?

a followup of #44527 to fix golden files.

### Why are the changes needed?

fix tests

### Does this PR introduce _any_ user-facing change?

no

### How was this patch tested?

N/A

### Was this patch authored or co-authored using generative AI tooling?

no

Closes #44585 from cloud-fan/golden.

Authored-by: Wenchen Fan <wenchen@databricks.com>
Signed-off-by: Kent Yao <yao@apache.org>
@dongjoon-hyun

Copy link
Copy Markdown
Member

Thank you, @cloud-fan !

@stefankandic

Copy link
Copy Markdown
Contributor Author

@stefankandic The PR conflicts with branch-3.5. Please, open a separate PR for the branch and branch-3.4 (if it is needed).

since this PR depends on a new feature introduced in 43843 there is no need to backport this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants