[SPARK-48545][SQL] Create to_avro and from_avro SQL functions to match DataFrame equivalents#46977
Closed
dtenedor wants to merge 13 commits into
Closed
[SPARK-48545][SQL] Create to_avro and from_avro SQL functions to match DataFrame equivalents#46977dtenedor wants to merge 13 commits into
dtenedor wants to merge 13 commits into
Conversation
Contributor
Author
|
Thanks @allisonwang-db for your review, followed through on your comments. |
allisonwang-db
approved these changes
Jun 15, 2024
allisonwang-db
left a comment
Contributor
There was a problem hiding this comment.
Looks good! cc @cloud-fan
Contributor
Author
|
cc @cloud-fan the CI is passing now :) |
Member
|
Thanks, merging to master |
HyukjinKwon
pushed a commit
that referenced
this pull request
Jun 23, 2024
…nd from_avro functions but Avro is not loaded by default ### What changes were proposed in this pull request? This PR updates the new `to_avro` and `from_avro` SQL functions added in #46977 to return reasonable errors when Avro is not loaded by default. ### Why are the changes needed? According to the [Apache Spark Avro Data Source Guide](https://spark.apache.org/docs/latest/sql-data-sources-avro.html), Avro is not loaded into Spark by default. With this change, users get reasonable error messages if they try to call the `to_avro` or `from_avro` SQL functions in this case with instructions telling them what to do, rather than obscure Java `ClassNotFoundException`s. ### Does this PR introduce _any_ user-facing change? Yes, see above. ### How was this patch tested? This PR adds golden file based test coverage. ### Was this patch authored or co-authored using generative AI tooling? No GitHub copilot this time. Closes #47063 from dtenedor/to-from-avro-error-not-loaded. Authored-by: Daniel Tenedorio <daniel.tenedorio@databricks.com> Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
HyukjinKwon
pushed a commit
that referenced
this pull request
Aug 22, 2024
…functions ### What changes were proposed in this pull request? This PR proposes to support `from_protobuf` and `to_protobuf` for SQL functions Similar to #46977 ### Why are the changes needed? For improving feature parity with DataFrame API ### Does this PR introduce _any_ user-facing change? This enables `from_protobuf` and `to_protobuf` from SQL functions ### How was this patch tested? Added UTs. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #47716 from itholic/from_to_protobuf. Authored-by: Haejoon Lee <haejoon.lee@databricks.com> Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this pull request?
This PR creates two new SQL functions "to_avro" and "from_avro" to match existing DataFrame equivalents.
For example:
Why are the changes needed?
This brings parity between SQL and DataFrame APIs in Apache Spark.
Does this PR introduce any user-facing change?
Yes, see above.
How was this patch tested?
This PR adds extra unit tests, and I also checked that the functions work with
spark-shell.Was this patch authored or co-authored using generative AI tooling?
No GitHub copilot usage this time