[breaking] Make pyspark-client as default and pyspark package optional#60031
Conversation
|
Good idea - but we need to wait until apache-beam fixes grpcio limit - maybe find if there is an existing issue or you can open a new issue with them ? |
|
Yes we need apache-beam to support grpcio>= 1.67 you already created it -> apache/beam#34081 |
Heh... Almost a year ago.. Maybe a time to follow up ? |
|
I raised my comments in the discussion grpc/grpc#37710 (comment) |
|
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 5 days if no further activity occurs. Thank you for your contributions. |
|
No stale -> #61926 |
e25155b to
da01129
Compare
cd4c01b to
624e3c3
Compare
eladkal
left a comment
There was a problem hiding this comment.
This needs entry in the top of the provider changelog explaining users the reasoning and what they should do if they want to keep the behavior as is
Yeah. Just add entry to changelog :) |
|
I also added `"[breaking]" in the title of the PR - I think that might be one of the ways @eladkal ? how we mark breaking changes. We discussed it before. I think it's very nice way to add "[breaking]" to the title - then the RM can remove it when preparing release notes. But we can also detect it automatically. I am also going to employ LLM -- optionally, following the experience with auto-triage - to prepare the release notes - so this might naturally work when we use LLM to do it. |
|
@raphaelauv can you please add the entry to the top of the change log? I will merge the PR after |
2698101 to
aec8858
Compare
bae87da to
c0a8e91
Compare
f69e62e to
24bb523
Compare
|
hey @eladkal , I rebased and added the comment on the "why" Thanks |
apache#60031) * feat: add pyspark-client as default and make pyspark package optional --------- Co-authored-by: raphaelauv <raphaelauv@users.noreply.github.com>
current
apache.sparkprovider need the package pyspark ( more than 400mb )where I would like to only use the spark-client ( spark-connect ) to trigger a spark job ( that is 1.5mb )
this is a breaking change but it will make things lighter by default , wdyt ? thanks
( btw spark-client is only available since spark 4.0 and need "grpcio >= 1.67.0" ( that conflict with apache-beam ) )