Skip to content

[SPARK-1466] Raise exception if pyspark Gateway process doesn't start.#383

Closed
kayousterhout wants to merge 1 commit into
apache:masterfrom
kayousterhout:pyspark
Closed

[SPARK-1466] Raise exception if pyspark Gateway process doesn't start.#383
kayousterhout wants to merge 1 commit into
apache:masterfrom
kayousterhout:pyspark

Conversation

@kayousterhout

Copy link
Copy Markdown
Contributor

If the gateway process fails to start correctly (e.g., because JAVA_HOME isn't set correctly, there's no Spark jar, etc.), right now pyspark fails because of a very difficult-to-understand error, where we try to parse stdout to get the port where Spark started and there's nothing there. This commit properly catches the error and throws an exception that includes the stderr output for much easier debugging.

Thanks to @shivaram and @stogers for helping to fix this issue!

@kayousterhout

Copy link
Copy Markdown
Contributor Author

This should be backported to 0.9 and 1.0

@kayousterhout

Copy link
Copy Markdown
Contributor Author

BTW this is a much bigger issue with iPython notebook -- if you're running in the console, you get the wrong error (with parsing the int) but also the correct error. If you're running in iPython notebook, you only get the wrong error, making this very annoying to debug.

@AmplabJenkins

Copy link
Copy Markdown

Merged build triggered.

@AmplabJenkins

Copy link
Copy Markdown

Merged build started.

@AmplabJenkins

Copy link
Copy Markdown

Merged build finished.

@AmplabJenkins

Copy link
Copy Markdown

Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14017/

@pwendell

Copy link
Copy Markdown
Contributor

Jenkins, retest this please.

@AmplabJenkins

Copy link
Copy Markdown

Merged build triggered.

@AmplabJenkins

Copy link
Copy Markdown

Merged build started.

@AmplabJenkins

Copy link
Copy Markdown

Merged build finished.

@AmplabJenkins

Copy link
Copy Markdown

Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14069/

Comment thread python/pyspark/java_gateway.py Outdated

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this should say "Launching GatewayServer failed"? It will be more informative, otherwise people will think something is wrong with SparkContext itself.

@mateiz

mateiz commented Apr 18, 2014

Copy link
Copy Markdown
Contributor

@kayousterhout not sure if you saw my comment, this looks good but the exception message is somewhat confusing. It would be good to update that.

@pwendell

Copy link
Copy Markdown
Contributor

Jenkins, retest this please.

@AmplabJenkins

Copy link
Copy Markdown

Merged build triggered.

@AmplabJenkins

Copy link
Copy Markdown

Merged build started.

@kayousterhout

Copy link
Copy Markdown
Contributor Author

I did but haven't had time to figure out why the tests are failing (the tests don't run properly on my laptop). Hoping this was a Jenkins issue and the re-launched tests pass.

@AmplabJenkins

Copy link
Copy Markdown

Merged build finished.

@AmplabJenkins

Copy link
Copy Markdown

Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14230/

Also include stderr output to help user debug startup issue.
@AmplabJenkins

Copy link
Copy Markdown

Merged build triggered.

@AmplabJenkins

Copy link
Copy Markdown

Merged build started.

@AmplabJenkins

Copy link
Copy Markdown

Merged build finished. All automated tests passed.

@AmplabJenkins

Copy link
Copy Markdown

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/15585/

@mateiz

mateiz commented Jun 10, 2014

Copy link
Copy Markdown
Contributor

It looks like the tests magically passed now! Is this good to go?

@kayousterhout

Copy link
Copy Markdown
Contributor Author

Yup! I just rebased so it should merge cleanly on master.

Sent from my iPhone

On Jun 9, 2014, at 7:03 PM, Matei Zaharia notifications@github.com wrote:

It looks like the tests magically passed now! Is this good to go?


Reply to this email directly or view it on GitHub.

@asfgit asfgit closed this in 3870248 Jun 18, 2014
pdeyhim pushed a commit to pdeyhim/spark-1 that referenced this pull request Jun 25, 2014
If the gateway process fails to start correctly (e.g., because JAVA_HOME isn't set correctly, there's no Spark jar, etc.), right now pyspark fails because of a very difficult-to-understand error, where we try to parse stdout to get the port where Spark started and there's nothing there. This commit properly catches the error and throws an exception that includes the stderr output for much easier debugging.

Thanks to @shivaram and @stogers for helping to fix this issue!

Author: Kay Ousterhout <kayousterhout@gmail.com>

Closes apache#383 from kayousterhout/pyspark and squashes the following commits:

36dd54b [Kay Ousterhout] [SPARK-1466] Raise exception if Gateway process doesn't start.
xiliu82 pushed a commit to xiliu82/spark that referenced this pull request Sep 4, 2014
If the gateway process fails to start correctly (e.g., because JAVA_HOME isn't set correctly, there's no Spark jar, etc.), right now pyspark fails because of a very difficult-to-understand error, where we try to parse stdout to get the port where Spark started and there's nothing there. This commit properly catches the error and throws an exception that includes the stderr output for much easier debugging.

Thanks to @shivaram and @stogers for helping to fix this issue!

Author: Kay Ousterhout <kayousterhout@gmail.com>

Closes apache#383 from kayousterhout/pyspark and squashes the following commits:

36dd54b [Kay Ousterhout] [SPARK-1466] Raise exception if Gateway process doesn't start.
tangzhankun pushed a commit to tangzhankun/spark that referenced this pull request Jul 25, 2017
…he#383)

This makes executors consistent with the driver. Note that
SPARK_EXTRA_CLASSPATH isn't set anywhere by Spark itself, but it's
primarily meant to be set by images that inherit from the base
driver/executor images.
erikerlandson pushed a commit to erikerlandson/spark that referenced this pull request Jul 28, 2017
…he#383)

This makes executors consistent with the driver. Note that
SPARK_EXTRA_CLASSPATH isn't set anywhere by Spark itself, but it's
primarily meant to be set by images that inherit from the base
driver/executor images.
mccheah added a commit to mccheah/spark that referenced this pull request Nov 28, 2018
Apply patches for SPARK-24531 to fix tests
bzhaoopenstack pushed a commit to bzhaoopenstack/spark that referenced this pull request Sep 11, 2019
Diable S3 test cases in fusioncloud job
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants