Skip to content

SPARK-3178 setting SPARK_WORKER_MEMORY to a value without a label (m or g) sets the worker memory limit to zero#2227

Closed
bbejeck wants to merge 2 commits into
apache:masterfrom
bbejeck:master
Closed

SPARK-3178 setting SPARK_WORKER_MEMORY to a value without a label (m or g) sets the worker memory limit to zero#2227
bbejeck wants to merge 2 commits into
apache:masterfrom
bbejeck:master

Conversation

@bbejeck

@bbejeck bbejeck commented Sep 1, 2014

Copy link
Copy Markdown
Member

Now the worker will fail fast if the memory is set to zero by leaving off the the label (m or g) either in the SPARK_WORKER_MEMORY environment variable or from the command line.

…he SPARK_WORKER_MEMORY environment variable or command line without a g or m label. Added unit tests. If memory is 0 an IllegalStateException is thrown.
Merging updates from Spark upstream on 8/31/14
@AmplabJenkins

Copy link
Copy Markdown

Can one of the admins verify this patch?

@bbejeck bbejeck changed the title [CORE] SPARK-3178 setting SPARK_WORKER_MEMORY to a value without a label (m or g) sets the worker memory limit to zero SPARK-3178 setting SPARK_WORKER_MEMORY to a value without a label (m or g) sets the worker memory limit to zero Sep 5, 2014
@bbejeck

bbejeck commented Sep 5, 2014

Copy link
Copy Markdown
Member Author

Did any of the admins have chance to check this out? Let me know if you want me to modify anything. Thanks!

@JoshRosen

Copy link
Copy Markdown
Contributor

Jenkins, this is ok to test.

@vanzin

vanzin commented Sep 5, 2014

Copy link
Copy Markdown
Contributor

Feels to me like it would be better to fix this in Utils.memoryStringToMb. That way all code using it benefits.

As for the behavior of that method, maybe it should throw an exception if there is no suffix and the value is < 1MB?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In #2002, I added a mechanism that allows environment variables to be mocked in tests. Take a look at that PR, SparkConf.getEnv in particular. By using a custom SparkConf subclass, you can mock environment variables on a per-test basis: https://github.com/apache/spark/pull/2002/files#diff-e9fb6be5f96766cce96c4d60aea2fc59R45

If we find ourselves doing this in multiple places (my PR, here, ...) it might be nice to add some test helper classes for doing this more generically. That refactoring can happen in a separate PR, though, so for now it's probably fine to just copy my code snippet here.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, to be more specific: you'll have to change the code that reads the environment variable to use SparkConf.getEnv instead of System.getEnv; I only changed this for the environment variables used in my specific test because I didn't want to make a big cross-cutting change across the codebase (plus it would probably get broken by subsequent PRs; we should add a style checker rule that complains about System.getEnv uses if we plan on doing this change globally).

@bbejeck

bbejeck commented Sep 5, 2014

Copy link
Copy Markdown
Member Author

Josh,

Thanks for the heads up on testing with environment variables. I will look at the PR and make the required changes to the test.

@bbejeck

bbejeck commented Sep 5, 2014

Copy link
Copy Markdown
Member Author

Feels to me like it would be better to fix this in Utils.memoryStringToMb. That way all code using it benefits.

I thought the same thing, but I was not sure about making a change that would be cross-cutting, so I confined my change to the WorkerArguments class

@SparkQA

SparkQA commented Sep 5, 2014

Copy link
Copy Markdown

Can one of the admins verify this patch?

@JoshRosen

Copy link
Copy Markdown
Contributor

Jenkins, this is ok to test.

@bbejeck

bbejeck commented Sep 6, 2014

Copy link
Copy Markdown
Member Author

I've made the changes for mocking out the environment variable in the test. I'd like to move the code from my master branch to a topic branch. If I rebase and move the code to a topic branch will the PR be automatically updated or will I have to close the PR, do the rebase and re-submit?

@JoshRosen

Copy link
Copy Markdown
Contributor

If you want to change the branch used for the pr you will need to close and open a new one

Sent from my phone

On Sep 5, 2014, at 7:24 PM, Bill Bejeck notifications@github.com wrote:

I've made the changes for mocking out the environment variable in the test. I'd like to move the code from forked master branch to a topic branch. If I rebase and move the code to a topic branch will the PR be automatically updated or will I have to close the PR, do the rebase and re-submit?


Reply to this email directly or view it on GitHub.

@bbejeck bbejeck closed this Sep 7, 2014
@bbejeck

bbejeck commented Sep 7, 2014

Copy link
Copy Markdown
Member Author

Closing PR to move to change branch, will open new PR

@JoshRosen

Copy link
Copy Markdown
Contributor

For those watching this PR, the new one is at #2309.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants