[SPARK-53209][YARN] Add ActiveProcessorCount JVM option to YARN executor and AM#51948
[SPARK-53209][YARN] Add ActiveProcessorCount JVM option to YARN executor and AM#51948jzhuge wants to merge 1 commit into
Conversation
|
what about the driver JVM options in YARN client mode? |
|
3 tests failed for the same error |
WweiL
left a comment
There was a problem hiding this comment.
Thank you for picking this up!
Client mode is out of scope for this JIRA. Several things to consider for client mode:
|
Sounds reasonable. nit: maybe the title should be "... to YARN executor and AM" instead of "... to YARN executor and driver"? |
How about "Add ActiveProcessorCount JVM option to Spark driver and executor in YARN mode" |
|
@jzhuge, my point is, "AM" is more consistent with your change than "driver" - in client mode, AM container |
|
Seems fine. cc @mridulm FYI |
|
@mridulm @HyukjinKwon Just wanted to check in on the review for this, thanks! |
Will add a flag after my vacation for 2 weeks. |
|
hi @jzhuge, do you have time to address the comment to move this PR forward? |
Ah, it fell through the crack :-( |
|
@pan3793 @mridulm @HyukjinKwon Added a feature flag, default to false. Let me know whether I need to rebase or squash WIP commits. |
|
Question: do we need separate flags for driver and executor? |
|
Many tests failed, let me rebase. |
|
Hmm, 2 unrelated test failures in these modules: kafkasparkr |
|
Retest please |
63afcaf to
62839fb
Compare
|
Thanks @pan3793 for the review! Looking ... |
Created SPARK-56157 for standalone and SPARK-56158 for local. |
8ffc12e to
f41dd72
Compare
|
@pan3793 Thanks for the feedback! The changes are cleaner. Please take another look. |
|
Sql test failure seem unrelated |
|
Unrelated test failures in |
since the UT runs on your forked repo, you have permission to rerun the single failed job. |
|
I'm going to merge this if no further comments in 24 hours |
…tor and AM Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
This seems unrelated: |
|
merged to master, thank you, @jzhuge and all reviewers |
|
Thanks @pan3793 for reviewing and merging the pr, @mridulm @HyukjinKwon @holdenk @WweiL for the review! |
What changes were proposed in this pull request?
When starting Spark driver and executors on YARN, the JVM process can discover all CPU cores on the node and set thread-pool or GC thread counts based on that value. We should limit what the JVM sees for the number of cores set by the user via
-XX:ActiveProcessorCount, which was introduced in Java 8u191.Adds three boolean config flags (default false):
spark.yarn.am.limitActiveProcessorCount.enabled: sets-XX:ActiveProcessorCount=<spark.yarn.am.cores>in the YARN AM JVM (client mode).spark.driver.limitActiveProcessorCount.enabled: sets-XX:ActiveProcessorCount=<spark.driver.cores>in the YARN AM JVM (cluster mode).spark.executor.limitActiveProcessorCount.enabled: sets-XX:ActiveProcessorCount=<spark.executor.cores>in executor JVMs on YARN.Why are the changes needed?
Without this change, the JVM discovers all CPU cores on the YARN node rather than the cores allocated to the container. Users have assigned driver and executors a number of cores and we should honor that. A simple test would be:
Runtime.getRuntime().availableProcessors()Does this PR introduce any user-facing change?
Yes — three new public configuration keys.
How was this patch tested?
New unit tests in
ClientSuiteandExecutorRunnableSuite.Co-authored-by: Shanyu Zhao shzhao@microsoft.com