Skip to content

[SPARK-47323][K8S] Support custom executor log urls#45464

Closed
EnricoMi wants to merge 4 commits into
apache:masterfrom
G-Research-Forks:k8s-custom-executor-log-url
Closed

[SPARK-47323][K8S] Support custom executor log urls#45464
EnricoMi wants to merge 4 commits into
apache:masterfrom
G-Research-Forks:k8s-custom-executor-log-url

Conversation

@EnricoMi

@EnricoMi EnricoMi commented Mar 11, 2024

Copy link
Copy Markdown
Contributor

What changes were proposed in this pull request?

Make Kubernetes resource manager support existing config spark.ui.custom.executor.log.url.

Allow for

spark.ui.custom.executor.log.url="https://my.custom.url/logs?app={{APP_ID}}&executor={{EXECUTOR_ID}}"

Supports these variables:

  • APP_ID: The unique application id
  • EXECUTOR_ID: The executor id (a positive integer larger than zero)
  • HOSTNAME: The name of the host where the executor runs
  • KUBERNETES_NAMESPACE: The namespace where the executor pods run
  • KUBERNETES_POD_NAME: The name of the pod that contains the executor
  • FILE_NAME: The name of the log, which is always "log"

Why are the changes needed?

Running Spark on Kubernetes requires persisting the logs elsewhere. Having the Spark UI link to those logs is very useful. This is currently only supported by YARN.

Does this PR introduce any user-facing change?

Spark UI provides links to logs when run on Kubernetes.

How was this patch tested?

Unit test and manually tested on minikube K8S cluster.

Was this patch authored or co-authored using generative AI tooling?

No

@mridulm

mridulm commented Mar 11, 2024

Copy link
Copy Markdown
Contributor

+CC @thejdeep

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the external log service for K8s is likely to use namespace and pod name to query the logs, could you please expose NAMESPACE too?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added the namespace.

@pan3793

pan3793 commented Mar 11, 2024

Copy link
Copy Markdown
Member

@EnricoMi this looks much simpler than my previous attempt #38357

Comment thread docs/configuration.md Outdated

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

standalone?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the Please check the documentation for your cluster manager to see which patterns are supported, if any. is sufficient, there is no need to list which manager supports this conf and which don't. That list easily gets out-dated.

@EnricoMi

EnricoMi commented Mar 11, 2024

Copy link
Copy Markdown
Contributor Author

@EnricoMi this looks much simpler than my previous attempt #38357

@pan3793 Thanks for the pointer! Here is also a PR for driver log support (#45728) which borrows some code from your attempt (#38357).

@EnricoMi

Copy link
Copy Markdown
Contributor Author

CC @dongjoon-hyun

@EnricoMi EnricoMi force-pushed the k8s-custom-executor-log-url branch from 80070ef to 2f896c8 Compare April 20, 2024 16:23
@EnricoMi

Copy link
Copy Markdown
Contributor Author

@dongjoon-hyun What do you think?

@github-actions

Copy link
Copy Markdown

We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.
If you'd like to revive this PR, please reopen it and ask a committer to remove the Stale tag!

@github-actions github-actions Bot added the Stale label Jul 30, 2024
@EnricoMi EnricoMi force-pushed the k8s-custom-executor-log-url branch from 2f896c8 to 315a0bb Compare July 30, 2024 08:06
@github-actions github-actions Bot closed this Jul 31, 2024

@dongjoon-hyun dongjoon-hyun left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for being late, @EnricoMi .

Apache Spark has been supported this feature. So, The configuration document is fixed in master/3.5/3.4. Could you try to follow the updated documentation?

spark.executorEnv.SPARK_EXECUTOR_ATTRIBUTE_APP_ID='$(SPARK_APPLICATION_ID)'
spark.executorEnv.SPARK_EXECUTOR_ATTRIBUTE_EXECUTOR_ID='$(SPARK_EXECUTOR_ID)'
spark.ui.custom.executor.log.url='https://log-server/log?appId={{APP_ID}}&execId={{

@EnricoMi

Copy link
Copy Markdown
Contributor Author

Looks like this works in master. Which versions before 4.0.0 support this?

@dongjoon-hyun

dongjoon-hyun commented Aug 14, 2024

Copy link
Copy Markdown
Member

Looks like this works in master. Which versions before 4.0.0 support this?

All Apache Spark with K8s GA have been supporting it. So, SPARK-49176 is a documentation fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants