Skip to content

[SPARK-49103][CORE] Support spark.master.rest.filters#47595

Closed
dongjoon-hyun wants to merge 1 commit into
apache:masterfrom
dongjoon-hyun:SPARK-49103
Closed

[SPARK-49103][CORE] Support spark.master.rest.filters#47595
dongjoon-hyun wants to merge 1 commit into
apache:masterfrom
dongjoon-hyun:SPARK-49103

Conversation

@dongjoon-hyun

@dongjoon-hyun dongjoon-hyun commented Aug 3, 2024

Copy link
Copy Markdown
Member

What changes were proposed in this pull request?

This PR aims to support spark.master.rest.filters configuration like the existing spark.ui.filters configuration.

Recently, Apache Spark starts to support JWSFilter. We can take advantage of JWSFilter to protect Spark Master REST API.

Why are the changes needed?

Like Spark UI, we had better provide the same capability to Apache Spark Master REST API .

For example, we can protect JWSFilter to Spark Master REST API like the following.

MASTER REST API WITH JWSFilter

$ build/sbt package
$ cp jjwt-impl-0.12.6.jar assembly/target/scala-2.13/jars
$ cp jjwt-jackson-0.12.6.jar assembly/target/scala-2.13/jars
$ SPARK_NO_DAEMONIZE=1 \
SPARK_MASTER_OPTS="-Dspark.master.rest.enabled=true -Dspark.master.rest.filters=org.apache.spark.ui.JWSFilter -Dspark.org.apache.spark.ui.JWSFilter.param.key=VmlzaXQgaHR0cHM6Ly9zcGFyay5hcGFjaGUub3JnIHRvIGRvd25sb2FkIEFwYWNoZSBTcGFyay4=" \
sbin/start-master.sh

AUTHORIZATION FAILURE

$ curl -v -XPOST http://localhost:6066/v1/submissions/clear
* Host localhost:6066 was resolved.
* IPv6: ::1
* IPv4: 127.0.0.1
*   Trying [::1]:6066...
* connect to ::1 port 6066 from ::1 port 51705 failed: Connection refused
*   Trying 127.0.0.1:6066...
* Connected to localhost (127.0.0.1) port 6066
> POST /v1/submissions/clear HTTP/1.1
> Host: localhost:6066
> User-Agent: curl/8.7.1
> Accept: */*
>
* Request completely sent off
< HTTP/1.1 403 Forbidden
< Date: Sat, 03 Aug 2024 22:18:03 GMT
< Cache-Control: must-revalidate,no-cache,no-store
< Content-Type: text/html;charset=iso-8859-1
< Content-Length: 590
< Server: Jetty(11.0.21)
<
<html>
<head>
<meta http-equiv="Content-Type" content="text/html;charset=ISO-8859-1"/>
<title>Error 403 Authorization header is missing.</title>
</head>
<body><h2>HTTP ERROR 403 Authorization header is missing.</h2>
<table>
<tr><th>URI:</th><td>/v1/submissions/clear</td></tr>
<tr><th>STATUS:</th><td>403</td></tr>
<tr><th>MESSAGE:</th><td>Authorization header is missing.</td></tr>
<tr><th>SERVLET:</th><td>org.apache.spark.deploy.rest.StandaloneClearRequestServlet-7f171159</td></tr>
</table>
<hr/><a href="https://eclipse.org/jetty">Powered by Jetty:// 11.0.21</a><hr/>

</body>
</html>
* Connection #0 to host localhost left intact

SUCCESS

$ curl -v -H "Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.e30.4EKWlOkobpaAPR0J4BE0cPQ-ZD1tRQKLZp1vtE7upPw" -XPOST http://localhost:6066/v1/submissions/clear
* Host localhost:6066 was resolved.
* IPv6: ::1
* IPv4: 127.0.0.1
*   Trying [::1]:6066...
* connect to ::1 port 6066 from ::1 port 51697 failed: Connection refused
*   Trying 127.0.0.1:6066...
* Connected to localhost (127.0.0.1) port 6066
> POST /v1/submissions/clear HTTP/1.1
> Host: localhost:6066
> User-Agent: curl/8.7.1
> Accept: */*
> Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.e30.4EKWlOkobpaAPR0J4BE0cPQ-ZD1tRQKLZp1vtE7upPw
>
* Request completely sent off
< HTTP/1.1 200 OK
< Date: Sat, 03 Aug 2024 22:16:51 GMT
< Content-Type: application/json;charset=utf-8
< Content-Length: 113
< Server: Jetty(11.0.21)
<
{
  "action" : "ClearResponse",
  "message" : "",
  "serverSparkVersion" : "4.0.0-SNAPSHOT",
  "success" : true
* Connection #0 to host localhost left intact
}%

Does this PR introduce any user-facing change?

No, this is a new feature which is not loaded by default.

How was this patch tested?

Pass the CIs with newly added test case.

Was this patch authored or co-authored using generative AI tooling?

No.

@github-actions github-actions Bot added the CORE label Aug 3, 2024
@dongjoon-hyun

Copy link
Copy Markdown
Member Author

cc @mridulm , @viirya , @yaooqinn

.version("4.0.0")
.stringConf
.toSequence
.createWithDefault(Nil)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have any user-facing documentation for this config?

@dongjoon-hyun

Copy link
Copy Markdown
Member Author

Thank you, @viirya .

For the following, I'm current preparing an independent documentation PR to include the recent contents. I will include this part too.

Do we have any user-facing documentation for this config?

@dongjoon-hyun dongjoon-hyun deleted the SPARK-49103 branch August 4, 2024 02:40
HyukjinKwon pushed a commit that referenced this pull request Aug 4, 2024
…REST API and rename parameter to `secretKey`

### What changes were proposed in this pull request?

This PR aims the following.
- Document `JWSFilter` and its usage in `Spark UI` and `REST API`
    - `Spark UI` section of `Configuration` page
    - `Spark Security` page
    - `Spark Standalone` page
- Rename the parameter `key` to `secretKey` to redact it in Spark Driver UI and Spark Master UI.

### Why are the changes needed?

To apply recent new security features
- #47575
- #47595

### Does this PR introduce _any_ user-facing change?

No because this is a new feature of Apache Spark 4.0.0.

### How was this patch tested?

Pass the CIs and manual review.

- `spark-standalone.html`
![Screenshot 2024-08-03 at 22 40 53](https://github.com/user-attachments/assets/f1b95a01-c14b-4f14-96b6-3181afaf6f9f)

- `security.html`
![Screenshot 2024-08-03 at 22 39 00](https://github.com/user-attachments/assets/8413f6a3-47df-4d71-87ee-25ab32171c6c)
![Screenshot 2024-08-03 at 22 39 51](https://github.com/user-attachments/assets/01546724-d5b5-40d5-a980-236f9d13ae81)

- `configuration.html`
![Screenshot 2024-08-03 at 22 38 07](https://github.com/user-attachments/assets/c0845a7f-6ae1-4194-b98a-68d7442c9785)

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #47596 from dongjoon-hyun/SPARK-49104.

Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
dongjoon-hyun added a commit that referenced this pull request Feb 12, 2025
### What changes were proposed in this pull request?

This PR aims to enable `spark.master.rest.enabled` by default for Apache Spark 4.1.0.

### Why are the changes needed?

Apache Spark is ready to enable this feature by default.
- Since Apache Spark 1.3.0, `spark.master.rest.enabled` has been used stably.
- Since Apache Spark 4.0.0, `spark.master.rest.filters` provides a way to serve it securely.
  - #47595

### Does this PR introduce _any_ user-facing change?

Yes, the migration guide is updated.

### How was this patch tested?

Pass the CIs.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #49894 from dongjoon-hyun/SPARK-51165.

Authored-by: Dongjoon Hyun <dongjoon@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
dongjoon-hyun added a commit that referenced this pull request Jun 15, 2026
…cret` with Master REST server

### What changes were proposed in this pull request?

This PR removes the standalone `Master` check-code that rejects `spark.authenticate.secret` when the Master REST server (`spark.master.rest.enabled`) is enabled.

https://github.com/apache/spark/blob/088071d869dee0cb433c5e72ba2e7851e332b391/core/src/main/scala/org/apache/spark/deploy/master/Master.scala#L138-L144

For the record, the check was introduced at Apache Spark 2.4.0. And, currently, it's outdated.
- #22071

### Why are the changes needed?

`spark.authenticate.secret` (the RPC authentication secret) and `spark.master.rest.enabled` (the standalone submission REST server) are independent concerns, but the removed check-code coupled them by failing Master startup whenever both were set.

Since Apache Spark 4.1.0, `spark.master.rest.enabled` defaults to `true`, this check-code forced any cluster using RPC authentication to disable the REST server. This is wrong. We don't need to block like this because the REST server is protected independently like the following.
- #47595 (Apache Spark 4.0.0)
- #47596
- #49894 (Apache Spark 4.1.0)

### Does this PR introduce _any_ user-facing change?

Previously, starting a standalone Master with `spark.authenticate.secret` set and `spark.master.rest.enabled=true` (the default) failed with an `IllegalArgumentException`. After this PR, the Master starts normally with both configured securely.

Although this is a bug fix by enabling a previous-blocked code path. So, technically there is no loss from the user perspective.

### How was this patch tested?

Added a unit test in `MasterSuite` that verifies a `Master` can be created with both `spark.master.rest.enabled=true` and `spark.authenticate.secret` set.

### Was this patch authored or co-authored using generative AI tooling?

Generated-by: Claude Code (Claude Opus 4.8)

Closes #56511 from dongjoon-hyun/SPARK-57451.

Authored-by: Dongjoon Hyun <dongjoon@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
dongjoon-hyun added a commit that referenced this pull request Jun 15, 2026
…cret` with Master REST server

### What changes were proposed in this pull request?

This PR removes the standalone `Master` check-code that rejects `spark.authenticate.secret` when the Master REST server (`spark.master.rest.enabled`) is enabled.

https://github.com/apache/spark/blob/088071d869dee0cb433c5e72ba2e7851e332b391/core/src/main/scala/org/apache/spark/deploy/master/Master.scala#L138-L144

For the record, the check was introduced at Apache Spark 2.4.0. And, currently, it's outdated.
- #22071

### Why are the changes needed?

`spark.authenticate.secret` (the RPC authentication secret) and `spark.master.rest.enabled` (the standalone submission REST server) are independent concerns, but the removed check-code coupled them by failing Master startup whenever both were set.

Since Apache Spark 4.1.0, `spark.master.rest.enabled` defaults to `true`, this check-code forced any cluster using RPC authentication to disable the REST server. This is wrong. We don't need to block like this because the REST server is protected independently like the following.
- #47595 (Apache Spark 4.0.0)
- #47596
- #49894 (Apache Spark 4.1.0)

### Does this PR introduce _any_ user-facing change?

Previously, starting a standalone Master with `spark.authenticate.secret` set and `spark.master.rest.enabled=true` (the default) failed with an `IllegalArgumentException`. After this PR, the Master starts normally with both configured securely.

Although this is a bug fix by enabling a previous-blocked code path. So, technically there is no loss from the user perspective.

### How was this patch tested?

Added a unit test in `MasterSuite` that verifies a `Master` can be created with both `spark.master.rest.enabled=true` and `spark.authenticate.secret` set.

### Was this patch authored or co-authored using generative AI tooling?

Generated-by: Claude Code (Claude Opus 4.8)

Closes #56511 from dongjoon-hyun/SPARK-57451.

Authored-by: Dongjoon Hyun <dongjoon@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
(cherry picked from commit ff36aac)
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
dongjoon-hyun added a commit that referenced this pull request Jun 17, 2026
…cret` with Master REST server

### What changes were proposed in this pull request?

This PR removes the standalone `Master` check-code that rejects `spark.authenticate.secret` when the Master REST server (`spark.master.rest.enabled`) is enabled.

https://github.com/apache/spark/blob/088071d869dee0cb433c5e72ba2e7851e332b391/core/src/main/scala/org/apache/spark/deploy/master/Master.scala#L138-L144

For the record, the check was introduced at Apache Spark 2.4.0. And, currently, it's outdated.
- #22071

### Why are the changes needed?

`spark.authenticate.secret` (the RPC authentication secret) and `spark.master.rest.enabled` (the standalone submission REST server) are independent concerns, but the removed check-code coupled them by failing Master startup whenever both were set.

Since Apache Spark 4.1.0, `spark.master.rest.enabled` defaults to `true`, this check-code forced any cluster using RPC authentication to disable the REST server. This is wrong. We don't need to block like this because the REST server is protected independently like the following.
- #47595 (Apache Spark 4.0.0)
- #47596
- #49894 (Apache Spark 4.1.0)

### Does this PR introduce _any_ user-facing change?

Previously, starting a standalone Master with `spark.authenticate.secret` set and `spark.master.rest.enabled=true` (the default) failed with an `IllegalArgumentException`. After this PR, the Master starts normally with both configured securely.

Although this is a bug fix by enabling a previous-blocked code path. So, technically there is no loss from the user perspective.

### How was this patch tested?

Added a unit test in `MasterSuite` that verifies a `Master` can be created with both `spark.master.rest.enabled=true` and `spark.authenticate.secret` set.

### Was this patch authored or co-authored using generative AI tooling?

Generated-by: Claude Code (Claude Opus 4.8)

Closes #56511 from dongjoon-hyun/SPARK-57451.

Authored-by: Dongjoon Hyun <dongjoon@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
(cherry picked from commit ff36aac)
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
(cherry picked from commit 0a95243)
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
dongjoon-hyun added a commit that referenced this pull request Jun 17, 2026
…cret` with Master REST server

### What changes were proposed in this pull request?

This PR removes the standalone `Master` check-code that rejects `spark.authenticate.secret` when the Master REST server (`spark.master.rest.enabled`) is enabled.

https://github.com/apache/spark/blob/088071d869dee0cb433c5e72ba2e7851e332b391/core/src/main/scala/org/apache/spark/deploy/master/Master.scala#L138-L144

For the record, the check was introduced at Apache Spark 2.4.0. And, currently, it's outdated.
- #22071

### Why are the changes needed?

`spark.authenticate.secret` (the RPC authentication secret) and `spark.master.rest.enabled` (the standalone submission REST server) are independent concerns, but the removed check-code coupled them by failing Master startup whenever both were set.

Since Apache Spark 4.1.0, `spark.master.rest.enabled` defaults to `true`, this check-code forced any cluster using RPC authentication to disable the REST server. This is wrong. We don't need to block like this because the REST server is protected independently like the following.
- #47595 (Apache Spark 4.0.0)
- #47596
- #49894 (Apache Spark 4.1.0)

### Does this PR introduce _any_ user-facing change?

Previously, starting a standalone Master with `spark.authenticate.secret` set and `spark.master.rest.enabled=true` (the default) failed with an `IllegalArgumentException`. After this PR, the Master starts normally with both configured securely.

Although this is a bug fix by enabling a previous-blocked code path. So, technically there is no loss from the user perspective.

### How was this patch tested?

Added a unit test in `MasterSuite` that verifies a `Master` can be created with both `spark.master.rest.enabled=true` and `spark.authenticate.secret` set.

### Was this patch authored or co-authored using generative AI tooling?

Generated-by: Claude Code (Claude Opus 4.8)

Closes #56511 from dongjoon-hyun/SPARK-57451.

Authored-by: Dongjoon Hyun <dongjoon@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
(cherry picked from commit ff36aac)
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
(cherry picked from commit 0a95243)
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
(cherry picked from commit a6480c1)
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
dongjoon-hyun added a commit that referenced this pull request Jun 17, 2026
…cret` with Master REST server

### What changes were proposed in this pull request?

This PR removes the standalone `Master` check-code that rejects `spark.authenticate.secret` when the Master REST server (`spark.master.rest.enabled`) is enabled.

https://github.com/apache/spark/blob/088071d869dee0cb433c5e72ba2e7851e332b391/core/src/main/scala/org/apache/spark/deploy/master/Master.scala#L138-L144

For the record, the check was introduced at Apache Spark 2.4.0. And, currently, it's outdated.
- #22071

### Why are the changes needed?

`spark.authenticate.secret` (the RPC authentication secret) and `spark.master.rest.enabled` (the standalone submission REST server) are independent concerns, but the removed check-code coupled them by failing Master startup whenever both were set.

Since Apache Spark 4.1.0, `spark.master.rest.enabled` defaults to `true`, this check-code forced any cluster using RPC authentication to disable the REST server. This is wrong. We don't need to block like this because the REST server is protected independently like the following.
- #47595 (Apache Spark 4.0.0)
- #47596
- #49894 (Apache Spark 4.1.0)

### Does this PR introduce _any_ user-facing change?

Previously, starting a standalone Master with `spark.authenticate.secret` set and `spark.master.rest.enabled=true` (the default) failed with an `IllegalArgumentException`. After this PR, the Master starts normally with both configured securely.

Although this is a bug fix by enabling a previous-blocked code path. So, technically there is no loss from the user perspective.

### How was this patch tested?

Added a unit test in `MasterSuite` that verifies a `Master` can be created with both `spark.master.rest.enabled=true` and `spark.authenticate.secret` set.

### Was this patch authored or co-authored using generative AI tooling?

Generated-by: Claude Code (Claude Opus 4.8)

Closes #56511 from dongjoon-hyun/SPARK-57451.

Authored-by: Dongjoon Hyun <dongjoon@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
(cherry picked from commit ff36aac)
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
(cherry picked from commit 0a95243)
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
(cherry picked from commit a6480c1)
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
(cherry picked from commit 5b84d39)
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
iemejia pushed a commit to iemejia/spark that referenced this pull request Jun 17, 2026
…cret` with Master REST server

### What changes were proposed in this pull request?

This PR removes the standalone `Master` check-code that rejects `spark.authenticate.secret` when the Master REST server (`spark.master.rest.enabled`) is enabled.

https://github.com/apache/spark/blob/088071d869dee0cb433c5e72ba2e7851e332b391/core/src/main/scala/org/apache/spark/deploy/master/Master.scala#L138-L144

For the record, the check was introduced at Apache Spark 2.4.0. And, currently, it's outdated.
- apache#22071

### Why are the changes needed?

`spark.authenticate.secret` (the RPC authentication secret) and `spark.master.rest.enabled` (the standalone submission REST server) are independent concerns, but the removed check-code coupled them by failing Master startup whenever both were set.

Since Apache Spark 4.1.0, `spark.master.rest.enabled` defaults to `true`, this check-code forced any cluster using RPC authentication to disable the REST server. This is wrong. We don't need to block like this because the REST server is protected independently like the following.
- apache#47595 (Apache Spark 4.0.0)
- apache#47596
- apache#49894 (Apache Spark 4.1.0)

### Does this PR introduce _any_ user-facing change?

Previously, starting a standalone Master with `spark.authenticate.secret` set and `spark.master.rest.enabled=true` (the default) failed with an `IllegalArgumentException`. After this PR, the Master starts normally with both configured securely.

Although this is a bug fix by enabling a previous-blocked code path. So, technically there is no loss from the user perspective.

### How was this patch tested?

Added a unit test in `MasterSuite` that verifies a `Master` can be created with both `spark.master.rest.enabled=true` and `spark.authenticate.secret` set.

### Was this patch authored or co-authored using generative AI tooling?

Generated-by: Claude Code (Claude Opus 4.8)

Closes apache#56511 from dongjoon-hyun/SPARK-57451.

Authored-by: Dongjoon Hyun <dongjoon@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants