Skip to content

[SPARK-26003][SQL][2.4] Improve SQLAppStatusListener.aggregateMetrics performance#25860

Closed
gatorsmile wants to merge 1 commit into
apache:branch-2.4from
gatorsmile:cherrypickSPARK-26003
Closed

[SPARK-26003][SQL][2.4] Improve SQLAppStatusListener.aggregateMetrics performance#25860
gatorsmile wants to merge 1 commit into
apache:branch-2.4from
gatorsmile:cherrypickSPARK-26003

Conversation

@gatorsmile

Copy link
Copy Markdown
Member

This PR is to cherry-pick #23002 to Spark 2.4


What changes were proposed in this pull request?

In SQLAppStatusListener.aggregateMetrics, we use the metricIds only to filter the relevant metrics. And this is a Seq which is also sorted. When there are many metrics involved, this can be pretty inefficient. The PR proposes to use a Set for it.

How was this patch tested?

NA

Closes #23002 from mgaido91/SPARK-26003.

## What changes were proposed in this pull request?

In `SQLAppStatusListener.aggregateMetrics`, we use the `metricIds` only to filter the relevant metrics. And this is a Seq which is also sorted. When there are many metrics involved, this can be pretty inefficient. The PR proposes to use a Set for it.

## How was this patch tested?

NA

Closes apache#23002 from mgaido91/SPARK-26003.

Authored-by: Marco Gaido <marcogaido91@gmail.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
@gatorsmile

Copy link
Copy Markdown
Member Author

cc @mgaido91 @zsxwing @cloud-fan

@zsxwing

zsxwing commented Sep 19, 2019

Copy link
Copy Markdown
Member

LGTM

@dongjoon-hyun dongjoon-hyun changed the title [SPARK-26003] [Backport-2.4] Improve SQLAppStatusListener.aggregateMetrics performance [SPARK-26003][SQL][2.4] Improve SQLAppStatusListener.aggregateMetrics performance Sep 19, 2019

@dongjoon-hyun dongjoon-hyun left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM.

@SparkQA

SparkQA commented Sep 20, 2019

Copy link
Copy Markdown

Test build #111027 has finished for PR 25860 at commit 39c1bcd.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dongjoon-hyun

Copy link
Copy Markdown
Member

Thank you all! Merged to branch-2.4.

dongjoon-hyun pushed a commit that referenced this pull request Sep 20, 2019
… performance

This PR is to cherry-pick #23002 to Spark 2.4

---

## What changes were proposed in this pull request?

In `SQLAppStatusListener.aggregateMetrics`, we use the `metricIds` only to filter the relevant metrics. And this is a Seq which is also sorted. When there are many metrics involved, this can be pretty inefficient. The PR proposes to use a Set for it.

## How was this patch tested?

NA

Closes #23002 from mgaido91/SPARK-26003.

Closes #25860 from gatorsmile/cherrypickSPARK-26003.

Authored-by: Marco Gaido <marcogaido91@gmail.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
@mgaido91

Copy link
Copy Markdown
Contributor

a late LGTM, thanks @gatorsmile @dongjoon-hyun

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants