Skip to content

[SPARK-44435][SPARK-44484][3.5][SS][CONNECT] Tests for foreachBatch and Listener#42664

Closed
WweiL wants to merge 4 commits into
apache:branch-3.5from
WweiL:SPARK-44435-tests-foreachBatch-listener-3.5
Closed

[SPARK-44435][SPARK-44484][3.5][SS][CONNECT] Tests for foreachBatch and Listener#42664
WweiL wants to merge 4 commits into
apache:branch-3.5from
WweiL:SPARK-44435-tests-foreachBatch-listener-3.5

Conversation

@WweiL

@WweiL WweiL commented Aug 24, 2023

Copy link
Copy Markdown
Contributor

What changes were proposed in this pull request?

Add several new test cases for streaming foreachBatch and streaming query listener events to test various scenarios.
Also merge this PR to 3.5 to fix the test error: #42077

Why are the changes needed?

More tests is better

Does this PR introduce any user-facing change?

No

How was this patch tested?

Test only change

WweiL added 2 commits August 24, 2023 13:18
### What changes were proposed in this pull request?

Add several new test cases for streaming foreachBatch and streaming query listener events to test various scenarios.

### Why are the changes needed?

More tests is better

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

Test only change

Closes apache#42521 from WweiL/SPARK-44435-tests-foreachBatch-listener.

Authored-by: Wei Liu <wei.liu@databricks.com>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
(cherry picked from commit 2d44848)
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
…thod

### What changes were proposed in this pull request?

Add the missing field batchDuration to StreamingQueryProgress json method. Also modify tests accordingly

### Why are the changes needed?

Add a missing field

### Does this PR introduce _any_ user-facing change?

Probably yes - in their call to `query.lastProgress` or `query.recentProgress` and inside listener this new field will show up

### How was this patch tested?

Existing unit tests

Closes apache#42077 from WweiL/SPARK-44484-missing-json-field-progress.

Authored-by: Wei Liu <wei.liu@databricks.com>
Signed-off-by: Jungtaek Lim <kabhwan.opensource@gmail.com>
("name" -> JString(name)) ~
("timestamp" -> JString(timestamp)) ~
("batchId" -> JInt(batchId)) ~
("batchDuration" -> JInt(batchDuration)) ~

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for making a new PR.

However, given the previous discussion on SPARK-44484. We cannot backport a new feature like this under a different Tests for foreachBatch and Listener title.

@dongjoon-hyun dongjoon-hyun left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If SPARK-44484 is inevitable for SPARK-44435, I'd prefer to stop backporting because we are in RC2 stage already, @WweiL .

Screenshot 2023-08-24 at 3 28 54 PM

@WweiL

WweiL commented Aug 24, 2023

Copy link
Copy Markdown
Contributor Author

@dongjoon-hyun I see. I'll close this. Thanks for the context.

@WweiL WweiL closed this Aug 24, 2023
@dongjoon-hyun

Copy link
Copy Markdown
Member

Thank you for closing this, @WweiL .

@WweiL WweiL reopened this Aug 25, 2023
batchId=j["batchId"],
batchDuration=j["batchDuration"],
# before spark 4.0, batchDuration is not in the json method of jvm side StreamingQueryProgress
batchDuration=j["batchDuration"] if "batchDuration" in j else None,

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dongjoon-hyun Hi Dongjoon, sorry for the back and forth. On second thought I actually find out that the newly added tests and the test failure in #42521 (comment) actually finds out a bug. Here before I assume batchDuration is always in the passed in json, but before 4.0 it is not there.

Given that we don't add the change in #42077, this check is needed. I reverted that commit, and add this check.

@WweiL WweiL requested a review from dongjoon-hyun August 25, 2023 17:18

@dongjoon-hyun dongjoon-hyun left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's totally fine. :)

If this is a bug fix, please make a new JIRA to proceed it instead of SPARK-44435 or SPARK-44484. The JIRA title should be clear that is a bug fix. We can merge that first.

@WweiL

WweiL commented Sep 5, 2023

Copy link
Copy Markdown
Contributor Author

fixed in 7be69bf

@WweiL WweiL closed this Sep 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants