Skip to content

[BEAM-12114] Eliminate beam_fn_api from KafkaIO expansion#14419

Merged
boyuanzz merged 1 commit intoapache:masterfrom
boyuanzz:kafka_override
Apr 14, 2021
Merged

[BEAM-12114] Eliminate beam_fn_api from KafkaIO expansion#14419
boyuanzz merged 1 commit intoapache:masterfrom
boyuanzz:kafka_override

Conversation

@boyuanzz
Copy link
Copy Markdown
Contributor

@boyuanzz boyuanzz commented Apr 2, 2021

Please add a meaningful description for your change here


Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:

  • Choose reviewer(s) and mention them in a comment (R: @username).
  • Format the pull request title like [BEAM-XXX] Fixes bug in ApproximateQuantiles, where you replace BEAM-XXX with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue.
  • Update CHANGES.md with noteworthy changes.
  • If this contribution is large, please file an Apache Individual Contributor License Agreement.

See the Contributor Guide for more tips on how to make review process smoother.

Post-Commit Tests Status (on master branch)

Lang SDK ULR Dataflow Flink Samza Spark Twister2
Go --- --- Build Status --- Build Status ---
Java Build Status Build Status Build Status
Build Status
Build Status
Build Status
Build Status Build Status
Build Status
Build Status
Build Status
Build Status
Build Status
Build Status
Build Status Build Status
Build Status
Build Status
Build Status
Python Build Status
Build Status
Build Status
--- Build Status
Build Status
Build Status
Build Status
Build Status
--- Build Status ---
XLang Build Status --- Build Status Build Status --- Build Status ---

Pre-Commit Tests Status (on master branch)

--- Java Python Go Website Whitespace Typescript
Non-portable Build Status
Build Status
Build Status
Build Status
Build Status
Build Status Build Status Build Status Build Status
Portable --- Build Status --- --- --- ---

See .test-infra/jenkins/README for trigger phrase, status and link of all Jenkins jobs.

GitHub Actions Tests Status (on master branch)

Build python source distribution and wheels
Python tests
Java tests

See CI.md for more information about GitHub Actions CI.

@boyuanzz boyuanzz force-pushed the kafka_override branch 2 times, most recently from 120fbf6 to 47cbf03 Compare April 6, 2021 23:33
@boyuanzz boyuanzz changed the title [WIP] Eliminate beam_fn_api from KafkaIO expansion [BEAM-12114] Eliminate beam_fn_api from KafkaIO expansion Apr 6, 2021
@boyuanzz boyuanzz requested a review from kennknowles April 6, 2021 23:40
Copy link
Copy Markdown
Member

@kennknowles kennknowles left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a useful step for moving this experiment out of the core SDK and into runners. Nice!

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mark @Internal just to be clear that it is not for pipeline authors. Would be good to document why this exists and when to use it.

Perhaps it could be in runners-core-construction, but I actually want to merge that back into the core SDK so no need.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

runners-core-construction cannot depend on kafka-io because expansion-service introduces circular dependency, just like pubsub

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch!

@kennknowles
Copy link
Copy Markdown
Member

run java precommit

@boyuanzz
Copy link
Copy Markdown
Contributor Author

boyuanzz commented Apr 8, 2021

Run Java KafkaIO Performance Test

@boyuanzz boyuanzz force-pushed the kafka_override branch 2 times, most recently from 06b409f to 280ccdc Compare April 8, 2021 23:41
@boyuanzz
Copy link
Copy Markdown
Contributor Author

boyuanzz commented Apr 8, 2021

Run Java KafkaIO Performance Test

@boyuanzz
Copy link
Copy Markdown
Contributor Author

boyuanzz commented Apr 9, 2021

Run Java PreCommit

@boyuanzz
Copy link
Copy Markdown
Contributor Author

Run Java PreCommit

1 similar comment
@boyuanzz
Copy link
Copy Markdown
Contributor Author

Run Java PreCommit

@boyuanzz
Copy link
Copy Markdown
Contributor Author

boyuanzz commented Apr 12, 2021

I got build errors from :sdks:java:testing:tpcds module as:

14:50:07 * What went wrong:
14:50:07 Execution failed for task ':sdks:java:testing:tpcds:compileJava'.
14:50:07 > Could not resolve all files for configuration ':sdks:java:testing:tpcds:compileClasspath'.
14:50:07    > Could not find io.confluent:kafka-avro-serializer:5.3.2.
14:50:07      Searched in the following locations:
14:50:07        - https://repo.maven.apache.org/maven2/io/confluent/kafka-avro-serializer/5.3.2/kafka-avro-serializer-5.3.2.pom
14:50:07      If the artifact you are trying to retrieve can be found in the repository but without metadata in 'Maven POM' format, you need to adjust the 'metadataSources { ... }' of the repository declaration.
14:50:07      Required by:
14:50:07          project :sdks:java:testing:tpcds > project :runners:google-cloud-dataflow-java > project :sdks:java:io:kafka
14:50:07    > Could not find io.confluent:kafka-schema-registry-client:5.3.2.
14:50:07      Searched in the following locations:
14:50:07        - https://repo.maven.apache.org/maven2/io/confluent/kafka-schema-registry-client/5.3.2/kafka-schema-registry-client-5.3.2.pom
14:50:07      If the artifact you are trying to retrieve can be found in the repository but without metadata in 'Maven POM' format, you need to adjust the 'metadataSources { ... }' of the repository declaration.
14:50:07      Required by:
14:50:07          project :sdks:java:testing:tpcds > project :runners:google-cloud-dataflow-java > project :sdks:java:io:kafka

It seems like if I change :sdks:java:testing:tpcds to apply applyJavaNature, this error will be gone but more complaint from the sdks:java:testing:tpcds code.

@boyuanzz
Copy link
Copy Markdown
Contributor Author

Commented on the PR #12436 (comment) to see whether it's intended to not apply applyJavaNature

@kennknowles
Copy link
Copy Markdown
Member

When I look at https://github.com/apache/beam/blob/master/sdks/java/testing/tpcds/build.gradle I do not really understand the root cause. I think using applyJavaNature would be an improvement, so it would gain dependency analysis (and others).

@boyuanzz
Copy link
Copy Markdown
Contributor Author

Fix the build makes this PR extremely huge: d05cd78. Especially applying spotless introduced a lot of formatting changes.

@kennknowles
Copy link
Copy Markdown
Member

I am OK doing more than one commit in a PR, if each commit makes sense. Doing the spotless formatting in its own commit makes sense. Why is it so huge? Is it not applied already?

@kennknowles
Copy link
Copy Markdown
Member

Oh, I see. Yes, I think it is probably worth separating the large-scale change of adding applyJavaNature to a separate PR, just to keep things simpler for people reading the history.

@boyuanzz
Copy link
Copy Markdown
Contributor Author

boyuanzz commented Apr 12, 2021

Is it not applied already?

No. Spotlesscheck was not applied because this module didn't apply applyJavaNature. The same as the checker framework and errorprone

@kennknowles
Copy link
Copy Markdown
Member

Yea, I see the problem. If we change to a more standard style of Gradle configuration we may avoid this pain in the future. I am OK with multiple commits but maybe a separate PR would be most clear.

@boyuanzz
Copy link
Copy Markdown
Contributor Author

Here we go: #14516

@boyuanzz
Copy link
Copy Markdown
Contributor Author

Run Java PreCommit

@boyuanzz
Copy link
Copy Markdown
Contributor Author

Run Java PreCommit

@boyuanzz
Copy link
Copy Markdown
Contributor Author

I'm going to merge this PR and I'll monitor the post commit for next 2 days.

@boyuanzz boyuanzz merged commit 3b85447 into apache:master Apr 14, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants