[Python] Drop envoy-data-plane/betterproto dependency from the Python SDK#39213
[Python] Drop envoy-data-plane/betterproto dependency from the Python SDK#39213shahar1 wants to merge 1 commit into
Conversation
EnvoyRateLimiter only needed a handful of protobuf message classes from envoy-data-plane, which pulls in the outdated betterproto==2.0.0b6 beta (a protobuf reimplementation) plus grpclib and transitive deps. That pin is the last blocker stopping Apache Airflow from un-suspending its Beam provider (apache/airflow#66952), and it forced a Python-version split in setup.py. The dependency was already fought rather than used: the RateLimitServiceStub bridge exists solely because betterproto emits async grpclib stubs that don't work with Beam's synchronous grpcio, and the wire is plain protobuf over grpcio regardless. Replace it with a minimal, self-contained rate_limit.proto compiled to a checked-in rate_limit_pb2.py (following the existing proto2_coder_test_messages_pb2.py precedent). Field numbers match Envoy's rls.proto/ratelimit.proto, so it stays wire-compatible with a real RLS server. This removes the conflict permanently for every downstream, deletes the py<3.11 split, and drops betterproto + grpclib from every container image. New wire-format tests pin the field numbers/enum values so a renumbering can't silently break live rate limiting (the mock-based tests would not). Note: the container base_image_requirements.txt files had the two direct packages removed to stay consistent with setup.py; a full `generatePythonRequirementsAll` regeneration should follow in CI to also prune now-orphaned transitives (grpclib, h2, multidict, ...) and refresh pins. Part of apache#37854 Unblocks apache/airflow#66952 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request removes the 'envoy-data-plane' dependency from the Apache Beam Python SDK. By vendoring a minimal subset of the Envoy Rate Limit Service protocol, the SDK avoids the maintenance burden and version conflicts associated with the 'betterproto' library. This change improves compatibility for downstream projects like Apache Airflow and aligns with existing patterns for handling protobuf definitions within the repository. Highlights
New Features🧠 You can now enable Memory (public preview) to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize the Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counterproductive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request removes the envoy-data-plane and transitive betterproto dependencies from the Python SDK. It introduces a small, vendored protobuf definition (rate_limit.proto and its generated Python module) to maintain wire-compatibility with the Envoy Rate Limit Service, thereby resolving dependency conflicts for downstream projects. I have no feedback to provide as there are no review comments.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
|
Assigning reviewers: R: @damccorm for label python. Note: If you would like to opt out of this review, comment Available commands:
The PR bot will only process comments in the main thread (not review comments). |
|
Failed CI job seems unrelated |
What & why
EnvoyRateLimiter(sdks/python/apache_beam/io/components/rate_limiter.py) depends onenvoy-data-planepurely to obtain a handful of protobuf message classes. That package pulls inbetterproto==2.0.0b6— an outdated pre-release of a protobuf reimplementation — plusgrpcliband a transitive subtree (h2,hpack,hyperframe,multidict).This PR removes
envoy-data-plane(and thereforebetterproto) entirely, replacing it with a minimal, self-contained vendored proto.Why this is worth doing, and why vendoring beats bumping the dependency:
betterprotoemits asyncgrpclibstubs that are incompatible with Beam's synchronousgrpcio. The code already hand-writes theRateLimitServiceStubbridge to work around that — so the package only ever contributed 6 message dataclasses, and the wire is plain protobuf overgrpcioregardless.betterproto==2.0.0b6pin is the last blocker stopping Apache Airflow from un-suspending its Beam provider (Revert "Suspend Apache Beam Provider due to grpcio limitation (#61926)" airflow#66952). Because Airflow's constraint solver reads Beam'sinstall_requires, removing the dependency here fixes it for all Python versions — nobetterprotopin needed on Airflow's side.envoy-data-plane>=2.1.0would only move the problem. That route swaps inbetterproto2(still pre-1.0,0.9.x), requires porting to the betterproto2 API, and drops Python 3.10 support (2.x requires>=3.11). Vendoring ends the version-chase permanently and lets us delete the existingpython_versionsplit insetup.py.apache_beam/coders/proto2_coder_test_messages_pb2.pyis already a checked-in, hand-regenerated_pb2.pyoutside thegen_protos.pymodel pipeline. This change follows it exactly (same lint-exclusion spots, same header style). The vendored proto is a frozen external contract — Envoy RLS v3 field numbers are GA and cannot change without breaking every client — so it is a write-once artifact, not an ongoing maintenance burden.How it works
sdks/python/apache_beam/io/components/rate_limit.proto: a ~30-line self-contained subset of the Envoy Rate Limit Service. Field numbers match Envoy'srls.proto/ratelimit.proto, so the messages are wire-compatible with a real RLS server (protobuf carries only field numbers and types on the wire — not message or package names).rate_limit_pb2.py, generated viagrpc_tools.protoc. The generatedruntime_versionguard is removed so the module stays compatible across Beam's fullprotobuf>=3.20.3,<7runtime range (matching the existing test pb2). The regeneration command is documented in both files' headers.rate_limiter.py/rate_limiter_test.py: swapped imports torate_limit_pb2; the one behavioral adjustment isgoogle.protobuf.Durationhandling (dur.ToTimedelta().total_seconds()—betterprotoused to auto-mapDuration→timedelta).Testing
rate_limiterimports and constructs withenvoy_data_planeandbetterprotoblocked fromsys.modules, confirming the dependency is genuinely gone.yapf,isort(repo CI flags), andruffall clean.Follow-up
The 12 container
*_requirements.txtfiles have the two direct packages (envoy-data-plane,betterproto) removed so they stay consistent withsetup.py. A full./gradlew :sdks:python:container:generatePythonRequirementsAllregeneration should run in CI/by a committer to also prune the now-orphaned transitives (grpclib,h2,multidict, …) and refresh pins — that step requires interpreters for all supported Python versions plus a full network resolution.envoy-data-planetech debt called out there; does not close the issue, whose TFT/protobuf-5/6 conflict remains open). Unblocks Revert "Suspend Apache Beam Provider due to grpcio limitation (#61926)" airflow#66952.CHANGES.mdwith noteworthy changes.Was AI tooling used to author this PR?
Yes — authored with Claude Code (Opus 4.8, 1M context). All changes were reviewed by the PR author. Generated following the project's gen-AI contribution guidelines.
🤖 Generated with Claude Code