Feat: Add experimental support for low-code source execution via manifest YAML#175
Conversation
|
Some tests are failing because we had to remove |
|
Natik Gadzhi (@natikgadzhi), Ella Rohm-Ensing (@erohmensing), Bindi Pankhudi (@bindipankhudi), Augustin (@alafanechere), Ben Church (@bnchrch) - This is ready for your review. Tests are all passing except Python 3.11 tests, which will be resolved soon via Natik Gadzhi (@natikgadzhi)'s CDK update here (just merged, pending release to PyPi): airbytehq/airbyte#38846 |
|
/fix-pr
|
WalkthroughThe recent updates to the Airbyte module introduce new entities and functionalities, enhance existing modules, and add support for declarative YAML source testing. Key changes include adding the Changes
Sequence Diagram(s) (Beta)sequenceDiagram
participant User
participant Airbyte
participant DeclarativeExecutor
participant Source
User->>Airbyte: Run declarative manifest source
Airbyte->>DeclarativeExecutor: Initialize with manifest
DeclarativeExecutor->>Source: Execute source with manifest
Source-->>DeclarativeExecutor: Return data
DeclarativeExecutor-->>Airbyte: Processed data
Airbyte-->>User: Display data
Poem
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (invoked as PR comments)
Additionally, you can add CodeRabbit Configration File (
|
There was a problem hiding this comment.
Actionable comments posted: 0
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files ignored due to path filters (1)
poetry.lockis excluded by!**/*.lock
Files selected for processing (14)
- airbyte/init.py (1 hunks)
- airbyte/_processors/sql/init.py (1 hunks)
- airbyte/caches/init.py (1 hunks)
- airbyte/sources/declarative.py (1 hunks)
- airbyte/sources/registry.py (5 hunks)
- airbyte/sources/util.py (4 hunks)
- examples/run_declarative_manifest_source.py (1 hunks)
- examples/run_downloadable_yaml_source.py (1 hunks)
- pyproject.toml (2 hunks)
- tests/conftest.py (2 hunks)
- tests/integration_tests/test_lowcode_connectors.py (1 hunks)
- tests/integration_tests/test_source_test_fixture.py (3 hunks)
- tests/unit_tests/test_anonymous_usage_stats.py (2 hunks)
- tests/unit_tests/test_lowcode_connectors.py (1 hunks)
Files not reviewed due to errors (4)
- airbyte/sources/registry.py (no review received)
- tests/conftest.py (no review received)
- pyproject.toml (no review received)
- airbyte/sources/util.py (no review received)
Files skipped from review due to trivial changes (2)
- airbyte/caches/init.py
- examples/run_declarative_manifest_source.py
Additional comments not posted (16)
airbyte/_processors/sql/__init__.py (2)
6-6: The import ofsnowflakecortexaligns with the PR's enhancements to SQL processing capabilities.
Line range hint
6-12: The updated export list correctly exposes the newSnowflakeCortexSqlProcessorandSnowflakeCortexTypeConverter, ensuring they are accessible as intended.tests/unit_tests/test_lowcode_connectors.py (2)
13-31: The parameterized test setup for low-code connectors is well-implemented, ensuring comprehensive testing across different configurations.
20-31: The test execution logic is correctly implemented, using theget_sourcefunction with thesource_manifestparameter to handle YAML sources as intended in the PR.airbyte/__init__.py (1)
19-19: The addition ofrecordsto the module's exports is appropriate and aligns with the PR's enhancements to the module's capabilities.examples/run_downloadable_yaml_source.py (2)
15-53: The example script effectively demonstrates the retrieval and usage of YAML connectors, aligning with the PR's objective to support declarative sources.
21-37: The error handling in the script is robust, effectively capturing and reporting failures during the installation of YAML connectors, which enhances the script's reliability.tests/integration_tests/test_lowcode_connectors.py (2)
20-38: The test setup for connector initialization is comprehensive and well-implemented, ensuring each connector's ability to initialize is thoroughly tested.
41-78: The test setup for handling expected failures is well-structured and effectively uses parameterization to test different failure scenarios, enhancing the robustness of the testing process.airbyte/sources/declarative.py (2)
22-69: TheDeclarativeExecutorclass is well-designed, effectively handling different types of manifest inputs and providing clear error messages, which enhances its usability and robustness.
72-104: TheDeclarativeSourceclass is appropriately implemented, providing detailed documentation and examples for usage, which aids in understanding and utilizing the class effectively.tests/unit_tests/test_anonymous_usage_stats.py (3)
15-18: LGTM! Proper use of fixture scope and cleanup.
21-21: LGTM! Telemetry tracking functionality is correctly tested.
75-79: LGTM! Correct handling of the DO_NOT_TRACK environment variable.tests/integration_tests/test_source_test_fixture.py (2)
221-222: The addition oflanguageandinstall_typesparameters enhances the function's flexibility and aligns with the new features introduced in the PR.
30-33: Ensure theautouse_source_test_registryfixture is correctly utilized.
There was a problem hiding this comment.
Actionable comments posted: 0
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (1)
- airbyte/sources/util.py (4 hunks)
Files skipped from review as they are similar to previous changes (1)
- airbyte/sources/util.py
This adds the ability to run (in theory) 130 declarative yaml sources in PyAirbyte, without any need for additional virtual environment isolation. The
manifest.ymlfile content can be provided by the user or auto-downloaded frommasterbranch ofairbytehq/airbyte.Thanks to Ben Church (@bnchrch) and Lake Mossman (@lmossman) for helping figure out the logic.
The
get_source()implementation inairbyte.experimentalincludes a newsource_manifestinput argument.The argument can be any of these types:
Path- A path to a local Yaml file.dict- An already-parsed Yaml manifest.str- A URL path to a Yaml manifest.True- Indicates that PyAirbyte should find the yaml manifest at the default location, e.g.:The Yaml-runnable connectors can be found using
ab.get_available_connectors(install_type="yaml")orab.get_available_connectors(InstallType.YAML)This PR also adds hard-coded exclusions for connectors in three categories:
Usage example
See the 2 new scripts in the
examplesdirectory for more examples, but the simplest usage is just:In the above example, the source
manifest.ymlis automatically located frommasterbranch ofairbytehq/airbyte, and the only change from the user perspective is to add the argsource_manifest=True.Note
Included Connectors
This is the result of calling
get_available_connectors("yaml"):Show/Hide
Hard-coded exclusions have been removed from this list, for instance, those low-code connectors that require one or more python code files.
Summary by CodeRabbit
New Features
Enhancements
ConnectorMetadatato include language and installation types.get_available_connectorsto handle different installation types.Dependencies
airbyte-cdkto^1.2.1.airbyte-source-fakerto^6.1.2.Tests