feat(default-error-handler): Add max_time parameter to DefaultErrorHandler#648
Conversation
👋 Greetings, Airbyte Team Member!Here are some helpful tips and reminders for your convenience. Testing This CDK VersionYou can test this version of the CDK using the following: # Run the CLI from this branch:
uvx 'git+https://github.com/airbytehq/airbyte-python-cdk.git@tolik0/default-error-handler/add-max-time-parameter#egg=airbyte-python-cdk[dev]' --help
# Update a connector to use the CDK from this branch ref:
cd airbyte-integrations/connectors/source-example
poe use-cdk-branch tolik0/default-error-handler/add-max-time-parameterHelpful ResourcesPR Slash CommandsAirbyte Maintainers can execute the following slash commands on your PR:
|
📝 WalkthroughWalkthroughA new Changes
✨ Finishing Touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Actionable comments posted: 0
🔭 Outside diff range comments (1)
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (1)
2268-2275:max_timeis Optional here butDefaultErrorHandlerexpectsint– mypy is red-flagging
model.max_timeis declared asOptional[int]in the generated model, but theDefaultErrorHandlerconstructor still types this argument as plainint. PassingNonetherefore violates the signature and breaks the linters (see pipeline error).Would it make sense to coerce the value before the call (e.g. fall back to the same default the schema uses) or, alternatively, amend the
DefaultErrorHandlersignature to acceptOptional[int]? For the quick fix on this call-site, something like the diff below keeps static typing happy while preserving behaviour – wdyt?- max_time=model.max_time, + max_time=model.max_time if model.max_time is not None else 600,(Replace
600withDefaultErrorHandler.DEFAULT_MAX_TIMEif such a constant exists.)
🧹 Nitpick comments (3)
airbyte_cdk/sources/declarative/requesters/error_handlers/default_error_handler.py (2)
89-95: Makemax_timeoptional for symmetry withmax_retries, wdyt?
max_retriesis declared asOptional[int], suggesting callers may omit it.
max_timeis declared as a plainint, yet it also has a sensible default.
Declaring it asOptional[int] = 60 * 10(and reflecting that in the docstring) would keep the public surface consistent and avoid mypy/IDE hints that the argument is mandatory.
101-104: Is the private field_max_timestill needed or could we rely onmax_timedirectly?
max_timeand_max_timecurrently hold identical default values, but the latter is never read inside this class.
Duplicating state can drift over time and complicate maintenance. Would dropping_max_time(or turning it into a cached/derived property) simplify things? wdyt?airbyte_cdk/sources/declarative/declarative_component_schema.yaml (1)
1739-1743: Consider allowing interpolated values formax_time, similar to other numeric settings, wdyt?Most numeric knobs in the schema (e.g.,
backoff_time_in_secondsat line 469 andfactorat line 1834) accept either anintegeror astringwith interpolation context so connector authors can surface the value fromconfig.
Would doing the same here improve consistency and flexibility?max_time: title: Max cumulative retry wait time (seconds) description: The maximum total time (in seconds) spent waiting between retries, that is the sum of all backoff intervals, before giving up and failing. - type: integer + anyOf: + - type: integer # literal seconds + - type: string # "{{ config['max_retry_time'] }}" + interpolation_context: + - config default: 600This keeps the default numeric but unlocks templating without a breaking change.
Happy to hear your thoughts!
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
airbyte_cdk/sources/declarative/declarative_component_schema.yaml(1 hunks)airbyte_cdk/sources/declarative/models/declarative_component_schema.py(1 hunks)airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py(1 hunks)airbyte_cdk/sources/declarative/requesters/error_handlers/default_error_handler.py(1 hunks)
🧰 Additional context used
🧠 Learnings (3)
airbyte_cdk/sources/declarative/models/declarative_component_schema.py (2)
undefined
<retrieved_learning>
Learnt from: pnilan
PR: airbytehq/airbyte-python-cdk#0
File: :0-0
Timestamp: 2024-12-11T16:34:46.319Z
Learning: In the airbytehq/airbyte-python-cdk repository, the declarative_component_schema.py file is auto-generated from declarative_component_schema.yaml and should be ignored in the recommended reviewing order.
</retrieved_learning>
<retrieved_learning>
Learnt from: ChristoGrab
PR: #58
File: airbyte_cdk/sources/declarative/yaml_declarative_source.py:0-0
Timestamp: 2024-11-18T23:40:06.391Z
Learning: When modifying the YamlDeclarativeSource class in airbyte_cdk/sources/declarative/yaml_declarative_source.py, avoid introducing breaking changes like altering method signatures within the scope of unrelated PRs. Such changes should be addressed separately to minimize impact on existing implementations.
</retrieved_learning>
airbyte_cdk/sources/declarative/declarative_component_schema.yaml (1)
undefined
<retrieved_learning>
Learnt from: ChristoGrab
PR: #58
File: airbyte_cdk/sources/declarative/yaml_declarative_source.py:0-0
Timestamp: 2024-11-18T23:40:06.391Z
Learning: When modifying the YamlDeclarativeSource class in airbyte_cdk/sources/declarative/yaml_declarative_source.py, avoid introducing breaking changes like altering method signatures within the scope of unrelated PRs. Such changes should be addressed separately to minimize impact on existing implementations.
</retrieved_learning>
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (2)
undefined
<retrieved_learning>
Learnt from: aaronsteers
PR: #174
File: airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py:1093-1102
Timestamp: 2025-01-14T00:20:32.310Z
Learning: In the airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py file, the strict module name checks in _get_class_from_fully_qualified_class_name (requiring module_name to be "components" and module_name_full to be "source_declarative_manifest.components") are intentionally designed to provide early, clear feedback when class declarations won't be found later in execution. These restrictions may be loosened in the future if the requirements for class definition locations change.
</retrieved_learning>
<retrieved_learning>
Learnt from: ChristoGrab
PR: #58
File: airbyte_cdk/sources/declarative/yaml_declarative_source.py:0-0
Timestamp: 2024-11-18T23:40:06.391Z
Learning: When modifying the YamlDeclarativeSource class in airbyte_cdk/sources/declarative/yaml_declarative_source.py, avoid introducing breaking changes like altering method signatures within the scope of unrelated PRs. Such changes should be addressed separately to minimize impact on existing implementations.
</retrieved_learning>
🧬 Code Graph Analysis (2)
airbyte_cdk/sources/declarative/models/declarative_component_schema.py (3)
airbyte_cdk/sources/streams/http/http.py (1)
max_time(149-153)airbyte_cdk/sources/declarative/requesters/error_handlers/composite_error_handler.py (1)
max_time(56-57)airbyte_cdk/sources/streams/http/error_handlers/http_status_error_handler.py (1)
max_time(45-46)
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (4)
airbyte_cdk/sources/streams/http/http.py (1)
max_time(149-153)airbyte_cdk/sources/declarative/requesters/error_handlers/composite_error_handler.py (1)
max_time(56-57)airbyte_cdk/sources/streams/http/error_handlers/error_handler.py (1)
max_time(26-30)airbyte_cdk/sources/streams/http/error_handlers/http_status_error_handler.py (1)
max_time(45-46)
🪛 GitHub Actions: Linters
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py
[error] 2270-2270: mypy: Argument "max_time" to "DefaultErrorHandler" has incompatible type "int | None"; expected "int" [arg-type]
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
- GitHub Check: Check: source-shopify
- GitHub Check: Check: source-intercom
- GitHub Check: Check: destination-motherduck
- GitHub Check: Check: source-pokeapi
- GitHub Check: Check: source-hardcoded-records
- GitHub Check: Pytest (Fast)
- GitHub Check: SDM Docker Image Build
- GitHub Check: Pytest (All, Python 3.10, Ubuntu)
- GitHub Check: Pytest (All, Python 3.11, Ubuntu)
- GitHub Check: Analyze (python)
🔇 Additional comments (2)
airbyte_cdk/sources/declarative/requesters/error_handlers/default_error_handler.py (1)
135-149:max_timeis defined but never enforced – missing guard inbackoff_time?The new parameter promises “maximum cumulative wait time across retries”, yet
backoff_time()only delegates to the strategies and returns their value; cumulative elapsed time is never tracked.Could you confirm whether the parent
ErrorHandleror the caller layer caps the total wait?
If not, perhaps we need bookkeeping similar to the existing_last_request_to_attempt_count, e.g.:@@ for backoff_strategy in self.backoff_strategies: backoff = backoff_strategy.backoff_time( response_or_exception=response_or_exception, attempt_count=attempt_count ) if backoff: - return backoff + if self._max_time is None: + return backoff + self._elapsed_time += backoff + if self._elapsed_time > self._max_time: + return None # give up + return backoff(or any equivalent approach).
Let me know if this lives elsewhere; otherwise adding the guard would keep the runtime behaviour aligned with the new config knob. wdyt?airbyte_cdk/sources/declarative/models/declarative_component_schema.py (1)
1939-1943: LGTM! The newmax_timefield looks well-implemented.The field definition follows the same pattern as the existing
max_retriesfield above it, with a reasonable default of 600 seconds that matches the timeout I see in the HTTP stream implementation. The description clearly explains that this represents cumulative wait time across all retry attempts, which complements the existing retry count limit nicely. Since this is auto-generated from the YAML schema, the implementation looks mechanically correct - wdyt?
Resolves: https://github.com/airbytehq/oncall/issues/7994
Summary by CodeRabbit
New Features
Documentation