-
Notifications
You must be signed in to change notification settings - Fork 17.3k
Document supported databricks_retry_args usage for deferrable Databricks operators
#68017
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1 +1 @@ | ||
| 2d6f34bb40832f84cb6c121237b1c5b0a05181dccface9fd171558f4df1747dc | ||
| 2ccde55d75b93c7fc2c5723fc7f74bf8995244606190c98acf005ea1f39f04ca | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -33,6 +33,7 @@ Features | |
| ~~~~~~~~ | ||
|
|
||
| * ``Fail fast for non-serializable retry_args in deferrable operators and triggers (#64960)`` | ||
| * ``Document supported retry_args shapes for deferrable Databricks operators`` | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Please drop this entry. Per the NOTE TO CONTRIBUTORS at the top of this file, the changelog is maintained semi-automatically by the release manager, and contributor edits are only for breaking-change guidance. A hand-added line (without the Drafted-by: Claude Code (Fable 5); reviewed by @moomindani before posting |
||
| * ``Forward Airflow Dag params to Databricks job parameters in CreateJobs/SubmitRun/RunNow (#66613)`` | ||
| * ``Add session-level query tags to Databricks SQL operators (#66895)`` | ||
|
|
||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -80,3 +80,41 @@ DatabricksRunNowDeferrableOperator | |
| Deferrable version of the :class:`~airflow.providers.databricks.operators.DatabricksRunNowOperator` operator. | ||
|
|
||
| It allows to utilize Airflow workers more effectively using `new functionality introduced in Airflow 2.2.0 <https://airflow.apache.org/docs/apache-airflow/2.2.0/concepts/deferring.html#triggering-deferral>`_ | ||
|
|
||
| .. _howto/operator:DatabricksRunNowDeferrableOperator:retry-args: | ||
|
|
||
| Retry args in deferrable mode | ||
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | ||
|
|
||
| When ``deferrable=True``, the ``databricks_retry_args`` dictionary is serialized across the | ||
| trigger boundary and must contain only Airflow-serializable values (plain Python primitives | ||
| such as ``int``, ``float``, ``str``, ``bool``, ``None``, ``dict``, and ``list``). | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Two precision issues here:
Drafted-by: Claude Code (Fable 5); reviewed by @moomindani before posting |
||
|
|
||
| **Supported** (serialization-safe and runtime-valid): | ||
|
|
||
| .. code-block:: python | ||
|
|
||
| # Only plain-primitive Retrying kwarg: reraise | ||
| databricks_retry_args = {"reraise": True} | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This example is serialization-safe but not runtime-valid as documented. When Drafted-by: Claude Code (Fable 5); reviewed by @moomindani before posting |
||
|
|
||
| For controlling attempt count and delay, prefer the dedicated operator | ||
| parameters ``retry_limit`` and ``retry_delay`` rather than | ||
| ``databricks_retry_args``. Custom tenacity strategy objects (``stop``, | ||
| ``wait``, ``retry``, ``before``, ``after``, etc.) require tenacity | ||
| callable objects, which are not serialization-safe in deferrable mode. | ||
|
|
||
| **Not supported** in deferrable mode (will raise ``ValueError`` at task submission): | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. "At task submission" overstates the timing. The validation added by #64960 lives only in the trigger constructors ( Drafted-by: Claude Code (Fable 5); reviewed by @moomindani before posting |
||
|
|
||
| .. code-block:: python | ||
|
|
||
| from tenacity import stop_after_attempt, wait_incrementing | ||
|
|
||
| # Tenacity strategy objects — NOT serializable | ||
| databricks_retry_args = {"stop": stop_after_attempt(3)} | ||
| databricks_retry_args = {"wait": wait_incrementing(start=30, increment=30)} | ||
|
|
||
| # Arbitrary callables — NOT serializable | ||
| databricks_retry_args = {"retry": my_custom_retry_callable} | ||
|
|
||
| If you need a custom callable retry strategy, use the non-deferrable | ||
| :class:`~airflow.providers.databricks.operators.DatabricksRunNowOperator` (``deferrable=False``). | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The cross-reference path is missing the module segment: the class lives at Drafted-by: Claude Code (Fable 5); reviewed by @moomindani before posting |
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -166,3 +166,41 @@ DatabricksSubmitRunDeferrableOperator | |
| Deferrable version of the :class:`~airflow.providers.databricks.operators.DatabricksSubmitRunOperator` operator. | ||
|
|
||
| It allows to utilize Airflow workers more effectively using `new functionality introduced in Airflow 2.2.0 <https://airflow.apache.org/docs/apache-airflow/2.2.0/concepts/deferring.html#triggering-deferral>`_ | ||
|
|
||
| .. _howto/operator:DatabricksSubmitRunDeferrableOperator:retry-args: | ||
|
|
||
| Retry args in deferrable mode | ||
| ----------------------------- | ||
|
|
||
| When ``deferrable=True``, the ``databricks_retry_args`` dictionary is serialized across the | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Same comments as the equivalent section in Drafted-by: Claude Code (Fable 5); reviewed by @moomindani before posting |
||
| trigger boundary and must contain only Airflow-serializable values (plain Python primitives | ||
| such as ``int``, ``float``, ``str``, ``bool``, ``None``, ``dict``, and ``list``). | ||
|
|
||
| **Supported** (serialization-safe and runtime-valid): | ||
|
|
||
| .. code-block:: python | ||
|
|
||
| # Only plain-primitive Retrying kwarg: reraise | ||
| databricks_retry_args = {"reraise": True} | ||
|
|
||
| For controlling attempt count and delay, prefer the dedicated operator | ||
| parameters ``retry_limit`` and ``retry_delay`` rather than | ||
| ``databricks_retry_args``. Custom tenacity strategy objects (``stop``, | ||
| ``wait``, ``retry``, ``before``, ``after``, etc.) require tenacity | ||
| callable objects, which are not serialization-safe in deferrable mode. | ||
|
|
||
| **Not supported** in deferrable mode (will raise ``ValueError`` at task submission): | ||
|
|
||
| .. code-block:: python | ||
|
|
||
| from tenacity import stop_after_attempt, wait_incrementing | ||
|
|
||
| # Tenacity strategy objects — NOT serializable | ||
| databricks_retry_args = {"stop": stop_after_attempt(3)} | ||
| databricks_retry_args = {"wait": wait_incrementing(start=30, increment=30)} | ||
|
|
||
| # Arbitrary callables — NOT serializable | ||
| databricks_retry_args = {"retry": my_custom_retry_callable} | ||
|
|
||
| If you need a custom callable retry strategy, use the non-deferrable | ||
| :class:`~airflow.providers.databricks.operators.DatabricksSubmitRunOperator` (``deferrable=False``). | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file was intentionally deleted and gitignored on
mainby #68801 (it is auto-regenerated by breeze when needed), so this PR re-introduces a file that no longer exists upstream — and the hash edit itself is byte-for-byte identical to the already-merged #68011. It looks like the branch was cut in the window between #67080 (which mistakenly re-added the file) and #68011. Could you rebase onto currentmainand drop this file from the diff? This is the same stale-base issue flagged during the #64960 review round.Drafted-by: Claude Code (Fable 5); reviewed by @moomindani before posting