Skip to content

Implement accessors to read dataset events defined as inlet #39367

Merged
uranusjr merged 1 commit into
apache:mainfrom
astronomer:dataset-inlet-event-access
May 3, 2024
Merged

Implement accessors to read dataset events defined as inlet #39367
uranusjr merged 1 commit into
apache:mainfrom
astronomer:dataset-inlet-event-access

Conversation

@uranusjr

@uranusjr uranusjr commented May 2, 2024

Copy link
Copy Markdown
Member

This is kind of the other side of dataset_events implemented in #38481. The inlet_events context key allows the task to access past events associated with a dataset that’s defined in the task’s inlets, like this:

@task(inlets=my_ds)
def my_task(inlet_events):
    last_event_timestamp = inlet_events[my_ds][-1].timestamp

Note that inlets is not logically related to using the dataset to schedule a DAG. An inlet dataset may or may not also be present in the DAG’s schedule. Subsequently, events accessed from inlet_events do not contain any logical filtering—all past events are simply returned with a list-like interface.

This PR implements the basic structure including a lazy list-like structure (that queries the database on-demand). I plan to add more changes in future PRs after this is merged (all targeting 2.10):

  • Rename dataset_events to outlet_events for consistency.
  • Allow slicing syntax e.g. inlet_events[ds][:-3].
  • Some refactor to consolidate other list-like and lazy db access interfaces we provide elsewhere, most significantly LazyXComAccess.
  • Add documentation on this.

@uranusjr uranusjr requested review from XD-DENG, ashb, kaxil and potiuk as code owners May 2, 2024 10:23
@uranusjr uranusjr force-pushed the dataset-inlet-event-access branch from 9fd41f0 to 12be387 Compare May 2, 2024 10:39
@uranusjr uranusjr force-pushed the dataset-inlet-event-access branch from 12be387 to 0a33bbb Compare May 3, 2024 03:07
@eladkal eladkal added this to the Airflow 2.10.0 milestone May 3, 2024
@uranusjr uranusjr merged commit 2294001 into apache:main May 3, 2024
@uranusjr uranusjr deleted the dataset-inlet-event-access branch May 3, 2024 13:42
@utkarsharma2 utkarsharma2 added the type:new-feature Changelog: New Features label Jun 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants