Skip to content

Number of queries to Airflow database in "DAG File Processing Stats" #40282

Description

@MaksYermak

Description

This new feature will add a new column to the "DAG File Processing Stats" of DAG processor logs. This column will store information about the number of queries to the Airflow database per DAG.

Use case/motivation

This new column may be convenient to have it in case of debugging issues related to high load on Airflow database, e.g. typical scenario is when DAG file(s) have a lot of queries to database done on the top level of code and those are executed each time during parsing of these DAG files. One common example is excessive usage of "Variables.get" as top-level statements in DAG files.

Having information about "number of queries to Airflow database" per DAG file may help a lot during debugging issues related to high load on database or issues related to long parsing of the DAG files.

Related issues

Thread with discussion in the Airflow community: https://lists.apache.org/thread/9j6q2lq521rt5zx46l2dvow2c85sgqwb

Are you willing to submit a PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions