Description
This new feature will add a new column to the "DAG File Processing Stats" of DAG processor logs. This column will store information about the number of queries to the Airflow database per DAG.
Use case/motivation
This new column may be convenient to have it in case of debugging issues related to high load on Airflow database, e.g. typical scenario is when DAG file(s) have a lot of queries to database done on the top level of code and those are executed each time during parsing of these DAG files. One common example is excessive usage of "Variables.get" as top-level statements in DAG files.
Having information about "number of queries to Airflow database" per DAG file may help a lot during debugging issues related to high load on database or issues related to long parsing of the DAG files.
Related issues
Thread with discussion in the Airflow community: https://lists.apache.org/thread/9j6q2lq521rt5zx46l2dvow2c85sgqwb
Are you willing to submit a PR?
Code of Conduct
Description
This new feature will add a new column to the "DAG File Processing Stats" of DAG processor logs. This column will store information about the number of queries to the Airflow database per DAG.
Use case/motivation
This new column may be convenient to have it in case of debugging issues related to high load on Airflow database, e.g. typical scenario is when DAG file(s) have a lot of queries to database done on the top level of code and those are executed each time during parsing of these DAG files. One common example is excessive usage of "Variables.get" as top-level statements in DAG files.
Having information about "number of queries to Airflow database" per DAG file may help a lot during debugging issues related to high load on database or issues related to long parsing of the DAG files.
Related issues
Thread with discussion in the Airflow community: https://lists.apache.org/thread/9j6q2lq521rt5zx46l2dvow2c85sgqwb
Are you willing to submit a PR?
Code of Conduct