Fix access_control syncing; faster cli sync-perm#15293
Conversation
This consolidates the cli sync-perm command and the syncing that happens by default during webserver startup. Prior to this change, `access_control` wasn't supported in the default sync and the cli sync-perm command was slow when you have many DAGs.
| .all() | ||
| ) | ||
| dagbag = DagBag(read_dags_from_db=True) | ||
| dagbag.collect_dags_from_db() |
There was a problem hiding this comment.
ooff -- This will load all the Serialized DAGs though here and start again with an incremental DagBag in
airflow/airflow/www/extensions/init_dagbag.py
Lines 24 to 32 in 9dd14aa
The Serialized DAG will be loaded when required by:
airflow/airflow/models/dagbag.py
Lines 184 to 200 in 9dd14aa
Let's say if someone changes the DAG File with a change in access_control and the Parsing process writes the serialized_dag, it will hit the above code-block and if 10 seconds have passed (AIRFLOW__CORE__MIN_SERIALIZED_DAG_FETCH_INTERVAL -https://airflow.apache.org/docs/apache-airflow/stable/configurations-ref.html#min-serialized-dag-fetch-interval), it will re-fetch and update the DAG. I think we should refresh the permission for that DAG in that nested if statement.
There was a problem hiding this comment.
We could add if dag.access_control: sync_dag_perm.... in:
airflow/airflow/models/dagbag.py
Lines 231 to 245 in 9dd14aa
There was a problem hiding this comment.
If we do that, we could just remove syncing DAG level permissions from sync-perm command
|
I'm going to merge this with pr #15311. |
This consolidates the cli sync-perm command and the syncing that happens by default during webserver startup. Prior to this change,
access_controlwasn't supported in the default sync and the cli sync-perm command was slow when you have many DAGs.With ~5k simple DAGs, this makes sync-perms faster (~24s -> ~10s). It does make the webserver startup slower (due to loading DagBag from the db vs just querying
DagModel, but it also syncsaccess_controlwhere it wasn't before.