KubernetesExecutor: self.completed adoption set is never drained by ihorlukianov · Pull Request #68674 · apache/airflow

ihorlukianov · 2026-06-17T13:27:01Z

Solves #68683

Airflow version observed 3.2.1

KubernetesExecutor.sync() re-runs _change_state() over the entire self.completed set, and nothing ever removes entries from that set.
Iteration over the self.completed is nested inside the result-queue while True

With delete_worker_pods=False, the data structure only grows, with no removals. So every adopted completed pod is re-PATCHed forever, and the set grows monotonically over the scheduler's lifetime.
With delete_worker_pods=True, the same happens with pod deletion.

The same pod name is deleted many times within seconds:

2026-06-11T18:49:36.108968Z  Deleting pod trigger-test-dags-trigger-zwyr4icn ...
2026-06-11T18:49:36.228925Z  Deleting pod trigger-test-dags-trigger-zwyr4icn ...
2026-06-11T18:49:36.397320Z  Deleting pod trigger-test-dags-trigger-zwyr4icn ...
... (166 total for this one pod)

This starves the scheduler loop: with (in my case ~1,855) scheduled TIs waiting, most of each loop is spent re-deleting finished pods instead of launching new ones.

Expected behaviour

Each finished worker pod should be patched or deleted once. Subsequent scheduler loops should not re-issue calls for pods already processed.

Reproduce

Deploy Airflow 3.2.x with KubernetesExecutor.
Trigger DAGs with many mapped tasks (e.g. 100+ mapped trigger tasks) so pods complete in quick succession.

Inspect scheduler logs:

kubectl logs <scheduler-pod> -c scheduler | grep "Deleting pod" \
  | sed 's/.*Deleting pod \([^ ]*\) in.*/\1/' | sort | uniq -c | sort -rn | head

Observe the same pod names with delete counts >> 1.

RCA

Introduced/regressed around #55797 (Oct 2025), which added a self.completed set for orphaned completed pod adoption.
In KubernetesExecutor.sync() (providers/cncf/kubernetes/src/airflow/providers/cncf/kubernetes/executors/kubernetes_executor.py):

self.completed is processed inside the while True loop that drains result_queue — for every completion event, all entries in self.completed call _change_state() again, each triggering delete_pod():

while True:
    results = self.result_queue.get_nowait()
    ...
    self._change_state(results)

    for result in self.completed:   # <-- nested inside while True
        self._change_state(result)

self.completed is never cleared after processing, so entries accumulate and are re-processed on every subsequent completion event.

Expected delete volume ≈ num_result_queue_events × (1 + len(self.completed))

Proposed fix

Move for result in self.completed outside the result-queue drain loop (once per sync()).
Clear or discard from self.completed after successful processing.
Deduplicate by pod_name when adopting completed pods.

Expected impact

Before: delete_calls ≈ num_result_events × (1 + len(completed)) per sync().
After: delete_calls ≈ num_result_events + len(completed) per sync(), with completed cleared after processing.

Was generative AI tooling used to co-author this PR?

Yes
Generated-by: [Cursor Composer ] following the guidelines

FrankYang0529

Leave some minor comments. You can link issue to #68683.

FrankYang0529 · 2026-06-18T13:03:04Z

@ihorlukianov Could you check the CI error? Thank you.

diff --git a/providers/cncf/kubernetes/tests/unit/cncf/kubernetes/executors/test_kubernetes_executor.py b/providers/cncf/kubernetes/tests/unit/cncf/kubernetes/executors/test_kubernetes_executor.py
index 8c7aad5..c82b97a 100644
--- a/providers/cncf/kubernetes/tests/unit/cncf/kubernetes/executors/test_kubernetes_executor.py
+++ b/providers/cncf/kubernetes/tests/unit/cncf/kubernetes/executors/test_kubernetes_executor.py
@@ -1561,12 +1561,8 @@ class TestKubernetesExecutor:
                     None,
                 )
             }
-            executor.result_queue.put(
-                KubernetesResults(queue_key, None, "queue-pod", "default", "2", None)
-            )
-            executor.result_queue.put(
-                KubernetesResults(queue_key, None, "queue-pod-2", "default", "3", None)
-            )
+            executor.result_queue.put(KubernetesResults(queue_key, None, "queue-pod", "default", "2", None))
+            executor.result_queue.put(KubernetesResults(queue_key, None, "queue-pod-2", "default", "3", None))

seanmuth · 2026-06-29T19:55:27Z

Independently validated this fix on a live Astro KubernetesExecutor deployment (Airflow 3.2.2, astronomer-kubernetes-executor 10.18.0 — which vendors the cncf executor; its sync() / self.completed path is byte-identical to cncf-kubernetes 10.18.1).

Method

self.completed is only populated by _adopt_completed_pods, which adopts status.phase=Succeeded pods that aren't yet marked done and belong to a dead scheduler (the selector excludes the current scheduler and alive siblings). A single steady scheduler never adopts its own completing pods, so the bug doesn't surface under normal operation — it needs orphaned Succeeded-not-done pods.

To trigger it deterministically:

AIRFLOW__KUBERNETES_EXECUTOR__DELETE_WORKER_PODS=False so completed pods linger as Succeeded.
Trigger a DAG with 100 mapped tasks (one worker pod each).
~45s in, restart the scheduler mid-burst. The restart gap leaves a pile of Succeeded-not-done pods that the new scheduler adopts into self.completed.

Measured Patched pod <key> ... to mark it as done occurrences per pod (map_index) in scheduler logs.

Results (identical trigger, only the executor code differs)

	adoption fired	pods	total patch calls	max per pod	avg	run outcome
Unpatched (10.18.0)	yes (10 adopt attempts)	100	300	21	3.0	failed
Patched (this PR)	yes (24 adopt attempts)	100	124	2	1.24	success

Without the fix, a single adopted pod was re-patched up to 21 times and the count was still climbing (self.completed grows unbounded → scheduler-loop starvation). With the fix, despite the patched run actually adopting more pods (24 vs 10 adoption attempts), re-processing is bounded to ~once per pod (max 2, avg 1.24) because self.completed is drained after each entry is handled. The patched run also recovered through the scheduler restart and completed successfully, whereas the unpatched run failed.

LGTM — the self.completed dict + drain (self.completed = still_pending) behaves as intended on a real deployment.

Drafted-by: Claude Code (Opus 4.8); reviewed by @seanmuth before posting

…che#68674) * Fix Kubernetes Executor pods deletion storm * Used dict for better performance; Added UT for delete_worker_pods=False * Fix formatting

Fix Kubernetes Executor pods deletion storm

3df6127

ihorlukianov requested review from hussein-awala, jedcunningham and jscheffl as code owners June 17, 2026 13:27

boring-cyborg Bot added area:providers provider:cncf-kubernetes Kubernetes (k8s) provider related issues labels Jun 17, 2026

FrankYang0529 reviewed Jun 18, 2026

View reviewed changes

Comment thread ...iders/cncf/kubernetes/src/airflow/providers/cncf/kubernetes/executors/kubernetes_executor.py Outdated

Comment thread providers/cncf/kubernetes/tests/unit/cncf/kubernetes/executors/test_kubernetes_executor.py

ihorlukianov mentioned this pull request Jun 18, 2026

KubernetesExecutor: self.completed adoption set is never drained, completed pods re-PATCHed (patch_pod_executor_done) every sync() loop #68683

Open

2 tasks

Used dict for better performance; Added UT for delete_worker_pods=False

a61ec7a

ihorlukianov changed the title ~~KubernetesExecutor repeatedly deletes the same finished worker pods every scheduler loop~~ KubernetesExecutor: self.completed adoption set is never drained Jun 18, 2026

Fix formatting

7f94c04

kaxil approved these changes Jun 30, 2026

View reviewed changes

kaxil merged commit 25c0b3f into apache:main Jun 30, 2026
102 checks passed

seanmuth mentioned this pull request Jun 30, 2026

Requeue KubernetesExecutor tasks whose pod failed before execution started #69058

Open

1 task

styndall mentioned this pull request Jul 1, 2026

Fix repeated deletion attempts from adopted completed pods in kubernetes executor #68360

Closed

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

KubernetesExecutor: self.completed adoption set is never drained#68674

KubernetesExecutor: self.completed adoption set is never drained#68674
kaxil merged 3 commits into
apache:mainfrom
ihorlukianov:main-pod-deletion-fix

ihorlukianov commented Jun 17, 2026 •

edited

Loading

Uh oh!

FrankYang0529 left a comment

Uh oh!

Uh oh!

Uh oh!

FrankYang0529 commented Jun 18, 2026

Uh oh!

seanmuth commented Jun 29, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Conversation

ihorlukianov commented Jun 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Expected behaviour

Reproduce

RCA

Proposed fix

Expected impact

Was generative AI tooling used to co-author this PR?

Uh oh!

FrankYang0529 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

FrankYang0529 commented Jun 18, 2026

Uh oh!

seanmuth commented Jun 29, 2026

Method

Results (identical trigger, only the executor code differs)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

ihorlukianov commented Jun 17, 2026 •

edited

Loading