Skip to content

Fix ScheduledTaskExecutor deadlock when TrySetResult runs continuations inline#2953

Merged
martincostello merged 3 commits intoApp-vNext:mainfrom
crnhrv:fix-executor-deadlock
Mar 3, 2026
Merged

Fix ScheduledTaskExecutor deadlock when TrySetResult runs continuations inline#2953
martincostello merged 3 commits intoApp-vNext:mainfrom
crnhrv:fix-executor-deadlock

Conversation

@crnhrv
Copy link
Copy Markdown
Contributor

@crnhrv crnhrv commented Mar 3, 2026

Pull Request

The issue or feature being addressed

Fixes #2948ScheduledTaskExecutor deadlock when TrySetResult runs continuations inline.

Details on the issue fix or feature implementation

ScheduledTaskExecutor.ScheduleTask was creating its TaskCompletionSource<object> without TaskCreationOptions.RunContinuationsAsynchronously. This meant that when StartProcessingAsync called TrySetResult, any continuation awaiting the returned task could run inline on the executor's single processing thread. If that continuation blocked (e.g. by holding a lock and making a synchronous Polly call, or by waiting for a second scheduled task), the executor thread would stall and deadlock.

To fix we pass TaskCreationOptions.RunContinuationsAsynchronously to the TaskCompletionSource constructor, ensuring continuations are always sent to the thread pool rather than running inline.

A regression test is included that forces a continuation to run synchronously via TaskContinuationOptions.ExecuteSynchronously and has it block waiting for a second scheduled task. This deadlocks (and is cancelled after 250ms) without the fix.

Confirm the following

  • I started this PR by branching from the head of the default branch
  • I have targeted the PR to merge into the default branch
  • I have included unit tests for the issue/feature
  • I have successfully run a local build

@crnhrv crnhrv changed the title Fix executor deadlock Fix ScheduledTaskExecutor deadlock when TrySetResult runs continuations inline Mar 3, 2026
@crnhrv
Copy link
Copy Markdown
Contributor Author

crnhrv commented Mar 3, 2026

@crnhrv please read the following Contributor License Agreement(CLA). If you agree with the CLA, please reply with the following information.

@dotnet-policy-service agree [company="{your company}"]

Options:

  • (default - no company specified) I have sole ownership of intellectual property rights to my Submissions and I am not making Submissions in the course of work for my employer.
@dotnet-policy-service agree
  • (when company given) I am making Submissions in the course of work for my employer (or my employer has intellectual property rights in my Submissions by contract or applicable law). I have permission from my employer to make Submissions and enter into this Agreement on behalf of my employer. By signing below, the defined term “You” includes me and my employer.
@dotnet-policy-service agree company="Microsoft"

Contributor License Agreement

@dotnet-policy-service agree company="Redgate Software Ltd"

@martincostello martincostello added this to the v8.6.6 milestone Mar 3, 2026
@martincostello martincostello enabled auto-merge (squash) March 3, 2026 18:23
@martincostello martincostello merged commit 016dd90 into App-vNext:main Mar 3, 2026
25 checks passed
@codecov
Copy link
Copy Markdown

codecov bot commented Mar 3, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 96.15%. Comparing base (779aa83) to head (7e93baf).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #2953   +/-   ##
=======================================
  Coverage   96.15%   96.15%           
=======================================
  Files         309      309           
  Lines        7128     7128           
  Branches     1005     1005           
=======================================
  Hits         6854     6854           
  Misses        221      221           
  Partials       53       53           
Flag Coverage Δ
linux 96.15% <100.00%> (ø)
macos 96.15% <100.00%> (ø)
windows 96.14% <100.00%> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@crnhrv crnhrv deleted the fix-executor-deadlock branch March 3, 2026 19:02
@martincostello
Copy link
Copy Markdown
Member

@crnhrv In the process of releasing this change, one of the new tests failed here:

continuationTask.Wait(timeout).ShouldBeTrue();

Does this indicate an issue with the fix, or is it just that the timeout is too small for CI?

@crnhrv
Copy link
Copy Markdown
Contributor Author

crnhrv commented Mar 4, 2026

@martincostello Hmm, the tests were passing on my machine in under <10ms so I thought 250ms would be safe enough for CI, but I think that's the only explanation considering the actual behaviour should be deterministic otherwise as far as I can tell. I notice it failed on tests targeting .NET Framework v4.8.1. I haven't checked how long the tests take on other versions of the SDK, but maybe it's slower? Increasing it to whatever time you're comfortable having a unit test take on failure would be safer.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 4, 2026

Thanks for your contribution @crnhrv - the changes from this pull request have been published as part of version 8.6.6 📦, which is now available from NuGet.org 🚀

@martincostello
Copy link
Copy Markdown
Member

There was also a failure here for net10.0: logs

If you're happy that it's just a test flake, then I'll do a PR to increase the value.

martincostello added a commit that referenced this pull request Mar 4, 2026
@crnhrv
Copy link
Copy Markdown
Contributor Author

crnhrv commented Mar 4, 2026

I've been running the tests until failure locally in ~10 concurrent sessions for the past 5 minutes or so and haven't hit a failure so I can't imagine it's flaky for any other reason than the timeout being too low.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: ScheduledTaskExecutor deadlock when TrySetResult runs continuations inline

2 participants