Skip to content

Job Monitor: cancel in-flight Helix jobs on external cancellation (CTRL+C / SIGTERM)#16877

Open
Copilot wants to merge 3 commits into
mainfrom
copilot/handle-job-cancellation
Open

Job Monitor: cancel in-flight Helix jobs on external cancellation (CTRL+C / SIGTERM)#16877
Copilot wants to merge 3 commits into
mainfrom
copilot/handle-job-cancellation

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented May 23, 2026

When AzDO times out a pipeline job, it sends CTRL+C (Windows) or SIGTERM (Linux) to the process. Program.cs previously called runner.RunAsync() which only set up an internal timeout-based CTS — external signals were never intercepted, so the process was killed before the existing CancelInFlightHelixJobsAsync logic could run.

Changes

  • Program.cs: Register signal handlers that cancel a shared CancellationTokenSource:
    • Console.CancelKeyPress (CTRL+C, all platforms) — sets e.Cancel = true to suppress immediate termination
    • PosixSignalRegistration.Create(PosixSignal.SIGTERM, ...) (Unix only) — sets ctx.Cancel = true to suppress default termination
    • Link the signal CTS with a timeout CTS (MaximumWaitMinutes) into a single token passed to runner.RunAsync(linkedCts.Token)
  • JobMonitorRunner.cs: Remove the now-unused RunAsync() parameterless overload (timeout is owned by Program.cs)

The existing catch (OperationCanceledException) handler in RunCoreAsync — which already calls CancelInFlightHelixJobsAsync with a 30-second bounded window — now fires correctly for both external signals and internal timeouts.

To double check:

Copilot AI requested review from Copilot and removed request for Copilot May 23, 2026 01:43
…bs on CTRL+C/SIGTERM

When AzDO times out a pipeline job, it sends CTRL+C (Windows) or SIGTERM (Linux) to the
process. The previous Program.cs called runner.RunAsync() which only handled an internal
timeout, not external cancellation signals.

This change:
- Registers Console.CancelKeyPress handler for CTRL+C on all platforms
- Registers PosixSignalRegistration for SIGTERM on non-Windows platforms
- Creates a linked CancellationTokenSource combining signals + timeout
- Calls runner.RunAsync(linkedCts.Token) so the existing cancellation handler
  (which calls CancelInFlightHelixJobsAsync) is properly triggered
- Removes the now-unused RunAsync() parameterless overload from JobMonitorRunner

Agent-Logs-Url: https://github.com/dotnet/arcade/sessions/223cb56d-44f3-45d4-a0f0-13dc2b244f5a

Co-authored-by: mmitche <8725170+mmitche@users.noreply.github.com>
Copilot AI requested review from Copilot and removed request for Copilot May 23, 2026 01:59
Copilot AI changed the title [WIP] Fix job monitor to handle cancellation of in-progress jobs Job Monitor: cancel in-flight Helix jobs on external cancellation (CTRL+C / SIGTERM) May 23, 2026
Copilot AI requested a review from mmitche May 23, 2026 01:59
premun
premun previously approved these changes May 25, 2026
@premun premun marked this pull request as ready for review May 25, 2026 08:40
Copilot AI review requested due to automatic review settings May 25, 2026 08:40
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the Helix Job Monitor tool so that external pipeline cancellations (CTRL+C on Windows, SIGTERM on Unix) trigger a graceful shutdown path that can cancel in-flight Helix jobs before the process exits.

Changes:

  • Register CTRL+C and SIGTERM handlers in Program.cs, and link a signal-driven CancellationTokenSource with the existing MaximumWaitMinutes timeout into a single token passed to the runner.
  • Remove the parameterless JobMonitorRunner.RunAsync() overload now that timeout ownership moves to Program.cs.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
src/Microsoft.DotNet.Helix/JobMonitor/Program.cs Adds external signal handling and links it with the existing timeout to drive graceful cancellation.
src/Microsoft.DotNet.Helix/JobMonitor/JobMonitorRunner.cs Removes the unused parameterless RunAsync() overload.

Comment thread src/Microsoft.DotNet.Helix/JobMonitor/Program.cs Outdated
Copilot finished work on behalf of premun May 25, 2026 08:47
Copilot AI requested a review from premun May 25, 2026 08:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Job Monitor should handle cancellation. In flight Helix jobs should be cancelled

4 participants