Skip to content

Fix scheduler status handling#5978

Merged
t-kikuc merged 2 commits into
masterfrom
fix-scheduler-status-handling
Jul 2, 2025
Merged

Fix scheduler status handling#5978
t-kikuc merged 2 commits into
masterfrom
fix-scheduler-status-handling

Conversation

@t-kikuc
Copy link
Copy Markdown
Member

@t-kikuc t-kikuc commented Jun 30, 2025

What this PR does:

[1] Use defer to ensure reporting the status even if the stage is cancelled
[2] Handle Skipped/Exited before executeStage when the stage was already completed by a previous scheduler.

Why we need it:

[1] Without this PR, the following problems will occur:

A. The stage is marked as 'running' forever. (it should be 'cancelled')

image

B. The stage is marked as 'failure' even if I cancelled. (it should be 'cancelled')

image

[2] To simplify status handling in executeStage.

Which issue(s) this PR fixes:

Part of #5259

Does this PR introduce a user-facing change?:

  • How are users affected by this change:
  • Is this breaking change:
  • How to migrate (if breaking change):

t-kikuc added 2 commits June 27, 2025 14:48
Signed-off-by: t-kikuc <tkikuchi07f@gmail.com>
Signed-off-by: t-kikuc <tkikuchi07f@gmail.com>
handler.Cancel()
finalStatus := s.executeStage(sig, s.deployment.Stages[0])
assert.Equal(t, model.StageStatus_STAGE_FAILURE, finalStatus)
assert.Equal(t, model.StageStatus_STAGE_CANCELLED, finalStatus)
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

due to this change:

if err := s.reportStageStatus(ctx, ps.Id, model.StageStatus_STAGE_RUNNING, ps.Requires); err != nil {
			if sig.Signal() == StopSignalCancel {
				return model.StageStatus_STAGE_CANCELLED
			}
			return model.StageStatus_STAGE_FAILURE
		}

Comment on lines +305 to +313
if ps.Status == model.StageStatus_STAGE_SKIPPED {
statusReason = fmt.Sprintf("Stage %s has been already skipped", ps.Id)
continue
}
if ps.Status == model.StageStatus_STAGE_EXITED {
deploymentStatus = model.DeploymentStatus_DEPLOYMENT_SUCCESS
statusReason = fmt.Sprintf("Deployment was exited before stage %s", ps.Id)
break
}
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are [2].
[1] is not related to here.

@codecov
Copy link
Copy Markdown

codecov Bot commented Jun 30, 2025

Codecov Report

Attention: Patch coverage is 33.33333% with 10 lines in your changes missing coverage. Please review.

Project coverage is 28.16%. Comparing base (fdb9066) to head (a305f6b).
Report is 11 commits behind head on master.

Files with missing lines Patch % Lines
pkg/app/pipedv1/controller/scheduler.go 33.33% 9 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #5978      +/-   ##
==========================================
- Coverage   28.16%   28.16%   -0.01%     
==========================================
  Files         516      516              
  Lines       55895    55904       +9     
==========================================
+ Hits        15743    15745       +2     
- Misses      38900    38906       +6     
- Partials     1252     1253       +1     
Flag Coverage Δ
. 23.22% <33.33%> (-0.01%) ⬇️
.-pkg-app-pipedv1-plugin-kubernetes 66.42% <ø> (ø)
.-pkg-app-pipedv1-plugin-kubernetes_multicluster 67.51% <ø> (ø)
.-pkg-app-pipedv1-plugin-wait 35.51% <ø> (ø)
.-pkg-plugin-sdk 50.89% <ø> (ø)
.-tool-actions-gh-release 19.23% <ø> (ø)
.-tool-actions-plan-preview 25.30% <ø> (ø)
.-tool-codegen-protoc-gen-auth 0.00% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Comment on lines +542 to +544
if sig.Signal() == StopSignalCancel {
return model.StageStatus_STAGE_CANCELLED
}
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This solves [1]-B.

Comment on lines +492 to +498
defer func() {
// Ensure reporting the status even if the stage is cancelled.
if err := s.reportStageStatus(context.Background(), ps.Id, finalStatus, ps.Requires); err != nil {
s.logger.Error("failed to report stage status", zap.Error(err))
}
}()

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cause of [1]-A was that reportStageStatus was not called when the stage is cancelled and returns some status before reportStageStatus at the bottom of this method.

Copy link
Copy Markdown
Member

@Warashi Warashi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Copy Markdown
Member

@khanhtc1202 khanhtc1202 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, nice catch 👍

@t-kikuc t-kikuc merged commit eaa49c2 into master Jul 2, 2025
38 checks passed
@t-kikuc t-kikuc deleted the fix-scheduler-status-handling branch July 2, 2025 07:39
@github-actions github-actions Bot mentioned this pull request Jul 14, 2025
@github-actions github-actions Bot mentioned this pull request Aug 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants