Feature: Plugin for podman support#1024
Feature: Plugin for podman support#1024WillButAgain wants to merge 1 commit intoProject-MONAI:developfrom
Conversation
WalkthroughThis pull request introduces Podman plugin support for the Workflow Manager, enabling task execution via Podman containers. It includes a complete plugin implementation with container creation, lifecycle monitoring, artifact management, and comprehensive unit tests. Changes
Sequence Diagram(s)sequenceDiagram
participant Client as Workflow Client
participant Plugin as PodmanPlugin
participant Factory as IPodmanClientFactory
participant Creator as IPodmanContainerCreator
participant Docker as Docker API
participant Storage as Storage Service
Client->>Plugin: ExecuteTask(TaskDispatchEvent)
Plugin->>Storage: Download input artifacts
Storage-->>Plugin: Artifacts ready
Plugin->>Factory: CreateClient(podmanEndpoint)
Factory->>Docker: Create DockerClient
Docker-->>Factory: Client instance
Plugin->>Creator: CreateContainerAsync(spec)
Creator->>Docker: POST /libpod/containers/create
Docker-->>Creator: ContainerId
Plugin->>Docker: StartContainer(containerId)
Docker-->>Plugin: Container started
Plugin-->>Client: ExecutionStatus.Accepted
sequenceDiagram
participant Monitor as ContainerStatusMonitor
participant Docker as Docker API
participant Storage as Storage Service
participant Broker as Message Broker
participant FileSystem as FileSystem
Monitor->>Docker: Poll InspectContainer()
loop Until container exits or timeout
Docker-->>Monitor: Container state
Monitor->>Monitor: Check if completed
end
Monitor->>FileSystem: Read output artifacts
FileSystem-->>Monitor: File list
Monitor->>Storage: Upload each artifact
Storage-->>Monitor: Upload confirmed
Monitor->>Broker: Publish TaskCallbackEvent
Broker-->>Monitor: Published
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Poem
🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Tip Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs). Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Signed-off-by: will tepe <will.tepe@cchmc.org>
7273860 to
f7c40bc
Compare
There was a problem hiding this comment.
Actionable comments posted: 11
🧹 Nitpick comments (7)
src/TaskManager/Plug-ins/Podman/IPodmanContainerCreator.cs (1)
68-68: Consider documenting or making the Podman API version configurable.The API version
v4.0.0is hardcoded. Consider adding a comment noting the minimum supported Podman version, or making this configurable for environments with different Podman versions.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/TaskManager/Plug-ins/Podman/IPodmanContainerCreator.cs` at line 68, The hardcoded Podman API version in the POST URL inside the container creation call should be made configurable or documented; update the code in the method that calls httpClient.PostAsync (in IPodmanContainerCreator implementation where "/v4.0.0/libpod/containers/create" is used) to pull the API prefix from a configurable setting (e.g., constructor-injected podmanApiVersion or options) and fall back to a sensible default, or add an inline comment stating the minimum supported Podman version if you choose not to make it configurable; ensure the identifier (podmanApiVersion / PodmanOptions) is used to build the request path consistently across methods that call the Podman HTTP API.src/TaskManager/Plug-ins/Podman/Keys.cs (1)
21-24: Update documentation comments to reference Podman instead of Docker.Several XML comments reference "Docker" but this is the Podman plugin:
- Line 22: "Docker server" → "Podman server"
- Line 37: "docker run" → "podman run"
- Line 82: "Docker workflow" → "Podman workflow"
📝 Proposed documentation fixes
/// <summary> - /// Key for the endpoint where the Docker server is running. + /// Key for the endpoint where the Podman server is running. /// </summary> public static readonly string BaseUrl = "server_url"; /// <summary> - /// Key for specifying the user to the container. Same as -u argument for docker run. + /// Key for specifying the user to the container. Same as -u argument for podman run. /// </summary> public static readonly string User = "user"; /// <summary> - /// Required arguments to run the Docker workflow. + /// Required arguments to run the Podman workflow. /// </summary>Also applies to: 36-39, 82-83
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/TaskManager/Plug-ins/Podman/Keys.cs` around lines 21 - 24, The XML documentation in the Podman plugin contains Docker-specific wording; update the comments in the Keys class (e.g., the BaseUrl field and other XML comments around lines referencing "docker run" and "Docker workflow") to say "Podman" instead—replace "Docker server" with "Podman server", "docker run" with "podman run", and "Docker workflow" with "Podman workflow" so the comments accurately describe the Podman plugin and its keys.src/TaskManager/Plug-ins/Podman/Logging/Log.cs (1)
41-48: Consider renamingDocker*methods toPodman*for consistency.Methods like
DockerInputMapped,DockerOutputMapped,DockerEnvironmentVariableAdded, andDockerIntermediateVolumeMappeduse "Docker" prefix in a Podman plugin. While functionally correct, this may cause confusion during maintenance.Also applies to: 65-66
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/TaskManager/Plug-ins/Podman/Logging/Log.cs` around lines 41 - 48, Rename the logging methods to use the Podman prefix instead of Docker to match the plugin context: change DockerInputMapped to PodmanInputMapped, DockerOutputMapped to PodmanOutputMapped, DockerEnvironmentVariableAdded to PodmanEnvironmentVariableAdded, and DockerIntermediateVolumeMapped to PodmanIntermediateVolumeMapped (keep the LoggerMessage attributes and EventId/Message text unless message wording must change). Update all call sites and any unit tests or references to these symbols to the new method names so compilation succeeds.src/TaskManager/Plug-ins/Podman/ContainerVolumeMount.cs (1)
17-17: Unused import:Ardalis.GuardClausesis not used.The code uses
ArgumentNullException.ThrowIfNullandThrowIfNullOrWhiteSpaceinstead of Ardalis guard clauses.🧹 Proposed fix
-using Ardalis.GuardClauses; - namespace Monai.Deploy.WorkflowManager.TaskManager.Podman🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/TaskManager/Plug-ins/Podman/ContainerVolumeMount.cs` at line 17, Remove the unused Ardalis.GuardClauses using directive from ContainerVolumeMount.cs: delete the line "using Ardalis.GuardClauses;" and ensure no other code in the ContainerVolumeMount class or related methods (e.g., constructors or validation helpers) reference Ardalis guard methods; rely on the existing ArgumentNullException.ThrowIfNull/ThrowIfNullOrWhiteSpace calls as currently implemented.tests/UnitTests/TaskManager.Podman.Tests/PodmanPluginTest.cs (2)
195-240: This test doesn't observe the background failure it describes.
Startthrows from a fire-and-forget task, so the exception is off the awaited code path and these assertions still pass even ifStartis never invoked. Add aVerifyorTaskCompletionSourceso the monitor invocation is observed deterministically.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/UnitTests/TaskManager.Podman.Tests/PodmanPluginTest.cs` around lines 195 - 240, The test ExerciseTask_WhenFailedToMonitorContainer_ExpectTaskToBeAccepted does not observe the background exception because ContainerStatusMonitor.Start is invoked fire-and-forget; update the test to deterministically observe the monitor call by either (a) replacing the current _containerStatusMonitor.Setup(...).ThrowsAsync(...) with a setup that signals a TaskCompletionSource when Start is invoked and then throws inside a Task returned from Start so the test can await the TCS and assert the invocation, or (b) add a Verify on _containerStatusMonitor.Verify(m => m.Start(...), Times.Once()) after ExecuteTask to ensure Start was called; reference the mocked member _containerStatusMonitor and the Start method and coordinate with a TaskCompletionSource in the test to ensure the background failure is observed before asserting result.
376-495: Add a non-zeroexited/deadregression here.The
GetStatussuite covers paused, OOM-killed, and successful exit, but notStatus=exited|deadwith a non-zero exit code. That missing case is exactly what lets the current implementation report failed containers asSucceeded.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/UnitTests/TaskManager.Podman.Tests/PodmanPluginTest.cs` around lines 376 - 495, Add a new unit test in the PodmanPlugin GetStatus suite that simulates a container with State.Status set to exited (or dead) and a non-zero ExitCode to cover the regression; create a test (e.g., GetStatus_WhenContainerIsExitedWithNonZeroExitCode_ExpectFailureStatus) that setups _podmanClient.Containers.InspectContainerAsync to return ContainerInspectResponse with State.Status = Strings.DockerStatusExited (or "dead") and ExitCode = 100, instantiate PodmanPlugin and call GetStatus("identity", ...), then assert result.Status == TaskExecutionStatus.Failed, result.FailureReason == FailureReason.ExternalServiceError (or the same code path used for non-zero exits in PodmanPlugin), result.Errors contains "Exit code=100", and verify InspectContainerAsync was called for "identity".tests/UnitTests/TaskManager.Podman.Tests/ContainerStatusMonitorTest.cs (1)
99-125: Add a transient-failure regression for the completion path.These tests only exercise the no-artifact and all-success cases. A case where
PutObjectAsyncorPublishfails after the container reachesexitedwould catch duplicate uploads/callbacks and callbacks sent with missing artifacts.Also applies to: 127-153
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/UnitTests/TaskManager.Podman.Tests/ContainerStatusMonitorTest.cs` around lines 99 - 125, Add a new unit test covering a transient-failure regression for the completion path: simulate ContainerStatusMonitor.Start receiving an exited container (use the same ContainerInspectResponse setup) but make IStorageService.PutObjectAsync and/or IMessageBrokerPublisherService.Publish throw on the first call and succeed on a retry, then assert that artifacts are uploaded exactly once and the final callback is published exactly once; reference the existing test Start_WhenCalledWithoutAnyArtifacts_ExpectToSendCallbackEvent for structure and the methods PutObjectAsync and Publish to locate the mocks and the Start method invocation on ContainerStatusMonitor for where to introduce the failure-and-retry behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/TaskManager/Plug-ins/Podman/ContainerStatusMonitor.cs`:
- Around line 91-107: The polling loop currently calls UploadOutputArtifacts and
SendCallbackMessage inside the try so any exception re-enters the retry loop and
can cause duplicate side effects; change the logic in the method that uses
dockerClient.Containers.InpectContainerAsync and IsContainerCompleted so that
when IsContainerCompleted(response.State) is true you set a local flag (e.g.,
completed = true) and break out of the polling loop (do not call
UploadOutputArtifacts or SendCallbackMessage there), then after the loop
(outside the retry/inspect try/catch/finally) invoke
UploadOutputArtifacts(intermediateVolumeMount, outputVolumeMounts,
cancellationToken) and SendCallbackMessage(taskDispatchEvent, containerId) once,
each wrapped with their own appropriate try/catch/error logging so failures do
not cause the poll loop to re-run and produce duplicates.
- Around line 175-177: The catch block in ContainerStatusMonitor that calls
_logger.ErrorUploadingFile(file, ex) is swallowing upload failures so Start can
still publish callbacks with missing artifacts; after logging, rethrow or
propagate the exception (e.g., throw; or throw a new exception that wraps ex) so
the failure prevents Start from publishing the callback, or alternatively set
and check a failure flag that prevents PublishCallback from running — update the
catch that contains _logger.ErrorUploadingFile(file, ex) to either rethrow the
exception or mark the upload as failed and ensure Start checks that flag before
publishing.
- Around line 164-165: The object key assembly is unsafe: using
file.Replace(artifactsPath, ...) and _fileSystem.Path.Combine produces wrong
results (prefix collisions, rooted paths dropping the prefix, and OS-specific
backslashes). Change the logic that sets objectName (the code referencing file,
artifactsPath and destination.RelativeRootPath and calling
_fileSystem.Path.Combine) to first ensure you only strip a leading artifactsPath
prefix (e.g., check file.StartsWith(artifactsPath) and remove exactly that
prefix) then normalize the remaining segment by trimming any leading slashes,
and finally join destination.RelativeRootPath and the normalized segment using
explicit forward-slash concatenation (not Path.Combine) so the resulting object
key always uses '/' separators; apply the same fix to the Docker variant where
the same pattern appears.
In `@src/TaskManager/Plug-ins/Podman/Keys.cs`:
- Around line 62-64: Fix the typo in the XML comment above the
EnvironmentVariableKeyPrefix field in Keys.cs: change "envrionment" to
"environment" so the summary reads "Prefix for environment variables." and
ensure the XML comment remains well-formed for the public static readonly string
EnvironmentVariableKeyPrefix.
In `@src/TaskManager/Plug-ins/Podman/Logging/Log.cs`:
- Around line 41-48: Fix the typos in the LoggerMessage Message strings: update
DockerInputMapped and DockerOutputMapped messages to use "host={hostPath}"
instead of "host=={hostPath}", change DockerEnvironmentVariableAdded message to
"Environment variable added {key}={value}." instead of "variabled", and find any
other logger messages around the same area (the two messages referenced at the
end of the comment) that use "retreiving" and correct them to "retrieving";
ensure each LoggerMessage attribute's Message text is corrected for these
spelling and punctuation issues.
- Around line 116-117: The ErrorSettingDirectoryPermission logger method should
accept an Exception so stack traces are captured: change the partial method
signature from ErrorSettingDirectoryPermission(this ILogger logger, string path,
string user) to include an Exception parameter (e.g.,
ErrorSettingDirectoryPermission(this ILogger logger, Exception exception, string
path, string user)) and update any call sites to pass the caught exception; keep
the LoggerMessage attribute as-is (the exception is provided as a method
parameter and will be logged by the generated source) and mirror the parameter
ordering used by other methods like ErrorGeneratingContainerSpecification and
ErrorDeployingContainer for consistency.
- Around line 65-66: The LoggerMessage attribute on the
DockerIntermediateVolumeMapped partial method has a typo in EventId (10014) that
breaks the project's EventId sequence; update the attribute to use EventId =
1014 on the DockerIntermediateVolumeMapped method in Log.cs and ensure it
matches the surrounding 1000-1031 range (and confirm no other log entry already
uses 1014 to avoid duplicates).
In `@src/TaskManager/Plug-ins/Podman/PodmanPlugin.cs`:
- Around line 366-372: The loop in PodmanPlugin (iterating
Event.TaskPluginArguments and using Keys.EnvironmentVariableKeyPrefix) currently
logs raw env var values via _logger.DockerEnvironmentVariableAdded(envVarKey,
Event.TaskPluginArguments[key]) which can leak secrets; change the logging to
only record the env var name or a redacted placeholder (e.g. "***REDACTED***")
when calling _logger.DockerEnvironmentVariableAdded, while still assigning the
real value into envvars[envVarKey]; if needed, update or add an overload on
_logger.DockerEnvironmentVariableAdded to accept a single parameter (key) to
avoid passing the secret value.
- Around line 245-270: The current logic in PodmanPlugin.cs uses
ContainerStatusMonitor.IsContainerCompleted(response.State) to mark a task as
Succeeded regardless of exit code; update the branch in the method that builds
ExecutionStatus so it only returns Status = TaskExecutionStatus.Succeeded when
the container is terminal AND response.State.ExitCode == 0; otherwise treat
terminal non-zero exit codes (or response.State.Dead/response.State.OOMKilled)
as failures—use FailureReason.ExternalServiceError for OOMKilled/Dead and
FailureReason.Unknown (or a more specific reason if available) for non-zero exit
codes, and populate Errors with the exit code and state (reference
response.State, ContainerStatusMonitor.IsContainerCompleted, and the
ExecutionStatus construction sites).
- Around line 452-454: The code incorrectly casts the result of
storageService.GetObjectAsync(input.Bucket, obj.FilePath, cancellationToken) to
MemoryStream and calls WriteTo, which can NRE for non-MemoryStream types;
instead assign the returned Stream to a variable (e.g., stream = await
storageService.GetObjectAsync(... ) ?? throw new
InvalidOperationException(...)), remove the MemoryStream cast, and replace
stream.WriteTo(fileStream) with await stream.CopyToAsync(fileStream,
cancellationToken) so any Stream implementation is supported; keep using the
using declarations for stream and fileStream.
- Around line 193-197: The monitor is being fire-and-forget via Task.Run calling
IContainerStatusMonitor.Start(containerId, ...) with the caller's
cancellationToken, so the monitor can be cancelled mid-upload and its exceptions
go unobserved; change the call in PodmanPlugin.cs to start the monitor with a
dedicated token (e.g., new CancellationTokenSource or CancellationToken.None)
instead of the caller's cancellationToken so artifact uploads/callbacks can
finish, capture the returned Task (from Task.Run or Start if it returns a Task)
and observe exceptions by either awaiting it where appropriate or attaching a
ContinueWith/try/catch to log faults (so exceptions from monitor.Start are not
lost), and ensure any intended cooperative cancellation from the caller is
translated into a graceful stop request rather than directly canceling the
upload (use a linked CTS if you need to forward cancellation requests while
allowing finalization).
---
Nitpick comments:
In `@src/TaskManager/Plug-ins/Podman/ContainerVolumeMount.cs`:
- Line 17: Remove the unused Ardalis.GuardClauses using directive from
ContainerVolumeMount.cs: delete the line "using Ardalis.GuardClauses;" and
ensure no other code in the ContainerVolumeMount class or related methods (e.g.,
constructors or validation helpers) reference Ardalis guard methods; rely on the
existing ArgumentNullException.ThrowIfNull/ThrowIfNullOrWhiteSpace calls as
currently implemented.
In `@src/TaskManager/Plug-ins/Podman/IPodmanContainerCreator.cs`:
- Line 68: The hardcoded Podman API version in the POST URL inside the container
creation call should be made configurable or documented; update the code in the
method that calls httpClient.PostAsync (in IPodmanContainerCreator
implementation where "/v4.0.0/libpod/containers/create" is used) to pull the API
prefix from a configurable setting (e.g., constructor-injected podmanApiVersion
or options) and fall back to a sensible default, or add an inline comment
stating the minimum supported Podman version if you choose not to make it
configurable; ensure the identifier (podmanApiVersion / PodmanOptions) is used
to build the request path consistently across methods that call the Podman HTTP
API.
In `@src/TaskManager/Plug-ins/Podman/Keys.cs`:
- Around line 21-24: The XML documentation in the Podman plugin contains
Docker-specific wording; update the comments in the Keys class (e.g., the
BaseUrl field and other XML comments around lines referencing "docker run" and
"Docker workflow") to say "Podman" instead—replace "Docker server" with "Podman
server", "docker run" with "podman run", and "Docker workflow" with "Podman
workflow" so the comments accurately describe the Podman plugin and its keys.
In `@src/TaskManager/Plug-ins/Podman/Logging/Log.cs`:
- Around line 41-48: Rename the logging methods to use the Podman prefix instead
of Docker to match the plugin context: change DockerInputMapped to
PodmanInputMapped, DockerOutputMapped to PodmanOutputMapped,
DockerEnvironmentVariableAdded to PodmanEnvironmentVariableAdded, and
DockerIntermediateVolumeMapped to PodmanIntermediateVolumeMapped (keep the
LoggerMessage attributes and EventId/Message text unless message wording must
change). Update all call sites and any unit tests or references to these symbols
to the new method names so compilation succeeds.
In `@tests/UnitTests/TaskManager.Podman.Tests/ContainerStatusMonitorTest.cs`:
- Around line 99-125: Add a new unit test covering a transient-failure
regression for the completion path: simulate ContainerStatusMonitor.Start
receiving an exited container (use the same ContainerInspectResponse setup) but
make IStorageService.PutObjectAsync and/or
IMessageBrokerPublisherService.Publish throw on the first call and succeed on a
retry, then assert that artifacts are uploaded exactly once and the final
callback is published exactly once; reference the existing test
Start_WhenCalledWithoutAnyArtifacts_ExpectToSendCallbackEvent for structure and
the methods PutObjectAsync and Publish to locate the mocks and the Start method
invocation on ContainerStatusMonitor for where to introduce the
failure-and-retry behavior.
In `@tests/UnitTests/TaskManager.Podman.Tests/PodmanPluginTest.cs`:
- Around line 195-240: The test
ExerciseTask_WhenFailedToMonitorContainer_ExpectTaskToBeAccepted does not
observe the background exception because ContainerStatusMonitor.Start is invoked
fire-and-forget; update the test to deterministically observe the monitor call
by either (a) replacing the current
_containerStatusMonitor.Setup(...).ThrowsAsync(...) with a setup that signals a
TaskCompletionSource when Start is invoked and then throws inside a Task
returned from Start so the test can await the TCS and assert the invocation, or
(b) add a Verify on _containerStatusMonitor.Verify(m => m.Start(...),
Times.Once()) after ExecuteTask to ensure Start was called; reference the mocked
member _containerStatusMonitor and the Start method and coordinate with a
TaskCompletionSource in the test to ensure the background failure is observed
before asserting result.
- Around line 376-495: Add a new unit test in the PodmanPlugin GetStatus suite
that simulates a container with State.Status set to exited (or dead) and a
non-zero ExitCode to cover the regression; create a test (e.g.,
GetStatus_WhenContainerIsExitedWithNonZeroExitCode_ExpectFailureStatus) that
setups _podmanClient.Containers.InspectContainerAsync to return
ContainerInspectResponse with State.Status = Strings.DockerStatusExited (or
"dead") and ExitCode = 100, instantiate PodmanPlugin and call
GetStatus("identity", ...), then assert result.Status ==
TaskExecutionStatus.Failed, result.FailureReason ==
FailureReason.ExternalServiceError (or the same code path used for non-zero
exits in PodmanPlugin), result.Errors contains "Exit code=100", and verify
InspectContainerAsync was called for "identity".
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: d05a00ca-9536-4f3d-baa6-1c8453407771
📒 Files selected for processing (22)
src/Common/Miscellaneous/ValidationConstants.cssrc/Monai.Deploy.WorkflowManager.slnsrc/TaskManager/Plug-ins/Podman/AssemblyInfo.cssrc/TaskManager/Plug-ins/Podman/ContainerMonitorException.cssrc/TaskManager/Plug-ins/Podman/ContainerStatusMonitor.cssrc/TaskManager/Plug-ins/Podman/ContainerVolumeMount.cssrc/TaskManager/Plug-ins/Podman/IPodmanClientFactory.cssrc/TaskManager/Plug-ins/Podman/IPodmanContainerCreator.cssrc/TaskManager/Plug-ins/Podman/Keys.cssrc/TaskManager/Plug-ins/Podman/Logging/Log.cssrc/TaskManager/Plug-ins/Podman/Monai.Deploy.WorkflowManager.TaskManager.Podman.csprojsrc/TaskManager/Plug-ins/Podman/PodmanPlugin.cssrc/TaskManager/Plug-ins/Podman/SetPermissionException.cssrc/TaskManager/Plug-ins/Podman/Strings.cssrc/TaskManager/TaskManager/Extensions/TaskManagerExtensions.cssrc/TaskManager/TaskManager/Monai.Deploy.WorkflowManager.TaskManager.csprojsrc/TaskManager/TaskManager/PluginStrings.cssrc/TaskManager/TaskManager/packages.lock.jsontests/UnitTests/TaskManager.Podman.Tests/ContainerStatusMonitorTest.cstests/UnitTests/TaskManager.Podman.Tests/Monai.Deploy.WorkflowManager.TaskManager.Podman.Tests.csprojtests/UnitTests/TaskManager.Podman.Tests/PodmanPluginTest.cstests/UnitTests/TaskManager.Podman.Tests/Usings.cs
| var response = await dockerClient.Containers.InspectContainerAsync(containerId, cancellationToken).ConfigureAwait(false); | ||
|
|
||
| if (IsContainerCompleted(response.State)) | ||
| { | ||
| await UploadOutputArtifacts(intermediateVolumeMount, outputVolumeMounts, cancellationToken).ConfigureAwait(false); | ||
| await SendCallbackMessage(taskDispatchEvent, containerId).ConfigureAwait(false); | ||
| return; | ||
| } | ||
| } | ||
| catch (Exception ex) | ||
| { | ||
| _logger.ErrorMonitoringContainerStatus(containerId, ex); | ||
| } | ||
| finally | ||
| { | ||
| await Task.Delay(pollingPeriod, cancellationToken).ConfigureAwait(false); | ||
| } |
There was a problem hiding this comment.
Separate polling from completion side effects.
Once the container is complete, any exception from UploadOutputArtifacts or SendCallbackMessage falls into the outer retry loop and the next poll re-enters the completion path. That can re-upload artifacts and publish duplicate callbacks until the timeout elapses.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/TaskManager/Plug-ins/Podman/ContainerStatusMonitor.cs` around lines 91 -
107, The polling loop currently calls UploadOutputArtifacts and
SendCallbackMessage inside the try so any exception re-enters the retry loop and
can cause duplicate side effects; change the logic in the method that uses
dockerClient.Containers.InpectContainerAsync and IsContainerCompleted so that
when IsContainerCompleted(response.State) is true you set a local flag (e.g.,
completed = true) and break out of the polling loop (do not call
UploadOutputArtifacts or SendCallbackMessage there), then after the loop
(outside the retry/inspect try/catch/finally) invoke
UploadOutputArtifacts(intermediateVolumeMount, outputVolumeMounts,
cancellationToken) and SendCallbackMessage(taskDispatchEvent, containerId) once,
each wrapped with their own appropriate try/catch/error logging so failures do
not cause the poll loop to re-run and produce duplicates.
| var objectName = file.Replace(artifactsPath, string.Empty).TrimStart('/'); | ||
| objectName = _fileSystem.Path.Combine(destination.RelativeRootPath, objectName); |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, let's locate and examine the file
find . -type f -name "ContainerStatusMonitor.cs"Repository: Project-MONAI/monai-deploy-workflow-manager
Length of output: 206
🏁 Script executed:
# Read the relevant section with more context
cat -n "src/TaskManager/Plug-ins/Podman/ContainerStatusMonitor.cs" | sed -n '150,180p'Repository: Project-MONAI/monai-deploy-workflow-manager
Length of output: 1768
🏁 Script executed:
# Search for how objectName is used after these lines
rg -A 10 -B 5 "objectName" "src/TaskManager/Plug-ins/Podman/ContainerStatusMonitor.cs" | head -60Repository: Project-MONAI/monai-deploy-workflow-manager
Length of output: 1306
🏁 Script executed:
# Check what artifactsPath is and how it's defined
rg "artifactsPath" "src/TaskManager/Plug-ins/Podman/ContainerStatusMonitor.cs" -B 2 -A 2Repository: Project-MONAI/monai-deploy-workflow-manager
Length of output: 1236
🏁 Script executed:
# Check what type of storage service is being used
rg -B 5 -A 5 "PutObjectAsync" "src/TaskManager/Plug-ins/Podman/ContainerStatusMonitor.cs"Repository: Project-MONAI/monai-deploy-workflow-manager
Length of output: 725
🏁 Script executed:
# Check the IStorageService interface to understand object storage expectations
find . -type f -name "*.cs" | xargs rg -l "interface IStorageService"Repository: Project-MONAI/monai-deploy-workflow-manager
Length of output: 71
🏁 Script executed:
# Look at the destination object to understand RelativeRootPath
rg "RelativeRootPath" --type cs -B 2 -A 2 | head -40Repository: Project-MONAI/monai-deploy-workflow-manager
Length of output: 3643
🏁 Script executed:
# Check if there's any documentation or tests about objectName formatting
rg "objectName" --type cs -B 2 -A 2 | grep -E "(objectName|storage|S3|blob)" | head -30Repository: Project-MONAI/monai-deploy-workflow-manager
Length of output: 6206
🏁 Script executed:
# Verify the prefix-safe concern by checking edge cases
python3 << 'EOF'
# Test the current problematic approach
artifactsPath1 = "/home/user/artifacts"
file1 = "/home/user/artifacts2/file.txt"
result1 = file1.replace(artifactsPath1, "").lstrip('/')
print(f"Case 1 - Path that starts with same prefix:")
print(f" artifactsPath: {artifactsPath1}")
print(f" file: {file1}")
print(f" Result: {result1}")
print(f" Expected: 2/file.txt (WRONG!)")
print()
# Correct approach with GetRelativePath-like logic
import os
file2 = "/home/user/artifacts/subdir/file.txt"
artifactsPath2 = "/home/user/artifacts"
result2 = os.path.relpath(file2, artifactsPath2)
print(f"Case 2 - Proper relative path:")
print(f" artifactsPath: {artifactsPath2}")
print(f" file: {file2}")
print(f" Result: {result2}")
EOFRepository: Project-MONAI/monai-deploy-workflow-manager
Length of output: 401
🏁 Script executed:
# Let's verify Path.Combine behavior on Windows with leading slashes
python3 << 'EOF'
# Simulate Path.Combine behavior on Windows
# On Windows, Path.Combine drops the second path if it starts with /
import ntpath
import posixpath
# Simulate Windows Path.Combine behavior
relativeRootPath = "minio/workflow"
objectName = "/artifacts/file.txt" # Leading slash from TrimStart('/')
# Windows behavior - rooted paths reset the path
windows_result = ntpath.join(relativeRootPath, objectName)
print(f"Windows Path.Combine simulation:")
print(f" RelativeRootPath: {relativeRootPath}")
print(f" objectName: {objectName}")
print(f" Result: {windows_result}")
print(f" Expected: minio/workflow/artifacts/file.txt")
print()
# Also test with backslashes in objectName
objectName2 = "artifacts\\file.txt"
windows_result2 = ntpath.join(relativeRootPath, objectName2)
print(f"With backslashes:")
print(f" Result: {windows_result2}")
EOFRepository: Project-MONAI/monai-deploy-workflow-manager
Length of output: 327
Use forward slash string concatenation for object storage keys instead of Path.Combine.
The current implementation has three critical issues:
-
Prefix-safety:
file.Replace(artifactsPath, ...)is unsafe—it will incorrectly match overlapping paths (e.g.,/artifactswon't properly strip from/artifacts2/file.txt). -
Leading slash behavior on Windows:
Path.Combinetreats rooted paths (those starting with/) as absolute, dropping theRelativeRootPathentirely. For example,Path.Combine("minio/workflow", "/artifacts/file.txt")returns/artifacts/file.txtinstead of the intendedminio/workflow/artifacts/file.txt. -
OS-specific separators in object keys: On Windows,
Path.Combineproduces backslashes, but object storage systems (MinIO/S3) expect forward slashes only.
This issue exists in both the Podman and Docker variants.
🛠️ Suggested fix
- var objectName = file.Replace(artifactsPath, string.Empty).TrimStart('/');
- objectName = _fileSystem.Path.Combine(destination.RelativeRootPath, objectName);
+ var relativePath = _fileSystem.Path.GetRelativePath(artifactsPath, file)
+ .Replace('\\', '/')
+ .TrimStart('/');
+ var objectName = string.IsNullOrWhiteSpace(destination.RelativeRootPath)
+ ? relativePath
+ : $"{destination.RelativeRootPath.TrimEnd('/')}/{relativePath}";🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/TaskManager/Plug-ins/Podman/ContainerStatusMonitor.cs` around lines 164 -
165, The object key assembly is unsafe: using file.Replace(artifactsPath, ...)
and _fileSystem.Path.Combine produces wrong results (prefix collisions, rooted
paths dropping the prefix, and OS-specific backslashes). Change the logic that
sets objectName (the code referencing file, artifactsPath and
destination.RelativeRootPath and calling _fileSystem.Path.Combine) to first
ensure you only strip a leading artifactsPath prefix (e.g., check
file.StartsWith(artifactsPath) and remove exactly that prefix) then normalize
the remaining segment by trimming any leading slashes, and finally join
destination.RelativeRootPath and the normalized segment using explicit
forward-slash concatenation (not Path.Combine) so the resulting object key
always uses '/' separators; apply the same fix to the Docker variant where the
same pattern appears.
| catch (Exception ex) | ||
| { | ||
| _logger.ErrorUploadingFile(file, ex); |
There was a problem hiding this comment.
Don't swallow failed artifact uploads.
Logging and continuing here means Start can still publish the callback even though one or more outputs never reached storage. That advances the workflow with incomplete artifacts.
🛠️ Suggested fix
catch (Exception ex)
{
_logger.ErrorUploadingFile(file, ex);
+ throw new ContainerMonitorException($"Failed to upload artifact '{file}'.", ex);
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| catch (Exception ex) | |
| { | |
| _logger.ErrorUploadingFile(file, ex); | |
| catch (Exception ex) | |
| { | |
| _logger.ErrorUploadingFile(file, ex); | |
| throw new ContainerMonitorException($"Failed to upload artifact '{file}'.", ex); | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/TaskManager/Plug-ins/Podman/ContainerStatusMonitor.cs` around lines 175 -
177, The catch block in ContainerStatusMonitor that calls
_logger.ErrorUploadingFile(file, ex) is swallowing upload failures so Start can
still publish callbacks with missing artifacts; after logging, rethrow or
propagate the exception (e.g., throw; or throw a new exception that wraps ex) so
the failure prevents Start from publishing the callback, or alternatively set
and check a failure flag that prevents PublishCallback from running — update the
catch that contains _logger.ErrorUploadingFile(file, ex) to either rethrow the
exception or mark the upload as failed and ensure Start checks that flag before
publishing.
| /// Prefix for envrionment variables. | ||
| /// </summary> | ||
| public static readonly string EnvironmentVariableKeyPrefix = "env_"; |
There was a problem hiding this comment.
Typo: "envrionment" should be "environment".
🔧 Proposed fix
/// <summary>
- /// Prefix for envrionment variables.
+ /// Prefix for environment variables.
/// </summary>
public static readonly string EnvironmentVariableKeyPrefix = "env_";🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/TaskManager/Plug-ins/Podman/Keys.cs` around lines 62 - 64, Fix the typo
in the XML comment above the EnvironmentVariableKeyPrefix field in Keys.cs:
change "envrionment" to "environment" so the summary reads "Prefix for
environment variables." and ensure the XML comment remains well-formed for the
public static readonly string EnvironmentVariableKeyPrefix.
| [LoggerMessage(EventId = 1006, Level = LogLevel.Information, Message = "Input volume mapping host=={hostPath}, container={containerPath}.")] | ||
| public static partial void DockerInputMapped(this ILogger logger, string hostPath, string containerPath); | ||
|
|
||
| [LoggerMessage(EventId = 1007, Level = LogLevel.Information, Message = "Output volume mapping host=={hostPath}, container={containerPath}.")] | ||
| public static partial void DockerOutputMapped(this ILogger logger, string hostPath, string containerPath); | ||
|
|
||
| [LoggerMessage(EventId = 1008, Level = LogLevel.Information, Message = "Environment variabled added {key}={value}.")] | ||
| public static partial void DockerEnvironmentVariableAdded(this ILogger logger, string key, string value); |
There was a problem hiding this comment.
Fix typos in log messages.
Several log messages contain typos:
- Line 41, 44, 65:
host==should behost= - Line 47:
variabledshould bevariable - Line 50:
retreivingshould beretrieving
🔧 Proposed fixes
- [LoggerMessage(EventId = 1006, Level = LogLevel.Information, Message = "Input volume mapping host=={hostPath}, container={containerPath}.")]
+ [LoggerMessage(EventId = 1006, Level = LogLevel.Information, Message = "Input volume mapping host={hostPath}, container={containerPath}.")]
- [LoggerMessage(EventId = 1007, Level = LogLevel.Information, Message = "Output volume mapping host=={hostPath}, container={containerPath}.")]
+ [LoggerMessage(EventId = 1007, Level = LogLevel.Information, Message = "Output volume mapping host={hostPath}, container={containerPath}.")]
- [LoggerMessage(EventId = 1008, Level = LogLevel.Information, Message = "Environment variabled added {key}={value}.")]
+ [LoggerMessage(EventId = 1008, Level = LogLevel.Information, Message = "Environment variable added {key}={value}.")]
- [LoggerMessage(EventId = 1009, Level = LogLevel.Error, Message = "Error retreiving status from container {identity}.")]
+ [LoggerMessage(EventId = 1009, Level = LogLevel.Error, Message = "Error retrieving status from container {identity}.")]Also applies to: 50-51
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/TaskManager/Plug-ins/Podman/Logging/Log.cs` around lines 41 - 48, Fix the
typos in the LoggerMessage Message strings: update DockerInputMapped and
DockerOutputMapped messages to use "host={hostPath}" instead of
"host=={hostPath}", change DockerEnvironmentVariableAdded message to
"Environment variable added {key}={value}." instead of "variabled", and find any
other logger messages around the same area (the two messages referenced at the
end of the comment) that use "retreiving" and correct them to "retrieving";
ensure each LoggerMessage attribute's Message text is corrected for these
spelling and punctuation issues.
| [LoggerMessage(EventId = 1031, Level = LogLevel.Error, Message = "Error setting directory {path} with permission {user}.")] | ||
| public static partial void ErrorSettingDirectoryPermission(this ILogger logger, string path, string user); |
There was a problem hiding this comment.
ErrorSettingDirectoryPermission lacks Exception parameter.
Unlike other error logging methods (e.g., ErrorGeneratingContainerSpecification, ErrorDeployingContainer), this method doesn't accept an Exception parameter, preventing stack trace capture.
🔧 Proposed fix
- [LoggerMessage(EventId = 1031, Level = LogLevel.Error, Message = "Error setting directory {path} with permission {user}.")]
- public static partial void ErrorSettingDirectoryPermission(this ILogger logger, string path, string user);
+ [LoggerMessage(EventId = 1031, Level = LogLevel.Error, Message = "Error setting directory {path} with permission {user}.")]
+ public static partial void ErrorSettingDirectoryPermission(this ILogger logger, string path, string user, Exception ex);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/TaskManager/Plug-ins/Podman/Logging/Log.cs` around lines 116 - 117, The
ErrorSettingDirectoryPermission logger method should accept an Exception so
stack traces are captured: change the partial method signature from
ErrorSettingDirectoryPermission(this ILogger logger, string path, string user)
to include an Exception parameter (e.g., ErrorSettingDirectoryPermission(this
ILogger logger, Exception exception, string path, string user)) and update any
call sites to pass the caught exception; keep the LoggerMessage attribute as-is
(the exception is provided as a method parameter and will be logged by the
generated source) and mirror the parameter ordering used by other methods like
ErrorGeneratingContainerSpecification and ErrorDeployingContainer for
consistency.
| var monitor = _scope.ServiceProvider.GetService<IContainerStatusMonitor>() ?? throw new ServiceNotFoundException(nameof(IContainerStatusMonitor)); | ||
| _ = Task.Run(async () => | ||
| { | ||
| await monitor.Start(Event, _containerTimeout, containerId, intermediateVolumeMount, outputVolumeMounts, cancellationToken); | ||
| }); |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
wc -l src/TaskManager/Plug-ins/Podman/PodmanPlugin.csRepository: Project-MONAI/monai-deploy-workflow-manager
Length of output: 138
🏁 Script executed:
sed -n '185,205p' src/TaskManager/Plug-ins/Podman/PodmanPlugin.cs | cat -nRepository: Project-MONAI/monai-deploy-workflow-manager
Length of output: 1171
🏁 Script executed:
sed -n '520,545p' src/TaskManager/Plug-ins/Podman/PodmanPlugin.cs | cat -nRepository: Project-MONAI/monai-deploy-workflow-manager
Length of output: 838
🏁 Script executed:
# Get broader context around both sections to understand the scope and lifecycle
sed -n '150,210p' src/TaskManager/Plug-ins/Podman/PodmanPlugin.cs | cat -nRepository: Project-MONAI/monai-deploy-workflow-manager
Length of output: 3296
🏁 Script executed:
# Check class structure and Dispose method
grep -n "class PodmanPlugin\|void Dispose\|IDisposable" src/TaskManager/Plug-ins/Podman/PodmanPlugin.cs | head -20Repository: Project-MONAI/monai-deploy-workflow-manager
Length of output: 197
🏁 Script executed:
# Look for _scope declaration and usage
rg "_scope" src/TaskManager/Plug-ins/Podman/PodmanPlugin.cs -nRepository: Project-MONAI/monai-deploy-workflow-manager
Length of output: 960
🏁 Script executed:
# Check the method signature and how ExecuteTask is called/awaited
sed -n '85,120p' src/TaskManager/Plug-ins/Podman/PodmanPlugin.cs | cat -nRepository: Project-MONAI/monai-deploy-workflow-manager
Length of output: 1986
🏁 Script executed:
# Check where ExecuteTask is called from to understand cancellation token source
grep -n "ExecuteTask\|override.*async" src/TaskManager/Plug-ins/Podman/PodmanPlugin.cs | head -10Repository: Project-MONAI/monai-deploy-workflow-manager
Length of output: 436
🏁 Script executed:
# Check if there's a GetStatus method that might indicate how monitor completion is tracked
sed -n '225,280p' src/TaskManager/Plug-ins/Podman/PodmanPlugin.cs | cat -nRepository: Project-MONAI/monai-deploy-workflow-manager
Length of output: 3105
🏁 Script executed:
# Check the base class to understand plugin lifecycle
find . -name "TaskPluginBase.cs" -type fRepository: Project-MONAI/monai-deploy-workflow-manager
Length of output: 126
🏁 Script executed:
# Check TaskPluginBase to understand the plugin lifecycle
sed -n '1,100p' src/TaskManager/API/TaskPluginBase.cs | cat -nRepository: Project-MONAI/monai-deploy-workflow-manager
Length of output: 2469
🏁 Script executed:
# Check if there's more context about plugin disposal and cancellation
grep -n "Dispose\|CancellationToken\|class TaskPluginBase" src/TaskManager/API/TaskPluginBase.cs | head -20Repository: Project-MONAI/monai-deploy-workflow-manager
Length of output: 835
🏁 Script executed:
# Check the IContainerStatusMonitor interface to understand what the monitor does
find . -name "*ContainerStatusMonitor*" -type fRepository: Project-MONAI/monai-deploy-workflow-manager
Length of output: 352
🏁 Script executed:
# Check the IContainerStatusMonitor interface
grep -n "interface IContainerStatusMonitor\|async Task Start" src/TaskManager/Plug-ins/Podman/ContainerStatusMonitor.cs | head -10Repository: Project-MONAI/monai-deploy-workflow-manager
Length of output: 171
🏁 Script executed:
# Check ContainerStatusMonitor.Start method and its dependencies
sed -n '1,80p' src/TaskManager/Plug-ins/Podman/ContainerStatusMonitor.cs | cat -nRepository: Project-MONAI/monai-deploy-workflow-manager
Length of output: 4016
🏁 Script executed:
# Check if the monitor uses any services from _scope that would be disposed
rg "ServiceProvider\|_scope" src/TaskManager/Plug-ins/Podman/ContainerStatusMonitor.csRepository: Project-MONAI/monai-deploy-workflow-manager
Length of output: 69
🏁 Script executed:
# Check the full Start method to understand what the monitor does
sed -n '69,150p' src/TaskManager/Plug-ins/Podman/ContainerStatusMonitor.cs | cat -nRepository: Project-MONAI/monai-deploy-workflow-manager
Length of output: 4909
🏁 Script executed:
# Check how the monitor is registered in DI and if it's scoped
rg "IContainerStatusMonitor\|AddScoped\|AddSingleton" src/TaskManager/Plug-ins/Podman/ -A 2 -B 2Repository: Project-MONAI/monai-deploy-workflow-manager
Length of output: 69
🏁 Script executed:
# Check when/how PodmanPlugin is disposed
rg "PodmanPlugin\|\.Dispose()" src/TaskManager/ --type cs | grep -i "dispose\|podman" | head -20Repository: Project-MONAI/monai-deploy-workflow-manager
Length of output: 69
🏁 Script executed:
# Check the SendCallbackMessage method to understand what the monitor does
sed -n '100,180p' src/TaskManager/Plug-ins/Podman/ContainerStatusMonitor.cs | cat -nRepository: Project-MONAI/monai-deploy-workflow-manager
Length of output: 4646
🏁 Script executed:
# Check how PodmanPlugin instances are created and managed
grep -n "new PodmanPlugin\|ExecuteTask\|GetStatus" src/TaskManager/TaskDispatcher.cs | head -20Repository: Project-MONAI/monai-deploy-workflow-manager
Length of output: 153
🏁 Script executed:
# Check if there's any awaiting of the background task or cleanup
rg "Task\.Run\|Task\.WhenAll\|await.*monitor" src/TaskManager/Plug-ins/Podman/PodmanPlugin.csRepository: Project-MONAI/monai-deploy-workflow-manager
Length of output: 69
🏁 Script executed:
# Find where the DI is configured for ContainerStatusMonitor
find . -name "*.cs" -type f -exec grep -l "IContainerStatusMonitor\|AddContainerStatusMonitor" {} \; | head -10Repository: Project-MONAI/monai-deploy-workflow-manager
Length of output: 644
🏁 Script executed:
# Check the startup/configuration files
grep -r "IContainerStatusMonitor\|ContainerStatusMonitor" src/TaskManager/Plug-ins/Podman/ --include="*.cs" | grep -i "add\|register\|factory"Repository: Project-MONAI/monai-deploy-workflow-manager
Length of output: 69
🏁 Script executed:
# Check if there's a Startup or configuration file
find . -name "Startup.cs" -o -name "*ServiceConfiguration*" -o -name "*ServiceExtension*" | head -10Repository: Project-MONAI/monai-deploy-workflow-manager
Length of output: 204
🏁 Script executed:
# Check TaskManagerExtensions to see DI registration
cat src/TaskManager/TaskManager/Extensions/TaskManagerExtensions.csRepository: Project-MONAI/monai-deploy-workflow-manager
Length of output: 4770
🏁 Script executed:
# Check how the plugin is used - find where ExecuteTask is called
find . -name "*.cs" -type f -exec grep -l "\.ExecuteTask\|ITaskPlugin" {} \; | grep -v test | head -10Repository: Project-MONAI/monai-deploy-workflow-manager
Length of output: 260
🏁 Script executed:
# Check TaskManager.cs to see how ExecuteTask is called
sed -n '1,100p' src/TaskManager/TaskManager/TaskManager.cs | cat -nRepository: Project-MONAI/monai-deploy-workflow-manager
Length of output: 5781
🏁 Script executed:
# Search for where ExecuteTask is called and when the plugin is disposed
rg "ExecuteTask\|\.Dispose\(\)" src/TaskManager/TaskManager/TaskManager.cs -B 3 -A 3Repository: Project-MONAI/monai-deploy-workflow-manager
Length of output: 69
🏁 Script executed:
# Search for where ExecuteTask is called
grep -n "ExecuteTask" src/TaskManager/TaskManager/TaskManager.csRepository: Project-MONAI/monai-deploy-workflow-manager
Length of output: 212
🏁 Script executed:
# Find all mentions of plugin disposal and ExecuteTask
grep -n "plugin\|ExecuteTask\|Dispose" src/TaskManager/TaskManager/TaskManager.cs | head -30Repository: Project-MONAI/monai-deploy-workflow-manager
Length of output: 2162
🏁 Script executed:
# Get more context around line 492 to see the full flow
sed -n '470,525p' src/TaskManager/TaskManager/TaskManager.cs | cat -nRepository: Project-MONAI/monai-deploy-workflow-manager
Length of output: 3198
🏁 Script executed:
# Check if there's any exception handling or finally blocks that dispose the plugin
sed -n '420,530p' src/TaskManager/TaskManager/TaskManager.cs | cat -nRepository: Project-MONAI/monai-deploy-workflow-manager
Length of output: 6136
Fire-and-forget monitor task can be cancelled before completing artifact uploads or callbacks.
This code launches monitor.Start() in a background Task.Run without awaiting, passing the caller's cancellationToken. If the token is cancelled before the monitor completes, in-flight artifact uploads and callback delivery can be interrupted. Additionally, the surrounding catch block only catches scheduling failures; exceptions from monitor.Start() are never observed.
Also applies to: 530-537
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/TaskManager/Plug-ins/Podman/PodmanPlugin.cs` around lines 193 - 197, The
monitor is being fire-and-forget via Task.Run calling
IContainerStatusMonitor.Start(containerId, ...) with the caller's
cancellationToken, so the monitor can be cancelled mid-upload and its exceptions
go unobserved; change the call in PodmanPlugin.cs to start the monitor with a
dedicated token (e.g., new CancellationTokenSource or CancellationToken.None)
instead of the caller's cancellationToken so artifact uploads/callbacks can
finish, capture the returned Task (from Task.Run or Start if it returns a Task)
and observe exceptions by either awaiting it where appropriate or attaching a
ContinueWith/try/catch to log faults (so exceptions from monitor.Start are not
lost), and ensure any intended cooperative cancellation from the caller is
translated into a graceful stop request rather than directly canceling the
upload (use a linked CTS if you need to forward cancellation requests while
allowing finalization).
| if (ContainerStatusMonitor.IsContainerCompleted(response.State)) | ||
| { | ||
| return new ExecutionStatus | ||
| { | ||
| Status = TaskExecutionStatus.Succeeded, | ||
| FailureReason = FailureReason.None, | ||
| Stats = stats | ||
| }; | ||
| } | ||
| else if (response.State.OOMKilled || response.State.Dead) | ||
| { | ||
| return new ExecutionStatus | ||
| { | ||
| Status = TaskExecutionStatus.Failed, | ||
| FailureReason = FailureReason.ExternalServiceError, | ||
| Errors = $"Exit code={response.State.ExitCode}", | ||
| Stats = stats | ||
| }; | ||
| } | ||
| else | ||
| { | ||
| return new ExecutionStatus | ||
| { | ||
| Status = TaskExecutionStatus.Failed, | ||
| FailureReason = FailureReason.Unknown, | ||
| Errors = $"Exit code={response.State.ExitCode}. Status={response.State.Status}.", |
There was a problem hiding this comment.
Don't equate "completed" with "succeeded".
This branch only checks that the container reached a terminal state. Status=exited with a non-zero exit code, or Status=dead, currently lands in Succeeded as long as FinishedAt is populated.
🛠️ Suggested fix
- if (ContainerStatusMonitor.IsContainerCompleted(response.State))
+ if (string.Equals(response.State.Status, Strings.DockerStatusExited, StringComparison.OrdinalIgnoreCase) &&
+ response.State.ExitCode == 0)
{
return new ExecutionStatus
{
Status = TaskExecutionStatus.Succeeded,
FailureReason = FailureReason.None,
Stats = stats
};
}
- else if (response.State.OOMKilled || response.State.Dead)
+ else if (response.State.OOMKilled ||
+ response.State.Dead ||
+ string.Equals(response.State.Status, Strings.DockerStatusDead, StringComparison.OrdinalIgnoreCase))
{
return new ExecutionStatus
{
Status = TaskExecutionStatus.Failed,📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if (ContainerStatusMonitor.IsContainerCompleted(response.State)) | |
| { | |
| return new ExecutionStatus | |
| { | |
| Status = TaskExecutionStatus.Succeeded, | |
| FailureReason = FailureReason.None, | |
| Stats = stats | |
| }; | |
| } | |
| else if (response.State.OOMKilled || response.State.Dead) | |
| { | |
| return new ExecutionStatus | |
| { | |
| Status = TaskExecutionStatus.Failed, | |
| FailureReason = FailureReason.ExternalServiceError, | |
| Errors = $"Exit code={response.State.ExitCode}", | |
| Stats = stats | |
| }; | |
| } | |
| else | |
| { | |
| return new ExecutionStatus | |
| { | |
| Status = TaskExecutionStatus.Failed, | |
| FailureReason = FailureReason.Unknown, | |
| Errors = $"Exit code={response.State.ExitCode}. Status={response.State.Status}.", | |
| if (string.Equals(response.State.Status, Strings.DockerStatusExited, StringComparison.OrdinalIgnoreCase) && | |
| response.State.ExitCode == 0) | |
| { | |
| return new ExecutionStatus | |
| { | |
| Status = TaskExecutionStatus.Succeeded, | |
| FailureReason = FailureReason.None, | |
| Stats = stats | |
| }; | |
| } | |
| else if (response.State.OOMKilled || | |
| response.State.Dead || | |
| string.Equals(response.State.Status, Strings.DockerStatusDead, StringComparison.OrdinalIgnoreCase)) | |
| { | |
| return new ExecutionStatus | |
| { | |
| Status = TaskExecutionStatus.Failed, | |
| FailureReason = FailureReason.ExternalServiceError, | |
| Errors = $"Exit code={response.State.ExitCode}", | |
| Stats = stats | |
| }; | |
| } | |
| else | |
| { | |
| return new ExecutionStatus | |
| { | |
| Status = TaskExecutionStatus.Failed, | |
| FailureReason = FailureReason.Unknown, | |
| Errors = $"Exit code={response.State.ExitCode}. Status={response.State.Status}.", |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/TaskManager/Plug-ins/Podman/PodmanPlugin.cs` around lines 245 - 270, The
current logic in PodmanPlugin.cs uses
ContainerStatusMonitor.IsContainerCompleted(response.State) to mark a task as
Succeeded regardless of exit code; update the branch in the method that builds
ExecutionStatus so it only returns Status = TaskExecutionStatus.Succeeded when
the container is terminal AND response.State.ExitCode == 0; otherwise treat
terminal non-zero exit codes (or response.State.Dead/response.State.OOMKilled)
as failures—use FailureReason.ExternalServiceError for OOMKilled/Dead and
FailureReason.Unknown (or a more specific reason if available) for non-zero exit
codes, and populate Errors with the exit code and state (reference
response.State, ContainerStatusMonitor.IsContainerCompleted, and the
ExecutionStatus construction sites).
| foreach (var key in Event.TaskPluginArguments.Keys) | ||
| { | ||
| if (key.StartsWith(Keys.EnvironmentVariableKeyPrefix, false, CultureInfo.InvariantCulture)) | ||
| { | ||
| var envVarKey = key.Replace(Keys.EnvironmentVariableKeyPrefix, string.Empty); | ||
| envvars[envVarKey] = Event.TaskPluginArguments[key]; | ||
| _logger.DockerEnvironmentVariableAdded(envVarKey, Event.TaskPluginArguments[key]); |
There was a problem hiding this comment.
Redact environment variable values in logs.
Task plug-in env vars can carry credentials or tokens. Logging the raw value here leaks secrets into application logs; keep only the key name or a redacted placeholder.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/TaskManager/Plug-ins/Podman/PodmanPlugin.cs` around lines 366 - 372, The
loop in PodmanPlugin (iterating Event.TaskPluginArguments and using
Keys.EnvironmentVariableKeyPrefix) currently logs raw env var values via
_logger.DockerEnvironmentVariableAdded(envVarKey,
Event.TaskPluginArguments[key]) which can leak secrets; change the logging to
only record the env var name or a redacted placeholder (e.g. "***REDACTED***")
when calling _logger.DockerEnvironmentVariableAdded, while still assigning the
real value into envvars[envVarKey]; if needed, update or add an overload on
_logger.DockerEnvironmentVariableAdded to accept a single parameter (key) to
avoid passing the secret value.
| using var stream = await storageService.GetObjectAsync(input.Bucket, obj.FilePath, cancellationToken).ConfigureAwait(false) as MemoryStream; | ||
| using var fileStream = new FileStream(filePath, FileMode.CreateNew, FileAccess.Write); | ||
| stream!.WriteTo(fileStream); |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
find . -name "PodmanPlugin.cs" -type fRepository: Project-MONAI/monai-deploy-workflow-manager
Length of output: 136
🏁 Script executed:
wc -l ./src/TaskManager/Plug-ins/Podman/PodmanPlugin.csRepository: Project-MONAI/monai-deploy-workflow-manager
Length of output: 140
🏁 Script executed:
sed -n '440,465p' ./src/TaskManager/Plug-ins/Podman/PodmanPlugin.cs | cat -nRepository: Project-MONAI/monai-deploy-workflow-manager
Length of output: 1562
🏁 Script executed:
sed -n '1,100p' ./src/TaskManager/Plug-ins/Podman/PodmanPlugin.cs | grep -n "storageService"Repository: Project-MONAI/monai-deploy-workflow-manager
Length of output: 69
🏁 Script executed:
grep -n "storageService" ./src/TaskManager/Plug-ins/Podman/PodmanPlugin.csRepository: Project-MONAI/monai-deploy-workflow-manager
Length of output: 565
🏁 Script executed:
find . -name "*.cs" -type f | xargs grep -l "interface IStorageService"Repository: Project-MONAI/monai-deploy-workflow-manager
Length of output: 71
🏁 Script executed:
rg "interface IStorageService" -t csRepository: Project-MONAI/monai-deploy-workflow-manager
Length of output: 69
🏁 Script executed:
rg "IStorageService" -t cs | head -20Repository: Project-MONAI/monai-deploy-workflow-manager
Length of output: 3009
🏁 Script executed:
rg "GetObjectAsync" -t cs -A 2 | head -40Repository: Project-MONAI/monai-deploy-workflow-manager
Length of output: 4025
🏁 Script executed:
fd "IStorageService" -t fRepository: Project-MONAI/monai-deploy-workflow-manager
Length of output: 69
🏁 Script executed:
rg "interface IStorageService" -t cs --no-headingRepository: Project-MONAI/monai-deploy-workflow-manager
Length of output: 69
🏁 Script executed:
find . -path "*/node_modules" -prune -o -name "*.cs" -type f -print | xargs grep -l "GetObjectAsync"Repository: Project-MONAI/monai-deploy-workflow-manager
Length of output: 840
🏁 Script executed:
rg "GetObjectAsync.*Stream" -t cs -B 5 | head -50Repository: Project-MONAI/monai-deploy-workflow-manager
Length of output: 2604
🏁 Script executed:
rg "Task.*GetObjectAsync" -t cs -A 1 | head -30Repository: Project-MONAI/monai-deploy-workflow-manager
Length of output: 685
🏁 Script executed:
cat ./tests/UnitTests/WorkflowManager.Tests/DummyStorageService.cs | grep -A 50 "interface IStorageService"Repository: Project-MONAI/monai-deploy-workflow-manager
Length of output: 69
🏁 Script executed:
cat ./tests/UnitTests/WorkflowManager.Tests/DummyStorageService.csRepository: Project-MONAI/monai-deploy-workflow-manager
Length of output: 6230
Remove the MemoryStream downcast and use CopyToAsync instead.
GetObjectAsync returns Task<Stream>, not Task<MemoryStream>. The cast with as MemoryStream will return null if the storage implementation returns any other Stream type, causing a NullReferenceException when calling stream!.WriteTo(fileStream). Additionally, WriteTo() is not available on the base Stream class.
Use the null-coalescing operator to ensure the result is non-null, and switch to CopyToAsync() which works on all Stream implementations:
🛠️ Suggested fix
- using var stream = await storageService.GetObjectAsync(input.Bucket, obj.FilePath, cancellationToken).ConfigureAwait(false) as MemoryStream;
- using var fileStream = new FileStream(filePath, FileMode.CreateNew, FileAccess.Write);
- stream!.WriteTo(fileStream);
+ using var stream = await storageService.GetObjectAsync(input.Bucket, obj.FilePath, cancellationToken).ConfigureAwait(false)
+ ?? throw new InvalidOperationException($"Storage returned no content for '{obj.FilePath}'.");
+ using var fileStream = new FileStream(filePath, FileMode.Create, FileAccess.Write);
+ await stream.CopyToAsync(fileStream, cancellationToken).ConfigureAwait(false);📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| using var stream = await storageService.GetObjectAsync(input.Bucket, obj.FilePath, cancellationToken).ConfigureAwait(false) as MemoryStream; | |
| using var fileStream = new FileStream(filePath, FileMode.CreateNew, FileAccess.Write); | |
| stream!.WriteTo(fileStream); | |
| using var stream = await storageService.GetObjectAsync(input.Bucket, obj.FilePath, cancellationToken).ConfigureAwait(false) | |
| ?? throw new InvalidOperationException($"Storage returned no content for '{obj.FilePath}'."); | |
| using var fileStream = new FileStream(filePath, FileMode.Create, FileAccess.Write); | |
| await stream.CopyToAsync(fileStream, cancellationToken).ConfigureAwait(false); |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/TaskManager/Plug-ins/Podman/PodmanPlugin.cs` around lines 452 - 454, The
code incorrectly casts the result of storageService.GetObjectAsync(input.Bucket,
obj.FilePath, cancellationToken) to MemoryStream and calls WriteTo, which can
NRE for non-MemoryStream types; instead assign the returned Stream to a variable
(e.g., stream = await storageService.GetObjectAsync(... ) ?? throw new
InvalidOperationException(...)), remove the MemoryStream cast, and replace
stream.WriteTo(fileStream) with await stream.CopyToAsync(fileStream,
cancellationToken) so any Stream implementation is supported; keep using the
using declarations for stream and fileStream.
Summary:
Summary by CodeRabbit
New Features
Tests