Add ability to use syft binary instead of container for Linux detector#1776
Add ability to use syft binary instead of container for Linux detector#1776jasonpaulos wants to merge 3 commits intomainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
Adds an alternate execution path for the Linux detector to run Syft via a user-specified local binary (Linux.SyftBinaryPath) instead of always invoking Syft through a Docker container, improving compatibility for environments where the Syft image/container approach isn’t desirable.
Changes:
- Introduces
ISyftRunnerplus Docker-based and binary-based Syft runner implementations, with DI registration for both. - Refactors
LinuxScanner/LinuxContainerDetectorto scan using anImageReferencemodel and a selectedISyftRunner. - Updates and adds unit tests to cover runner behavior and the new invocation flow.
Show a summary per file
| File | Description |
|---|---|
| test/Microsoft.ComponentDetection.Orchestrator.Tests/Experiments/LinuxApplicationLayerExperimentTests.cs | Updates detector constructors in experiment tests to match new DI signatures. |
| test/Microsoft.ComponentDetection.Detectors.Tests/LinuxScannerTests.cs | Refactors scanner tests to mock ISyftRunner and use ImageReference. |
| test/Microsoft.ComponentDetection.Detectors.Tests/LinuxContainerDetectorTests.cs | Updates detector tests for new runner selection and new scanner signatures. |
| test/Microsoft.ComponentDetection.Detectors.Tests/DockerSyftRunnerTests.cs | Adds tests validating Docker runner source/bind construction for each image kind. |
| test/Microsoft.ComponentDetection.Detectors.Tests/BinarySyftRunnerTests.cs | Adds tests validating binary runner version checks and command-line construction. |
| src/Microsoft.ComponentDetection.Orchestrator/Extensions/ServiceCollectionExtensions.cs | Registers new runner services/factory and ensures logging services are available. |
| src/Microsoft.ComponentDetection.Detectors/linux/LinuxScanner.cs | Moves Syft execution behind ISyftRunner; updates telemetry fields and call sites. |
| src/Microsoft.ComponentDetection.Detectors/linux/LinuxContainerDetector.cs | Selects runner based on Linux.SyftBinaryPath; passes runner through scan pipeline; refactors local image scanning to use ImageReference. |
| src/Microsoft.ComponentDetection.Detectors/linux/LinuxApplicationLayerDetector.cs | Wires new runner dependencies through the experimental detector subclass. |
| src/Microsoft.ComponentDetection.Detectors/linux/ISyftRunner.cs | Introduces the runner abstraction for executing Syft. |
| src/Microsoft.ComponentDetection.Detectors/linux/ImageReference.cs | Makes ImageReference/ImageReferenceKind public and keeps parsing/normalization logic here. |
| src/Microsoft.ComponentDetection.Detectors/linux/ILinuxScanner.cs | Updates scanner API to accept ImageReference and ISyftRunner. |
| src/Microsoft.ComponentDetection.Detectors/linux/IDockerSyftRunner.cs | Adds marker interface for Docker-backed runner. |
| src/Microsoft.ComponentDetection.Detectors/linux/IBinarySyftRunnerFactory.cs | Adds factory interface for creating binary-backed runners by path. |
| src/Microsoft.ComponentDetection.Detectors/linux/DockerSyftRunner.cs | Implements Docker container execution of Syft with per-kind source/bind mapping. |
| src/Microsoft.ComponentDetection.Detectors/linux/BinarySyftRunnerFactory.cs | Implements binary runner factory using ICommandLineInvocationService and ILoggerFactory. |
| src/Microsoft.ComponentDetection.Detectors/linux/BinarySyftRunner.cs | Implements local-binary execution of Syft, including --version availability check. |
Copilot's findings
- Files reviewed: 17/17 changed files
- Comments generated: 4
| /// <summary> | ||
| /// Specifies the type of image reference. | ||
| /// </summary> | ||
| internal enum ImageReferenceKind | ||
| public enum ImageReferenceKind | ||
| { |
There was a problem hiding this comment.
ImageReferenceKind/ImageReference were changed from internal to public, expanding the public API surface for what looks like an internal Linux-detector detail. Since Microsoft.ComponentDetection.Detectors already grants internals access to Microsoft.ComponentDetection.Orchestrator and test assemblies via InternalsVisibleTo, consider keeping these types (and related Syft runner abstractions) internal unless you explicitly want to support them as public API.
| using var record = new LinuxScannerTelemetryRecord | ||
| { | ||
| ImageToScan = imageHash, | ||
| ScannerVersion = ScannerImage, | ||
| ImageToScan = imageReference.Reference, | ||
| ScannerVersion = DockerSyftRunner.ScannerImage, | ||
| }; |
There was a problem hiding this comment.
LinuxScannerTelemetryRecord uses ImageToScan = imageReference.Reference and ScannerVersion = DockerSyftRunner.ScannerImage. For local images, Reference is a full normalized host path (can leak machine/user path details into telemetry), and when using the binary runner the scanner version will be incorrect. Consider using a sanitized/non-host identifier (e.g., OriginalInput) and capturing runner identity/version based on the ISyftRunner used.
See below for a potential fix:
var scannerVersion = syftRunner is DockerSyftRunner
? DockerSyftRunner.ScannerImage
: syftRunner.GetType().Name;
using var record = new LinuxScannerTelemetryRecord
{
ImageToScan = imageReference.OriginalInput,
ScannerVersion = scannerVersion,
| var arguments = CmdParameters | ||
| .Concat(scopeParameters) | ||
| .ToList(); | ||
| (stdout, stderr) = await syftRunner.RunSyftAsync( | ||
| imageReference, |
There was a problem hiding this comment.
LinuxScannerTelemetryRecord still has a SemaphoreFailure field, but semaphore acquisition now happens inside the Syft runners and the scanner no longer sets this telemetry signal. Consider propagating a clear “semaphore not acquired” result/exception from ISyftRunner.RunSyftAsync so the scanner can set record.SemaphoreFailure (or otherwise preserve this telemetry).
|
👋 Hi! It looks like you modified some files in the
If none of the above scenarios apply, feel free to ignore this comment 🙂 |
There was a problem hiding this comment.
Pull request overview
Adds a configurable execution path for the Linux detector to run Syft via a user-provided local binary (via Linux.SyftBinaryPath) instead of always using the Syft Docker image, by introducing a runner abstraction and updating the Linux scanning flow to use parsed image references.
Changes:
- Introduces
ISyftRunnerwith Docker (DockerSyftRunner) and local-binary (BinarySyftRunner) implementations plus a factory for the binary runner. - Refactors
LinuxScanner/LinuxContainerDetectorto accept anImageReferenceand anISyftRunner, and selects the runner based on detector args. - Updates DI registration and adjusts/adds unit tests for the new runner architecture.
Show a summary per file
| File | Description |
|---|---|
| test/Microsoft.ComponentDetection.Orchestrator.Tests/Experiments/LinuxApplicationLayerExperimentTests.cs | Updates detector instantiation due to constructor signature changes. |
| test/Microsoft.ComponentDetection.Detectors.Tests/LinuxScannerTests.cs | Refactors tests to mock ISyftRunner instead of IDockerService and updates call shapes. |
| test/Microsoft.ComponentDetection.Detectors.Tests/LinuxContainerDetectorTests.cs | Updates tests for new detector constructor + runner plumbing. |
| test/Microsoft.ComponentDetection.Detectors.Tests/DockerSyftRunnerTests.cs | Adds unit tests for Docker-based runner command/bind construction. |
| test/Microsoft.ComponentDetection.Detectors.Tests/BinarySyftRunnerTests.cs | Adds unit tests for binary runner availability checks and command construction. |
| src/Microsoft.ComponentDetection.Orchestrator/Extensions/ServiceCollectionExtensions.cs | Registers new runners/factory and adds logging to support ILoggerFactory usage. |
| src/Microsoft.ComponentDetection.Detectors/linux/LinuxScanner.cs | Refactors scanning to call an injected ISyftRunner and uses ImageReference. |
| src/Microsoft.ComponentDetection.Detectors/linux/LinuxContainerDetector.cs | Selects runner based on Linux.SyftBinaryPath and passes runner through scanning flow. |
| src/Microsoft.ComponentDetection.Detectors/linux/LinuxApplicationLayerDetector.cs | Updates derived detector constructor/base call to include new runner deps. |
| src/Microsoft.ComponentDetection.Detectors/linux/ISyftRunner.cs | Adds runner abstraction for executing Syft via different mechanisms. |
| src/Microsoft.ComponentDetection.Detectors/linux/ImageReference.cs | Makes ImageReference public and adds mapping to Syft --from kinds. |
| src/Microsoft.ComponentDetection.Detectors/linux/ILinuxScanner.cs | Updates interface to accept ImageReference and ISyftRunner. |
| src/Microsoft.ComponentDetection.Detectors/linux/IDockerSyftRunner.cs | Adds marker interface for Docker-backed runner DI clarity. |
| src/Microsoft.ComponentDetection.Detectors/linux/IBinarySyftRunnerFactory.cs | Adds factory abstraction for creating binary runners from a path. |
| src/Microsoft.ComponentDetection.Detectors/linux/DockerSyftRunner.cs | Implements Syft execution via Docker container with mount handling for local images. |
| src/Microsoft.ComponentDetection.Detectors/linux/BinarySyftRunnerFactory.cs | Implements factory to construct BinarySyftRunner using DI services. |
| src/Microsoft.ComponentDetection.Detectors/linux/BinarySyftRunner.cs | Implements Syft execution via a local binary with basic availability/version checks. |
Copilot's findings
Comments suppressed due to low confidence (1)
src/Microsoft.ComponentDetection.Detectors/linux/LinuxScanner.cs:110
- Same telemetry concern as in ScanLinuxAsync: ImageToScan is set to imageReference.Reference, which will be a full local path for OCI layouts/archives after normalization. Consider recording OriginalInput or a redacted path to avoid leaking host filesystem structure into telemetry.
using var record = new LinuxScannerTelemetryRecord
{
ImageToScan = imageReference.Reference,
ScannerVersion = DockerSyftRunner.ScannerImage,
};
- Files reviewed: 17/17 changed files
- Comments generated: 6
| using var record = new LinuxScannerTelemetryRecord | ||
| { | ||
| ImageToScan = imageHash, | ||
| ScannerVersion = ScannerImage, | ||
| ImageToScan = imageReference.Reference, | ||
| ScannerVersion = DockerSyftRunner.ScannerImage, | ||
| }; |
There was a problem hiding this comment.
ScannerVersion is always set to DockerSyftRunner.ScannerImage, even when the scan is executed via a BinarySyftRunner (Linux.SyftBinaryPath). This will make telemetry misleading; consider letting the selected ISyftRunner provide a version/identifier (e.g., image digest for Docker runner, syft --version output for binary runner) and record that instead.
| this.logger.LogWarning( | ||
| "Failed to enter the container semaphore for image {ImageReference}", | ||
| imageReference.Reference); | ||
| return (string.Empty, string.Empty); |
There was a problem hiding this comment.
When the container semaphore cannot be acquired, this returns ("", "") and only logs a warning. LinuxScannerTelemetryRecord has a SemaphoreFailure field, but it is no longer set anywhere, so this signal is lost in telemetry. Consider returning a distinct stderr/error indicator or throwing a dedicated exception so LinuxScanner can set record.SemaphoreFailure (and potentially include clearer failure context).
| this.logger.LogWarning( | |
| "Failed to enter the container semaphore for image {ImageReference}", | |
| imageReference.Reference); | |
| return (string.Empty, string.Empty); | |
| var stderr = $"Failed to acquire the container semaphore for image {imageReference.Reference}."; | |
| this.logger.LogWarning( | |
| "{SemaphoreFailureMessage}", | |
| stderr); | |
| return (string.Empty, stderr); |
| try | ||
| { | ||
| acquired = await BinarySemaphore.WaitAsync(SemaphoreTimeout, cancellationToken); | ||
| if (!acquired) | ||
| { | ||
| this.logger.LogWarning( | ||
| "Failed to enter the binary semaphore for image {ImageReference}", | ||
| imageReference.Reference); | ||
| return (string.Empty, string.Empty); | ||
| } |
There was a problem hiding this comment.
BinarySyftRunner mirrors DockerSyftRunner behavior on semaphore acquisition failure by returning empty stdout/stderr. This makes it hard for callers/telemetry to distinguish "semaphore throttled" from other failures, and LinuxScannerTelemetryRecord.SemaphoreFailure is now never set. Consider returning a recognizable error (stderr) or throwing a dedicated exception so upstream can record SemaphoreFailure and improve diagnostics.
| public class LinuxContainerDetector( | ||
| ILinuxScanner linuxScanner, | ||
| IDockerSyftRunner dockerSyftRunner, | ||
| IBinarySyftRunnerFactory binarySyftRunnerFactory, | ||
| IDockerService dockerService, | ||
| ILogger<LinuxContainerDetector> logger | ||
| ) : IComponentDetector |
There was a problem hiding this comment.
LinuxContainerDetector is public, and this change expands its constructor signature (and introduces new public abstractions like IDockerSyftRunner/IBinarySyftRunnerFactory). This is a breaking API change for any consumers constructing detectors directly. If external construction is supported, consider adding a backward-compatible constructor overload (non-primary constructor) that preserves the previous signature, or otherwise documenting/handling the breaking change explicitly.
See below for a potential fix:
{
/// <summary>
/// Initializes a new instance of the <see cref="LinuxContainerDetector"/> class using the legacy public constructor signature.
/// </summary>
/// <param name="linuxScanner">The Linux scanner.</param>
/// <param name="dockerService">The Docker service.</param>
/// <param name="logger">The logger.</param>
public LinuxContainerDetector(
ILinuxScanner linuxScanner,
IDockerService dockerService,
ILogger<LinuxContainerDetector> logger)
: this(linuxScanner, null!, null!, dockerService, logger)
{
}
| @@ -31,7 +31,7 @@ internal enum ImageReferenceKind | |||
| /// <summary> | |||
| /// Represents a parsed image reference from the scan input, with its type and cleaned reference string. | |||
| /// </summary> | |||
| internal class ImageReference | |||
| public class ImageReference | |||
| { | |||
There was a problem hiding this comment.
ImageReferenceKind and ImageReference were changed from internal to public, which expands the Detectors assembly public surface area. Since the .csproj already uses InternalsVisibleTo for Orchestrator and test assemblies, consider whether these types can remain internal (and keep related runner/scanner abstractions internal) to avoid locking in a new public API unnecessarily.
| /// <summary> | ||
| /// Gets the appropriate Syft runner based on detector arguments. | ||
| /// When <c>Linux.SyftBinaryPath</c> is provided, returns a binary runner via the factory; | ||
| /// otherwise returns the default Docker-based runner. | ||
| /// </summary> | ||
| /// <param name="detectorArgs">The arguments provided by the user.</param> | ||
| /// <returns>An <see cref="ISyftRunner"/> to use for scanning.</returns> | ||
| private ISyftRunner GetSyftRunner(IDictionary<string, string> detectorArgs) | ||
| { | ||
| if ( | ||
| detectorArgs != null | ||
| && detectorArgs.TryGetValue(SyftBinaryPathConfigKey, out var syftBinaryPath) | ||
| && !string.IsNullOrWhiteSpace(syftBinaryPath) | ||
| ) | ||
| { | ||
| this.logger.LogInformation( | ||
| "Using Syft binary at {SyftBinaryPath} for Linux container scanning", | ||
| syftBinaryPath); | ||
| return this.binarySyftRunnerFactory.Create(syftBinaryPath); | ||
| } | ||
|
|
||
| return this.dockerSyftRunner; | ||
| } |
There was a problem hiding this comment.
New behavior selects a Syft runner based on the Linux.SyftBinaryPath detector arg, but LinuxContainerDetectorTests currently don't cover this selection path (no assertions that the binary factory is used when the arg is present, and Docker runner otherwise). Adding unit tests here would help prevent regressions in runner selection and argument parsing.
By passing the detector argument
Linux.SyftBinaryPath, the Linux detector will invoke the given syft binary instead of using the syft docker image to scan images.