Skip to content

F-020: perf(services): semaphore-bound concurrent git show spawning#23

Open
Sephyi wants to merge 1 commit intodevelopmentfrom
audit/f-020-semaphore-git-show
Open

F-020: perf(services): semaphore-bound concurrent git show spawning#23
Sephyi wants to merge 1 commit intodevelopmentfrom
audit/f-020-semaphore-git-show

Conversation

@Sephyi
Copy link
Copy Markdown
Owner

@Sephyi Sephyi commented Apr 22, 2026

Summary

perf(services): semaphore-bound concurrent git show spawning.

Audit context

Closes audit entry F-020 from #3.

Verification

  • cargo fmt --check
  • cargo clippy --all-targets --all-features -- -D warnings
  • cargo test --all-targets

Note: one pre-existing test porcelain_exits_within_timeout_with_no_staged_changes is a known macOS cold-start flake that reproduces on unmodified development — unrelated to this change.

`fetch_file_contents` previously spawned one `git show` task per
staged file with no upper bound on concurrency. Each file spawns two
processes (staged blob + HEAD blob), so a 50-file stage fanned out
to ~100 subprocesses at once — enough to cause fork/FD pressure on
machines with a modest core count.

Wrap the `JoinSet` spawns in a `tokio::sync::Semaphore` whose permit
count is `available_parallelism() * 2`, clamped to `16..=32`. Permits
are acquired inside each spawned task and held for the task's
lifetime, so parallelism scales with the host but never exceeds a
safe ceiling. The public signature of `fetch_file_contents` is
unchanged; callers see identical behavior for small stages and
smoother subprocess spawning on large ones.

Uses `std::thread::available_parallelism()` to avoid adding a
`num_cpus` dependency.

Closes audit entry F-020 from #3.
Copilot AI review requested due to automatic review settings April 22, 2026 19:51
@Sephyi Sephyi added the audit Codebase audit cleanup (issue #3) label Apr 22, 2026
@Sephyi Sephyi self-assigned this Apr 22, 2026
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses audit finding F-020 by adding a concurrency bound around git show subprocess spawning in GitService::fetch_file_contents, preventing large staged sets from triggering unbounded process creation.

Changes:

  • Introduces a computed concurrency limit for git show subprocesses.
  • Wraps per-file content fetching in a tokio::sync::Semaphore permit to bound concurrent git show execution.
Comments suppressed due to low confidence (1)

src/services/git.rs:265

  • The semaphore is acquired inside each spawned task, so fetch_file_contents still creates one task per path immediately. On very large stages this can allocate a large number of pending tasks even though subprocess concurrency is bounded. If the goal is to bound overall work, consider acquiring the permit before spawning and moving it into the task (so only up to N tasks exist at once), or switching to a bounded-concurrency pattern that doesn't enqueue all tasks at once.
        let mut set = tokio::task::JoinSet::new();
        let work_dir: Arc<PathBuf> = Arc::new(self.work_dir.clone());
        let semaphore = Arc::new(tokio::sync::Semaphore::new(
            Self::git_show_concurrency_limit(),
        ));

        for path in paths {
            let work_dir = Arc::clone(&work_dir);
            let semaphore = Arc::clone(&semaphore);
            let path = path.clone();
            set.spawn(async move {
                // Semaphore is never closed, so acquire cannot fail.
                let _permit = semaphore
                    .acquire_owned()
                    .await
                    .expect("git-show semaphore closed unexpectedly");
                let staged =
                    Self::fetch_git_show(&work_dir, &format!(":0:{}", path.display())).await;
                let head =
                    Self::fetch_git_show(&work_dir, &format!("HEAD:{}", path.display())).await;
                (path, staged, head)
            });

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/services/git.rs
Comment on lines +227 to +234
/// (staged + HEAD); capping at `cores * 2` (clamped to 16..=32) keeps
/// parallelism high on beefy machines without causing fork/FD pressure on
/// large stages.
fn git_show_concurrency_limit() -> usize {
let cores = std::thread::available_parallelism()
.map(std::num::NonZeroUsize::get)
.unwrap_or(4);
(cores * 2).clamp(16, 32)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

audit Codebase audit cleanup (issue #3)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants