[ML] Allow on-the-fly adjustment of the number of threads in use#2232
Merged
tveasey merged 6 commits intoelastic:mainfrom Mar 18, 2022
Merged
[ML] Allow on-the-fly adjustment of the number of threads in use#2232tveasey merged 6 commits intoelastic:mainfrom
tveasey merged 6 commits intoelastic:mainfrom
Conversation
droberts195
reviewed
Mar 17, 2022
edsavage
added a commit
to edsavage/ml-cpp
that referenced
this pull request
Feb 27, 2026
Allows overriding the PR number from the command line, useful for local testing of the GitHub comment feature without being in a Buildkite PR build environment. Tested end-to-end against build elastic#2232 (Bayesian test timeout), posting to a throwaway PR. Both initial post and update-in-place (deduplication) verified working. Made-with: Cursor
edsavage
added a commit
to edsavage/ml-cpp
that referenced
this pull request
Mar 20, 2026
Allows overriding the PR number from the command line, useful for local testing of the GitHub comment feature without being in a Buildkite PR build environment. Tested end-to-end against build elastic#2232 (Bayesian test timeout), posting to a throwaway PR. Both initial post and update-in-place (deduplication) verified working. Made-with: Cursor
edsavage
added a commit
to edsavage/ml-cpp
that referenced
this pull request
Mar 24, 2026
Allows overriding the PR number from the command line, useful for local testing of the GitHub comment feature without being in a Buildkite PR build environment. Tested end-to-end against build elastic#2232 (Bayesian test timeout), posting to a throwaway PR. Both initial post and update-in-place (deduplication) verified working. Made-with: Cursor
edsavage
added a commit
to edsavage/ml-cpp
that referenced
this pull request
Mar 26, 2026
Allows overriding the PR number from the command line, useful for local testing of the GitHub comment feature without being in a Buildkite PR build environment. Tested end-to-end against build elastic#2232 (Bayesian test timeout), posting to a throwaway PR. Both initial post and update-in-place (deduplication) verified working. Made-with: Cursor
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
For PyTorch model inference we want to be able to control the number of threads we assign to parallel calls to forward on-the-fly to adjust the number of cores each model consumes on a node as cluster wide tasks are added or removed. This lays the groundwork by adding a dynamic setting for the number of threads the pool will use. The idea is we always start the process thread pool with the hardware concurrency (divided by the number of threads we give libtorch), but then adjust the threads actually used by control message. Unused threads will simply be idle waiting to pop an empty queue so are effectively free.
cc @dimitris-athanasiou.