Refactor: drop Dist prefix from runtime types, files, and constants#587
Merged
ChaoWao merged 1 commit intohw-native-sys:mainfrom Apr 17, 2026
Merged
Conversation
There was a problem hiding this comment.
Code Review
This pull request implements a major refactor of the distributed runtime by dropping the "Dist" prefix from class names, constants, and source files, and removing the project roadmap and associated status documentation. Feedback highlights a significant risk of name collisions resulting from moving generic names into the global namespace, suggesting the use of a proper C++ namespace for these definitions. Furthermore, inconsistencies were noted in the documentation updates, specifically regarding the naming of configuration types.
Mechanical rename sweep — no behavioural changes.
File renames (git mv, history preserved):
- src/common/distributed/ → src/common/hierarchical/
- src/common/{hierarchical}/dist_*.{h,cpp} → *.{h,cpp}
- tests/ut/cpp/test_dist_*.cpp → test_*.cpp
- tests/ut/py/test_dist_worker/ → test_worker/
- python/bindings/dist_worker_bind.h → worker_bind.h
- docs/distributed_level_runtime.md → docs/hierarchical_level_runtime.md
Type renames:
- DistWorker → Worker, DistOrchestrator → Orchestrator,
DistScheduler → Scheduler, DistWorkerManager → WorkerManager,
DistRing → Ring, DistScope → Scope, DistTensorMap → TensorMap,
DistReadyQueue → ReadyQueue, DistSubmitResult → SubmitResult,
DistTaskSlotState → TaskSlotState, DistAllocResult → AllocResult.
Constant renames:
- DIST_MAX_RING_DEPTH → MAX_RING_DEPTH, DIST_MAILBOX_SIZE → MAILBOX_SIZE,
DIST_HEAP_ALIGN → HEAP_ALIGN, DIST_INVALID_SLOT → INVALID_SLOT,
DIST_MAX_SCOPE_DEPTH → MAX_SCOPE_DEPTH, DIST_ALLOC_TIMEOUT_MS →
ALLOC_TIMEOUT_MS, DIST_DEFAULT_HEAP_RING_SIZE → DEFAULT_HEAP_RING_SIZE,
DIST_MAILBOX_ARGS_CAPACITY → MAILBOX_ARGS_CAPACITY.
Folder / doc rename: src/common/distributed/ → src/common/hierarchical/
to match the "hierarchical runtime" terminology used throughout the
level-composition docs. The corresponding doc file
docs/distributed_level_runtime.md is renamed to
docs/hierarchical_level_runtime.md for consistency. Internal identifiers
touched by the folder rename:
- CMake vars DISTRIBUTED_SRC{,_DIR}/DISTRIBUTED_SOURCES → HIERARCHICAL_*
- CMake project `distributed_ut` → `hierarchical_ut`;
function `add_distributed_test` → `add_hierarchical_test`
- Python private methods `_init_distributed` / `_start_distributed`
(and the `_distributed_started` flag) → `_init_hierarchical` /
`_start_hierarchical` / `_hierarchical_started`.
Python bindings keep the underscore-prefix convention so they don't
collide with the user-facing Python wrapper classes: C++ Worker is
bound as _Worker, C++ Orchestrator as _Orchestrator. The Python
simpler.worker.Worker factory and simpler.orchestrator.Orchestrator
facade wrap them.
Historical references to the pre-PR-D-2 classes DistChipProcess and
DistSubWorker in comments and docs have been updated to ChipProcess /
SubWorker.
Also removes three stale roadmap docs now that the refactor chain is
complete:
- docs/roadmap.md — every bullet is now covered by the per-component
docs (orchestrator/scheduler/worker-manager/task-flow); the
"Behavioural notes" content (release_ref +1 threshold, fork hygiene
setenv + pthread_atfork) already lives in orchestrator.md §8 and
§"Fork hygiene before fork". Removes six forward-reference
blockquotes ("see roadmap.md for landed-vs-planned") from the
per-component docs and the README doc table entry.
- src/{a2a3,a5}/runtime/tensormap_and_ringbuffer/docs/ROADMAP.md — all
three proposed features have landed (via a different API than the
roadmap described): MixedKernels + atomic cluster dispatch replaced
the allocate_cluster/free_cluster proposal (explicitly "Out of
Scope" in SUBMIT_BY_CLUSTER.md); PTO2LaunchSpec covers SPMD→MPMD
block_incore expansion; cube+vector co-scheduling rides on the
MixedKernels path. Current runtime design lives in
RUNTIME_LOGIC.md + SUBMIT_BY_CLUSTER.md.
8576df7 to
d11e7e2
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Mechanical rename sweep across the hierarchical runtime — no behavioural changes. Final PR in the refactor chain (follows #563, #564, #570, #572, #575, #578, #583).
File renames (history preserved via
git mv)src/common/distributed/dist_*.{h,cpp}→*.{h,cpp}tests/ut/cpp/test_dist_*.cpp→test_*.cpptests/ut/py/test_dist_worker/→test_worker/python/bindings/dist_worker_bind.h→worker_bind.hType renames
DistWorker→Worker,DistOrchestrator→Orchestrator,DistScheduler→Scheduler,DistWorkerManager→WorkerManager,DistRing→Ring,DistScope→Scope,DistTensorMap→TensorMap,DistReadyQueue→ReadyQueue,DistSubmitResult→SubmitResult,DistTaskSlotState→TaskSlotState,DistAllocResult→AllocResult.Constant renames
DIST_MAX_RING_DEPTH→MAX_RING_DEPTH,DIST_MAILBOX_SIZE→MAILBOX_SIZE,DIST_HEAP_ALIGN→HEAP_ALIGN,DIST_INVALID_SLOT→INVALID_SLOT,DIST_MAX_SCOPE_DEPTH→MAX_SCOPE_DEPTH,DIST_ALLOC_TIMEOUT_MS→ALLOC_TIMEOUT_MS,DIST_DEFAULT_HEAP_RING_SIZE→DEFAULT_HEAP_RING_SIZE,DIST_MAILBOX_ARGS_CAPACITY→MAILBOX_ARGS_CAPACITY.Python binding naming
The C++ types
WorkerandOrchestratorare bound as_Workerand_Orchestratorto avoid colliding with the user-facing Pythonsimpler.worker.Workerfactory andsimpler.orchestrator.Orchestratorfacade, which wrap them. Matches the existing_ChipWorker→ChipWorkerpattern.Roadmap cleanup
Removes three stale roadmap docs now that the refactor chain is complete:
docs/roadmap.md— every bullet is now covered by the per-component docs (orchestrator/scheduler/worker-manager/task-flow). The "Behavioural notes" content (release_ref +1 threshold, fork hygiene setenv + pthread_atfork) already lives inorchestrator.md§8 and §"Fork hygiene before fork". Also removes six forward-reference blockquotes ("see roadmap.md for landed-vs-planned") from the per-component docs and the README doc table entry.src/{a2a3,a5}/runtime/tensormap_and_ringbuffer/docs/ROADMAP.md— all three proposed features have landed, via a different API than the roadmap described.MixedKernels+ atomic cluster dispatch replaced theallocate_cluster/free_clusterproposal (explicitly "Out of Scope" inSUBMIT_BY_CLUSTER.md);PTO2LaunchSpeccovers SPMD→MPMDblock_incoreexpansion; cube+vector co-scheduling rides on theMixedKernelspath. Current runtime design lives inRUNTIME_LOGIC.md+SUBMIT_BY_CLUSTER.md.Test plan
pip install -e .builds cleanlytorch, unrelated)ctest(test_tensormap,test_ring,test_scope,test_orchestrator,test_scheduler,test_a2a3_pto2_fatal,test_a5_pto2_fatal)grep -rnE "Dist[A-Z]|DIST_" src/ python/returns no resultsls src/common/distributed/dist_*/ls tests/ut/cpp/test_dist_*return nothing