Skip to content

support extern func define in aicore#6

Merged
ChaoWao merged 1 commit intohw-native-sys:mainfrom
poursoul:refactor-func-define
Jan 28, 2026
Merged

support extern func define in aicore#6
ChaoWao merged 1 commit intohw-native-sys:mainfrom
poursoul:refactor-func-define

Conversation

@poursoul
Copy link
Copy Markdown
Collaborator

No description provided.

@poursoul poursoul force-pushed the refactor-func-define branch from 94b7082 to 1a93f69 Compare January 28, 2026 03:25
@poursoul poursoul force-pushed the refactor-func-define branch from 1a93f69 to d352caa Compare January 28, 2026 03:27
@ChaoWao ChaoWao merged commit 8a6bf0b into hw-native-sys:main Jan 28, 2026
@poursoul poursoul deleted the refactor-func-define branch March 11, 2026 01:08
ChaoWao added a commit to ChaoWao/simpler-fork that referenced this pull request Apr 15, 2026
Removes the fixed DIST_TASK_WINDOW_SIZE slot pool and the per-slot array
DistWorker used to carry. At L3 the slot state lives entirely in the
parent process's heap -- never crossed into child workers -- so the ring
index L2 uses to address shmem descriptors buys us nothing here. Only
the heap needs a pre-sized region for MAP_SHARED fork inheritance.

- DistRing:
  - init() drops window_size; takes only (heap_bytes, timeout_ms).
  - alloc() returns a monotonic task id; no back-pressure on slot count,
    only on heap space.
  - Owns the slot state pool as std::deque<std::unique_ptr<SlotState>>.
    push_back never invalidates existing pointers, so slot_state(id)
    returns a pointer that stays valid for the slot's lifetime without
    holding the mutex past the lookup.
  - released_ and slot_heap_end_ become std::vector<>, grown via
    push_back on alloc, indexed directly by task id.
  - advance_last_alive_locked no longer needs to undo the released bit
    (entries aren't recycled within a run; reset_to_empty clears them
    all at drain).
  - New reset_to_empty(): drops all slot state and zeroes counters.
    DistOrchestrator::drain() calls it right after active_tasks_ hits 0
    so each Worker.run() starts from task id 0 with bounded memory.

- DistOrchestrator::init drops slots/num_slots params. slot_state(id)
  delegates to ring.slot_state(id) with a nullptr->throw guard.
- DistScheduler::Config drops slots/num_slots; takes DistRing* and reads
  slot state via ring->slot_state(id) at every access site.
- DistWorker drops the std::unique_ptr<SlotState[]> member; slot state
  is now entirely in allocator_. DistWorker::init() is a straight
  passthrough to allocator_/orchestrator_/scheduler_.
- dist_types.h: remove DIST_TASK_WINDOW_SIZE constant.

Tests:
- test_dist_ring rewritten: drop window_size tests, add
  SlotAllocGrowsPastLegacyWindow (2048 allocs past the old 128 cap),
  SlotStateIsPointerStable (push_back doesn't invalidate refs),
  ResetToEmptyRequiresAllReleased, ResetToEmptyResetsCounters.
- test_dist_orchestrator / test_dist_scheduler fixtures drop the
  std::unique_ptr<SlotState[]> member and access via a local S(id)
  helper that calls ring.slot_state(id).

Docs:
- orchestrator.md section 5 rewritten to describe the three resources
  DistRing now owns (task id, heap, slot state) and the end-of-run
  reset contract.
- roadmap.md Dispatch internals bullet updated.

Plan (local, gitignored): PR-I moved to "in review"; Allowed Exception
hw-native-sys#6 kept (explains why L3 doesn't need a shmem slot ring).

No user-visible behaviour change: heap_ring_size still configurable via
Worker ctor, OUTPUT auto-alloc / WaW tag semantics unchanged, back-
pressure timeout still throws std::runtime_error on heap exhaustion.
ChaoWao added a commit that referenced this pull request Apr 15, 2026
Removes the fixed DIST_TASK_WINDOW_SIZE slot pool and the per-slot array
DistWorker used to carry. At L3 the slot state lives entirely in the
parent process's heap -- never crossed into child workers -- so the ring
index L2 uses to address shmem descriptors buys us nothing here. Only
the heap needs a pre-sized region for MAP_SHARED fork inheritance.

- DistRing:
  - init() drops window_size; takes only (heap_bytes, timeout_ms).
  - alloc() returns a monotonic task id; no back-pressure on slot count,
    only on heap space.
  - Owns the slot state pool as std::deque<std::unique_ptr<SlotState>>.
    push_back never invalidates existing pointers, so slot_state(id)
    returns a pointer that stays valid for the slot's lifetime without
    holding the mutex past the lookup.
  - released_ and slot_heap_end_ become std::vector<>, grown via
    push_back on alloc, indexed directly by task id.
  - advance_last_alive_locked no longer needs to undo the released bit
    (entries aren't recycled within a run; reset_to_empty clears them
    all at drain).
  - New reset_to_empty(): drops all slot state and zeroes counters.
    DistOrchestrator::drain() calls it right after active_tasks_ hits 0
    so each Worker.run() starts from task id 0 with bounded memory.

- DistOrchestrator::init drops slots/num_slots params. slot_state(id)
  delegates to ring.slot_state(id) with a nullptr->throw guard.
- DistScheduler::Config drops slots/num_slots; takes DistRing* and reads
  slot state via ring->slot_state(id) at every access site.
- DistWorker drops the std::unique_ptr<SlotState[]> member; slot state
  is now entirely in allocator_. DistWorker::init() is a straight
  passthrough to allocator_/orchestrator_/scheduler_.
- dist_types.h: remove DIST_TASK_WINDOW_SIZE constant.

Tests:
- test_dist_ring rewritten: drop window_size tests, add
  SlotAllocGrowsPastLegacyWindow (2048 allocs past the old 128 cap),
  SlotStateIsPointerStable (push_back doesn't invalidate refs),
  ResetToEmptyRequiresAllReleased, ResetToEmptyResetsCounters.
- test_dist_orchestrator / test_dist_scheduler fixtures drop the
  std::unique_ptr<SlotState[]> member and access via a local S(id)
  helper that calls ring.slot_state(id).

Docs:
- orchestrator.md section 5 rewritten to describe the three resources
  DistRing now owns (task id, heap, slot state) and the end-of-run
  reset contract.
- roadmap.md Dispatch internals bullet updated.

Plan (local, gitignored): PR-I moved to "in review"; Allowed Exception
#6 kept (explains why L3 doesn't need a shmem slot ring).

No user-visible behaviour change: heap_ring_size still configurable via
Worker ctor, OUTPUT auto-alloc / WaW tag semantics unchanged, back-
pressure timeout still throws std::runtime_error on heap exhaustion.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants