Skip to content

Minor updates for worksplit_gpu with comfy-aimdo#13419

Merged
Kosinkadink merged 3 commits intoComfy-Org:worksplit-multigpufrom
rattus128:prs/for-kosa/worksplit-multigpu
Apr 16, 2026
Merged

Minor updates for worksplit_gpu with comfy-aimdo#13419
Kosinkadink merged 3 commits intoComfy-Org:worksplit-multigpufrom
rattus128:prs/for-kosa/worksplit-multigpu

Conversation

@rattus128
Copy link
Copy Markdown
Contributor

@rattus128 rattus128 commented Apr 15, 2026

Upcoming comfy-aimdo has direct multi-GPU support (not yet released). This aimdo support symmetric device init so pass all devices to aimdo for upfront init (previous versions depended on a lazy init of non-primary GPUs). Fix the vbars_analyze call to be per-GPU and not report free-able VRAM as a global pool.

wheel that can be used here:

https://github.com/Comfy-Org/comfy-aimdo/actions/runs/24448365803

Example test conditions:

Linux, 2x4090
qwen 38GB (2512) 1328x1328 CFG=4 second run performance.

image
pytorch version: 2.11.0+cu130
Set vram state to: NORMAL_VRAM
Device: cuda:0 NVIDIA GeForce RTX 4090 : cudaMallocAsync
Device: cuda:1 NVIDIA GeForce RTX 4090 : cudaMallocAsync
Using async weight offloading with 2 streams
Enabled pinned memory 115850.0
Using pytorch attention
aimdo: /project/src-posix/cuda-funchooks.c:126:DEBUG:aimdo_setup_hooks: hooks successfully installed
aimdo: /project/src/control.c:144:INFO:comfy-aimdo inited for GPU: NVIDIA GeForce RTX 4090 (VRAM: 24078 MB)
aimdo: /project/src/control.c:144:INFO:comfy-aimdo inited for GPU: NVIDIA GeForce RTX 4090 (VRAM: 24080 MB)
DynamicVRAM support detected and enabled
Python version: 3.12.3 (main, Mar  3 2026, 12:15:18) [GCC 13.3.0]
ComfyUI version: 0.18.1
comfy-aimdo version: 0.2.13.dev15
comfy-kitchen version: 0.2.8

1 GPU:

got prompt
Model QwenImage prepared for dynamic VRAM loading. 38968MB Staged. 0 patches attached.
100%|██████████████████████████████████████████████████████████████████████████████| 20/20 [01:02<00:00, 3.13s/it]
2 GPU:

Model QwenImage prepared for dynamic VRAM loading. 38968MB Staged. 0 patches attached.
Model QwenImage prepared for dynamic VRAM loading. 38968MB Staged. 0 patches attached.
100%|███████████████████████████████████████████████████████████████████████████████████| 20/20 [00:32<00:00, 1.62s/it]

Comment thread main.py
@Kosinkadink Kosinkadink merged commit f0d550b into Comfy-Org:worksplit-multigpu Apr 16, 2026
8 of 9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants