Skip to content

[Relax] Share storage allocs among functions after cuda graph rewriting#16830

Merged
vinx13 merged 1 commit intoapache:mainfrom
vinx13:feat/cuda-graph-merge-1
Apr 2, 2024
Merged

[Relax] Share storage allocs among functions after cuda graph rewriting#16830
vinx13 merged 1 commit intoapache:mainfrom
vinx13:feat/cuda-graph-merge-1

Conversation

@vinx13
Copy link
Copy Markdown
Member

@vinx13 vinx13 commented Apr 1, 2024

This PR makes storage among different functions shared after cuda graph rewriting. Because CUDA graph cache storage, storage objects are not freed after function execution, this will increase memory usage if there are multiple functions. Making storage objects shared eliminate such overhead.

It also updates rewriting to prevent capturing storages and bindings used as function output. Previous we relies on the fact output tensors are allocated with R.builtin.alloc_tensor, however, this behavior changed after we enable storage planning for output tensor, which may also use R.memory.alloc_memory

cc @tqchen

@github-actions github-actions Bot requested a review from tqchen April 1, 2024 17:08
@vinx13 vinx13 force-pushed the feat/cuda-graph-merge-1 branch 3 times, most recently from 55c8b4e to 97dbcf6 Compare April 1, 2024 18:27
@vinx13 vinx13 force-pushed the feat/cuda-graph-merge-1 branch from 97dbcf6 to 045342f Compare April 1, 2024 19:27
@vinx13 vinx13 merged commit f83a329 into apache:main Apr 2, 2024
thaisacs pushed a commit to thaisacs/tvm that referenced this pull request Apr 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants