[NVRTC] Add NVSHMEM support to NVRTC compilation path by Kathryn-cat · Pull Request #18681 · apache/tvm

Kathryn-cat · 2026-01-23T05:44:23Z

This PR adds NVSHMEM support to the NVRTC path in python/tvm/contrib/nvcc.py.

This is enabled by a compile stage and a link stage for NVSHMEM programs. Tested locally via tests/python/disco/test_nvshmem.py.

Results show it's about 10-35% faster in compilation speed. Kernel perf is the same.

gemini-code-assist · 2026-01-23T05:44:51Z

Summary of Changes

Hello @Kathryn-cat, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances TVM's CUDA compilation capabilities by integrating NVSHMEM support into the NVRTC backend. This allows users to compile and run CUDA kernels that leverage NVSHMEM features using the NVRTC compiler, offering an alternative to the traditional NVCC path. The changes involve adapting the NVRTC compilation process to include necessary linking steps, handle specific CUDA header requirements, and manage different binary output formats, ensuring seamless operation for distributed GPU programming with NVSHMEM.

Highlights

NVRTC NVSHMEM Support: Enabled NVSHMEM compilation and linking within the NVRTC (NVIDIA Runtime Compilation) path, which was previously only supported by NVCC.
Compilation and Linking Stages: Implemented distinct compilation and linking stages for NVSHMEM programs using NVRTC, including handling cubin output and leveraging the CUDA driver API for linking.
CUDA Header Compatibility: Addressed compatibility issues by mapping cuda::std type traits to the std namespace and including CCCL paths for NVSHMEM headers, ensuring proper compilation.
Flexible Binary Loading: Updated the external kernel builder to handle both PTX and cubin binary formats, allowing for NVSHMEM-specific compilation outputs.
New NVSHMEM Kernel Tests: Introduced new tests to verify the correct compilation and execution of NVSHMEM kernels using both NVCC and NVRTC compilation modes.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces NVSHMEM support for the NVRTC compilation path, which is a valuable addition. The implementation is well-structured, including necessary workarounds for NVRTC and comprehensive end-to-end tests for both nvcc and nvrtc paths. My main concern is a potential resource leak related to CUDA context management, which I've detailed in a specific comment.

gemini-code-assist · 2026-01-23T05:50:07Z

+        # Check if there's already a CUDA context; create one if not
+        result, context = cu.cuCtxGetCurrent()
+        if result != cu.CUresult.CUDA_SUCCESS or context is None or int(context) == 0:
+            result, device = cu.cuDeviceGet(0)
+            if result != cu.CUresult.CUDA_SUCCESS:
+                raise RuntimeError(f"Failed to get CUDA device: {result}")
+            result, context = cu.cuCtxCreate(None, 0, device)
+            if result != cu.CUresult.CUDA_SUCCESS:
+                raise RuntimeError(f"Failed to create CUDA context: {result}")


The CUDA context created here if one doesn't already exist is not destroyed. This can lead to a resource leak, as CUDA contexts hold significant GPU resources. It would be safer to ensure that if a context is created within this function, it is also destroyed before the function returns.

A try...finally block should be used to manage the context's lifecycle:

context_created = False context = None try: # get or create context # ... # linking logic # ... finally: if context_created and context: cu.cuCtxDestroy(context)

This would ensure that any context created specifically for this compilation is properly cleaned up, preventing resource leaks. Additionally, the condition context is None or int(context) == 0 can be simplified to not context since the cuda-python CUcontext object has a __bool__ method that handles this check.

Kathryn-cat · 2026-01-23T08:27:23Z

@tvm-bot rerun

tqchen · 2026-01-23T13:38:18Z

-    return bytearray(binary_buf)
+    # link stage for NVSHMEM
+    if use_nvshmem:
+        import ctypes  # pylint: disable=import-outside-toplevel


conside move into a separate function

Kathryn-cat · 2026-01-23T22:18:03Z

@tvm-bot re-run

Kathryn-cat · 2026-01-24T05:41:04Z

@tvm-bot re-run

dev

407262b

gemini-code-assist Bot reviewed Jan 23, 2026

View reviewed changes

Kathryn-cat added 2 commits January 22, 2026 22:05

address comment

be3ab81

update

235df38

tqchen reviewed Jan 23, 2026

View reviewed changes

Kathryn-cat added 2 commits January 23, 2026 12:14

move to separate function

d3dad8d

update

dd5d5e3

tqchen approved these changes Jan 23, 2026

View reviewed changes

spectrometerHBH approved these changes Jan 24, 2026

View reviewed changes

spectrometerHBH merged commit 2004a8b into apache:main Jan 24, 2026
13 checks passed

ysh329 mentioned this pull request Feb 1, 2026

[Release] v0.23.0 Release Candidate Notes #18701

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[NVRTC] Add NVSHMEM support to NVRTC compilation path#18681

[NVRTC] Add NVSHMEM support to NVRTC compilation path#18681
spectrometerHBH merged 5 commits intoapache:mainfrom
Kathryn-cat:nvrtc-nvshmem-ptr

Kathryn-cat commented Jan 23, 2026 •

edited

Loading

Uh oh!

gemini-code-assist Bot commented Jan 23, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Jan 23, 2026

Uh oh!

Kathryn-cat Jan 23, 2026

Uh oh!

Kathryn-cat commented Jan 23, 2026

Uh oh!

tqchen Jan 23, 2026

Uh oh!

Kathryn-cat Jan 23, 2026

Uh oh!

Kathryn-cat commented Jan 23, 2026

Uh oh!

Kathryn-cat commented Jan 24, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Kathryn-cat commented Jan 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist Bot commented Jan 23, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

Kathryn-cat Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

Kathryn-cat commented Jan 23, 2026

Uh oh!

tqchen Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

Kathryn-cat Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

Kathryn-cat commented Jan 23, 2026

Uh oh!

Kathryn-cat commented Jan 24, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Kathryn-cat commented Jan 23, 2026 •

edited

Loading