Skip to content

[TIR] Add CUDA int4 tensor core intrinsics#14598

Merged
tqchen merged 2 commits intoapache:mainfrom
vinx13:feat/int4-tensor-intrin
Apr 12, 2023
Merged

[TIR] Add CUDA int4 tensor core intrinsics#14598
tqchen merged 2 commits intoapache:mainfrom
vinx13:feat/int4-tensor-intrin

Conversation

@vinx13
Copy link
Copy Markdown
Member

@vinx13 vinx13 commented Apr 11, 2023

This PR added int4 tensor intrinsic for CUDA tensor core.

cc @junrushao @tqchen @masahi

@tvm-bot
Copy link
Copy Markdown
Collaborator

tvm-bot commented Apr 11, 2023

Thanks for contributing to TVM! Please refer to the contributing guidelines https://tvm.apache.org/docs/contribute/ for useful information and tips. Please request code reviews from Reviewers by @-ing them in a comment.

Generated by tvm-bot

@github-actions github-actions Bot requested review from junrushao, masahi and tqchen April 11, 2023 23:36
Copy link
Copy Markdown
Member

@Hzfengsy Hzfengsy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. But I want to remind you that the int4 Tensor Core support is removed from the 4th Tensor Core (Rtx 40 serious and Hopper)

@yzh119
Copy link
Copy Markdown
Member

yzh119 commented Apr 12, 2023

@Hzfengsy , int4 Tensor Cores is still supported in RTX 40 series, per Ada whitepaper.

Copy link
Copy Markdown
Member

@yzh119 yzh119 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A slight issue, otherwise LGTM.

*get_wmma_sync_intrin(16, 16, 16, "int8", "int32", True),
)

WMMA_SYNC_8x8x32_s4s4s32_TRANS_INTRIN = "wmma_sync_8x8x32_s4s4s32_trans"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"wmma_sync_8x8x32_s4s4s32" is missing.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sub-byte tensor core only allows A in row major and B in col major

Copy link
Copy Markdown
Member

@yzh119 yzh119 Apr 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh that's interesting! Maybe we can leave a note somewhere.

@tqchen tqchen merged commit c1d1e9f into apache:main Apr 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants