[Core] Changes to support 0.2.0 flashinfer#11314
[Core] Changes to support 0.2.0 flashinfer#11314pavanimajety wants to merge 2 commits intovllm-project:mainfrom
Conversation
|
👋 Hi! Thank you for contributing to the vLLM project. Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can do one of these:
🚀 |
…anged Signed-off-by: Pavani Majety <pmajety@nvidia.com>
c1e4b21 to
5439e7d
Compare
|
I found flashinfer 0.2.0 uses more memory on rank 0 when tp>1. I built it from source with AOT mode. Is that normal? |
|
@JaheimLee Seems like we have a fix, we'll update to Flashinfer 0.2.0.post1. Thanks |
Still have this problem. And I got another error Here is my code |
|
This pull request has merge conflicts that must be resolved before it can be |
|
Out of date |
Dataype and wrapper changes for 0.2.0 flashinfer
Related: #11194