Skip to content

在v100上用vllm推理时报错 #3717

@Pobby321

Description

@Pobby321

Reminder

  • I have read the README and searched the existing issues.

Reproduction

执行
CUDA_VISIBLE_DEVICES=0 llamafactory-cli api LLaMA-Factory/examples/inference/qwen_vllm.yaml
报错
You can use float16 instead by explicitly setting thedtype flag in CLI, for example: --dtype=half
看到有相同错误,但我拉的是最新代码,依旧报错

Expected behavior

No response

System Info

No response

Others

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    solvedThis problem has been already solved

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions