[Model] add vllm compatible models by Luodian · Pull Request #544 · EvolvingLMMs-Lab/lmms-eval

Luodian · 2025-02-19T22:13:08Z

Add vLLM model integration and update configurations

Introduce VLLM model in the model registry.

vLLM offers exceptional speed in evaluation. Moving forward, we should prioritize using this approach.

- Introduce VLLM model in the model registry. - Update AVAILABLE_MODELS to include new models: - models/__init__.py: Added "aria", "internvideo2", "llama_vision", "oryx", "ross", "slime", "videochat2", "vllm", "xcomposer2_4KHD", "xcomposer2d5". - Create vllm.py for VLLM model implementation: - Implemented encoding for images and videos. - Added methods for generating responses and handling multi-round generation. - Update mmu tasks with new prompt formats and evaluation metrics: - mmmu_val.yaml: Added specific kwargs for prompt types. - mmmu_val_reasoning.yaml: Enhanced prompts for reasoning tasks. - utils.py: Adjusted evaluation rules and scoring for predictions. - Added script for easy model execution: - vllm_qwen2vl.sh: Script to run VLLM with specified parameters.

- Configure environment for better performance and debugging. - Added variables to control multiprocessing and NCCL behavior. miscs/vllm_qwen2vl.sh: - Set `VLLM_WORKER_MULTIPROC_METHOD` to `spawn` for compatibility. - Enabled `NCCL_BLOCKING_WAIT` to avoid hangs. - Increased `NCCL_TIMEOUT` to 18000000 for long-running processes. - Set `NCCL_DEBUG` to `DEBUG` for detailed logs.

- Renamed representation scripts for clarity. - miscs/repr_scripts.sh -> miscs/model_dryruns/llava_1_5.sh - miscs/cicd_qwen2vl.sh -> miscs/model_dryruns/qwen2vl.sh - miscs/tinyllava_repr_scripts.sh -> miscs/model_dryruns/tinyllava.sh - miscs/vllm_qwen2vl.sh -> miscs/model_dryruns/vllm_qwen2vl.sh - Updated parameters in the vllm_qwen2vl.sh script. - miscs/model_dryruns/vllm_qwen2vl.sh: Added `--limit=64` to output path command.

kylesayrs · 2025-02-19T23:38:58Z

lmms_eval/models/vllm.py

+
+
+@register_model("vllm")
+class VLLM(lmms):


Since this processing is specific to VLMs, consider renaming to vllm_vlm

I think this can remain as is vllm supports both vision/language models and vision/language tasks. For example, actually you can use Qwen/Qwen2.5-0.5B-Instruct to evaluate mmlu_flan_n_shot_generative within our framework.

So it's better supporting vllm through a single class.

kylesayrs · 2025-02-19T23:42:21Z

lmms_eval/models/vllm.py

+        img.save(output_buffer, format="PNG")
+        byte_data = output_buffer.getvalue()
+
+        base64_str = base64.b64encode(byte_data).decode("utf-8")


Is base64 specific to qwen_vl? The vllm VLM interface accepts PIL images and then does model-specific processing

great, let me change it

oh after checking the doc, I feel like we should keep the openai format messages so we dont need to apply chat template.

Otherwise if we use this format, we need to apply model specific chat template.

outputs = llm.generate({ "prompt": prompt, "multi_modal_data": {"image": image_embeds}, })

https://github.com/vllm-project/vllm/blob/0d243f2a54fbd1c56da8a571f0899c30b6aba5d9/docs/source/serving/multimodal_inputs.md

fzyzcjy · 2025-02-20T00:35:29Z

+1 This looks great and thanks for the support!

I wonder whether this PR is ready for use for not / when will this be ready for use? i.e. can I just use this branch and run it

- Simplify image conversion in the `to_base64` method: - vllm.py: Directly convert input image to RGB format instead of copying it. - Remove unnecessary base64 encoding for images: - vllm.py: Return the PIL image directly instead of converting it to base64. - Update video frame processing to return PIL images: - vllm.py: Replace base64 encoding of frames with returning the PIL frames directly.

Luodian · 2025-02-20T04:01:33Z

+1 This looks great and thanks for the support!

I wonder whether this PR is ready for use for not / when will this be ready for use? i.e. can I just use this branch and run it

think immediately

This reverts commit 469e1fc.

Luodian · 2025-02-20T04:26:20Z

* Add VLLM model integration and update configurations - Introduce VLLM model in the model registry. - Update AVAILABLE_MODELS to include new models: - models/__init__.py: Added "aria", "internvideo2", "llama_vision", "oryx", "ross", "slime", "videochat2", "vllm", "xcomposer2_4KHD", "xcomposer2d5". - Create vllm.py for VLLM model implementation: - Implemented encoding for images and videos. - Added methods for generating responses and handling multi-round generation. - Update mmu tasks with new prompt formats and evaluation metrics: - mmmu_val.yaml: Added specific kwargs for prompt types. - mmmu_val_reasoning.yaml: Enhanced prompts for reasoning tasks. - utils.py: Adjusted evaluation rules and scoring for predictions. - Added script for easy model execution: - vllm_qwen2vl.sh: Script to run VLLM with specified parameters. * Set environment variables for VLLM script - Configure environment for better performance and debugging. - Added variables to control multiprocessing and NCCL behavior. miscs/vllm_qwen2vl.sh: - Set `VLLM_WORKER_MULTIPROC_METHOD` to `spawn` for compatibility. - Enabled `NCCL_BLOCKING_WAIT` to avoid hangs. - Increased `NCCL_TIMEOUT` to 18000000 for long-running processes. - Set `NCCL_DEBUG` to `DEBUG` for detailed logs. * Rename scripts and update paths - Renamed representation scripts for clarity. - miscs/repr_scripts.sh -> miscs/model_dryruns/llava_1_5.sh - miscs/cicd_qwen2vl.sh -> miscs/model_dryruns/qwen2vl.sh - miscs/tinyllava_repr_scripts.sh -> miscs/model_dryruns/tinyllava.sh - miscs/vllm_qwen2vl.sh -> miscs/model_dryruns/vllm_qwen2vl.sh - Updated parameters in the vllm_qwen2vl.sh script. - miscs/model_dryruns/vllm_qwen2vl.sh: Added `--limit=64` to output path command. * Optimize image handling in VLLM model - Simplify image conversion in the `to_base64` method: - vllm.py: Directly convert input image to RGB format instead of copying it. - Remove unnecessary base64 encoding for images: - vllm.py: Return the PIL image directly instead of converting it to base64. - Update video frame processing to return PIL images: - vllm.py: Replace base64 encoding of frames with returning the PIL frames directly. * Revert "Optimize image handling in VLLM model" This reverts commit 469e1fc. * use threads to encode visuals --------- Co-authored-by: kcz358 <kaichenzhang358@outlook.com>

- Copy CITATION.cff from feat/api-model-concurrency (enables GitHub's "Cite this repository" button and fixes broken FAQ reference) - Remove stale miscs/repr_scripts.sh link (file was renamed in #544, then deleted in #644; README was never updated)

* docs: restructure README and v0.6 release notes - Restructure v0.6 doc Section 1 from bottom-up to top-down architecture overview: Pipeline (1.1) -> Model Interface (1.2) -> API Concurrency (1.3) - Restore original "Why lmms-eval?" voice with v0.6 insights integrated naturally (statistical rigor, evaluation as infrastructure) - Reorder README sections by user journey: Why -> Quickstart -> Usage -> Advanced - Collapse i18n language links into expandable details tag - Simplify quick links from 3 rows to 2 rows - Update title to "LMMs-Eval: Probing Intelligence in the Real World" - Add comprehensive CHANGELOG.md covering v0.6 highlights and ~182 commits * fix: add CITATION.cff and remove dead repr_scripts.sh link - Copy CITATION.cff from feat/api-model-concurrency (enables GitHub's "Cite this repository" button and fixes broken FAQ reference) - Remove stale miscs/repr_scripts.sh link (file was renamed in #544, then deleted in #644; README was never updated) * Fix formatting of evaluation components diagram * Revise mermaid diagram for async pipeline with cache Updated mermaid diagram to improve clarity and formatting. --------- Co-authored-by: Pu Fanyi <FPU001@e.ntu.edu.sg>

* Add VLLM model integration and update configurations - Introduce VLLM model in the model registry. - Update AVAILABLE_MODELS to include new models: - models/__init__.py: Added "aria", "internvideo2", "llama_vision", "oryx", "ross", "slime", "videochat2", "vllm", "xcomposer2_4KHD", "xcomposer2d5". - Create vllm.py for VLLM model implementation: - Implemented encoding for images and videos. - Added methods for generating responses and handling multi-round generation. - Update mmu tasks with new prompt formats and evaluation metrics: - mmmu_val.yaml: Added specific kwargs for prompt types. - mmmu_val_reasoning.yaml: Enhanced prompts for reasoning tasks. - utils.py: Adjusted evaluation rules and scoring for predictions. - Added script for easy model execution: - vllm_qwen2vl.sh: Script to run VLLM with specified parameters. * Set environment variables for VLLM script - Configure environment for better performance and debugging. - Added variables to control multiprocessing and NCCL behavior. miscs/vllm_qwen2vl.sh: - Set `VLLM_WORKER_MULTIPROC_METHOD` to `spawn` for compatibility. - Enabled `NCCL_BLOCKING_WAIT` to avoid hangs. - Increased `NCCL_TIMEOUT` to 18000000 for long-running processes. - Set `NCCL_DEBUG` to `DEBUG` for detailed logs. * Rename scripts and update paths - Renamed representation scripts for clarity. - miscs/repr_scripts.sh -> miscs/model_dryruns/llava_1_5.sh - miscs/cicd_qwen2vl.sh -> miscs/model_dryruns/qwen2vl.sh - miscs/tinyllava_repr_scripts.sh -> miscs/model_dryruns/tinyllava.sh - miscs/vllm_qwen2vl.sh -> miscs/model_dryruns/vllm_qwen2vl.sh - Updated parameters in the vllm_qwen2vl.sh script. - miscs/model_dryruns/vllm_qwen2vl.sh: Added `--limit=64` to output path command. * Optimize image handling in VLLM model - Simplify image conversion in the `to_base64` method: - vllm.py: Directly convert input image to RGB format instead of copying it. - Remove unnecessary base64 encoding for images: - vllm.py: Return the PIL image directly instead of converting it to base64. - Update video frame processing to return PIL images: - vllm.py: Replace base64 encoding of frames with returning the PIL frames directly. * Revert "Optimize image handling in VLLM model" This reverts commit 469e1fc. * use threads to encode visuals --------- Co-authored-by: kcz358 <kaichenzhang358@outlook.com>

Luodian added 3 commits February 19, 2025 21:55

kylesayrs reviewed Feb 19, 2025

View reviewed changes

Revert "Optimize image handling in VLLM model"

f86961b

This reverts commit 469e1fc.

use threads to encode visuals

3e569e5

kcz358 approved these changes Feb 20, 2025

View reviewed changes

Luodian merged commit 968d5f1 into main Feb 20, 2025
2 checks passed

kcz358 deleted the dev/add_vllm branch February 20, 2025 05:14

Luodian mentioned this pull request Feb 16, 2026

docs: restructure README and v0.6 release notes #1086

Merged

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Model] add vllm compatible models#544

[Model] add vllm compatible models#544
Luodian merged 6 commits intomainfrom
dev/add_vllm

Luodian commented Feb 19, 2025

Uh oh!

kylesayrs Feb 19, 2025

Uh oh!

Luodian Feb 20, 2025 •

edited

Loading

Uh oh!

kylesayrs Feb 19, 2025

Uh oh!

Luodian Feb 20, 2025

Uh oh!

Luodian Feb 20, 2025

Uh oh!

fzyzcjy commented Feb 20, 2025 •

edited

Loading

Uh oh!

Luodian commented Feb 20, 2025

Uh oh!

Luodian commented Feb 20, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants



		@register_model("vllm")
		class VLLM(lmms):

Conversation

Luodian commented Feb 19, 2025

Uh oh!

kylesayrs Feb 19, 2025

Choose a reason for hiding this comment

Uh oh!

Luodian Feb 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kylesayrs Feb 19, 2025

Choose a reason for hiding this comment

Uh oh!

Luodian Feb 20, 2025

Choose a reason for hiding this comment

Uh oh!

Luodian Feb 20, 2025

Choose a reason for hiding this comment

Uh oh!

fzyzcjy commented Feb 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Luodian commented Feb 20, 2025

Uh oh!

Luodian commented Feb 20, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Luodian Feb 20, 2025 •

edited

Loading

fzyzcjy commented Feb 20, 2025 •

edited

Loading