MedROV

Official PyTorch implementation for paper "MedROV: Towards Real-Time Open-Vocabulary Detection Across Diverse Medical Imaging Modalities", accepted at WACV 2026. Paper Link

Abstract

Traditional object detection models in medical imaging operate within a closed-set paradigm, limiting their ability to detect objects of novel labels. Open-vocabulary object detection (OVOD) addresses this limitation but remains underexplored in medical imaging due to dataset scarcity and weak text-image alignment. To bridge this gap, we introduce MedROV, the first Real-time Open Vocabulary detection model for medical imaging. To enable open-vocabulary learning, we curate a large-scale dataset, Omnis, with 600K detection samples across nine imaging modalities and introduce a pseudo-labeling strategy to handle missing annotations from multi-source datasets. Additionally, we enhance generalization by incorporating knowledge from a large pre-trained foundation model. By leveraging contrastive learning and cross-modal representations, MedROV effectively detects both known and novel structures. Experimental results demonstrate that MedROV outperforms the previous state-of-the-art foundation model for medical image detection with an average absolute improvement of 40 mAP50, and surpasses closed-set detectors by more than 3 mAP50, while running at 70 FPS, setting a new benchmark in medical detection.

Model Architecture

Getting Started

Installation

To set up the environment and install the required packages, run the following commands:

conda create -n medrov python=3.10
conda activate medrov
git clone https://github.com/toobatehreem/MedROV
cd MedROV
pip install -e .
pip install open_clip_torch==2.23.0 transformers==4.35.2 matplotlib

Training

To train the MedROV model, use the following code snippet:

from ultralytics import YOLOWorld

# Load the model configuration and weights
model = YOLOWorld("yolov8l-worldv2.pt")

# Start training
results = model.train(data="data.yaml", epochs=20, batch=128, optimizer='AdamW', lr0=0.0002, weight_decay=0.05, device=(0,1,2,3))

Evaluation

The checkpoints for the trained MedROV model can be found here: MedROV

The Omnis-600K dataset can be found here: Omnis-600K

To evaluate the trained model, use the following code:

from ultralytics import YOLOWorld

# Load the model configuration and weights
model = YOLOWorld("/MedROV/checkpoints/medrov.pt")

# Validate the model
model.val(data="data.yaml", device=0)

# Print evaluation metrics
print(f"Mean Average Precision @ .5:.95 : {metrics.box.map}")
print(f"Mean Average Precision @ .50   : {metrics.box.map50}")
print(f"Mean Average Precision @ .70   : {metrics.box.map75}")

Acknowledgements

We sincerely thank Ultralytics for providing the YOLOWorld code.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MedROV

Abstract

Model Architecture

Getting Started

Installation

Training

Evaluation

Acknowledgements

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

MedROV

Abstract

Model Architecture

Getting Started

Installation

Training

Evaluation

Acknowledgements