MedRegA, an interpretable bilingual generalist model for diverse biomedical tasks, represented by its outstanding ability to leverage regional information. MedRegA can perceive 8 modalities covering almost all the body parts, showcasing significant versatility.
💡We establish Region-Centric tasks with a large-scale dataset, MedRegInstruct, where each sample is paired with coordinates of body structures or lesions.
💡Based on the proposed dataset, we develop a Region-Aware medical MLLM, MedRegA, as a bilingual generalist medical AI system to perform both image-level and region-level medical vision-language tasks, demonstrating impressive versatility.
- Release the model.
- Release the demo code.
- Release the evaluation code.
- Release the training code.
- Release the data.
Please refer to InternVL Installation to build the environment.
Run the demo:
torchrun --nproc-per-node=1 src/demo.py
@article{wang2024interpretable,
title={Interpretable bilingual multimodal large language model for diverse biomedical tasks},
author={Wang, Lehan and Wang, Haonan and Yang, Honglong and Mao, Jiaji and Yang, Zehong and Shen, Jun and Li, Xiaomeng},
journal={arXiv preprint arXiv:2410.18387},
year={2024}
}
We refer to the codes from InternVL. Thank the authors for releasing their code.