GitHub - RuoyuChen10/VPS: [CVPR 2025 Highlight] Interpreting Object-level Foundation Models via Visual Precision Search

【CVPR 2025 Highlight 🔥】Interpreting Object-level Foundation Models via Visual Precision Search

If you like our project, please give us a star ⭐ on GitHub for latest update.

📰 News & Update

[2025.04.04] Our paper was selected as the Highlight paper of CVPR 2025.
[2025.02.29] Our paper was successfully accepted by CVPR 2025.
[2025.02.08] We release the official code of VPS, a new interpretation mechanism.
[2024.09.30] We begin to investigate the potential of interpretability in object detection.

🛠️ Environment

For our interpretation method, the packages we use are relatively common. Please mainly install pytorch, etc.

We provide code to explain Grounding DINO, but please install its dependencies first: https://github.com/IDEA-Research/GroundingDINO.

For explaining Florence-2, please install its dependencies: https://huggingface.co/microsoft/Florence-2-large-ft

For explaining traditional detectors, please install MMDetection v3.3: https://github.com/open-mmlab/mmdetection/

In addition, please follow the datasets/readme.md and ckpt/readme.md to organize the dataset and download the weights of the relevant detectors.

🧳 Quickly Try

You can experience the interpretability of a single image directly in the Jupyter notebook.

Grounding DINO Interpretation (Detection): tutorial
Florence-2 Interpretation (Detection): tutorial
Florence-2 Interpretation (Visaul Grounding): tutorial

😮 Highlights

We provide some results of our approach on interpreting object detection models.

Note: The tank picture is from the Internet.

Grounding DINO:

Florence-2:

🗝️ How to Run

Prepare the datasets following here.

Download the benchmark files and put them into ./datasets from https://huggingface.co/datasets/RuoyuChen/VPS_benchmark.

Run (more instructions are in fold ./scripts):

./script/groundingdino_coco_correct.sh

Visualization:

python -m visualization.visualize_ours \
    --explanation-dir submodular_results/grounding-dino-coco-correctly/slico-1.0-1.0-division-number-100 \
    --Datasets datasets/coco/val2017

Evaluation faithfulness:

python -m evals.eval_AUC_faithfulness \
    --explanation-dir submodular_results/grounding-dino-coco-correctly/slico-1.0-1.0-division-number-100

Evaluation location:

python -m evals.eval_energy_pg \
    --Datasets datasets/coco/val2017 \
    --explanation-dir submodular_results/grounding-dino-coco-correctly/slico-1.0-1.0-division-number-100

👍 Acknowledgement

SMDL-Attribution: SOTA attribution method based on submodular subset selection

Grounding DINO: an open-set object detector.

Florence-2: a novel vision foundation model with a unified, prompt-based representation for a variety of computer vision and vision-language tasks.

MMDetection V3.3: an open source object detection toolbox based on PyTorch.

✏️ Citation

@article{chen2024interpreting,
  title={Interpreting Object-level Foundation Models via Visual Precision Search},
  author={Chen, Ruoyu and Liang, Siyuan and Li, Jingzhi and Liu, Shiming and Li, Maosen and Huang, Zheng and Zhang, Hua and Cao, Xiaochun},
  journal={arXiv preprint arXiv:2411.16198},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
ablation		ablation
baseline		baseline
ckpt		ckpt
config		config
datasets		datasets
detection_attribution		detection_attribution
evals		evals
grounding_attribution		grounding_attribution
images		images
interpretation		interpretation
scripts		scripts
tutorial		tutorial
visualization		visualization
README.md		README.md
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

【CVPR 2025 Highlight 🔥】Interpreting Object-level Foundation Models via Visual Precision Search

If you like our project, please give us a star ⭐ on GitHub for latest update.

📰 News & Update

🛠️ Environment

🧳 Quickly Try

😮 Highlights

🗝️ How to Run

👍 Acknowledgement

✏️ Citation

About

Releases

Packages

Languages

RuoyuChen10/VPS

Folders and files

Latest commit

History

Repository files navigation

【CVPR 2025 Highlight 🔥】Interpreting Object-level Foundation Models via Visual Precision Search

If you like our project, please give us a star ⭐ on GitHub for latest update.

📰 News & Update

🛠️ Environment

🧳 Quickly Try

😮 Highlights

🗝️ How to Run

👍 Acknowledgement

✏️ Citation

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages