Skip to content

[TIP 2024] Cost Volume Aggregation in Stereo Matching Revisited: A Disparity Classification Perspective

License

Notifications You must be signed in to change notification settings

cocowy1/DCANet_Stereo

Repository files navigation

Cost Volume Aggregation in Stereo Matching Revisited: A Disparity Classification Perspective (DCANet TIP 2024).

Cost aggregation plays a critical role in existing stereo matching methods. In this paper, we revisit cost aggregation in stereo matching from disparity classification and propose a generic yet efficient Disparity Context Aggregation (DCA) module to improve the performance of CNN-based methods. Our approach is based on an insight that a coarse disparity class prior is beneficial to disparity regression. To obtain such a prior, we first classify pixels in an image into several disparity classes and treat pixels within the same class as homogeneous regions. We then generate homogeneous region representations and incorporate these representations into the cost volume to suppress irrelevant information while enhancing the matching ability for cost aggregation. With the help of homogeneous region representations, efficient and informative cost aggregation can be achieved with only a shallow 3D CNN. Our DCA module is fully differentiable and well-compatible with different network architectures, which can be seamlessly plugged into existing networks to improve performance with small additional overheads. It is demonstrated that our DCA module can effectively exploit disparity class priors to improve the performance of cost aggregation. Based on our DCA, we design a highly accurate network named DCANet, which achieves state-of-the-art performance on several benchmarks.

DCA (Disparity Context Aggregation) Module Overview

Environment

Python 3.8
Pytorch 1.6.0

Create a virtual environment and activate it.

conda create -n DCANet python=3.8
conda activate DCANet

Dependencies

conda install pytorch torchvision torchaudio cudatoolkit=10.3 -c pytorch -c nvidia
pip install opencv-python
pip install scikit-image
pip install tensorboard
pip install matplotlib 
pip install tqdm
pip install chardet
pip install imageio
pip install thop
pip install timm==0.5.4

Disparity Classficiation Visualization on KITTI

Disparity Classficiation Visualization on SceneFlow

Visualization of Grad-CAM

1. Train on SceneFlow

Run main.py to train on the SceneFlow dataset. Please update datapath in main.py as your training data path.

2. Train on KITTI & ETH3D

Run train_kitti.py or train_eth3d.py to finetune on the different real-world datasets, such as KITTI 2012, KITTI 2015, and ETH3D.

To generate prediction results on the test set of the KITTI dataset, you can run evaluate_kitti.py. The inference time can be printed once you run evaluate_kitti.py. The inference results on the KITTI dataset can be directly submitted to the online evaluation server for benchmarking.

3. Inference

Run my_img.py to finetune on the KITTI 2012, KITTI 2015. Please update datapath in my_img.py as your testing data path.

Citation

If you find our work useful in your research, please consider citing our paper:

@article{wang2024cost,
  title={Cost Volume Aggregation in Stereo Matching Revisited: A Disparity Classification Perspective},
  author={Wang, Yun and Wang, Longguang and Li, Kunhong and Zhang, Yongjian and Wu, Dapeng Oliver and Guo, Yulan},
  journal={IEEE Transactions on Image Processing},
  year={2024},
  publisher={IEEE}
}

Acknowledgements

This project is based on GwcNet. We thank the original authors for their excellent works.

About

[TIP 2024] Cost Volume Aggregation in Stereo Matching Revisited: A Disparity Classification Perspective

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published