Yingyan Li, Lue Fan, Jiawei He, Yuqi Wang, Yuntao Chen, Zhaoxiang Zhang and Tieniu Tan
This Paper presents the LAtent World model (LAW), a self-supervised framework that predicts future scene features from current features and ego trajectories.
conda create -n law python=3.8 -y
conda activate law
pip install -r requirements.txt
pip install torch==1.9.1+cu111 torchvision==0.10.1+cu111 torchaudio==0.9.1 -f https://download.pytorch.org/whl/torch_stable.html
pip install mmcv-full==1.4.0
pip install mmdet==2.14.0
pip install mmsegmentation==0.14.1
pip install timm
conda activate law
git clone https://github.com/open-mmlab/mmdetection3d.git
cd /path/to/mmdetection3d
git checkout -f v0.17.1
python setup.py develop
pip install nuscenes-devkit==1.1.9
pip install yapf==0.40.1
For the pickle files, download the train and val files from VAD.
Organize your dataset as follows:
LAW
├── projects/
├── data/nuscenes
│ ├── can_bus/
│ ├── nuscenes/
│ │ ├── maps/
│ │ ├── samples/
│ │ ├── sweeps/
│ │ ├── v1.0-test/
│ │ ├── v1.0-trainval/
│ │ ├── vad_nuscenes_infos_temporal_train.pkl
│ │ ├── vad_nuscenes_infos_temporal_val.pkl
./tools/nusc_my_train.sh law/default 8
./tools/dist_test $CONFIG $CKPT $NUM_GPU
Method | L2 (m) 1s | L2 (m) 2s | L2 (m) 3s | L2 (m) Avg. | Collision (%) 1s | Collision (%) 2s | Collision (%) 3s | Collision (%) Avg. | Log and Checkpoints |
---|---|---|---|---|---|---|---|---|---|
LAW (Perception-Free) | 0.28 | 0.58 | 0.99 | 0.62 | 0.10 | 0.15 | 0.38 | 0.21 | Google Drive |
Please consider citing our work as follows if it is helpful.
@misc{li2024enhancing,
title={Enhancing End-to-End Autonomous Driving with Latent World Model},
author={Yingyan Li and Lue Fan and Jiawei He and Yuqi Wang and Yuntao Chen and Zhaoxiang Zhang and Tieniu Tan},
year={2024},
eprint={2406.08481},
archivePrefix={arXiv},
primaryClass={cs.CV}
}