Sirui Xu
Hung Yu Ling
Yu-Xiong Wang
Liang-Yan Gui
University of Illinois Urbana Champaign,
Electronic Arts
CVPR 2025 Highlight 🏆
We introduce InterMimic, a framework that enables a single policy to robustly learn from hours of imperfect MoCap data covering diverse full-body interactions with dynamic and varied objects, supporting both SMPLX and Unitree G1 humanoids.
- [2025-04-05] We're excited by the overwhelming interest in humanoid robot support and are ahead of schedule in open-sourcing our Unitree-G1 integration—starting with a small demo with support for G1 with its original three-finger dexterous hands. Join us in exploring whole-body loco-manipulation with humanoid robots.
- [2025-04-04] InterMimic has been selected as a CVPR Highlight Paper 🏆. More exciting developments are on the way!
- [2025-03-25] We’ve officially released the codebase and checkpoint for teacher policy inference demo — give it a try! ☕️
follow the following instructions:
-
Create new conda environment and install pytroch:
conda create -n intermimic python=3.8 conda install pytorch torchvision torchaudio pytorch-cuda=11.6 -c pytorch -c nvidia pip install -r requirement.txt
You may also build from environment.yml, which might contain redundancies,
conda env create -f environment.yml
-
Download and setup Isaac Gym.
We’ve released a checkpoint for one (out of 17) teacher policy on OMOMO, along with some sample data. To get started:
-
Download the checkpoints and place them in the current directory.
-
Then, run the following commands:
conda activate intermimic sh scripts/test.sh
-
🔥 New! To try it on the Unitree G1 with its three-fingered dexterous hand:
conda activate intermimic sh scripts/test_g1.sh
- Release inference demo for the teacher policy
- Add support for Unitree-G1 with dexterous robot hands
- Release training pipeline for the teacher policy and processed MoCap data
- Release student policy distillation training, distilled reference data (physically correct HOI data❗️), and all related checkpoints
- Release evaluation pipeline for the student policy
- Release all data and processing scripts alongside the InterAct launch
- Release physics-based text-to-HOI and interaction prediction demo
If you find our work helpful, please cite:
@inproceedings{xu2025intermimic,
title = {{InterMimic}: Towards Universal Whole-Body Control for Physics-Based Human-Object Interactions},
author = {Xu, Sirui and Ling, Hung Yu and Wang, Yu-Xiong and Gui, Liang-Yan},
booktitle = {CVPR},
year = {2025},
}
Our data is sourced from InterAct. Please consider citing:
@inproceedings{xu2025interact,
title = {{InterAct}: Advancing Large-Scale Versatile 3D Human-Object Interaction Generation},
author = {Xu, Sirui and Li, Dongting and Zhang, Yucheng and Xu, Xiyan and Long, Qi and Wang, Ziyin and Lu, Yunzhi and Dong, Shuchang and Jiang, Hezi and Gupta, Akshat and Wang, Yu-Xiong and Gui, Liang-Yan},
booktitle = {CVPR},
year = {2025},
}
Please also consider citing the specific sub-dataset you used from InterAct.
Our integrated kinematic model builds upon InterDiff, HOI-Diff, and InterDreamer. Please consider citing the following if you find this component useful:
@inproceedings{xu2024interdreamer,
title = {{InterDreamer}: Zero-Shot Text to 3D Dynamic Human-Object Interaction},
author = {Xu, Sirui and Wang, Ziyin and Wang, Yu-Xiong and Gui, Liang-Yan},
booktitle = {NeurIPS},
year = {2024},
}
@inproceedings{xu2023interdiff,
title = {{InterDiff}: Generating 3D Human-Object Interactions with Physics-Informed Diffusion},
author = {Xu, Sirui and Li, Zhengyuan and Wang, Yu-Xiong and Gui, Liang-Yan},
booktitle = {ICCV},
year = {2023},
}
@article{peng2023hoi,
title = {HOI-Diff: Text-Driven Synthesis of 3D Human-Object Interactions using Diffusion Models},
author = {Peng, Xiaogang and Xie, Yiming and Wu, Zizhao and Jampani, Varun and Sun, Deqing and Jiang, Huaizu},
journal = {arXiv preprint arXiv:2312.06553},
year = {2023}
}
Our SMPL-X-based humanoid model is adapted from PHC. Please consider citing:
@inproceedings{Luo2023PerpetualHC,
author = {Zhengyi Luo and Jinkun Cao and Alexander W. Winkler and Kris Kitani and Weipeng Xu},
title = {Perpetual Humanoid Control for Real-time Simulated Avatars},
booktitle = {ICCV},
year = {2023}
}
This repository builds upon the following excellent open-source projects:
- IsaacGymEnvs: Contributes to the environment code
- rl_games: Serves as the core reinforcement learning framework
- PHC: Used for data construction
- PhysHOI: Contributes to the environment code
- InterAct, OMOMO: Core resource for dataset construction
- InterDiff: Supports kinematic generation
- HOI-Diff: Supports kinematic generation
This codebase is released under the MIT License.
Please note that it also relies on external libraries and datasets, each of which may be subject to their own licenses and terms of use.