MLIP Arena is a unified platform for evaluating foundation machine learning interatomic potentials (MLIPs) beyond conventional error metrics. It focuses on revealing the physical soundness learned by MLIPs and assessing their utilitarian performance agnostic to underlying model architecture. The platform's benchmarks are specifically designed to evaluate the readiness and reliability of open-source, open-weight models in accurately reproducing both qualitative and quantitative behaviors of atomic systems.
MLIP Arena leverages modern pythonic workflow orchestrator Prefect to enable advanced task/flow chaining and caching.
Note
Contributions of new tasks are very welcome! If you're interested in joining the effort, please reach out to Yuan at cyrusyc@berkeley.edu. See project page for some outstanding tasks, or propose new one in Discussion.
- [April 8, 2025] π MLIP Arena accepted as an ICLR AI4Mat Spotlight! π Huge thanks to all co-authors for their contributions!
pip install mlip-arena
Caution
We recommend clean build in a new virtual environment due to the compatibility issues between multiple popular MLIPs. We provide a single installation script using uv
for minimal package conflicts and fast installation!
Linux
# (Optional) Install uv
curl -LsSf https://astral.sh/uv/install.sh | sh
source $HOME/.local/bin/env
# One script uv pip installation
bash scripts/install-linux.sh
# Or from command line
git clone https://github.com/atomind-ai/mlip-arena.git
cd mlip-arena
pip install torch==2.2.0
bash scripts/install-pyg.sh
bash scripts/install-dgl.sh
pip install -e .[test]
pip install -e .[mace]
# DeePMD
DP_ENABLE_TENSORFLOW=0 pip install -e .[deepmd]
Mac
# (Optional) Install uv
curl -LsSf https://astral.sh/uv/install.sh | sh
source $HOME/.local/bin/env
# One script uv pip installation
bash scripts/install-macosx.sh
Arena provides a unified interface to run all the compiled MLIPs. This can be achieved simply by looping through MLIPEnum
:
from mlip_arena.models import MLIPEnum
from mlip_arena.tasks.md import run as MD
# from mlip_arena.tasks import MD # for convenient import
from mlip_arena.tasks.utils import get_calculator
from ase import units
from ase.build import bulk
atoms = bulk("Cu", "fcc", a=3.6)
results = []
for model in MLIPEnum:
result = MD(
atoms=atoms,
calculator=get_calculator(
model,
calculator_kwargs=dict(), # passing into calculator
dispersion=True,
dispersion_kwargs=dict(damping='bj', xc='pbe', cutoff=40.0 * units.Bohr), # passing into TorchDFTD3Calculator
),
ensemble="nve",
dynamics="velocityverlet",
total_time=1e3, # 1 ps = 1e3 fs
time_step=2, # fs
)
results.append(result)
To run multiple benchmarks in parallel, add .submit
before the task function and wrap all the tasks into a flow to dispatch the tasks to worker for concurrent execution. See Prefect Doc on tasks and flow for more details.
...
from prefect import flow
@flow
def run_all_tasks:
futures = []
for model in MLIPEnum:
future = MD.submit(
atoms=atoms,
...
)
future.append(future)
return [f.result(raise_on_failure=False) for f in futures]
For a more practical example, please now refer to MOF classification.
The implemented tasks are available under mlip_arena.tasks.<module>.run
or from mlip_arena.tasks import *
for convenient imports (currently doesn't work if phonopy is not installed).
- OPT: Structure optimization
- EOS: Equation of state (energy-volume scan)
- MD: Molecular dynamics with flexible dynamics (NVE, NVT, NPT) and temperature/pressure scheduling (annealing, shearing, etc)
- PHONON: Phonon calculation driven by phonopy
- NEB: Nudged elastic band
- NEB_FROM_ENDPOINTS: Nudge elastic band with convenient image interpolation (linear or IDPP)
- ELASTICITY: Elastic tensor calculation
MLIP Arena is now in pre-alpha. If you're interested in joining the effort, please reach out to Yuan at cyrusyc@berkeley.edu.
git lfs fetch --all
git lfs pull
streamlit run serve/app.py
Note
Please reuse, extend, or chain the general tasks defined above
If you have pretrained MLIP models that you would like to contribute to the MLIP Arena and show benchmark in real-time, there are two ways:
- Implement new ASE Calculator class in mlip_arena/models/externals.
- Name your class with awesome model name and add the same name to registry with metadata.
Caution
Remove unneccessary outputs under results
class attributes to avoid error for MD simulations. Please refer to other class definition for example.
- Inherit Hugging Face ModelHubMixin class to your awesome model class definition. We recommend PytorchModelHubMixin.
- Create a new Hugging Face Model repository and upload the model file using push_to_hub function.
- Follow the template to code the I/O interface for your model here.
- Update model registry with metadata
If you find the work useful, please consider citing the following:
@inproceedings{
chiang2025mlip,
title={{MLIP} Arena: Advancing Fairness and Transparency in Machine Learning Interatomic Potentials through an Open and Accessible Benchmark Platform},
author={Yuan Chiang and Tobias Kreiman and Elizabeth Weaver and Ishan Amin and Matthew Kuner and Christine Zhang and Aaron Kaplan and Daryl Chrzan and Samuel M Blau and Aditi S. Krishnapriyan and Mark Asta},
booktitle={AI for Accelerated Materials Design - ICLR 2025},
year={2025},
url={https://openreview.net/forum?id=ysKfIavYQE}
}