【ICLR 2025】🧪 Atomas: Hierarchical Adaptive Alignment on Molecule-Text for Unified Molecule Understanding and Generation
Project Page | Paper | Report Bug | Citation
Table of Contents
- [2025/2/24]: We've released the source code and pretrained checkpoint for Atomas. This includes everything needed to start using and extending Atomas in your own projects. Enjoy easy setup and quick integration!

We propose Atomas, a hierarchical molecular representation learning framework that jointly learns representations from SMILES strings and text. We design a Hierarchical Adaptive Alignment model to automatically learn the fine-grained fragment correspondence between two modalities and align these representations at three semantic levels. Atomas's end-to-end training framework supports understanding and generating molecule, enabling a wider range of downstream tasks. Extensive experiments on retrieval and generation tasks demonstrate superior performance, highlighting the efficacy of our method. Scaling experiments reveal Atomas’s robustness and scalability. Additionally, the visualization and qualitative analysis of Atomas confirms the chemical significance of our approach.
To get a local copy up and running follow these simple example steps.
To install requirements:
pip install -r requirements.txt
You can download pretrained models here:
- Atomas pre-trained on PubchemSTM-distill dataset and finetune on CHEBI-20 dataset for molecule generation task.
To train the model(s) in the paper, run this command:
python main.py --project Atomas --data_dir <your data path> --dataset <choose pubchem or chebi-20 dataset> --model_size <choose base or large> --task <choose genmol or gentext>
To evaluate model, run:
python eval.py --resume_from_checkpoint mymodel.ckpt
If you find our work useful in your research or if you use parts of this code please consider citing our paper:
@article{zhang2024atomas,
title={Atomas: Hierarchical alignment on molecule-text for unified molecule understanding and generation},
author={Zhang, Yikun and Ye, Geyan and Yuan, Chaohao and Han, Bo and Huang, Long-Kai and Yao, Jianhua and Liu, Wei and Rong, Yu},
journal={arXiv preprint arXiv:2404.16880},
year={2024}
}
Yikun Zhang - yikun.zhang@stu.pku.edu.cn