This repository contains the data and codes for ''Machine Learning for Reaction Performance Prediction in Allylic Substitution Enhanced by Automatic Extraction of Substrate-Aware Descriptor''.
pip install rdkit
pip install -U scikit-learn
pip install autogluon
pip install xgboost
The data used in the paper is in the data folder. Some scripts about data cleaning and so on are available in tools folder.
The files in gjf format are optimized molecule 3D conformations.
The scripts are in the utils folder.
- The SMILES will be converted into gjf files by Open Babel software.
python smile2gif.py 'SAMPLE1.csv'
- Gaussian 09 and Multiwfn software will used for CDFT calculation.
python slurm_generate.py 'gjf_path'
python copyChk.py 'gjf_path'
python copyChk_2.py 'gjf_path'
python submit_jobs.py 'script_path'
A running example is given in demo.ipynb
.