This repository collects papers related to the application of AI in the study of environmental pollutants, focusing primarily on the following scenarios: Molecular Annotation for Emerging Contaminants,Pollutant Database/Benchmark,Property Prediction of Pollutants.
- Tandem mass spectrum prediction for small molecules using graph transformers. Nature Machine Intelligence, 2024. paper
- Machine learning–enhanced molecular network reveals global exposure to hundreds of unknown PFAS. Science Advances, 2024. paper
- Efficiently predicting high resolution mass spectra with graph neural networks. ICML, 2023. paper
- Prefix-Tree Decoding for Predicting Mass Spectra from Molecules. NeurIPS, 2023. paper
- Multi-scale Sinusoidal Embeddings Enable Learning on High Resolution Mass Spectrometry Data. ICLR workshop, 2023. paper
- Annotating metabolite mass spectra with domain-inspired chemical formula transformers. Nature Machine Intelligence, 2023. paper
- Annotation of natural product compound families using molecular networking topology and structural similarity fingerprinting. Nature Communications, 2023. paper
- Joint structural annotation of small molecules using liquid chromatography retention order and tandem mass spectrometry data. Nature Machine Intelligence, 2022. paper
- Metabolite annotation from knowns to unknowns through knowledge-guided multi-layer metabolic networking. Nature Communications, 2022. paper
- MSNovelist: de novo structure generation from mass spectra. Nature Methods, 2022. paper
- Metabolite discovery through global annotation of untargeted metabolomics data. Nature Methods, 2021. paper
- Systematic classification of unknown metabolites using high-resolution fragmentation mass spectra. Nature Biotechnology, 2021. paper
- Database-independent molecular formula annotation using Gibbs sampling through ZODIAC. Nature Machine Intelligence, 2020. paper
- Feature-based molecular networking in the GNPS analysis environment. Nature Methods, 2020. paper
- Deep imitation learning for molecular inverse problems. ICML, 2019. paper
- SIRIUS 4: a rapid tool for turning tandem mass spectra into metabolite structure information. Nature Methods, 2019. paper
- Natural products targeting strategies involving molecular networking: different manners, one goal. Natural Product Reports, 2019. paper
- Searching molecular structure databases with tandem mass spectra using CSI:FingerID. PNAS, 2015. paper
- Mass spectral molecular networking of living microbial colonies. PNAS, 2012. paper
- RepoRT: a comprehensive repository for small molecule retention times. Nature Methods, 2024. paper, code
- MassSpecGym: A benchmark for the discovery and identification of molecules. NeurIPS, 2024. paper, code
- Transformers enable accurate prediction of acute and chronic chemical toxicity in aquatic organisms. Science Advances, 2024. paper
- Times are changing but order matters: Transferable prediction of small molecule liquid chromatography retention times. ChemRxiv, 2024. paper
- OPERA models for predicting physicochemical properties and environmental fate endpoints. Journal of Cheminformatics, 2018. paper