EN
Molecular pKa Prediction with Deep Learning and Chemical Fingerprints
Abstract
Today, drug discovery and design, the determination of molecular properties, in particular the determination of a molecule's pKa value, is essential for understanding and optimising the biological activity of drugs. In this context, in addition to traditional chemical methods, artificial intelligence techniques such as machine learning and deep learning are increasingly used to predict molecular properties and drug design processes. In this paper, we present an approach that investigates the effect of molecular properties on pKa prediction and implements this prediction using a deep learning model. The model considers molecular weight together with chemical fingerprinting methods such as Morgan fingerprinting to represent molecular structures. The dataset used in this study contains 2093 molecular data points obtained from PubChem. The method presented in the paper predicts the pKa values of many molecules with 96.66% accuracy. This can save time and money in the drug discovery, design process, and provide valuable guidance for experimental studies. The paper also presents a comprehensive analysis of the training process, accuracy metrics and performance of the deep learning model. Finally, this paper presents research that evaluates the impact of molecular features on pKa prediction and demonstrates the success of the deep learning model in these predictions
Keywords
References
- [1] Gao J., Truhlar D.G., Quantum mechanical methods for enzyme kinetics, Annu Rev Phys Chem., 53 (2002) 467-505.
- [2] Ho J., Coote M.L., First-principles prediction of acidities in the gas and solution phase, WIREs Comput Mol Sci. 1(5) (2011) 649-60.
- [3] Cramer C.J., Truhlar D.G., Density functional theory for transition metals and transition metal chemistry, Phys Chem Chem Phys. 11(46) (2009) 10757-10816.
- [4] Xu Y., Dai Z., Chen F., Gao S., Pei J., Lai L., Deep learning for drug-induced liver injury, J Chem Inf Model. 55(10) (2015) 2085-2093.
- [5] Wang S., Guo Y., Wang Y., Sun H., Huang J., SMILES-BERT: Large scale unsupervised pre-training for molecular property prediction. Proc 10th ACM Int Conf Bioinformatics, Comput Biol Health Inform., (2019) 429-436.
- [6] Mayr A., Klambauer G., Unterthiner T., Hochreiter S., DeepTox: Toxicity prediction using deep learning, Front Environ Sci., (2016)
- [7] Ramsundar B., Eastman P., Walters P., Pande V., Deep learning for the life sciences, Sebastopol (CA): O'Reilly Media, (2015).
- [8] Feinberg E.N., Sur D., Wu Z., Husic B.E., Mai H., Li Y., PotentialNet for molecular property prediction, ACS Cent Sci., 4(11) (2018) 1520-1530.
Details
Primary Language
English
Subjects
Quality Assurance, Chemometrics, Traceability and Metrological Chemistry
Journal Section
Research Article
Authors
Publication Date
June 30, 2025
Submission Date
October 31, 2024
Acceptance Date
April 28, 2025
Published in Issue
Year 1970 Volume: 46 Number: 2