Learning from the ligand: using ligand-based features to improve binding affinity prediction
- PMID: 31598630
- DOI: 10.1093/bioinformatics/btz665
Learning from the ligand: using ligand-based features to improve binding affinity prediction
Abstract
Motivation: Machine learning scoring functions for protein-ligand binding affinity prediction have been found to consistently outperform classical scoring functions. Structure-based scoring functions for universal affinity prediction typically use features describing interactions derived from the protein-ligand complex, with limited information about the chemical or topological properties of the ligand itself.
Results: We demonstrate that the performance of machine learning scoring functions are consistently improved by the inclusion of diverse ligand-based features. For example, a Random Forest (RF) combining the features of RF-Score v3 with RDKit molecular descriptors achieved Pearson correlation coefficients of up to 0.836, 0.780 and 0.821 on the PDBbind 2007, 2013 and 2016 core sets, respectively, compared to 0.790, 0.746 and 0.814 when using the features of RF-Score v3 alone. Excluding proteins and/or ligands that are similar to those in the test sets from the training set has a significant effect on scoring function performance, but does not remove the predictive power of ligand-based features. Furthermore a RF using only ligand-based features is predictive at a level similar to classical scoring functions and it appears to be predicting the mean binding affinity of a ligand for its protein targets.
Availability and implementation: Data and code to reproduce all the results are freely available at http://opig.stats.ox.ac.uk/resources.
Supplementary information: Supplementary data are available at Bioinformatics online.
© The Author(s) 2019. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Similar articles
-
Extended connectivity interaction features: improving binding affinity prediction through chemical description.Bioinformatics. 2021 Jun 16;37(10):1376-1382. doi: 10.1093/bioinformatics/btaa982. Bioinformatics. 2021. PMID: 33226061
-
Structural and Sequence Similarity Makes a Significant Impact on Machine-Learning-Based Scoring Functions for Protein-Ligand Interactions.J Chem Inf Model. 2017 Apr 24;57(4):1007-1012. doi: 10.1021/acs.jcim.7b00049. Epub 2017 Apr 5. J Chem Inf Model. 2017. PMID: 28358210
-
Classical scoring functions for docking are unable to exploit large volumes of structural and interaction data.Bioinformatics. 2019 Oct 15;35(20):3989-3995. doi: 10.1093/bioinformatics/btz183. Bioinformatics. 2019. PMID: 30873528
-
Learning from Docked Ligands: Ligand-Based Features Rescue Structure-Based Scoring Functions When Trained on Docked Poses.J Chem Inf Model. 2022 Nov 28;62(22):5329-5341. doi: 10.1021/acs.jcim.1c00096. Epub 2021 Sep 1. J Chem Inf Model. 2022. PMID: 34469150 Review.
-
Application of Machine Learning Techniques to Predict Binding Affinity for Drug Targets: A Study of Cyclin-Dependent Kinase 2.Curr Med Chem. 2021;28(2):253-265. doi: 10.2174/2213275912666191102162959. Curr Med Chem. 2021. PMID: 31729287 Review.
Cited by
-
A novel method for drug-target interaction prediction based on graph transformers model.BMC Bioinformatics. 2022 Nov 3;23(1):459. doi: 10.1186/s12859-022-04812-w. BMC Bioinformatics. 2022. PMID: 36329406 Free PMC article.
-
Delta Machine Learning to Improve Scoring-Ranking-Screening Performances of Protein-Ligand Scoring Functions.J Chem Inf Model. 2022 Jun 13;62(11):2696-2712. doi: 10.1021/acs.jcim.2c00485. Epub 2022 May 17. J Chem Inf Model. 2022. PMID: 35579568 Free PMC article. Review.
-
Machine learning approaches for biomolecular, biophysical, and biomaterials research.Biophys Rev (Melville). 2022 Jun 3;3(2):021306. doi: 10.1063/5.0082179. eCollection 2022 Jun. Biophys Rev (Melville). 2022. PMID: 38505413 Free PMC article. Review.
-
ALDELE: All-Purpose Deep Learning Toolkits for Predicting the Biocatalytic Activities of Enzymes.J Chem Inf Model. 2024 Apr 22;64(8):3123-3139. doi: 10.1021/acs.jcim.4c00058. Epub 2024 Apr 4. J Chem Inf Model. 2024. PMID: 38573056 Free PMC article.
-
Scoring Functions for Protein-Ligand Binding Affinity Prediction using Structure-Based Deep Learning: A Review.Front Bioinform. 2022 Jun 17;2:885983. doi: 10.3389/fbinf.2022.885983. Front Bioinform. 2022. PMID: 36187180 Free PMC article.
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources