Supervised learning with decision tree-based methods in computational and systems biology
- PMID: 20023720
- DOI: 10.1039/b907946g
Supervised learning with decision tree-based methods in computational and systems biology
Abstract
At the intersection between artificial intelligence and statistics, supervised learning allows algorithms to automatically build predictive models from just observations of a system. During the last twenty years, supervised learning has been a tool of choice to analyze the always increasing and complexifying data generated in the context of molecular biology, with successful applications in genome annotation, function prediction, or biomarker discovery. Among supervised learning methods, decision tree-based methods stand out as non parametric methods that have the unique feature of combining interpretability, efficiency, and, when used in ensembles of trees, excellent accuracy. The goal of this paper is to provide an accessible and comprehensive introduction to this class of methods. The first part of the review is devoted to an intuitive but complete description of decision tree-based methods and a discussion of their strengths and limitations with respect to other supervised learning methods. The second part of the review provides a survey of their applications in the context of computational and systems biology.
Similar articles
-
Accuracy-based learning classifier systems: models, analysis and applications to classification tasks.Evol Comput. 2003 Fall;11(3):209-38. doi: 10.1162/106365603322365289. Evol Comput. 2003. PMID: 14558911
-
Statistical geometry based prediction of nonsynonymous SNP functional effects using random forest and neuro-fuzzy classifiers.Proteins. 2008 Jun;71(4):1930-9. doi: 10.1002/prot.21838. Proteins. 2008. PMID: 18186470
-
Neural networks.Methods Mol Biol. 2010;609:197-222. doi: 10.1007/978-1-60327-241-4_12. Methods Mol Biol. 2010. PMID: 20221921
-
Protein function prediction with high-throughput data.Amino Acids. 2008 Oct;35(3):517-30. doi: 10.1007/s00726-008-0077-y. Epub 2008 Apr 22. Amino Acids. 2008. PMID: 18427717 Review.
-
Computational intelligence approaches for pattern discovery in biological systems.Brief Bioinform. 2008 Jul;9(4):307-16. doi: 10.1093/bib/bbn021. Epub 2008 May 5. Brief Bioinform. 2008. PMID: 18460474 Review.
Cited by
-
Learning Relationships Between Chemical and Physical Stability for Peptide Drug Development.Pharm Res. 2023 Mar;40(3):701-710. doi: 10.1007/s11095-023-03475-3. Epub 2023 Feb 16. Pharm Res. 2023. PMID: 36797504
-
Prospects and Challenges of Using Machine Learning for Academic Forecasting.Comput Intell Neurosci. 2022 Jun 17;2022:5624475. doi: 10.1155/2022/5624475. eCollection 2022. Comput Intell Neurosci. 2022. PMID: 35909823 Free PMC article.
-
Explainable Artificial Intelligence for Neuroscience: Behavioral Neurostimulation.Front Neurosci. 2019 Dec 13;13:1346. doi: 10.3389/fnins.2019.01346. eCollection 2019. Front Neurosci. 2019. PMID: 31920509 Free PMC article.
-
Machine Learning Techniques to Classify Healthy and Diseased Cardiomyocytes by Contractility Profile.ACS Biomater Sci Eng. 2021 Jul 12;7(7):3043-3052. doi: 10.1021/acsbiomaterials.1c00418. Epub 2021 Jun 21. ACS Biomater Sci Eng. 2021. PMID: 34152732 Free PMC article.
-
CD4+ T cell-dependent and CD4+ T cell-independent cytokine-chemokine network changes in the immune responses of HIV-infected individuals.Sci Signal. 2015 Oct 20;8(399):ra104. doi: 10.1126/scisignal.aab0808. Sci Signal. 2015. PMID: 26486173 Free PMC article.
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Other Literature Sources