Learning gene functional classifications from multiple data types
- PMID: 12015889
- DOI: 10.1089/10665270252935539
Learning gene functional classifications from multiple data types
Abstract
In our attempts to understand cellular function at the molecular level, we must be able to synthesize information from disparate types of genomic data. We consider the problem of inferring gene functional classifications from a heterogeneous data set consisting of DNA microarray expression measurements and phylogenetic profiles from whole-genome sequence comparisons. We demonstrate the application of the support vector machine (SVM) learning algorithm to this functional inference task. Our results suggest the importance of exploiting prior information about the heterogeneity of the data. In particular, we propose an SVM kernel function that is explicitly heterogeneous. In addition, we describe feature scaling methods for further exploiting prior knowledge of heterogeneity by giving each data type different weights.
Similar articles
-
The advantage of functional prediction based on clustering of yeast genes and its correlation with non-sequence based classifications.J Comput Biol. 2002;9(2):193-210. doi: 10.1089/10665270252935412. J Comput Biol. 2002. PMID: 12015877
-
Systematic learning of gene functional classes from DNA array expression data by using multilayer perceptrons.Genome Res. 2002 Nov;12(11):1703-15. doi: 10.1101/gr.192502. Genome Res. 2002. PMID: 12421757 Free PMC article.
-
Integrating genomic data to predict transcription factor binding.Genome Inform. 2005;16(1):83-94. Genome Inform. 2005. PMID: 16362910
-
Artificial intelligence techniques for bioinformatics.Appl Bioinformatics. 2002;1(4):191-222. Appl Bioinformatics. 2002. PMID: 15130837 Review.
-
Toxicogenomics using yeast DNA microarrays.J Biosci Bioeng. 2010 Nov;110(5):511-22. doi: 10.1016/j.jbiosc.2010.06.003. Epub 2010 Jul 10. J Biosci Bioeng. 2010. PMID: 20624688 Review.
Cited by
-
Fusing imperfect experimental data for risk assessment of musculoskeletal disorders in construction using canonical polyadic decomposition.Autom Constr. 2020 Nov;119:10.1016/j.autcon.2020.103322. doi: 10.1016/j.autcon.2020.103322. Autom Constr. 2020. PMID: 33897107 Free PMC article.
-
Large datasets in biomedicine: a discussion of salient analytic issues.J Am Med Inform Assoc. 2009 Nov-Dec;16(6):759-67. doi: 10.1197/jamia.M2780. Epub 2009 Aug 28. J Am Med Inform Assoc. 2009. PMID: 19717808 Free PMC article.
-
Alzheimer's disease diagnosis in individual subjects using structural MR images: validation studies.Neuroimage. 2008 Feb 1;39(3):1186-97. doi: 10.1016/j.neuroimage.2007.09.073. Epub 2007 Oct 22. Neuroimage. 2008. PMID: 18054253 Free PMC article.
-
Predicting gene function in a hierarchical context with an ensemble of classifiers.Genome Biol. 2008;9 Suppl 1(Suppl 1):S3. doi: 10.1186/gb-2008-9-s1-s3. Epub 2008 Jun 27. Genome Biol. 2008. PMID: 18613947 Free PMC article.
-
Directing experimental biology: a case study in mitochondrial biogenesis.PLoS Comput Biol. 2009 Mar;5(3):e1000322. doi: 10.1371/journal.pcbi.1000322. Epub 2009 Mar 20. PLoS Comput Biol. 2009. PMID: 19300515 Free PMC article.
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Molecular Biology Databases